User:Razzi/2021-09-24
Appearance
~ ❯ ssh an-test-client1001.e Linux an-test-client1001 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 Debian GNU/Linux 10 (buster) _ __ _ _ _ _ _ | |/ / | | (_) | | | | | | | ' / ___ _ __| |__ ___ _ __ _ _______ __| | | |__ ___ ___| |_ | < / _ \ '__| '_ \ / _ \ '__| |_ / _ \/ _` | | '_ \ / _ \/ __| __| | . \ __/ | | |_) | __/ | | |/ / __/ (_| | | | | | (_) \__ \ |_ |_|\_\___|_| |_.__/ \___|_| |_/___\___|\__,_| |_| |_|\___/|___/\__| This host is capable of Kerberos authentication in the WIKIMEDIA realm. For more info: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos/UserGuide an-test-client1001 is a Analytics Hadoop test client (analytics_test_cluster::client) The last Puppet run was at Fri Sep 24 20:55:58 UTC 2021 (16 minutes ago). Last puppet commit: (ff4eb50a5e) Ryan Kemper - query_service: Add monitoring::groups for wcqs Debian GNU/Linux 10 auto-installed on Wed Oct 21 22:23:14 UTC 2020. Last login: Thu Jul 29 16:15:19 2021 from 2620:0:863:1:198:35:26:13 You do not have a valid Kerberos ticket in the credential cache, remember to kinit. razzi@an-test-client1001:~$ kinit Password for razzi@WIKIMEDIA: razzi@an-test-client1001:~$ klist Ticket cache: FILE:/tmp/krb5cc_26051 Default principal: razzi@WIKIMEDIA Valid starting Expires Service principal 09/24/2021 21:12:26 09/26/2021 21:12:23 krbtgt/WIKIMEDIA@WIKIMEDIA renew until 10/01/2021 21:12:23 razzi@an-test-client1001:~$ presto --help NAME presto - Presto interactive console SYNOPSIS presto [--access-token <access token>] [--catalog <catalog>] [--client-info <client-info>] [--client-request-timeout <client request timeout>] [--client-tags <client tags>] [--debug] [--disable-compression] [--execute <execute>] [--extra-credential <extra-credential>...] [(-f <file> | --file <file>)] [(-h | --help)] [--http-proxy <http-proxy>] [--ignore-errors] [--keystore-password <keystore password>] [--keystore-path <keystore path>] [--krb5-config-path <krb5 config path>] [--krb5-credential-cache-path <krb5 credential cache path>] [--krb5-disable-remote-service-hostname-canonicalization] [--krb5-keytab-path <krb5 keytab path>] [--krb5-principal <krb5 principal>] [--krb5-remote-service-name <krb5 remote service name>] [--log-levels-file <log levels file>] [--output-format <output-format>] [--password] [--resource-estimate <resource-estimate>...] [--schema <schema>] [--server <server>] [--session <session>...] [--socks-proxy <socks-proxy>] [--source <source>] [--truststore-password <truststore password>] [--truststore-path <truststore path>] [--user <user>] [--version] OPTIONS --access-token <access token> Access token --catalog <catalog> Default catalog --client-info <client-info> Extra information about client making query --client-request-timeout <client request timeout> Client request timeout (default: 2m) --client-tags <client tags> Client tags --debug Enable debug information --disable-compression Disable compression of query results --execute <execute> Execute specified statements and exit --extra-credential <extra-credential> Extra credentials (property can be used multiple times; format is key=value) -f <file>, --file <file> Execute statements from file and exit -h, --help Display help information --http-proxy <http-proxy> HTTP proxy to use for server connections --ignore-errors Continue processing in batch mode when an error occurs (default is to exit immediately) --keystore-password <keystore password> Keystore password --keystore-path <keystore path> Keystore path --krb5-config-path <krb5 config path> Kerberos config file path (default: /etc/krb5.conf) --krb5-credential-cache-path <krb5 credential cache path> Kerberos credential cache path --krb5-disable-remote-service-hostname-canonicalization Disable service hostname canonicalization using the DNS reverse lookup --krb5-keytab-path <krb5 keytab path> Kerberos key table path (default: /etc/krb5.keytab) --krb5-principal <krb5 principal> Kerberos principal to be used --krb5-remote-service-name <krb5 remote service name> Remote peer's kerberos service name --log-levels-file <log levels file> Configure log levels for debugging using this file --output-format <output-format> Output format for batch mode [ALIGNED, VERTICAL, CSV, TSV, CSV_HEADER, TSV_HEADER, NULL] (default: CSV) --password Prompt for password --resource-estimate <resource-estimate> Resource estimate (property can be used multiple times; format is key=value) --schema <schema> Default schema --server <server> Presto server location (default: localhost:8080) --session <session> Session property (property can be used multiple times; format is key=value; use 'SHOW SESSION' to see available properties) --socks-proxy <socks-proxy> SOCKS proxy to use for server connections --source <source> Name of source making query --truststore-password <truststore password> Truststore password --truststore-path <truststore path> Truststore path --user <user> Username --version Display version information and exit razzi@an-test-client1001:~$ presto presto> select meta.id from event.navigationtiming where year=2021 limit 10; Query 20210924_211309_00000_4ytii failed: line 1:21: Catalog must be specified when session catalog is not set select meta.id from event.navigationtiming where year=2021 limit 10 presto> use analytics_hive; Query 20210924_211315_00001_4ytii failed: line 1:1: Catalog must be specified when session catalog is not set use analytics_hive presto> show catalogs; Catalog
analytics_test_hive system (2 rows) Query 20210924_211321_00002_4ytii, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:01 [0 rows, 0B] [0 rows/s, 0B/s] presto> use analytics_test_hive; Query 20210924_211328_00003_4ytii failed: line 1:1: Catalog must be specified when session catalog is not set use analytics_test_hive presto> help Supported commands: QUIT EXPLAIN [ ( option [, ...] ) ] <query> options: FORMAT { TEXT | GRAPHVIZ } TYPE { LOGICAL | DISTRIBUTED } DESCRIBE SHOW COLUMNS FROM
analytics_test_hive system (2 rows) Query 20210924_211357_00004_4ytii, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s] presto> use analytics_test_hive -> -> ; Query 20210924_211406_00005_4ytii failed: line 1:1: Catalog must be specified when session catalog is not set use analytics_test_hive presto> show tables -> presto> showhl presto> help Supported commands: QUIT EXPLAIN [ ( option [, ...] ) ] <query> options: FORMAT { TEXT | GRAPHVIZ } TYPE { LOGICAL | DISTRIBUTED } DESCRIBE
default elukey event event_sanitized information_schema otto_sanitized wmf wmf_raw (8 rows) Query 20210924_211445_00006_4ytii, FINISHED, 2 nodes Splits: 19 total, 19 done (100.00%) 0:01 [8 rows, 115B] [9 rows/s, 142B/s] presto> use analytics_test_hive.wmf -> ; USE presto:wmf> show tables; Table
anomaly_detection webrequest (2 rows) Query 20210924_211507_00008_4ytii, FINISHED, 2 nodes Splits: 19 total, 19 done (100.00%) 0:01 [2 rows, 53B] [2 rows/s, 76B/s] presto:wmf> select * from webrequest -> limit 1; Query 20210924_211522_00009_4ytii failed: Query over table 'wmf.webrequest' can potentially read more than 840 partitions presto:wmf> select count(*) from webrequest limit 1; Query 20210924_211532_00010_4ytii failed: Query over table 'wmf.webrequest' can potentially read more than 840 partitions presto:wmf> describe table webrequest; Query 20210924_211546_00011_4ytii failed: line 1:10: mismatched input 'table'. Expecting: 'INPUT', 'OUTPUT', <identifier> describe table webrequest presto:wmf> describe webrequest; Column | Type
+------------------------------------------------------------------------------------------ hostname | varchar sequence | bigint dt | varchar time_firstbyte | double ip | varchar cache_status | varchar http_status | varchar response_size | bigint http_method | varchar uri_host | varchar uri_path | varchar uri_query | varchar content_type | varchar referer | varchar x_forwarded_for | varchar user_agent | varchar accept_language | varchar x_analytics | varchar range | varchar is_pageview | boolean record_version | varchar client_ip | varchar geocoded_data | map(varchar, varchar) x_cache | varchar user_agent_map | map(varchar, varchar) x_analytics_map | map(varchar, varchar) ts | timestamp access_method | varchar agent_type | varchar is_zero | boolean referer_class | varchar normalized_host | row(project_class varchar, project varchar, qualifiers array(varchar), tld varchar, proje pageview_info | map(varchar, varchar) page_id | bigint namespace_id | integer tags | array(varchar) isp_data | map(varchar, varchar) accept | varchar tls | varchar tls_map | map(varchar, varchar) webrequest_source | varchar year | integer month | integer day | integer hour | integer (45 rows) Query 20210924_211610_00012_4ytii, FINISHED, 2 nodes Splits: 19 total, 19 done (100.00%) 0:13 [45 rows, 5.56KB] [3 rows/s, 424B/s] presto:wmf> select count(*) from webrequest where year = 2021 and month = 5 and day = 5 and hour = 5; _col0
0 (1 row) Query 20210924_211644_00013_4ytii, FINISHED, 1 node Splits: 1 total, 1 done (100.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s] presto:wmf> select count(*) from webrequest where year = 2021 and month = 8 and day = 5 and hour = 5; _col0
10405 (1 row) Query 20210924_211653_00014_4ytii, FINISHED, 1 node Splits: 273 total, 273 done (100.00%) 0:05 [10.4K rows, 0B] [1.95K rows/s, 0B/s] presto:wmf> select count(*) from webrequest where year = 2021 and month = 8 and day = 5 and hour = 5; _col0
10405 (1 row) Query 20210924_211703_00015_4ytii, FINISHED, 1 node Splits: 273 total, 273 done (100.00%) 0:02 [10.4K rows, 0B] [4.85K rows/s, 0B/s] presto:wmf> select count(*) from webrequest where year = 2021 and month = 8 and day = 5; _col0
288890 (1 row) Query 20210924_211843_00016_4ytii, FINISHED, 1 node Splits: 6,161 total, 6,161 done (100.00%) 0:47 [289K rows, 0B] [6.1K rows/s, 0B/s] presto:wmf> Connection to an-test-client1001.eqiad.wmnet closed.