Obsolete:Beeline
beeline
, instead use hive
or wmfdata-python
Beeline is the command line shell that ships with HiveServer2, which was introduced in Hive 0.11[1]. The original Hive CLI (hive
) has been officialy deprecated in favor of Beeline[2], but as of October 2018, the Analytics team does not recommend switching since the original client still has better error reporting and a few other advantages.
Usage
SSH into stat1007/1004. Run beeline --showNestedErrs= true
You should get a prompt like:
madhuvishy@stat1004:~$ beeline
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
scan complete in 2ms
Connecting to jdbc:hive2://analytics1003.eqiad.wmnet:10000
Connected to: Apache Hive (version 1.1.0-cdh5.5.2)
Driver: Hive JDBC (version 1.1.0-cdh5.5.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.5.2 by Apache Hive
0: jdbc:hive2://analytics1003.eqiad.wmnet:100>
Ctrl+C to exit.
Help
Run beeline --help
, this should show you all the available options.
Defaults
We have a wrapper script setup around beeline that sets defaults for the database url, username (current user) and outputformat (tsv2). Any of these can be overridden by passing the option while invoking beeline.
Running queries
Running queries works the same way as the hive CLI - you can read query from a file using -f, pass it as a string using -e, etc. See Analytics/Cluster/Hive#Querying.
Documentation
Beeline usage, and all the different options are explained here - https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineâCommandLineShell.
Features [WIP]
Beeline has some cool features/bugs solved over the old Hive client.
- View results with huge number of columns in vertical alignment with --outputformat vertical