设为首页 加入收藏

TOP

SparkSQL使用之Spark SQL CLI(一)
2014-11-23 21:26:33 来源: 作者: 【 】 浏览:76
Tags:SparkSQL 使用 Spark SQL CLI

Spark SQL CLI描述


Spark SQL CLI的引入使得在SparkSQL中通过hive metastore就可以直接对hive进行查询更加方便;当前版本中还不能使用Spark SQL CLI与ThriftServer进行交互。


使用Spark SQL CLI前需要注意:


1、将hive-site.xml配置文件拷贝到$SPARK_HOME/conf目录下;


2、需要在$SPARK_HOME/conf/spark-env.sh中的SPARK_CLASSPATH添加jdbc驱动的jar包


Spark SQL CLI命令参数介绍:


cd $SPARK_HOME/bin
spark-sql --help


Usage: ./bin/spark-sql [options] [cli option]
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
on the PYTHONPATH for Python apps.
--files FILES Comma-separated list of files to be placed in the working
directory of each executor.


--conf PROP=VALUE Arbitrary Spark configuration property.
--properties-file FILE Path to a file from which to load extra properties. If not
specified, this will look for conf/spark-defaults.conf.


--driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 512M).
--driver-java-options Extra Java options to pass to the driver.
--driver-library-path Extra library path entries to pass to the driver.
--driver-class-path Extra class path entries to pass to the driver. Note that
jars added with --jars are automatically included in the
classpath.


--executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).


--help, -h Show this help message and exit
--verbose, -v Print additional debug output


Spark standalone with cluster deploy mode only:
--driver-cores NUM Cores for driver (Default: 1).
--supervise If given, restarts the driver on failure.


Spark standalone and Mesos only:
--total-executor-cores NUM Total cores for all executors.


YARN-only:
--executor-cores NUM Number of cores per executor (Default: 1).
--queue QUEUE_NAME The YARN queue to submit to (Default: "default").
--num-executors NUM Number of executors to launch (Default: 2).
--archives ARCHIVES Comma separated list of archives to be extracted into the
working directory of each executor.


CLI options:
-d,--define Variable subsitution to apply to hive
commands. e.g. -d A=B or --define A=B
--database Specify the database to use
-e SQL from command line
-f SQL from files
-h connecting to Hive Server on remote host
--hiveconf Use value for given property
--hivevar Variable subsitution to apply to hive
commands. e.g. --hivevar A=B
-i Initialization SQL file
-p connecting to Hive Server on port number
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the console)


在启动spark-sql时,如果不指定master,则以local的方式运行,master既可以指定standalone的地址,也可以指定yarn;


当设定master为yarn时(spark-sql

首页 上一页 1 2 下一页 尾页 1/2/2
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
分享到: 
上一篇SparkSQL使用之Thrift JDBC Server 下一篇C#把对象类型转化为指定类型,转..

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容: