设为首页 加入收藏

TOP

Spark学习-SparkSQL--02-Spark history Server
2019-03-21 01:29:18 】 浏览:71
Tags:Spark 学习 -SparkSQL--02-Spark history Server
版权声明:本文为博主九师兄(QQ群:spark源代码 198279782 欢迎来探讨技术)原创文章,未经博主允许不得转载。 https://blog.csdn.net/qq_21383435/article/details/76615407

Spark History Server配置使用
1。Spark history Server产生背景

以standalone运行模式为例,在运行Spark Application的时候,Spark会提供一个WEBUI列出应用程序的运行时信息;但该WEBUI随着Application的完成(成功/失败)而关闭,也就是说,Spark Application运行完(成功/失败)后,将无法查看Application的历史记录;

Spark history Server就是为了应对这种情况而产生的,通过配置可以在Application执行的过程中记录下了日志事件信息,那么在Application执行结束后,WEBUI就能重新渲染生成UI界面展现出该Application在执行过程中的运行时信息;

Spark运行在yarn或者mesos之上,通过spark的history server仍然可以重构出一个已经完成的Application的运行时参数信息(假如Application运行的事件日志信息已经记录下来);

配置&使用Spark History Server
以默认配置的方式启动spark history server:

cd $SPARK_HOME/sbin
start-history-server.sh

报错

starting org.apache.spark.deploy.history.HistoryServer, logging to /home/spark/software/source/compile/deploy_spark/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-hadoop000.out
failed to launch org.apache.spark.deploy.history.HistoryServer:
        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:44)
        ... 6 more

[root@biluos logs]# /opt/moudles/spark-2.2.0-bin-hadoop2.7/sbin/start-history-server.sh hdfs://mycluster:8020/spark_job_history
starting org.apache.spark.deploy.history.HistoryServer, logging to /opt/moudles/spark-2.2.0-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out

[root@biluos logs]# cat spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out 
Spark Command: /opt/moudles/jdk1.8.0_121/bin/java -cp /opt/moudles/spark-2.2.0-bin-hadoop2.7/conf/:/opt/moudles/spark-2.2.0-bin-hadoop2.7/jars/*:/opt/moudles/hadoop-2.7.3/etc/hadoop/ -Xmx1g org.apache.spark.deploy.history.HistoryServer hdfs://mycluster:8020/spark_job_history
========================================
17/08/03 03:22:18 INFO HistoryServer: Started daemon with process name: 2666@biluos.com
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for TERM
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for HUP
17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for INT
17/08/03 03:22:18 WARN HistoryServerArguments: Setting log directory through the command line is deprecated as of Spark 1.1.0. Please set this through spark.history.fs.logDirectory instead.
17/08/03 03:22:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/03 03:22:19 INFO SecurityManager: Changing view acls to: root
17/08/03 03:22:19 INFO SecurityManager: Changing modify acls to: root
17/08/03 03:22:19 INFO SecurityManager: Changing view acls groups to: 
17/08/03 03:22:19 INFO SecurityManager: Changing modify acls groups to: 
17/08/03 03:22:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
17/08/03 03:22:19 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
        at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.io.FileNotFoundException: Log directory specified does not exist: hdfs://mycluster:8020/spark_job_history
        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:214)
        at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:160)
        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:156)
        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:78)
        ... 6 more
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://mycluster:8020/spark_job_history
        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:204)
        ... 9 more


解决方法
[root@biluos logs]# hdfs dfs -mkdir /spark_job_history
重新启动不报错了

界面如图
这里写图片描述

】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇spark 计算引擎 下一篇Learning Spark (Python版) 学习..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目