PATH
(6)让配置生效
(7)验证安装是否成功
$ sbt sbt-version
//如果这条命令运行不成功请改为以下这条 >sbt sbtVersion
$ sbt sbtVersion
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0
[info] Loading project definition from /home/pxh/project
[info] Set current project to pxh (in build file:/home/pxh/)
[info] 1.2.1
编写Scala应用程序
(1)在终端创建一个文件夹sparkapp作为应用程序根目录
cd ~
mkdir ./sparkapp
mkdir -p ./sparkapp/src/main/scala #创建所需的文件夹结构
(2)./sparkapp/src/main/scala在建立一个SimpleApp.scala的文件并添加以下代码
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args:Array[String]){
val logFile = "file:///home/pxh/hello.ts"
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile,2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
println("Lines with a: %s".format(numAs))
}
}
(3)添加该独立应用程序的信息以及与Spark的依赖关系
vim ./sparkapp/simple.sbt
在文件中添加如下内容
name:= "Simple Project"
version:= "1.0"
scalaVersion :="2.11.8"
libraryDependencies += "org.apache.spark"%% "spark-core" % "2.2.0"
(4)检查整个应用程序的文件结构
文件结构如下
.
./simple.sbt
./src
./src/main
./src/main/scala
./src/main/scala/SimpleApp.scala
(5)将整个应用程序打包成JAR(首次运行的话会花费较长时间下载依赖包,请耐心等待)
sparkapp$ /usr/local/sbt/sbt package
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0
[info] Loading project definition from /home/pxh/sparkapp/project
[info] Loading settings for project sparkapp from simple.sbt ...
[info] Set current project to Simple Project (in build file:/home/pxh/sparkapp/)
[success] Total time: 2 s, completed 2018-10-1 0:04:59
(6)将生成的jar包通过spark-submit提交到Spark中运行
:~$ /home/pxh/spark-2.2.0-bin-hadoop2.7/bin/spark-submit --class "SimpleApp" /home/pxh/sparkapp/target/scala-2.11/simple-project_2.11-1.0.jar 2>&1 | grep "Lines with a:"
Lines with a: 3
END........