设为首页 加入收藏

TOP

flume安装在hadoop记录
2019-02-09 00:42:53 】 浏览:40
Tags:flume 安装 hadoop 记录

flume介绍

1、功能:数据采集
    1.1、collecting收集
    1.2、aggregating聚合
    1.3、moving移动
2、数据源->目标地
3、流式:就像水流一样,动态的,有序
        3.1、实时和离线都可以使用
4、flume如何去解决数据源在windows的问题
        4.1、Linux:NFS(NetWork File System)
        4.2、通过NFS来将windows上的目录挂载到Linux系统上
5、三大组件:source、channel、sink
6、channel类型
        6.1、file channel
        6.2、memory channel
        6.3、kafka channel
7、source、channel、sink共同组成了一个agent

flume部署

安装包下载
链接:https://pan.baidu.com/s/170PUOjZIl_MSQU6AmzYe6w 密码:gigf

flume安装在hadoop上有三种安装方式
如果flume在Hadoop集群中

  • 修改flume-env.sh
    export JAVA_HOME=/opt/moduels/jdk1.7.0_67

如果flume在Hadoop集群中,而且Hadoop是配置了HA的

  • 修改flume-env.sh
    export JAVA_HOME=/opt/moduels/jdk1.7.0_67
  • 将Hadoop的core-site和hdfs-site文件拷贝到flume/conf下

如果flume不在Hadoop集群中

  • 修改flume-env.sh
    export JAVA_HOME=/opt/moduels/jdk1.7.0_67
  • 将Hadoop的core-site和hdfs-site文件拷贝到flume/conf下
  • 将Hadoop相关jar包放到lib目录下,lib包在上面下载。

常用用法:
-Dflume.root.logger=INFO.console


d="flume-hdfs-dir-mem-confproperties">flume-hdfs-dir-mem-conf.properties

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#  http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.


# The configuration file needs to define the sources, 
# the channels and the sinks.
# Sources, channels and sinks are defined per agent, 
# in this case called 'agent'

agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = loggerSink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = spooldir
agent.sources.seqGenSrc.spoolDir = /opt/cdhmoduels/apache-flume-1.5.0-cdh5.3.6-bin/file/spooling

# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.loggerSink.type = logger
agent.sinks.loggerSink.type = hdfs
agent.sinks.loggerSink.hdfs.path = /flume/event/hdfsdir
agent.sinks.loggerSink.hdfs.filePrefix = hive-log
agent.sinks.loggerSink.hdfs.rollSize = 10240
agent.sinks.loggerSink.hdfs.rollInterval = 0
agent.sinks.loggerSink.hdfs.rollCount = 0
#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel

# Each channel's type is defined.   
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 1000

启动命令

bin/flume-ng agent --conf conf/ --name agent --conf-file conf/flume-hdfs-dir-mem-conf.properties -Dflume.root.logger=INFO,console

编程开发网
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇在Hadoop中处理输入的CSV文件 下一篇关于hadoop各种项目中用到的maven..

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容:

array(4) { ["type"]=> int(8) ["message"]=> string(24) "Undefined variable: jobs" ["file"]=> string(32) "/mnt/wp/cppentry/do/bencandy.php" ["line"]=> int(217) }