设为首页 加入收藏

TOP

用flume-ng-sql-source 从mysql 抽取数据到kafka被storm消费
2019-01-19 14:09:01 】 浏览:295
Tags:flume-ng-sql-source mysql 抽取 数据 kafka storm 消费

1.下载编译flume-ng-sql-source 下载地址:https://github.com/keedio/flume-ng-sql-source.git

安装说明文档编译和拷贝jar包

2.编写flume-ng 配置文件

1.channels = ch-1
a1.sources = src-1
a1.sinks = k1
###########sql source#################
# For each one of the sources, the type is defined
a1.sources.src-1.type = org.keedio.flume.source.SQLSource
a1.sources.src-1.hibernate.connection.url = jdbc:mysql://172.16.43.21:3306/test
# Hibernate Database connection properties
a1.sources.src-1.hibernate.connection.user = hadoop
a1.sources.src-1.hibernate.connection.password = hadoop
a1.sources.src-1.hibernate.connection.autocommit = true
a1.sources.src-1.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
a1.sources.src-1.hibernate.connection.driver_class = com.mysql.jdbc.Driver
a1.sources.src-1.run.query.delay=5000
a1.sources.src-1.status.file.path = /home/hadoop/export/server/apache-flume-1.7.0-bin
a1.sources.src-1.status.file.name = sqlSource.status
# Custom query
a1.sources.src-1.start.from = 0
a1.sources.src-1.custom.query = select `id`, `str` from json_str where id > $@$ order by id asc
a1.sources.src-1.batch.size = 1000
a1.sources.src-1.max.rows = 1000
a1.sources.src-1.hibernate.connection.provider_class = org.hibernate.connection.C3P0ConnectionProvider
a1.sources.src-1.hibernate.c3p0.min_size=1
a1.sources.src-1.hibernate.c3p0.max_size=10

################################################################
a1.channels.ch-1.type = memory
a1.channels.ch-1.capacity = 10000
a1.channels.ch-1.transactionCapacity = 10000
a1.channels.ch-1.byteCapacityBufferPercentage = 20
a1.channels.ch-1.byteCapacity = 800000

################################################################
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = testuser
a1.sinks.k1.brokerList = test0:9092,test1:9092,test2:9092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1

a1.sinks.k1.channel = ch-1
a1.sources.src-1.channels=ch-1

3.遇到的问题

mysql中的内容采集到kafka中之后会多出来很多双引号

mysql数据格式:


kafka数据格式:


用storm对kafka中的数据进行格子整理


】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇flume中:memory channel,file c.. 下一篇Flume 核心组件

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目