设为首页 加入收藏

TOP

HBase用一个MR同时写入两张表(一)
2014-11-24 08:32:24 来源: 作者: 【 】 浏览:3
Tags:HBase 一个 同时 写入 张表

原始数据如下:


fansy,22,blog.csdu.net/fansy1990
tom,25,blog.csdu.net/tom1987
kate,23,blog.csdu.net/kate1989
jake,20,blog.csdu.net/jake1992
john,35,blog.csdu.net/john1977
ben,30,blog.csdu.net/ben1982


第一列代表name,dierlie代表age,disanlie代表webPage;要做的事情是把name和age存入表1,name和webPage存入表2;下面贴代码:


ImportToHB.java:


package org.fansy.multipletables;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.MultiTableOutputFormat;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
/**
* write to multiple tables
* @author fansy
*
*/
public class ImportToHB extends Configured implements Tool{



public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new ImportToHB(), args);
System.exit(exitCode);
}


@Override
public int run(String[] args) throws Exception {
if(args.length!=7){
System.err.println("wrong args length:"+args.length);
// System.out.println();
System.out.println("Usage: "+
" ");
System.exit(-1);
}
Configuration conf=new Configuration();
conf.set("TABLE1", args[1]);
conf.set("T1-FAM", args[2]);
conf.set("T1-QUA", args[3]);
conf.set("TABLE2", args[4]);
conf.set("T2-FAM", args[5]);
conf.set("T2-QUA", args[6]);
Job job = new Job(conf);
job.setJarByClass(ImportToHB.class);
job.setMapperClass(MapperHB.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(Writable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
job.setOutputFormatClass(MultiTableOutputFormat.class);
job.setNumReduceTasks(0);


if(job.waitForCompletion(true)){
return 0;
}
return -1;
}
}


MapperHB.java:


package org.fansy.multipletables;


import java.io.IOException;


import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.io.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Mapper;


public class MapperHB extends Mapper{
private byte[] table1;
private byte[] table2;
private byte[] t1_fam;
private byte[] t1_qua;
private byte[] t2_fam;
private byte[] t2_qua;

public void setup(Context context){
table1=Bytes.toBytes(context.getConfiguration().get("TABLE1"));
table2=Bytes.toBytes(context.getConfiguration().get("TABLE2"));
t1_fam=Bytes.toBytes(context.getConfiguration().get("T1-FAM"));
t1_qua=Bytes.toBytes(context.getConfiguration().get("T1-QUA"));
t2_fam=Bytes.toBytes(context.getConfiguration().get("T2-FAM"));
t2_qua=Bytes.toBytes(context.getConfiguration().get("T2-QUA"));
}

public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{
String[] info=value.toString().split(",");
if(info.length!=3){
return;
}
String name=info[0];
String age=info[1];
String webPage=info[2];

// write to the first table row = name+age, value=age;
ImmutableBytesWritable putTable = new ImmutableBytesWritable(table1);

首页 上一页 1 2 下一页 尾页 1/2/2
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
分享到: 
上一篇Linux下多线程通过蒙特卡洛法来求.. 下一篇Linux下用GDB调试可加载模块

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容:

·定义一个类模板并实 (2025-12-27 06:52:28)
·一文搞懂怎么用C语言 (2025-12-27 06:52:25)
·常用C模板范文_百度 (2025-12-27 06:52:21)
·【C语言】动态内存管 (2025-12-27 06:23:20)
·C语言中的内存管理 - (2025-12-27 06:23:16)