设为首页 加入收藏

TOP

hive利用hbase实现更新
2018-11-26 01:05:21 】 浏览:116
Tags:hive 利用 hbase 实现 更新
全过程:
1. 建立hive表映射到hbase表(建表的时候不能建外表)
2. 之后修改hive表属性,企图将内表改为外表,失败,不支持
3. 向hive表插入数据,hive中有结果,hbase中也有结果
4. 在hbase中插入数据,hbase中有结果,hive中也有结果
5. 在hbase中更新数据,hbase中结果有变化,hive中结果也有变化
6. 用hive插入hbase中主键有重复的记录:在一次插入多条主键一样的记录时,只是将最后一条记录插入了;而且明明用的是overwrite,但是表里面原有的其他记录没有被覆盖;


以下是全过程:

========================================================================================================================


create external table test_hive_hbase
(a string,b string)
row format delimited
fields terminated by '\001'
lines terminated by '\n'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")
TBLPROPERTIES("hbase.table.name" = "test_hive_hbase");

提示有错误,hbase连同hive的时候不支持见hive 外表
FAILED: Error in metadata: MetaException(message:HBase table test_hive_hbase doesn't exist while the table is declared as an external table.)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

========================================================================================================================

create table test_hive_hbase
(a string,b string)
row format delimited
fields terminated by '\001'
lines terminated by '\n'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")
TBLPROPERTIES("hbase.table.name" = "test_hive_hbase");
成功
========================================================================================================================

alter table test_hive_hbase set TBLPROPERTIES ('EXTERNAL'='TRUE');
不能改成外表,但是在desc的时候,表是MANAGED_TABLE
FAILED: Error in metadata: Cannot use ALTER TABLE on a non-native table
========================================================================================================================

insert overwrite table test_hive_hbase
select round(rand()*100),a
from test_col

hive> select * from test_hive_hbase;
OK
13 liuhongchao
14 shiliyuan
25 shiliyuan
27 liuhongchao
45 liuhongchao
47 shiliyuan
54 liuhongchao
67 liuhongchao
72 liuhongchao
78 shiliyuan

========================================================================================================================

hbase(main):003:0> scan 'test_hive_hbase'
ROW COLUMN+CELL
13 column=cf1:val, timestamp=1369127997228, value=liuhongchao
14 column=cf1:val, timestamp=1369127997228, value=shiliyuan
25 column=cf1:val, timestamp=1369127999973, value=shiliyuan
27 column=cf1:val, timestamp=1369127997231, value=liuhongchao
45 column=cf1:val, timestamp=1369127997231, value=liuhongchao
47 column=cf1:val, timestamp=1369127999973, value=shiliyuan
54 column=cf1:val, timestamp=1369127997231, value=liuhongchao
67 column=cf1:val, timestamp=1369127997231, value=liuhongchao
72 column=cf1:val, timestamp=1369127997231, value=liuhongchao
78 column=cf1:val, timestamp=1369127997231, value=shiliyuan

========================================================================================================================

hbase插入数据,在hive中也增加了
put'test_hive_hbase',79,'cf1:val','zhouweiping'

hbase(main):007:0> scan 'test_hive_hbase'
ROW COLUMN+CELL
13 column=cf1:val, timestamp=1369127997228, value=liuhongchao
14 column=cf1:val, timestamp=1369127997228, value=shiliyuan
25 column=cf1:val, timestamp=1369127999973, value=shiliyuan
27 column=cf1:val, timestamp=1369127997231, value=liuhongchao
45 column=cf1:val, timestamp=1369127997231, value=liuhongchao
47 column=cf1:val, timestamp=1369127999973, value=shiliyuan
54 column=cf1:val, timestamp=1369127997231, value=liuhongchao
67 column=cf1:val, timestamp=1369127997231, value=liuhongchao
72 column=cf1:val, timestamp=1369127997231, value=liuhongchao
78 column=cf1:val, timestamp=1369127997231, value=shiliyuan
79 column=cf1:val, timestamp=1369128399282, value=zhouweiping

hive> select * from test_hive_hbase;
OK
13 liuhongchao
14 shiliyuan
25 shiliyuan
27 liuhongchao
45 liuhongchao
47 shiliyuan
54 liuhongchao
67 liuhongchao
72 liuhongchao
78 shiliyuan
79 zhouweiping
========================================================================================================================

修改hbase中的记录,hive中的记录也跟着变了
hbase(main):008:0> put'test_hive_hbase',79,'cf1:val','zhouweiping1'
0 row(s) in 3.9380 seconds

hbase(main):009:0> scan 'test_hive_hbase'
ROW COLUMN+CELL
13 column=cf1:val, timestamp=1369127997228, value=liuhongchao
14 column=cf1:val, timestamp=1369127997228, value=shiliyuan
25 column=cf1:val, timestamp=1369127999973, value=shiliyuan
27 column=cf1:val, timestamp=1369127997231, value=liuhongchao
45 column=cf1:val, timestamp=1369127997231, value=liuhongchao
47 column=cf1:val, timestamp=1369127999973, value=shiliyuan
54 column=cf1:val, timestamp=1369127997231, value=liuhongchao
67 column=cf1:val, timestamp=1369127997231, value=liuhongchao
72 column=cf1:val, timestamp=1369127997231, value=liuhongchao
78 column=cf1:val, timestamp=1369127997231, value=shiliyuan
79 column=cf1:val, timestamp=1369128548928, value=zhouweiping1


hive> select * from test_hive_hbase;
OK
13 liuhongchao
14 shiliyuan
25 shiliyuan
27 liuhongchao
45 liuhongchao
47 shiliyuan
54 liuhongchao
67 liuhongchao
72 liuhongchao
78 shiliyuan
79 zhouweiping1

========================================================================================================================

用hive插入hbase中主键有重复的
在一次插入多条主键一样的记录时,只是将最后一条记录插入了;而且明明用的是overwrite,但是表里面的其他记录没有被覆盖;
insert overwrite table test_hive_hbase
select 1,a
from test_col
hive> select * from test_hive_hbase;
OK
1 shiliyuan
13 liuhongchao
14 shiliyuan
25 shiliyuan
27 liuhongchao
45 liuhongchao
47 shiliyuan
54 liuhongchao
67 liuhongchao
72 liuhongchao
78 shiliyuan
79 zhouweiping1
结果只能是只插入了一条,而且原来的记录也没有被覆盖;另外新增的那条记录是select结果中的最后一条,后来居上;

insert overwrite table test_hive_hbase
select 78,'zhouweiping'
from test_col limit 1


hive> select * from test_hive_hbase;
OK
1 shiliyuan
13 liuhongchao
14 shiliyuan
25 shiliyuan
27 liuhongchao
45 liuhongchao
47 shiliyuan
54 liuhongchao
67 liuhongchao
72 liuhongchao
78 zhouweiping
79 zhouweiping1

显然,把原来78那条记录覆盖了


结论:

可以通过hive初始化数据,hbase更新数据(也可以直接用hive将新的数据insert overwrite 到hive表,这个同样能实现更新,并且原来的记录不会被覆盖);
但是hive的表不能是外表,也不支持分区表

】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇Hive-异常处理Hive Schema versio.. 下一篇【Hadoop】Hive搭建

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目