设为首页 加入收藏

TOP

Hbase1.2数据导入2.0
2019-02-12 13:36:43 】 浏览:11
Tags:Hbase1.2 数据 导入 2.0
版权声明:原创文章,欢迎转载,转载请注明出处 https://blog.csdn.net/zhangshenghang/article/details/82594143

场景:现有一批之前导出的数据,发现2.0版本hbck工具更新,无法直接导入,跨机房使用export/import方式需要重新外网传输数据比较耗时,现搭建临时hbase版本1.2,在同机房进行export/import方式

  1. 数据导入hbase1.2版本

CDH默认hbase用户是不可登录,修改权限,使其可登录(完成操作后记得改回来)

##这里没有全部显示
[root@test ~]# vim /etc/passwd
cloudera-scm:x:997:995:Cloudera Manager:/var/lib/cloudera-scm-server:/sbin/nologin
mysql:x:27:27:MariaDB Server:/var/lib/mysql:/sbin/nologin
flume:x:996:993:Flume:/var/lib/flume-ng:/bin/false
hdfs:x:995:992:Hadoop HDFS:/var/lib/hadoop-hdfs:/bin/bash
solr:x:994:991:Solr:/var/lib/solr:/sbin/nologin
zookeeper:x:993:990:ZooKeeper:/var/lib/zookeeper:/bin/false
llama:x:992:989:Llama:/var/lib/llama:/bin/bash
httpfs:x:991:988:Hadoop HTTPFS:/var/lib/hadoop-httpfs:/bin/bash
mapred:x:990:987:Hadoop MapReduce:/var/lib/hadoop-mapreduce:/bin/bash
sqoop:x:989:986:Sqoop:/var/lib/sqoop:/bin/false
yarn:x:988:985:Hadoop Yarn:/var/lib/hadoop-yarn:/bin/bash
kms:x:987:984:Hadoop KMS:/var/lib/hadoop-kms:/bin/bash
hive:x:986:983:Hive:/var/lib/hive:/bin/false
sqoop2:x:985:982:Sqoop 2 User:/var/lib/sqoop2:/sbin/nologin
oozie:x:984:981:Oozie User:/var/lib/oozie:/bin/false
kudu:x:983:980:Kudu:/var/lib/kudu:/sbin/nologin
hbase:x:982:979:HBase:/var/lib/hbase:/bin/false
sentry:x:981:978:Sentry:/var/lib/sentry:/sbin/nologin
impala:x:980:977:Impala:/var/lib/impala:/bin/bash
spark:x:979:976:Spark:/var/lib/spark:/sbin/nologin
hue:x:978:975:Hue:/usr/lib/hue:/bin/false
ntp:x:38:38::/etc/ntp:/sbin/nologin

修改

hbase:x:982:979:HBase:/var/lib/hbase:/bin/false

hbase:x:982:979:HBase:/var/lib/hbase:/bin/bash

切换至Hbase用户,将数据导入hbase1.2,使用一下命令将linux 上的数据迁移至hdfs相对应目录中

 hdfs dfs -copyFromLocal /linux/dic/table1 /hdfs/hbase/table1  

导入成功后修复元数据表

hbase hbck -fixMeta -fixAssignments

2.将Hbase1.2数据Export 至 Hbase2.0数据库

 hbase org.apache.hadoop.hbase.mapreduce.Export  需要导出的Hbase表名 需要输出到的位置(可以使其他集群的hdfs路径,也可以是本地linux系统的路径)

异常:权限不足

[hbase@test ~]$ hbase org.apache.hadoop.hbase.mapreduce.Export  HbasetableName hdfs://xxxxx:8020/hbase/HbasetableName
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
18/09/10 21:39:53 INFO mapreduce.Export: versions=1, starttime=0, endtime=9223372036854775807, keepDeletedCells=false
18/09/10 21:39:54 INFO client.RMProxy: Connecting to ResourceManager at fwqml006.zh/10.248.161.16:8032
18/09/10 21:39:54 WARN security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3770)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3753)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3735)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6723)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4493)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4463)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4436)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:876)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:326)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3770)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3753)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3735)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6723)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4493)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4463)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4436)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:876)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:326)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3120)
        at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:3085)
        at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1004)
        at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1000)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1000)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:992)
        at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:133)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:148)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325)
        at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:188)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hbase, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3770)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3753)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3735)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6723)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4493)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4463)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4436)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:876)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:326)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)

        at org.apache.hadoop.ipc.Client.call(Client.java:1504)
        at org.apache.hadoop.ipc.Client.call(Client.java:1441)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:573)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy12.mkdirs(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3118)
        ... 16 more

提示hdfs中/user目录hbase用户权限不足,修改/user目录访问权限,这里为了方便使用

-chmod -R 777,记得数据导出完成后修改为之前权限。

又报错jdk环境:

[hdfs@test ~]$ hdfs dfs -chmod -R 777 /user
ERROR: JAVA_HOME /home/xxx/tools/jdk1.8.0_101 does not exist.
在~/bashrc修改环境变量

执行导出成功。

 hbase org.apache.hadoop.hbase.mapreduce.Export  HbasetableName hdfs://xxxxx:8020/hbase/HbasetableName

在目标集群中发现数据已传输过去

[hbase@fwqzx002 ~]$ hdfs dfs -ls /hbase/xxx/xxx
Found 12 items
-rw-r--r--   1 hbase hbase           0 2018-09-10 23:04 /hbase/tt_user/offline_user/_SUCCESS
-rw-r--r--   1 hbase hbase 47065003510 2018-09-10 23:04 /hbase/xxx/xxx/part-m-00003
-rw-r--r--   1 hbase hbase 23541633987 2018-09-10 22:24 /hbase/xxx/xxx/part-m-00004
-rw-r--r--   1 hbase hbase 23532345447 2018-09-10 22:40 /hbase/xxx/xxx/part-m-00005
-rw-r--r--   1 hbase hbase 23551359671 2018-09-10 22:39 /hbase/xxx/xxx/part-m-00006
-rw-r--r--   1 hbase hbase 23522350569 2018-09-10 22:25 /hbase/xxx/xxx/part-m-00007
-rw-r--r--   1 hbase hbase 23544202929 2018-09-10 22:48 /hbase/xxx/xxx/part-m-00008
-rw-r--r--   1 hbase hbase 23529537743 2018-09-10 22:40 /hbase/xxx/xxx/part-m-00009
-rw-r--r--   1 hbase hbase 11749139280 2018-09-10 22:36 /hbase/xxx/xxx/part-m-00010
-rw-r--r--   1 hbase hbase 11754855832 2018-09-10 22:36 /hbase/xxx/xxx/part-m-00011
-rw-r--r--   1 hbase hbase 11775381448 2018-09-10 22:27 /hbase/xxx/xxx/part-m-00012
-rw-r--r--   1 hbase hbase 11767607324 2018-09-10 22:35 /hbase/xxx/xxx/part-m-00013

3.数据导入Hbase2.0版本

 hbase org.apache.hadoop.hbase.mapreduce.Import -Dmapred.job.queue.name=etl crawl:wechat_biz /hbase/test4

少量数据正常导入,大量数据时Hbase写入过快处理不过来,主要原因是region分裂时导致memstore数据量过大,提示异常RetriesExhaustedWithDetailsException

解决方法:https://blog.csdn.net/zhangshenghang/article/details/82621101

再次执行导入成功。


编程开发网
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇HBase的核心模块介绍 下一篇HBase 协处理器 (二)

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容:

array(4) { ["type"]=> int(8) ["message"]=> string(24) "Undefined variable: jobs" ["file"]=> string(32) "/mnt/wp/cppentry/do/bencandy.php" ["line"]=> int(214) }