版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/moquancsdn/article/details/81700320
这个系列指南使用真实集群搭建环境,不是伪集群,用了三台腾讯云服务器
或者访问我的个人博客站点,链接
HBase
配置
export JAVA_HOME=/opt/java/jdk1.8
export HADOOP_HOME=/opt/hadoop/hadoop2.8
export HBASE_HOME=/opt/hbase/hbase1.2
export HBASE_CLASSPATH=/opt/hadoop/hadoop2.8/etc/hadoop
export HBASE_PID_DIR=/root/hbase/pids
export HBASE_MANAGES_ZK=false
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared byregion servers.</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>120000</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>150000</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/root/hbase/tmp</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>master:60000</value>
</property>
- 修改regionservers
指定hbase的主从关系,类似hadoop的主从关系配置
添加slave1和slave2即可
主从配置同步
scp -r /opt/hbase root@slave1:/opt
scp -r /opt/hbase root@slave2:/opt
启动
starthbase(别名),在各个主机上使用jps查看进程
master上是HMaster,slave1上是HRegionServer
查看web端口http://134.175.xxx.xxx:16010,有两个slave表示配置成功
使用HBase
hbase不提供表的关联,处理表之间关系的能力弱。如果需要处理很复杂的逻辑关系,不适合hbase
HIVE和HBASE区别
点这里
HBASE的运行必须
依赖zookeeper
HBase表结构
常规表结构
表的字段是事先定义好的,如果想临时添加字段,就必须改变整个表结构,既麻烦,也浪费了存储空间。
HBASE表结构
定义表的时候,没有列,而是列族。插入数据时,列族中可以存储任意多个列(KV,列名&列值)
要查询某一个字段的值,需要指定的坐标:表名->行键->列族(ColumnFamily):列名(Qualifier)->版本。如果不指定版本,那么默认最新版本的值。
巨大的表被切分为很多个region,region放在服务器上,这些服务器被称为region server。
上图有两个table1的region,一个table2的region。每个region被存放在hdfs上,称为一个HFile。
事实上,为了数据存取方便,往往region server和datanode运行在同一台机器上。
为了管理很多region server,Hbase提供了HMaster机制,它不负责存储表数据,而是管理region server的状态,并负责负载均衡。
表寻址机制
root表由zookeeper(zookeeper 必须
使用三台及以上节点组成集群才能真正发挥作用)来管理,因此HBase的使用必须依赖集群
HBase系统架构
MenStore是HBase支持高速随机访问的关键,它把当前最热(频次最高的)数据放在MenStore里。
HBase Shell操作
create 'table_name',{NAME=>'name',VERSIONS=>1},{NAME=>'name'}
describe 'table_name'
shell 脚本不是很方便
,建议使用java api
HBase+Eclipse
配置eclipse的时候注意必要的xml属性文件,建议使用maven。
如果没有使用maven,而是使用自己添加依赖的形式的话,项目配置完了之后的目录树长这样:
值得注意的是,项目文件里要添加一些配置文件,例如hbase-site.xml,这些文件是从集群配置上复制的
样例代码(代码的注释还没有写)
更多关于JUnit使用
的介绍点这里
package cn.colony.cloudhadoop.hbase
import java.io.IOException
import java.util.ArrayList
import java.util.List
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.hbase.Cell
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.HTableDescriptor
import org.apache.hadoop.hbase.MasterNotRunningException
import org.apache.hadoop.hbase.ZooKeeperConnectionException
import org.apache.hadoop.hbase.client.Admin
import org.apache.hadoop.hbase.client.Connection
import org.apache.hadoop.hbase.client.ConnectionFactory
import org.apache.hadoop.hbase.client.Get
import org.apache.hadoop.hbase.client.Put
import org.apache.hadoop.hbase.client.Result
import org.apache.hadoop.hbase.client.ResultScanner
import org.apache.hadoop.hbase.client.Scan
import org.apache.hadoop.hbase.client.Table
import org.apache.hadoop.hbase.filter.BinaryComparator
import org.apache.hadoop.hbase.filter.BinaryPrefixComparator
import org.apache.hadoop.hbase.filter.ByteArrayComparable
import org.apache.hadoop.hbase.filter.ColumnPrefixFilter
import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp
import org.apache.hadoop.hbase.filter.FamilyFilter
import org.apache.hadoop.hbase.filter.Filter
import org.apache.hadoop.hbase.filter.MultipleColumnPrefixFilter
import org.apache.hadoop.hbase.filter.PrefixFilter
import org.apache.hadoop.hbase.filter.QualifierFilter
import org.apache.hadoop.hbase.filter.RegexStringComparator
import org.apache.hadoop.hbase.filter.RowFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
import org.junit.Before
import org.junit.Test
import org.apache.hadoop.hbase.TableName
public class HbaseDemo {
private Configuration testConf = null
@Before
public void init(){
testConf = HBaseConfiguration.create()
}
@Test
public void testCreate() throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
Connection connection = ConnectionFactory.createConnection()
Admin admin = connection.getAdmin()
TableName name = TableName.valueOf("testTable")
HTableDescriptor desc = new HTableDescriptor(name)
HColumnDescriptor cd = new HColumnDescriptor("baseInfo")
desc.addFamily(cd)
admin.createTable(desc)
admin.close()
}
@Test
public void testDrop() throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
Connection connection = ConnectionFactory.createConnection()
Admin admin = connection.getAdmin()
TableName tablename = TableName.valueOf("testTable")
admin.disableTable(tablename)
admin.deleteTable(tablename)
admin.close()
}
@Test
public void testPut() throws IOException, Exception{
Connection connection = ConnectionFactory.createConnection()
Table table = connection.getTable(TableName.valueOf("testTable"))
Put p = new Put(Bytes.toBytes("rk001"))
p.addColumn(Bytes.toBytes("baseInfo"), Bytes.toBytes("name"), Bytes.toBytes("dingqimin"))
table.put(p)
table.close()
}
@Test
public void testInsert() throws IOException{
Connection connection = ConnectionFactory.createConnection()
Table table = connection.getTable(TableName.valueOf("testTable"))
//不要使用String.getBytes(),可能会出现编码错误
Put name = new Put(Bytes.toBytes("rk001"))
name.addColumn(Bytes.toBytes("baseInfo"), Bytes.toBytes("name"), Bytes.toBytes("one"))
Put age = new Put(Bytes.toBytes("rk001"))
age.addColumn(Bytes.toBytes("baseInfo"), Bytes.toBytes("age"), Bytes.toBytes("6"))
ArrayList<Put> puts = new ArrayList<>()
puts.add(name)
puts.add(age)
table.put(puts)
table.close()
}
@Test
public void testGet() throws IOException{
Connection connection = ConnectionFactory.createConnection()
Table table = connection.getTable(TableName.valueOf("testTable"))
Get get = new Get(Bytes.toBytes("rk001"))
// get.setMaxVersions(3)
Result result = table.get(get)
List<Cell> cells = result.listCells()
//使用 CellUtil.getRowByte(Cell, int)会有乱码
for (Cell cell:cells){
System.out.println(Bytes.toString(cell.getRow()))
System.out.println(Bytes.toString(cell.getFamily()))
System.out.println(Bytes.toString(cell.getQualifier()))
System.out.println(Bytes.toString(cell.getValue()))
}
table.close()
}
@Test
public void testScan() throws IOException{
Connection connection = ConnectionFactory.createConnection()
Table table = connection.getTable(TableName.valueOf("testTable"))
Scan scan = new Scan(Bytes.toBytes("person_rk_bj_zhang_000001"), Bytes.toBytes("person_rk_bj_zhang_000002"))
//前缀过滤器----针对行键
Filter filter = new PrefixFilter(Bytes.toBytes("rk"))
//行过滤器
ByteArrayComparable rowComparator = new BinaryComparator(Bytes.toBytes("person_rk_bj_zhang_000001"))
RowFilter rf = new RowFilter(CompareOp.LESS_OR_EQUAL, rowComparator)
rf = new RowFilter(CompareOp.EQUAL , new SubstringComparator("_2014-12-21_"))
//单值过滤器 1 完整匹配字节数组
new SingleColumnValueFilter("base_info".getBytes(), "name".getBytes(), CompareOp.EQUAL, "zhangsan".getBytes())
//单值过滤器2 匹配正则表达式
ByteArrayComparable comparator = new RegexStringComparator("zhang.")
new SingleColumnValueFilter("info".getBytes(), "NAME".getBytes(), CompareOp.EQUAL, comparator)
//单值过滤器2 匹配是否包含子串,大小写不敏感
comparator = new SubstringComparator("wu")
new SingleColumnValueFilter("info".getBytes(), "NAME".getBytes(), CompareOp.EQUAL, comparator)
//键值对元数据过滤-----family过滤----字节数组完整匹配
FamilyFilter ff = new FamilyFilter(
CompareOp.EQUAL ,
new BinaryComparator(Bytes.toBytes("base_info")) //表中不存在inf列族,过滤结果为空
)
//键值对元数据过滤-----family过滤----字节数组前缀匹配
ff = new FamilyFilter(
CompareOp.EQUAL ,
new BinaryPrefixComparator(Bytes.toBytes("inf")) //表中存在以inf打头的列族info,过滤结果为该列族所有行
)
//键值对元数据过滤-----qualifier过滤----字节数组完整匹配
filter = new QualifierFilter(
CompareOp.EQUAL ,
new BinaryComparator(Bytes.toBytes("na")) //表中不存在na列,过滤结果为空
)
filter = new QualifierFilter(
CompareOp.EQUAL ,
new BinaryPrefixComparator(Bytes.toBytes("na")) //表中存在以na打头的列name,过滤结果为所有行的该列数据
)
//基于列名(即Qualifier)前缀过滤数据的ColumnPrefixFilter
filter = new ColumnPrefixFilter("na".getBytes())
//基于列名(即Qualifier)多个前缀过滤数据的MultipleColumnPrefixFilter
byte[][] prefixes = new byte[][] {Bytes.toBytes("na"), Bytes.toBytes("me")}
filter = new MultipleColumnPrefixFilter(prefixes)
//为查询设置过滤条件
scan.setFilter(filter)
scan.addFamily(Bytes.toBytes("base_info"))
ResultScanner scanner = table.getScanner(scan)
for(Result r : scanner){
//直接从result中取到某个特定的value
byte[] value = r.getValue(Bytes.toBytes("base_info"), Bytes.toBytes("name"))
System.out.println(new String(value))
}
table.close()
}
public static void createTable(String tableName, String[] familyNames, int[] maxVersions, Configuration conf) throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
Connection connection = ConnectionFactory.createConnection()
Admin admin = connection.getAdmin()
TableName tablename = TableName.valueOf(tableName)
HTableDescriptor desc = new HTableDescriptor(tablename)
for (int index = 0
HColumnDescriptor cd = new HColumnDescriptor(familyNames[index])
cd.setMaxVersions(maxVersions[index])
desc.addFamily(cd)
}
admin.createTable(desc)
admin.close()
}
public static void dropTable(String tableName, Configuration conf) throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
Connection connection = ConnectionFactory.createConnection()
Admin admin = connection.getAdmin()
TableName tablename = TableName.valueOf(tableName)
admin.disableTable(tablename)
admin.deleteTable(tablename)
admin.close()
}
public static void putData(String tableName, String rowKey, String columnFamily, String qualifier, String value, Configuration conf) throws IOException{
Connection connection = ConnectionFactory.createConnection()
Table table = connection.getTable(TableName.valueOf("testTable"))
Put p = new Put(Bytes.toBytes(rowKey))
p.addColumn(columnFamily.getBytes(), qualifier.getBytes(), value.getBytes())
table.put(p)
table.close()
}
public static void main(String[] args) throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
Configuration myConf = HBaseConfiguration.create()
// String tableName = "testTable"
// String[] familyNames = {"baseInfo","extraInfo"}
// int[] maxVersions = {2, 2}
// createTable(tableName, familyNames, maxVersions, myConf)
dropTable("testTable", myConf)
}
}