版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hochoy/article/details/82684340
ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies toallversions of a row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.
Store files which contains only expired rows are deleted on minor compaction. Settinghbase.store.delete.expired.storefile
tofalse
disables this feature. Setting minimum number of versions to other than 0 also disables this.
SeeHColumnDescriptorfor more information.
Recent versions of HBase also support setting time to live on a per cell basis. SeeHBASE-10560for more information. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs:
40.生存时间(TTL)
ColumnFamilies可以设置TTL长度(以秒为单位),HBase将在到达到期时间后自动删除行。这适用于行的所有版本 - 即使是当前版本。在HBase中为行编码的TTL时间以UTC指定。
在轻微压缩时删除仅包含过期行的存储文件。设置hbase.store.delete.expired.storefile为false禁用此功能。将最小版本数设置为0以外也会禁用此功能。
最新版本的HBase还支持基于每个单元格设置生存时间。使用Mutation#setTTL将cell TTL作为突变请求(Appends,Increments,Puts等)的属性提交。如果设置了TTL属性,它将应用于操作在服务器上更新的所有单元格。
Cell的TTL与Column family的TTL区别:
- Column family的TTL以秒为单位,cell的TTL以毫秒为单位
- 如果有有cell级别的TTL,则cell的TTL override CF的TTL; 但是不能超出CF级别的TTL
以上内容来自Apache的hbase官网,可供参考。以下实际操作一下。
创建表:
create 'dc:event',{NAME => 'f1'},{NAME => 'cf'},{NAME => 'f2'}
查看表结构:
desc "dc:event"
'dc:event', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1',COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
put 数据
put 'dc:event','866925023233621','f1:eventid','866925023233621'
put 'dc:event','866925023233622','f1:eventid','866925023233621'
put 'dc:event','866925023233623','f1:eventid','866925023233621'
put 'dc:event','866925023233624','f1:eventid','866925023233621'
put 'dc:event','866925023233625','f1:eventid','866925023233621'
put 'dc:event','866925023233626','f1:eventid','866925023233621'
put 'dc:event','866925023233627','f1:eventid','866925023233621'
put 'dc:event','866925023233628','f1:eventid','866925023233621'
put 'dc:event','866925023233629','f1:eventid','866925023233621'
put 'dc:event','866925023233630','f1:eventid','866925023233621'
put 'dc:event','8669250232336-21','cf:eventid','866925023233621'
put 'dc:event','8669250232336-22','cf:eventid','866925023233621'
put 'dc:event','8669250232336-23','cf:eventid','866925023233621'
put 'dc:event','8669250232336-24','cf:eventid','866925023233621'
put 'dc:event','8669250232336-25','cf:eventid','866925023233621'
put 'dc:event','8669250232336-26','cf:eventid','866925023233621'
put 'dc:event','8669250232336-27','cf:eventid','866925023233621'
put 'dc:event','8669250232336-28','cf:eventid','866925023233621'
put 'dc:event','8669250232336-29','cf:eventid','866925023233621'
put 'dc:event','8669250232336-30','cf:eventid','866925023233621'
put 'dc:event','866925023233-6-21','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-22','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-23','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-24','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-25','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-26','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-27','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-28','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-29','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-30','f2:eventid','866925023233621'
scan 'dc:event'
hbase(main):048:0> scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
8669250232336-21 column=cf:eventid, timestamp=1536805310816, value=866925023233621
8669250232336-22 column=cf:eventid, timestamp=1536805310850, value=866925023233621
8669250232336-23 column=cf:eventid, timestamp=1536805310861, value=866925023233621
8669250232336-24 column=cf:eventid, timestamp=1536805310870, value=866925023233621
8669250232336-25 column=cf:eventid, timestamp=1536805310881, value=866925023233621
8669250232336-26 column=cf:eventid, timestamp=1536805310890, value=866925023233621
8669250232336-27 column=cf:eventid, timestamp=1536805310911, value=866925023233621
8669250232336-28 column=cf:eventid, timestamp=1536805310918, value=866925023233621
8669250232336-29 column=cf:eventid, timestamp=1536805310930, value=866925023233621
8669250232336-30 column=cf:eventid, timestamp=1536805310937, value=866925023233621
866925023233621 column=f1:eventid, timestamp=1536805258985, value=866925023233621
866925023233622 column=f1:eventid, timestamp=1536805259053, value=866925023233621
866925023233623 column=f1:eventid, timestamp=1536805259060, value=866925023233621
866925023233624 column=f1:eventid, timestamp=1536805259070, value=866925023233621
866925023233625 column=f1:eventid, timestamp=1536805259078, value=866925023233621
866925023233626 column=f1:eventid, timestamp=1536805259084, value=866925023233621
866925023233627 column=f1:eventid, timestamp=1536805259112, value=866925023233621
866925023233628 column=f1:eventid, timestamp=1536805259119, value=866925023233621
866925023233629 column=f1:eventid, timestamp=1536805259127, value=866925023233621
866925023233630 column=f1:eventid, timestamp=1536805259143, value=866925023233621
30 row(s) in 0.0920 seconds
以下内容设置TTL值,
1.disable 'dc:event'
2. alter "dc:event" ,NAME=>'cf',TTL=>600
alter "dc:event" ,NAME=>'f1',TTL=>600
3. enable 'dc:event'
4. scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
10 row(s) in 0.0740 seconds
对表中原有的cf,f1,f2 列中的cf,f1列设置ttl,时间到之后,cf、f1列的数据会自动清除,f2的数据由于没有设置ttl时间,数据依然还在。
表的TTL修改前后对比:
修改HBASE ttl shell
#!/bin/bash -l
# 针对这一步骤的操作是否需要做回滚操作
# 如果需要,需要查看生产的对应表的ttl,回滚时数据无法回滚
WB_DIR=$(cd $(dirname $0); pwd)
HBASE_NAMESPACE='hochoy'
origin_tables="tabTest1 tabTest2 tabTest3"
alter_ttl="alter_hbase.script"
get_ttl_value(){
years=${1}
ttl=FOREVER
ttl=$(echo "scale = 0; 60 * 60 * 24 * 365 * ${years} " | bc)
echo ${ttl%\.*}
}
gen_alt_script(){
ttl=${1}
echo ''>${WB_DIR}/${alter_ttl}
for table in ${origin_tables}
do
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
echo "disable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "alter '${HBASE_NAMESPACE}:${table}', {NAME=>'f',TTL=>${ttl} } ">>${WB_DIR}/${alter_ttl}
echo "enable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
done
echo "exit">>${WB_DIR}/${alter_ttl}
}
if [ $# -lt 1 ]; then
echo "Usage:
Input value of TTL please!
"
exit
fi
if [ "${1}" = "FOREVER" ] ;then
gen_alt_script FOREVER
else
ttl=$(get_ttl_value ${1})
gen_alt_script $ttl
fi
cat ${WB_DIR}/${alter_ttl}
hbase shell ${WB_DIR}/${alter_ttl}