Hive 高级编程――深入浅出学Hive(二)

2014-11-24 03:16:45 · 作者: · 浏览: 1
select explode_map(explode_map(properties)) from src 不可以和group by/cluster by/distribute by/sort by一起使用:select explode_map(properties) as (col1,col2) from src group by col1, col2 和lateral view一起使用:select src.id, mytable.col1, mytable.col2 from src lateral view explode_map(properties) mytable as col1, col2; 此方法更为方便日常使用。执行过程相当于单独执行了两次抽取,然后union到一个表里。 lateral view Lateral View语法 lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)* fromClause: FROM baseTable (lateralView)* Lateral View用于UDTF(user-defined table generating functions)中将行转成列,例如explode(). 目前Lateral View不支持有上而下的优化。如果使用Where子句,查询可能将不被编译。解决方法见: 此时,在查询之前执行set hive.optimize.ppd=false; 例子 pageAds。它有两个列
string pageid
Array adid_list
" front_page" [1, 2, 3]
"contact_page " [ 3, 4, 5]
SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid; 将输出如下结果 string pageid int adid "front_page" 1 ……. “contact_page" 3 代码示例 public class MyUDTF extends GenericUDTF{ public StructObjectInspector initialize(ObjectInspector[] args) {} public void process(Object[] args) throws HiveException { } } 实现:切分 ” key:value;key:value;” 这种字符串, 返回结果为 key, value 两个字段 转载请注明出处【 http://sishuok.com/forum/blogPost/list/6226. html