CPU 4.68 sec
MapReduce Total cumulative CPU time: 4 seconds 680 msec
Ended Job = job_201307151509_15501
Copying data to local directory /tmp/hivetest/sortby
Copying data to local directory /tmp/hivetest/sortby
7 Rows loaded to /tmp/hivetest/sortby
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 2 Cumulative CPU: 4.68 sec HDFS Read: 458 HDFS Write: 112 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 680 msec
OK
查询结果:
结果说明:sort by 不支持通过hash算法将数据分配到不同的reduce文件。
2.4 cluster by
cluster by 除了distribute by 的功能外,还会对该字段进行排序,所以cluster by = distribute by +sort by
---cluster by 是否可以指定asc 或desc
hive> select id,devid,job_time from tb_in_base where job_time=030729 cluster by id desc;
FAILED: Parse Error: line 1:77 mismatched input 'desc' expecting EOF near 'id'
hive> select id,devid,job_time from tb_in_base where job_time=030729 cluster by id asc;
FAILED: Parse Error: line 1:77 mismatched input 'asc' expecting EOF near 'id'
注:cluster by 默认倒序排序
hive> set mapred.reduce.tasks=2;
hive> insert overwrite local directory '/tmp/hivetest/clusterby' select id,devid,job_time from tb_in_base where job_time=030729 cluster by id;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201307151509_15532, Tracking URL = http://mwtec-50:50030/jobdetails.jsp jobid=job_201307151509_15532
Kill Command = /home/hadoop/hadoop-0.20.2/bin/hadoop job -Dmapred.job.tracker=mwtec-50:9002 -kill job_201307151509_15532
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 2
2013-08-05 19:41:15,138 Stage-1 map = 0%, reduce = 0%
2013-08-05 19:41:17,147 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:18,153 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:19,158 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:20,163 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:21,169 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:22,174 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:23,180 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:24,186 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec
2013-08-05 19:41:25,193 Stage-1 map = 100%, reduce = 67%, Cumulative CPU 2.5 sec
2013-08-05 19:41:26,199 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.34 sec
2013-08-05 19:41:27,205 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.34 sec
2013-08-05 19:41:28,210 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.34 sec
MapReduce Total cumulative CPU time: 4 seconds 340 msec
Ended Job = job_201307151509_15532
Copying data to local directory /tmp/hivetest/clusterby
Copying data to local directory /tmp/hivetest/clusterby
7 Rows loaded to /tmp/hivetest/clusterby
MapReduce Jobs Launched:
Job 0: Map