3-08-05 18:42:11,784 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:12,789 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:13,795 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:14,801 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:15,810 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:16,816 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.61 sec
2013-08-05 18:42:17,821 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 0.61 sec
2013-08-05 18:42:18,827 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.89 sec
2013-08-05 18:42:19,833 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.89 sec
2013-08-05 18:42:20,839 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.89 sec
MapReduce Total cumulative CPU time: 3 seconds 890 msec
Ended Job = job_201307151509_15500
Copying data to local directory /tmp/hivetest/distributeby
Copying data to local directory /tmp/hivetest/distributeby
7 Rows loaded to /tmp/hivetest/distributeby
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 2 Cumulative CPU: 3.89 sec HDFS Read: 458 HDFS Write: 112 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 890 msec
OK
Time taken: 16.678 seconds
查看查询后的结果:
结果说明:hash值一样的数据被分配到同一个reduce中。
额外验证:
---sort by 是否也像distribute by 一样使用hash算法
set mapred.reduce.tasks=2;
hive> insert overwrite local directory '/tmp/hivetest/sortby' select id,devid,job_time from tb_in_base where job_time=030729 sort by id;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201307151509_15501, Tracking URL = http://mwtec-50:50030/jobdetails.jsp jobid=job_201307151509_15501
Kill Command = /home/hadoop/hadoop-0.20.2/bin/hadoop job -Dmapred.job.tracker=mwtec-50:9002 -kill job_201307151509_15501
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 2
2013-08-05 18:57:33,616 Stage-1 map = 0%, reduce = 0%
2013-08-05 18:57:35,625 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:36,631 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:37,636 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:38,642 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:39,648 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:40,653 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:41,659 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:42,669 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2013-08-05 18:57:43,675 Stage-1 map = 100%, reduce = 67%, Cumulative CPU 3.02 sec
2013-08-05 18:57:44,681 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.68 sec
2013-08-05 18:57:45,687 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.68 sec
2013-08-05 18:57:46,693 Stage-1 map = 100%, reduce = 100%, Cumulative