Hadoop运行wordcount出现异常解决宝马娱乐在线:
如何设置Hadoop Map/Reduce任务的内存限制?
Ubuntu下Hadoop环境的配置 http://www.linuxidc.com/Linux/2012-11/74539.htm
Type
package com.guilin.hadoop.mapreduce;
Meaning
mapred.cluster.map.memory.mb
set by admin, cluster-wide
Cluster definition of memory per map slot. The maximum amount of memory, in MB, each map task on a tasktracker can consume.
mapred.cluster.reduce.memory.mb
set by admin, cluster-wide
Cluster definition of memory per reduce slot. The maximum amount of memory, in MB, each reduce task on a tasktracker can consume.
mapred.job.map.memory.mb
set by user, per-job
Job requirement for map tasks. The maximum amount of memory each map task of a job can consume, in MB.
mapred.job.reduce.memory.mb
set by user, per-job
job requirement for reduce tasks. The maximum amount of memory each reduce task of a job can consume, in MB.
mapred.cluster.max.map.memory.mb
set by admin, cluster-wide
Max limit on jobs. The maximum value that can be specified by a user via mapred.job.map.memory.mb, in MB. A job that asks for more than this number will be failed at submission itself.
mapred.cluster.max.reduce.memory.mb
set by admin, cluster-wide
Max limit on jobs. The maximum value that can be specified by a user via mapred.job.reduce.memory.mb, in MB. A job that asks for more than this number will be failed at submission itself.
不设置时默认都是-1,无限制
相关介绍请参考Hadoop-0.20.2 作业内存控制策略分析 http://www.linuxidc.com/Linux/2012-06/63310.htm
设置时请注意其大小关系。比如你设置了mapred.cluster.map.memory.mb为1024 ,然后你提交任务时没有设置mapred.job.map.memory.mb(默认为-1,无限制),此时便会报如下错误:
- 2012-06-13 16:18:10,951 ERROR exec.Task (SessionState.java:printError(380)) - Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: job_201206131602_0003(-1 memForMapTasks -1 memForReduceTasks): Invalid job requirements.
- at org.apache.hadoop.mapred.JobTracker.checkMemoryRequirements(JobTracker.java:5160)
- at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3949)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
- at java.lang.reflect.Method.invoke(Method.java:597)
- at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
- at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
- )'
- org.apache.hadoop.ipc.RemoteException: java.io.IOException: job_201206131602_0003(-1 memForMapTasks -1 memForReduceTasks): Invalid job requirements.
- at org.apache.hadoop.mapred.JobTracker.checkMemoryRequirements(JobTracker.java:5160)
- at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3949)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
- at java.lang.reflect.Method.invoke(Method.java:597)
- at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
- at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
- at org.apache.hadoop.ipc.Client.call(Client.java:1030)
- at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
- at org.apache.hadoop.mapred.$Proxy7.submitJob(Unknown Source)
- at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:862)
- at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
- at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:791)
- at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:765)
- at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
- at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
- at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
- at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
- at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
- at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
- at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
- at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
- at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:629)
- at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:617)
- at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
- at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
- at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
虚拟机上4台机器:192.168.137.111(master)、192.168.137.112(slave1)、192.168.137.113(slave2)、192.168.137.114(slave3)
Parameter
public void reduce(Text key, Iterable<IntWritable> values,
Reducer<Text, IntWritable, Text, IntWritable>.Context
context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
this.result.set(sum);
context.write(key, this.result);
}
}
单机版搭建Hadoop环境图文教程详解 http://www.linuxidc.com/Linux/2012-02/53927.htm
Hadoop版本:hadoop-1.1.2
异常1:
Ubuntu 12.10 +Hadoop 1.2.1版本集群配置 http://www.linuxidc.com/Linux/2013-09/90600.htm
import java.io.IOException;
import java.util.StringTokenizer;
把项目打包为jar文件hadoop-test.jar,放置在项目根目录下。
Hadoop集群用户名:hadoop
14/10/18 10:12:27 INFO input.FileInputFormat: Total input paths to
process : 2
14/10/18 10:12:27 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
14/10/18 10:12:27 WARN snappy.LoadSnappy: Snappy native library not
loaded
14/10/18 10:12:27 INFO mapred.JobClient: Running job:
job_201410181754_0004
14/10/18 10:12:28 INFO mapred.JobClient: map 0% reduce 0%
14/10/18 10:12:32 INFO mapred.JobClient: map 100% reduce 0%
14/10/18 10:12:39 INFO mapred.JobClient: map 100% reduce 33%
14/10/18 10:12:40 INFO mapred.JobClient: map 100% reduce 100%
14/10/18 10:12:40 INFO mapred.JobClient: Job complete:
job_201410181754_0004
14/10/18 10:12:40 INFO mapred.JobClient: Counters: 29
14/10/18 10:12:40 INFO mapred.JobClient: Job Counters
14/10/18 10:12:40 INFO mapred.JobClient: Launched reduce tasks=1
14/10/18 10:12:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4614
14/10/18 10:12:40 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
14/10/18 10:12:40 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
14/10/18 10:12:40 INFO mapred.JobClient: Launched map tasks=2
14/10/18 10:12:40 INFO mapred.JobClient: Data-local map tasks=2
14/10/18 10:12:40 INFO mapred.JobClient:
SLOTS_MILLIS_REDUCES=8329
14/10/18 10:12:40 INFO mapred.JobClient: File Output Format Counters
14/10/18 10:12:40 INFO mapred.JobClient: Bytes Written=31
14/10/18 10:12:40 INFO mapred.JobClient: FileSystemCounters
14/10/18 10:12:40 INFO mapred.JobClient: FILE_BYTES_READ=75
14/10/18 10:12:40 INFO mapred.JobClient: HDFS_BYTES_READ=264
14/10/18 10:12:40 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=154204
14/10/18 10:12:40 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=31
14/10/18 10:12:40 INFO mapred.JobClient: File Input Format Counters
14/10/18 10:12:40 INFO mapred.JobClient: Bytes Read=44
14/10/18 10:12:40 INFO mapred.JobClient: Map-Reduce Framework
14/10/18 10:12:40 INFO mapred.JobClient: Map output materialized
bytes=81
14/10/18 10:12:40 INFO mapred.JobClient: Map input records=2
14/10/18 10:12:40 INFO mapred.JobClient: Reduce shuffle bytes=81
14/10/18 10:12:40 INFO mapred.JobClient: Spilled Records=12
14/10/18 10:12:40 INFO mapred.JobClient: Map output bytes=78
14/10/18 10:12:40 INFO mapred.JobClient: CPU time spent (ms)=1090
14/10/18 10:12:40 INFO mapred.JobClient: Total committed heap usage
(bytes)=241246208
14/10/18 10:12:40 INFO mapred.JobClient: Combine input records=8
14/10/18 10:12:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=220
14/10/18 10:12:40 INFO mapred.JobClient: Reduce input records=6
14/10/18 10:12:40 INFO mapred.JobClient: Reduce input groups=4
14/10/18 10:12:40 INFO mapred.JobClient: Combine output records=6
14/10/18 10:12:40 INFO mapred.JobClient: Physical memory (bytes)
snapshot=311574528
14/10/18 10:12:40 INFO mapred.JobClient: Reduce output records=4
14/10/18 10:12:40 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=1034760192
14/10/18 10:12:40 INFO mapred.JobClient: Map output records=8
wordcount完整代码
Ubuntu 13.04上搭建Hadoop环境 http://www.linuxidc.com/Linux/2013-06/86106.htm
解决办法:添加conf.set("mapred.jar","hadoop-test.jar");
最后运行成功
解决办法:第一种是修改windows管理员(Administrator)账户名为hadoop账户名;第二种是在集群上创建一个账户名称与windows管理员账户名相同,并设置对hadoop目录有读、写、执行权限。推荐使用第一种,
近学习Hadoop,在Windows+Eclipse+虚拟机Hadoop集群环境下运行Mapreduce程序遇到了很多问题。上网查了查,并经过自己的分析,最终解决,在此分享一下,给遇到同样问题的人提供参考。
搭建Hadoop环境(在Winodws环境下用虚拟机虚拟两个Ubuntu系统进行搭建) http://www.linuxidc.com/Linux/2011-12/48894.htm
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
本文由宝马娱乐在线城发布于互联网络,转载请注明出处:Hadoop运行wordcount出现异常解决宝马娱乐在线:
关键词: