我目前有一个mapreduce程序,它用不同的文件名将数据发送到hdfs。因此,在我的reducer中,我使用multipleoutput来写入hdfs中的不同文件(下面是完整的reducer代码)。
我想用mrunit测试我的代码,下面是我的测试方法。
@Test
public void reducerMRUnit() throws IOException{
String output="";
ArrayList<Text> list = new ArrayList<Text>(0);
list.add(new Text(""));
reduceDriver.withInput(new Text(""), list);
reduceDriver.withPathOutput(new Text(output),NullWritable.get(),"");
reduceDriver.runTest();
}
但是,当我做这个测试的时候,它给了我npe。
java.lang.NullPointerException
at org.apache.hadoop.fs.Path.<init>(Path.java:104)
at org.apache.hadoop.fs.Path.<init>(Path.java:93)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getDefaultWorkFile(FileOutputFormat.java:286)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:129)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:476)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:456)
at org.clinical3PO.learn.fasta.ArffToFastAReducer.reduce(ArffToFastAReducer.java:127)
at org.clinical3PO.learn.fasta.ArffToFastAReducer.reduce(ArffToFastAReducer.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mrunit.mapreduce.ReduceDriver.run(ReduceDriver.java:265)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:640)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:627)
at org.clinical3PO.learn.fasta.MRUnitTest.ArffToFastAReducerMRUnitTest.reducerMRUnit(ArffToFastAReducerMRUnitTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
减速机代码:
import java.io.IOException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.output.MultipleOutputs;
public class AReducer extends Reducer<Text, Text, Text, NullWritable>{
private MultipleOutputs<Text, NullWritable> mos = null;
@Override
public void setup(Context context) throws IOException {
mos = new MultipleOutputs<Text, NullWritable>(context);
}
@Override
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
mos = new MultipleOutputs<Text, NullWritable>(context);
mos.write(key, value, "filename");
}
@Override
public void cleanup(Context context) throws IOException, InterruptedException {
mos.close();
}
}
有什么建议吗?
1条答案
按热度按时间hvvq6cgz1#
mrunit目前有一个已知的问题,没有很好的文档记录,那就是测试
MultipleOutputs
需要使用运行测试PowerMockRunner
和一个PrepareForTest
应用于模拟reducer类的注解。jira问题mrunit-13和mrunit-213对此进行了详细讨论。mrunit-213仍未解决/未修复。将powermock添加到项目中之后,在排列mockito和powermock的正确兼容版本时会引发一些进一步的挑战。有关将powermock与mockito结合使用的文档介绍了哪些版本是兼容的。
我试过对你的样品做些改动。那已经过去了
NullPointerException
,但后来我遇到了最后一个问题。测试中声明的预期路径输出与"filename"
减速器代码使用的路径。我更改了预期的路径输出,以使测试完全通过。这是我的最终结果:一个完全工作的项目与您的样本测试。好好享受!
pom.xml文件
src/main/java/areducer.java
src/test/java/testaducer.java