public class TreeMapWritable extends TreeMap<Text, IntWritable>
implements Writable {
@Override
public void write(DataOutput out) throws IOException {
// write out the number of entries
out.writeInt(size());
// output each entry pair
for (Map.Entry<Text, IntWritable> entry : entrySet()) {
entry.getKey().write(out);
entry.getValue().write(out);
}
}
@Override
public void readFields(DataInput in) throws IOException {
// clear current contents - hadoop re-uses objects
// between calls to your map / reduce methods
clear();
// read how many items to expect
int count = in.readInt();
// deserialize a key and value pair, insert into map
while (count-- > 0) {
Text key = new Text();
key.readFields(in);
IntWritable value = new IntWritable();
value.readFields(in);
put(key, value);
}
}
}
1条答案
按热度按时间bqf10yzr1#
扩展tariq的链接,并简单地详细说明
<Text, IntWritable>
树状图:基本上,hadoop中的默认序列化工厂期望对象输出实现可写接口(上面详述的readfields和write方法)。通过这种方式,您几乎可以扩展任何类来重新适应序列化方法。
另一个选项是启用java序列化(它使用默认的java序列化方法)
org.apache.hadoop.io.serializer.JavaSerialization
通过配置io.serializations
配置属性,但我不建议这样做。