我正在研究hiveudtf,以便围绕主键转置每一行。作为需求的一部分,我需要将列名和相应的数据与键相关联。
e.g.
Source Data
Customer_id Customer_name Customer_type
1000000 ABCD Individual
hiveudtf将数据转换为
Key att_name att_val
10000000 customer_name ABCD
10000000 customer_type Individual
我已经编写了udtf,它正在工作,目前正在生成以下数据
Key att_name att_val
10000000 _col0 ABCD
10000000 _col1 Individual
下面是代码 (StructField)inputFields.get(i)).getFieldName()
正在返回\u col0而不是客户\u name。
这可能是apache配置单元中的一个缺陷,或者还有另一个Map将\u col0Map到我应该参考的实际模式。
public class transposeUDTF extends GenericUDTF {
private Map tableMap = new HashMap();
private MetadataListStructObjectInspector metadataDetails;
@Override
public StructObjectInspector initialize(StructObjectInspector args) throws UDFArgumentException {
List inputFields = args.getAllStructFieldRefs();
((StructObjectInspector)args).getTypeName();
for(int i = 0; i < inputFields.size(); ++i)
{
tableMap.put(i+1,((StructField)inputFields.get(i)).getFieldName());
}
return super.initialize(args);
}
@Override
public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException {
List<String> fieldNames = new ArrayList<String>(3);
List<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>(3);
fieldNames.add("key");
fieldNames.add("AttrName");
fieldNames.add("AttrVal");
fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs);
}
@Override
public void process(Object[] record) throws HiveException {
ArrayList<Object[]> results = new ArrayList<Object[]>();
for(int i = 1; i < record.length; ++i) {
results.add(new Object[] {record[0],tableMap.get(i).toString(),record[i]});
}
Iterator<Object[]> it = results.iterator();
while (it.hasNext()){
Object[] r = it.next();
forward(r);
}
}
@Override
public void close() throws HiveException {
// do nothing
}
}
暂无答案!
目前还没有任何答案,快来回答吧!