我已经使用serde来处理我的xml数据,并在其中为xml数据创建自定义inputformat和recordredaer。
下面是班级的签名-
serde类-
public class XMLSerde extends AbstractSerDe {
输入格式-
public class XMLInputFormat extends FileInputFormat<LongWritable, BookWritable> {
@Override
public RecordReader<LongWritable, BookWritable> createRecordReader(InputSplit arg0,
TaskAttemptContext arg1) throws IOException, InterruptedException {
// TODO Auto-generated method stub
return new XMLRecordReader();
}
录音阅读器-
public class XMLRecordReader extends RecordReader<LongWritable, BookWritable> {
bookwritable是我创建的自定义可写类。
现在,当我使用这个serde如下-
CREATE TABLE xml_items(Author STRING, Title STRING, ISBN STRING) ROW FORMAT SERDE 'com.xml.serde.XMLSerde' STORED AS INPUTFORMAT 'com.xml.serde.XMLInputFormat';
对此表运行select查询时出现以下错误。
FAILED: SemanticException 1:14 Input format must implement InputFormat. Error encountered near token 'books'
请建议。阿杰
1条答案
按热度按时间muk1a3rh1#
hive只支持旧的mareduceapi。您应该从org.apache.hadoop.mapred.fileinputformat类继承xmlinputformat。