使用apache pig工具(hadoop)为给定xml的每个元素获取几个具有相同名称的child

j9per5c4  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(203)

我想使用pig为给定的xml中的每个元素获取几个具有相同名称的child,但这不起作用。。。
我有这个xml:

<A>
    <B> 
        <C>valueC1</C>
        <D>
            <E>valueE1</E>
            <F>
                <G>valueG11</G>
                <H>valueH11</H>
            </F>
            <F>
                <G>valueG12</G>
                <H>valueG12</H>
            </F>
            <F>
                <G>valueG13</G>
                <H>valueG13</H> 
            </F>
        </D>
    </B>
    <B> 
        <C>valueC2</C>
        <D>
            <E>valueE1<E>
            <F>
                <G>valueG21</G>
                <H>valueH21</H>
            </F>
            <F>
                <G>valueG22</G>
                <H>valueH22</H>
            </F>
            <F>
                <G>valueG23</G>
                <H>valueH23</H> 
            </F>
        </D>
    </B>
</A>

我想要:

(valueC1, valueE1,(valueG11,valueG12,valueG13))
(valueC2, valueE2,(valueG21,valueG22,valueG23))

我试过:

DEFINE XPath org.apache.pig.piggybank.evaluation.xml.XPath();
A = LOAD '/user/hue/maelia/adhesions_xml/MMC.20140214.xml' 
USING org.apache.pig.piggybank.storage.XMLLoader('B') as    (x:chararray); 
B = FOREACH A GENERATE XPath(x,'B/C'), XPath(x,'B/D/E'),XPath(x,'B/D/F/G');
C = FOREACH A GENERATE XPath(x,'B/C'), XPath(x,'B/D/E'),XPath(x,'B/D/F[0]/G');
D = FOREACH A GENERATE XPath(x,'B/C'), XPath(x,'B/D/E'),XPath(x,'B/D/F[position()=0]/G');
E = FOREACH A GENERATE XPath(x,'B/C'), XPath(x,'B/D/E'),XPath(x,'B/D/F[1]/G');
F = FOREACH A GENERATE XPath(x,'B/C'), XPath(x,'B/D/E'),XPath(x,'B/D/F[position()=1]/G');
dump B;
dump C;
dump D;
dump E;
dump F;

它返回:

(valueC1, valueE1,valueG11)
(valueC2, valueE2,valueG21)

或:

(valueC1, valueE1,)
(valueC2, valueE2,)

然后我试着:

DEFINE XPath org.apache.pig.piggybank.evaluation.xml.XPath();
A = LOAD '/user/.../sample.xml' 
USING org.apache.pig.piggybank.storage.XMLLoader('B') as (x:chararray); 
B = FOREACH A {
LOAD '/user/.../sample.xml' 
    USING org.apache.pig.piggybank.storage.XMLLoader(
        CONCAT(
            CONCAT(
                'B/C[text()=\'',
                x
             ),
             '\']/../D/F/G'
         )
     ) as (y:chararray);
GENERATE XPath(x,'B/C'),y;
};
dump C

它返回一个错误:

...ERROR org.apache.pig.PigServer - exception during parsing: Error    during parsing. <file script.pig, line 5, column 9>  mismatched input   ''/user/.../sample.xml'' expecting LEFT_PAREN
Failed to parse: <file script.pig, line 5, column 9>  mismatched input      ''/user/.../sample.xml'' expecting LEFT_PAREN
at     org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:244)
...

有人有办法解决这个问题吗?但是,我们可以将一个load放入嵌套的foreach吗?
提前谢谢!

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题