A = LOAD 'input.csv' AS line;
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'(\\w+),(\\w+),(.*)$')) AS (col1:chararray,col2:chararray,col3:chararray);
DUMP B;
输出:
(car,deer,"bear,cat")
(car,deer,"bear,cat")
pigscript输出格式2:
A = LOAD 'input.csv' AS line;
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'(\\w+),(\\w+),"(\\w+),(.*)"$')) AS (col1:chararray,col2:chararray,col3:chararray,col4:chararray);
DUMP B;
4条答案
按热度按时间c9x0cxw01#
你能试试这个吗?
输入.csv
pigscript:输出格式1:
输出:
pigscript输出格式2:
输出:
jw5wzhpr2#
r55awzrz3#
您可以使用apachecsv库来处理这个问题
vi4fp9gy4#
你可以试试这个正则表达式:
这里看看,这样已经问上了,很好的被用户解释了。
单击此处