在这种情况下如何在hadoop中使用mapreduce？

5cnsuln7 于 2021-06-02 发布在 Hadoop

关注(0)|答案(1)|浏览(402)

我想分析一个文本文件文本文件的格式是这样的。。。

<msg time='2015-07-30T16:37:48.408+09:00' org_id='oracle' comp_id='rdbms' 
msg_id='opiexe:3056:2780954927' client_id='' type='NOTIFICATION'
group='admin_ddl' level='16' host_id='TEST_DB1'
host_addr='127.0.0.1' module='sqlplus@TEST_DB1 (TNS V1-V3)' pid='24436'>
<txt>ORA-1543 signalled during: create tablespace TS_MODULE_I datafile &apos;/data001/orasvc01/NEWDB/ts_module_i_01.dbf&apos; size 20m...
</txt>
</msg>

<msg time='2015-07-30T16:39:13.173+09:00' org_id='oracle' comp_id='rdbms'
client_id='' type='UNKNOWN' level='16'
host_id='TEST_DB1' host_addr='127.0.0.1' module=''
pid='23242'>
<txt>Errors in
file /logs001/orasvc01/diag/rdbms/newdb/NEWDB/trace/NEWDB_smon_23242.trc:
ORA-01116: error in opening database file 6
ORA-01110: data file 6:
&apos;/data001/orasvc01/NEWDB/ts_module_d_01.dbf&apos;
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
</txt>
</msg>

....
有时它包括7行，但其他东西包括10行。在这种情况下。。
我想要一个类似（列[0]）（列[1]）的错误总和的输出
2015年7月31日ora-1051 7
我该怎么办？

hadoop mapreduce

来源：https://stackoverflow.com/questions/31777789/how-to-mapreduce-in-this-situation-in-hadoop

1条答案

按热度按时间

2skhul331#

您的输入文件是xml。如果在每一行中都有一个字符串形式的整个xml，则应该使用直接Mapreduce。但是你的输入是不同的形式。主要依靠开始和结束标记，来获得一个记录。
所以您应该使用RecordReader，并为map reduce-xmlinputformat创建自己的格式。好消息是，它已经被创建了，你必须定制它。您可以为实际类搜索“xmlinputformat mahout”。然而，更简单的方法是看一个使用上述格式的示例。你可以在这里找到它。一旦你的Map器重新编码了一条记录，并且你得到了里面的内容，剩下的就要向前看了，这取决于你要把哪些细节发送给输出。快乐编码

赞(0）回复(0）举报 2021-06-02

我来回答

在这种情况下如何在hadoop中使用mapreduce？

1条答案

相关问题

热门标签

最新问答