如何配置apacheflume从twitter获取特定时间段的数据？

9rbhqvlz 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(390)

我有一个hadoop集群和apacheflume用于从twitter到hdfs的数据集成，它默认按时间顺序获取数据，就像最新的tweet将首先获取一样，现在我有了usecase从twitter获取特定时期的特定数据，比如2013年2月。请让我知道有任何配置或属性在Flume或推特处理需要设置。
提前谢谢。

hadoop flume twitter data-integration

来源：https://stackoverflow.com/questions/18395989/how-to-configure-apache-flume-to-fetch-data-from-twitter-for-specific-period

1条答案

按热度按时间

5lhxktic1#

您可能需要为flume使用自定义源。
http://blog.cloudera.com/blog/2012/10/analyzing-twitter-data-with-hadoop-part-2-gathering-data-with-flume/
上面链接中提到的twittersource将帮助您根据关键字获取twitter数据。

赞(0）回复(0）举报 2021-06-03

我来回答

如何配置apacheflume从twitter获取特定时间段的数据？

1条答案

相关问题

热门标签

最新问答