如何在MapReduceHadoop中使用geolite数据库？

xxe27gdn 于 2021-06-02 发布在 Hadoop

关注(0)|答案(1)|浏览(340)

我试图编写一个map reduce程序，在这个程序中，我试图使用geolite数据库来解析ip地址的位置。我不知道如何将数据库文件传递给Map器，以及要使用哪些依赖项？

Java hadoop

来源：https://stackoverflow.com/questions/31760893/how-to-use-geolite-database-in-mapreduce-hadoop

1条答案

按热度按时间

dz6r00yl1#

在map reduce hadoop中使用geolite数据库的一种方法是使用以下方法将数据库作为缓存文件传递：
distributedcache.addcachefile（inputpath.touri（），job.getconfiguration（））；
可以使用缓存文件将.mmdb文件传递给每个Map器。
我用于使用geolite数据库的依赖项包括：

</dependency>
        <dependency>
            <groupId>com.maxmind.geoip2</groupId>
            <artifactId>geoip2</artifactId>
            <version>2.3.0</version>
        </dependency>

        <dependency>
            <groupId>com.maxmind.db</groupId>
            <artifactId>maxmind-db</artifactId>
            <version>1.0.0</version>
        </dependency>
        <dependency>

然后可以覆盖设置并将缓存文件传递给Map器，如下所示：

@Override
public void setup(Context context)

{
  Configuration conf = context.getConfiguration();

try {

  cachefiles = DistributedCache.getLocalCacheFiles(conf);

  File database = new File(cachefiles[0].toString()); 

  reader = new DatabaseReader.Builder(database).build();

} catch (IOException e) {
  e.printStackTrace();
}

}

然后我在map函数中使用如下：

public void map(Object key, Text line, Context context) throws IOException,
      InterruptedException {

    InetAddress ipAddress = InetAddress.getByName(address.getHostAddress());
    CityResponse response = null;
    try {
      response = reader.city(ipAddress);
    } catch (GeoIp2Exception ex) {
      ex.printStackTrace();
      return;
    }

    Country country = response.getCountry();
    String count = country.getName(); // 'US'

    if (country.getName() == null) {
      return;
    }

您可以在这里查看一个工作示例。

赞(0）回复(0）举报 2021-06-02

我来回答

如何在MapReduceHadoop中使用geolite数据库？

1条答案

相关问题

热门标签

最新问答