输入数据到awsElasticSearch使用胶水

ar7v8xwq 于 2021-05-27 发布在 Spark

关注(0)|答案(1)|浏览(422)

我正在寻找一种解决方案，使用aws glue python或pyspark将数据插入awsElasticSearch。我看过boto3 sdk的ElasticSearch，但找不到任何函数插入数据到ElasticSearch。有人能帮我找到解决办法吗？任何有用的链接或代码？

apache-spark pyspark aws-glue amazon-web-services aws-elasticsearch

来源：https://stackoverflow.com/questions/62829791/input-data-to-aws-elastic-search-using-glue

1条答案

按热度按时间

noj0wjuj1#

对于aws glue，您需要向作业中添加一个额外的jar。
从下载jarhttps://repo1.maven.org/maven2/org/elasticsearch/elasticsearch-hadoop/7.8.0/elasticsearch-hadoop-7.8.0.jar
将jar保存在s3上，并将其传递给胶水作业。
现在，在保存Dataframe时，请使用以下命令

df.write.format("org.elasticsearch.spark.sql").\
         option("es.resource", "index/document").\
         option("es.nodes", host).\
         option("es.port", port).\
         save()

如果您使用的是aws管理的ElasticSearch，请尝试将其设置为true

option("es.nodes.wan.only", "true")

有关更多属性，请检查https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
注意：elasticsearch spark连接器仅与spark 2.3兼容，因为它是在scala 2.11上预先构建的，而spark 2.4和spark 3.0是在scala 2.12上预先构建的

赞(0）回复(0）举报 2021-05-27

我来回答

输入数据到awsElasticSearch使用胶水

1条答案

相关问题

热门标签

最新问答