logstash 在输入elasticsearch索引输出mysql数据库的情况下,如何通过设置文档Id来避免重复?

bbuxkriu  于 2023-03-21  发布在  Logstash
关注(0)|答案(1)|浏览(184)

在我的案例场景中,我使用ElasticSearch索引作为输入:

input{
elasticsearch {
    hosts => "https://***.***.***.***:9200"
    index => "****"
    user => "***"
    password => "***"
    query => '{ "query": { "query_string": { "query": "*" } } }'
    size => 500
    scroll => "5m"
    docinfo => true
    ssl => true
    ca_file => "/etc/logstash/newfile.crt.pem"
    codec => "json"
  }

}
和mysql数据库作为输出:

output {
if "PRV_API_REQUEST" in [message] {
jdbc {
  driver_class => "com.mysql.jdbc.Driver"
  connection_string => "jdbc:mysql://***.***.***.***/indexname? 
  user=test&password=****"
  enable_event_as_json_keyword => true
  statement => [
    "INSERT INTO indexname (logLevel, timestamp, requestURL, date, response_code) 
     VALUES (?, ?, ?, ?, ?)",
    "logLevel",
    "timestamp",
    "requestURL",
    "date",
    "[response][code]"
  ]
  }
 }
}

我的问题是如何避免重复和设置它以及向我的配置中添加什么?

cvxl0en2

cvxl0en21#

您是否要在表中插入elasticsearch的document_id?也许您可以在SQL语句中添加一个条件来验证document_id是否已经存在,如下所示:

INSERT INTO TABLE (document_id, field1, field2, ...)
VALUES (v_document_id, v_field1, v_field2, ...)
WHERE v_document_id NOT IN (SELECT document_id FROM TABLE)

相关问题