logstash的csv数据解析问题

fnatzsnv  于 2023-06-27  发布在  Logstash
关注(0)|答案(1)|浏览(204)

我试图将csv报告上传到logstash,但它无法按预期工作。
我的csv文件有超过200+行,在这一行中给出了下面的参考。

$ cat app.csv
Sl.No,Emp_ID,Date,Emp_Name,Product_Name,Item_Details
1,1234556,30-12-2022,frank.van,SAMPLE_PRODUCT,"[Name] Frank Van Puffelen JAVA.
[Area/Pin] San Francisco, CA 
[Region/Status/Identify] Android Plaltfrom
[Case#] Jira-01234
[Problem] Messaging app not booting.
[Staring Point] Google service for the notifications
[Evaluate] Cloud Messaging.
[Verification Mode] Local Device.
[Empname] Frank Van.

Domain:Cloud_S,Android:S_OS
***** Ticket Status : https://jenkins.company.com/job/889900112 *****
"

我的logstash conf文件如下。

input {
   file {
      path => "/home/user/logs/app.csv"
      start_position => "beginning"
      sincedb_path => "/dev/null"
      codec => multiline { 
      pattern => '^"'
      negate => "true"
      what => "next"
}
   }
}
filter {
    csv {
        separator => ","
        columns => ["Sl.No", "Emp_ID", "Date", "Emp_Name", "Product_Name", "Item_Details"]
    }

}
output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "java-app"
    document_type => "Emp_ID"
  }
  stdout{}
}

在logstash日志中,它显示CSV列标题值,而不是加载实际值。

logstash         |                  "Emp_Name" => "Emp_Name",
logstash         |        "Product_Name" => "Product_Name",
logstash         |               "message" => "Sl.No,Emp_ID,Date,Emp_Name,Product_Name,Item_Details\r\n1,1234556,30-12-2022,frank.van,SAMPLE_PRODUCT,\"[Name] Frank Van Puffelen JAVA.\r\n[Area/Pin] San Francisco, CA \r\n[Region/Status/Identify] Android Plaltfrom\r\n[Case#] Jira-01234\r\n[Problem] Messaging app not booting.\r\n[Staring Point] Google service for the notifications\r\n[Evaluate] Cloud Messaging.\r\n[Verification Mode] Local Device.\r\n[Empname] Frank Van.\r\n\r\nDomain:Cloud_S,Android:S_OS\r\n***** Ticket Status : https://jenkins.company.com/job/889900112 *****\r",
logstash         |              "@version" => "1",
logstash         |                  "path" => "/home/user/logs/app.csv",
logstash         |          "Date" => "Date",
logstash         |           "Item_Details" => "Item_Details",
logstash         |     "Emp_ID" => "Emp_ID",
logstash         |            "@timestamp" => 2023-06-22T05:48:22.714Z,
logstash         |                  "tags" => [
logstash         |         [0] "multiline"
logstash         |     ],
logstash         |         "host" => "828967718f28",
logstash         |         "Sl.No" => "Sl.No"
logstash         | }

你能告诉我如何上传我的csv文件数据到logstash吗?我的Item_Details列包含双引号。
@Paulo,这是我更新的logstash conf文件。

input {
  file {
    path => "/home/user/logs/app.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => multiline { 
    pattern => '^\d'
    negate => "false"
    what => "next"
  }
 } 
}

filter {
    csv {
      separator => ","
      skip_header => true
      columns => ["Sl.No", "Emp_ID", "Date", "Emp_Name", "Product_Name", "Item_Details"]
  }
}

Logstash输出

logstash         | [2023-06-24T02:46:02,959][WARN ][logstash.filters.csv     ][main][aacc2f62062158dfe25eef66ddf5744e6c49abde51f3b44a542bcdacf04017fc] Error parsing csv {:field=>"message", :source=>"\"\r", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}
logstash         | {
logstash         |           "tags" => [
logstash         |         [0] "multiline",
logstash         |         [1] "_csvparsefailure"
logstash         |     ],
logstash         |        "message" => "1,1234556,30-12-2022,frank.van,SAMPLE_PRODUCT,\"[Name] Frank Van Puffelen JAVA.\r\n[Area/Pin] San Francisco, CA \r",
logstash         |     "@timestamp" => 2023-06-24T02:46:02.757Z,
logstash         |           "path" => "/home/user/logs/app.csv",
logstash         |       "@version" => "1",
logstash         |           "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Evaluate] Cloud Messaging.\r",
logstash         |     "serial_number" => "[Evaluate] Cloud Messaging.",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.759Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "path" => "/home/user/logs/app.csv",
logstash         |       "@version" => "1",
logstash         |        "message" => "\r",
logstash         |           "host" => "11f690730fff",
logstash         |     "@timestamp" => 2023-06-24T02:46:02.760Z
logstash         | }
logstash         | {
logstash         |           "message" => "***** Ticket Status : https://jenkins.company.com/job/889900112 *****\r",
logstash         |     "serial_number" => "***** Ticket Status : https://jenkins.company.com/job/889900112 *****",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.760Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Staring Point] Google service for the notifications\r",
logstash         |     "serial_number" => "[Staring Point] Google service for the notifications",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.759Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Problem] Messaging app not booting.\r",
logstash         |     "serial_number" => "[Problem] Messaging app not booting.",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.758Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Verification Mode] Local Device.\r",
logstash         |     "serial_number" => "[Verification Mode] Local Device.",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.759Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Region/Status/Identify] Android Plaltfrom\r",
logstash         |     "serial_number" => "[Region/Status/Identify] Android Plaltfrom",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.758Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Case#] Jira-01234\r",
logstash         |     "serial_number" => "[Case#] Jira-01234",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.758Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |           "message" => "[Empname] Frank Van.\r",
logstash         |     "serial_number" => "[Empname] Frank Van.",
logstash         |        "@timestamp" => 2023-06-24T02:46:02.760Z,
logstash         |              "path" => "/home/user/logs/app.csv",
logstash         |          "@version" => "1",
logstash         |              "host" => "11f690730fff"
logstash         | }
logstash         | {
logstash         |               "message" => "Domain:Cloud_S,Android:S_OS\r",
logstash         |         "serial_number" => "Domain:Cloud_S",
logstash         |            "@timestamp" => 2023-06-24T02:46:02.760Z,
logstash         |                  "path" => "/home/user/logs/app.csv",
logstash         |              "@version" => "1",
logstash         |                  "host" => "11f690730fff",
logstash         |     "changelist_number" => "Android:S_OS"
logstash         | }
logstash         | {
logstash         |           "tags" => [
logstash         |         [0] "_csvparsefailure"
logstash         |     ],
logstash         |        "message" => "\"\r",
logstash         |     "@timestamp" => 2023-06-24T02:46:02.761Z,
logstash         |           "path" => "/home/user/logs/app.csv",
logstash         |       "@version" => "1",
logstash         |           "host" => "11f690730fff"
logstash         | }

这不是预期结果,字段值未正确对齐。
根据我的csv文件,每一行都应该上传到logstash。
你能帮我用conf文件插入我的csv文件到logstash这将是非常有帮助的。

4dc9hkyq

4dc9hkyq1#

Tldr;

似乎在第一次迭代中,您匹配了多行,匹配了具有标题的行和具有第一行实际信息的行。

解决方案;

要么删除第一行,在将其发送到logstash之前包含头。
或者您也可以使用另一种模式,正确匹配每个条目。
可以匹配的模式的想法如下:

codec => multiline { 
  pattern => '^\d'
  negate => "false"
  what => "next"
}

相关问题