elasticsearch 如何用fluentbit解析特定日志字段(日志消息内的message(string))内容

cigdeys3  于 2023-04-05  发布在  ElasticSearch
关注(0)|答案(1)|浏览(383)

**I am using fluentbit in kubernetes and i also use the kubernetes filter, which parses the container logs in the right format. But i want to extract a specific field as json and use this in elasticsearch. The message content is a string and not json, so i created a second parser therefore, but it still not working. How should I configure fluentbit to achieve this? **

Example log:
{"timestamp":"2023-03-24T15:32:14.293Z","sequence":8479,"loggerClassName":"org.jboss.logging.Logger","loggerName":"org.keycloak.events","level":"WARN","message":"type=LOGIN_ERROR, realmId=test, clientId=test, userId=null, ipAddress=10.10.0.1, error=user_not_found, auth_method=openid-connect, auth_type=code, redirect_uri=https://keycloak.example-domain.com/111/console/#/, code_id=5c2c1240-3374-4655-ac21-ce9ecffc7916, username=test, authSessionParentId=5c2c1240-3374-4655-ac21-ce9ecffc7916, authSessionTabId=jm0L-tr5X1C","threadName":"executor-thread-2","threadId":52,"mdc":{},"ndc":"","hostName":"keycloak-0","processName":"QuarkusEntryPoint","processId":1}
I want to extract the values from message like type, username, realmId, clientId and so on from this part of the message:
"message":"type=LOGIN_ERROR, realmId=master, clientId=security-admin-console, userId=null, ipAddress=172.18.0.6, error=user_not_found, auth_method=openid-connect, auth_type=code, redirect_uri=https://keycloak.example-domain.com/admin/master/console/#/, code_id=5c2c1240-3374-4655-ac21-ce9ecffc7916, username=test, authSessionParentId=5c2c1240-3374-4655-ac21-ce9ecffc7916, authSessionTabId=jm0L-tr5K5A"
My parser looks like this. (checked it also on https://rubular.com/ )

[PARSER]
          Name                keycloak_events
          Format              regex
          Regex               type=(?<type>[^,]*), realmId=(?<realmId>[^,]*), clientId=(?<clientId>[^,]*), userId=(?<userId>[^,]*), ipAddress=(?<ipAddress>[^,]*), error=(?<error>[^,]*), auth_method=(?<authMethod>[^,]*), auth_type=(?<authType>[^,]*), redirect_uri=(?<redirectUri>[^,]*), code_id=(?<codeId>[^,]*), username=(?<username>[^,]*), authSessionParentId=(?<authSessionParentId>[^,]*), authSessionTabId=(?<authSessionTabId>[<userId>[^,]*), i^,]*)

My two filters are:

[FILTER]
        Name                kubernetes
        Match               keycloak.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_Tag_Prefix     keycloak.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
        Labels              Off
        Annotations         Off
    [FILTER]
        Name                parser
        Match               keycloak.*
        Parser              keycloak_events
        Preserve_Key        On
        Reserve_Data        On
        Key_Name            $log_processed['message']

After the parsing it looks like this:

{
    "stream": "stdout",
    "logtag": "F",
    "log": "{\"timestamp\":\"2023-03-28T17:15:18.447Z\",\"sequence\":8567,\"loggerClassName\":\"org.jboss.logging.Logger\",\"loggerName\":\"org.keycloak.events\",\"level\":\"WARN\",\"message\":\"type=LOGIN_ERROR, realmId=test, clientId=test, userId=null, ipAddress=10.10.0.1, error=user_not_found, auth_method=openid-connect, auth_type=code, redirect_uri=https://keycloak.example-domain.com/111/console/#/, code_id=5c2c1240-3374-4655-ac21-ce9ecffc7916, username=test, authSessionParentId=5c2c1240-3374-4655-ac21-ce9ecffc7916, authSessionTabId=jm0L-tr5X1C\",\"threadName\":\"executor-thread-98\",\"threadId\":1070,\"mdc\":{},\"ndc\":\"\",\"hostName\":\"keycloak-1\",\"processName\":\"QuarkusEntryPoint\",\"processId\":1}",
    "log_processed": {
        "timestamp": "2023-03-28T17:15:18.447Z",
        "sequence": 8567,
        "loggerClassName": "org.jboss.logging.Logger",
        "loggerName": "org.keycloak.events",
        "level": "WARN",
        "message": "message":"type=LOGIN_ERROR, realmId=test, clientId=test, userId=null, ipAddress=10.10.0.1, error=user_not_found, auth_method=openid-connect, auth_type=code, redirect_uri=https://keycloak.example-domain.com/111/console/#/, code_id=5c2c1240-3374-4655-ac21-ce9ecffc7916, username=test, authSessionParentId=5c2c1240-3374-4655-ac21-ce9ecffc7916, authSessionTabId=jm0L-tr5X1C",
        "threadName": "executor-thread-98",
        "threadId": 1070,
        "mdc": {},
        "ndc": "",
        "hostName": "example_hostName",
        "processName": "QuarkusEntryPoint",
        "processId": 1
    },
    "kubernetes": {
        "pod_name": "example_pod_name",
        "namespace_name": "example_namespace_name",
        "pod_id": "example_pod_id",
        "host": "example_ip-addr",
        "container_name": "example_container_name",
        "docker_id": "example_docker_id",
        "container_hash": "example_docker_hash",
        "container_image": "example_container_image"
    }
}

It seems that fluentbit only uses the first filter, parser. Am i using the correct Key_Name in the second parser? How can i debug this pipeline (second parser problem)? I also set debug logging, but it does not give any useful information.
Thanks in advance!

xeufq47z

xeufq47z1#

我可以用nest filter插件解决这个问题。第一步是在log_processed部分中提取消息,然后用前面提到的我为特定情况编写的自定义解析器解析它。

[FILTER]
    Name                nest
    Match               keycloak.*
    Operation           lift
    Nested_under        log_processed
    Add_prefix          log_
    Wildcard            message
[FILTER]
    Name                parser
    Match               keycloak.*
    Key_Name            log_message
    Parser              keycloak_events
    Preserve_Key        On
    Reserve_Data        On

所以我得到了正确格式的日志。

相关问题