使用python将txt文件转换为json?

jgwigjjp  于 2022-12-14  发布在  Python
关注(0)|答案(1)|浏览(299)

我有一个日志文件,其格式如下:

Nov 28 06:26:45 server-01 dhcpd: DHCPDISCOVER from cc:d3:e2:7a:af:40 via 10.39.192.1 
Nov 28 06:26:45 server-01 dhcpd: DHCPOFFER on 10.39.255.253 to cc:d3:e2:7a:af:40 via 10.39.192.1

下一步是使用Python将文本数据转换为JSON,到目前为止,我已经有了python脚本,现在,JSON文件以下面的格式创建:

# Python program to convert text
# file to JSON

import json

# the file to be converted
filename = 'Logs.txt'

# resultant dictionary
dict1 = {}

# fields in the sample file
fields =['timestamp', 'Server', 'Service', 'Message']

with open(filename) as fh:
    # count variable for employee id creation
    l = 1

    for line in fh:
        # reading line by line from the text file
        description = list( line.strip().split(None, 4))

        # for output see below
        print(description)

        # for automatic creation of id for each employee
        sno ='emp'+str(l)

        # loop variable
        i = 0
        # intermediate dictionary
        dict2 = {}
        while i<len(fields):

                # creating dictionary for each employee
                dict2[fields[i]]= description[i]
                i = i + 1

        # appending the record of each employee to
        # the main dictionary
        dict1[sno]= dict2
        l = l + 1

# creating json file
out_file = open("test5.json", "w")
json.dump(dict1, out_file, indent = 4)
out_file.close()

其给出以下输出:

{
 "emp1": { "timestamp": "Nov", "Server": "28", "Service": "06:26:45", "Message": "server-01" },
 "emp2": { "timestamp": "Nov", "Server": "28", "Service": "06:26:45", "Message": "server-01" }
}

但我需要一个输出像:

{
"timestamp":"Nov 28 06:26:26", 
"Server":"server-01", 
"Service":"dhcpd",
"Message":"DHCPOFFER on 10.45.45.31 to cc:d3:e2:7a:b9:6b via 10.45.0.1",
}


我不知道为什么没有打印出全部数据。有人能帮我吗?

p1iqtdky

p1iqtdky1#

你的代码的问题是你使用了.split(None, 4),它只允许对输入字符串进行4次拆分。由于日期也包含空格,因此结果将是(例如,对于输入的第一行):

['Nov',         # timestamp
 '28',          # Server
 '06:26:45',    # Service
 'server-01',   # Message
 'dhcpd: DHCPDISCOVER from cc:d3:e2:7a:af:40 via 10.39.192.1']

你甚至把这个打印出来了,所以我很惊讶你没有注意到有什么不对劲。
现在,列表的第一个元素被赋值给键'timestamp',第二个元素被赋值给键'Server',以此类推,这样就得到了一个dict,如下所示:

{ "timestamp": "Nov", "Server": "28", "Service": "06:26:45", "Message": "server-01" }

相反,您希望最多拆分 * 五 * 次,拆分结果的前三个元素是时间戳。

# Don't need that extra list(), since .split() already returns a list
description = line.strip().split(None, 5) 

# Join the first three elements,
joined_timestamp = " ".join(description[:3])

# and replace them in the list
# Setting a slice of a list: See https://stackoverflow.com/q/10623302/843953
description[:3] = [joined_timestamp]

然后,您的description如下所示:

['Nov 28 06:26:45',
 'server-01',
 'dhcpd:',
 'DHCPDISCOVER from cc:d3:e2:7a:af:40 via 10.39.192.1']

并且元素fields现在对应于description中的值。
请注意,您可以将整个while i < len(fields)...循环替换为dict2 = dict(zip(fields, description))
附注:您可能需要清除description的其他元素,例如description[2] = description[2].rstrip(":"),以删除'dhcpd:'中的尾随冒号

相关问题