将包含冒号的文本文件转换为JSON

yk9xbfzb  于 2023-08-08  发布在  其他
关注(0)|答案(6)|浏览(138)

现有代码

import json
filename = 'thunar-volman/debian/control'
dict1 = {}
with open(filename) as fh:
    for line in fh:
        print(line)
        command, description = line.strip().split(': ')
        print(command)
        print(description)
        dict1[command.strip()] = description.strip()

with open("test.json", "w") as out_file:
    json.dump(dict1, out_file, indent=4, sort_keys = False)

字符串
错误类型

Build-Depends
debhelper-compat (= 13),
               intltool,

Traceback (most recent call last):
  File "read.py", line 7, in <module>
    command, description = line.strip().split(': ')
ValueError: not enough values to unpack (expected 2, got 1)


我打算处理成json的文本文件在这里-https://salsa.debian.org/xfce-team/goodies/thunar-volman/-/blob/debian/master/debian/control
如何处理内容,使Build-Depends冒号后面的内容作为Build-Dependscommanddescription进行处理。
任何帮助将非常感谢,因为我是一个新的JSON。

kx7yvsdv

kx7yvsdv1#

你的问题很简单,我在不到五分钟的时间里就为它写了工作代码,一气呵成。
您有一些表示Map的行,这些行可以包含冒号,包含冒号的行表示新键值对的开始。
键位于冒号的左侧,值可以跨多行。
我们可以分配一个名为key的变量,将其初始化为None。然后,我们循环遍历这些行,对于每一行,如果我们找到一个冒号并且第一个字符不是空格,那么我们就找到了一个新的键值对。
如果key不是None,我们将之前的键值对添加到字典中。然后,我们设置要记住的当前键值对,并在以后的迭代中使用它们。
然后,如果该行不是空的,也不是一个新对的开始,它是前一个值的延续,我们把它加到前一个值上。
这样我们就可以正确地处理所有项目,但我们会错过最后一个项目。
我们可以稍后添加它。
代码:

import json

lines = """Source: thunar-volman
Section: xfce
Priority: optional
Maintainer: Debian Xfce Maintainers <debian-xfce@lists.debian.org>
Uploaders: Yves-Alexis Perez <corsac@debian.org>
Build-Depends: debhelper-compat (= 13),
               intltool,
               libexo-2-dev,
               libgtk-3-dev,
               libgudev-1.0-dev,
               libnotify-dev,
               libxfce4ui-2-dev,
               libxfce4util-dev,
               libxfconf-0-dev,
               xfce4-dev-tools (>= 4.16)
Rules-Requires-Root: no
Standards-Version: 4.6.1
Vcs-Git: https://salsa.debian.org/xfce-team/goodies/thunar-volman.git
Vcs-Browser: https://salsa.debian.org/xfce-team/goodies/thunar-volman
Homepage: https://docs.xfce.org/xfce/thunar/thunar-volman

Package: thunar-volman
Architecture: linux-any
Depends: exo-utils, thunar, ${misc:Depends}, ${shlibs:Depends}
Description: Thunar extension for volumes management
 The Thunar Volume Manager is an extension for the Thunar file manager, which
 enables automatic management of removable drives and media.
""".splitlines()

key = None
dic = {}
for line in lines:
    line = line.strip()
    if ':' in line and line[0] != ' ':
        if key:
            dic[key] = cache[0] if len(cache) == 1 else cache
        key, cache = line.split(':', 1)
        key, cache = key.strip(), [cache.strip()]
    elif line:
        cache.append(line)

dic[key] = cache[0] if len(cache) == 1 else cache
print(json.dumps(dic, indent=4, ensure_ascii=False))

个字符

neekobn8

neekobn82#

你可以使用retext包含你的问题中的字符串)(regex101):

import re

out = dict(re.findall(r"^(\S+):\s*(.*?)(?=^\S*:|\Z)", text, flags=re.M | re.S))
print(out)

字符串
印刷品:

{
    "Source": "thunar-volman\n",
    "Section": "xfce\n",
    "Priority": "optional\n",
    "Maintainer": "Debian Xfce Maintainers <debian-xfce@lists.debian.org>\n",
    "Uploaders": "Yves-Alexis Perez <corsac@debian.org>\n",
    "Build-Depends": "debhelper-compat (= 13),\n               intltool,\n               libexo-2-dev,\n               libgtk-3-dev,\n               libgudev-1.0-dev,\n               libnotify-dev,\n               libxfce4ui-2-dev,\n               libxfce4util-dev,\n               libxfconf-0-dev,\n               xfce4-dev-tools (>= 4.16)\n",
    "Rules-Requires-Root": "no\n",
    "Standards-Version": "4.6.1\n",
    "Vcs-Git": "https://salsa.debian.org/xfce-team/goodies/thunar-volman.git\n",
    "Vcs-Browser": "https://salsa.debian.org/xfce-team/goodies/thunar-volman\n",
    "Homepage": "https://docs.xfce.org/xfce/thunar/thunar-volman\n\n",
    "Package": "thunar-volman\n",
    "Architecture": "linux-any\n",
    "Depends": "exo-utils, thunar, ${misc:Depends}, ${shlibs:Depends}\n",
    "Description": "Thunar extension for volumes management\n The Thunar Volume Manager is an extension for the Thunar file manager, which\n enables automatic management of removable drives and media.",
}

axkjgtzd

axkjgtzd3#

你的档案是yaml
要使用yaml文件,您需要库ruamel.yaml

安装

pip install ruamel.yaml

字符串

加载文件并转换为json编写json

from ruamel.yaml import YAML
import json
yaml=YAML(typ='safe')   # default, if not specfied, is 'rt' (round-trip)
filename = 'thunar-volman/debian/control'
dict1 = {}
with open(filename) as fh:
    dict1 = yaml.load(filename)
with open("test.json", "w") as out_file:
    json.dump(dict1, out_file, indent=4, sort_keys = False)

bpsygsoo

bpsygsoo4#

使用maxsplitPython String split() Method

from pathlib import Path
import json

with Path('control').open() as file:   # be sure to adjust your path
    commands, description = {}, ''
    for line in file:
        if not line.strip(): continue  # if line is blank continue loop
        if ':' in line:
            # use 'maxsplit=1' to split only the first occurance of ':'
            command, description = list(map(str.strip, line.split(':', maxsplit=1)))
        else:
            # if not ':' in line append line to description
            description += ' %s' % line.strip()
        commands[command] = description

with Path('test.json') as file: file.write_text(json.dumps(commands, indent=4))

字符串
产出:

{
    "Source": "thunar-volman",
    "Section": "xfce",
    "Priority": "optional",
    "Maintainer": "Debian Xfce Maintainers <debian-xfce@lists.debian.org>",
    "Uploaders": "Yves-Alexis Perez <corsac@debian.org>",
    "Build-Depends": "debhelper-compat (= 13), intltool, libexo-2-dev, libgtk-3-dev, libgudev-1.0-dev, libnotify-dev, libxfce4ui-2-dev, libxfce4util-dev, libxfconf-0-dev, xfce4-dev-tools (>= 4.16)",
    "Rules-Requires-Root": "no",
    "Standards-Version": "4.6.1",
    "Vcs-Git": "https://salsa.debian.org/xfce-team/goodies/thunar-volman.git",
    "Vcs-Browser": "https://salsa.debian.org/xfce-team/goodies/thunar-volman",
    "Homepage": "https://docs.xfce.org/xfce/thunar/thunar-volman",
    "Package": "thunar-volman",
    "Architecture": "linux-any",
    "Depends": "exo-utils, thunar, ${misc:Depends}, ${shlibs:Depends}",
    "Description": "Thunar extension for volumes management The Thunar Volume Manager is an extension for the Thunar file manager, which enables automatic management of removable drives and media."
}

hmmo2u0o

hmmo2u0o5#

你可以这样做:

import json
filename = 'thunar-volman/debian/control'
dict1 = {}
command = ""
description = ""
with open(filename) as fh:
    for line in fh:
        print(line)
        if line[0] == " ":
            description = line.strip()
            dict1[command.strip()].append(description.strip())
        else:
            command, description = line.strip().split(': ')
            dict1[command.strip()] = description.strip()

with open("test.json", "w") as out_file:
    json.dump(dict1, out_file, indent=4, sort_keys = False)

字符串
因此,基本上,如果在开始处有一个白色,那么它将在前一个命令中追加新项。

7rfyedvj

7rfyedvj6#

打开文件并逐行读取。忽略空行。在冒号上拆分,检查标记的数量。确保输入数据的完整性

from json import dumps

CONTROL = '/Volumes/G-Drive/control'

jdata = {}

with open(CONTROL) as control:
    previous_key = None
    for line in control:
        if len(sline := line.strip()) > 0: # make sure to skip blank lines
            if line[0].isspace():
                if previous_key is not None:
                    # you may not want the newline prefix
                    jdata[previous_key] += '\n' + sline
                else:
                    raise ValueError('Line has leading whitespace but no previous keyword')
            elif len(tokens := sline.split(':', 1)) == 2: # note second argument to split()
                # looks like a normal keyword and value
                key, value = tokens
                jdata[key] = value.lstrip()
                previous_key = key
            else:
                raise ValueError(f'Cannot understand "{line.rstrip()}"')

print(dumps(jdata, indent=2))

字符串

输出:

{
  "Source": "thunar-volman",
  "Section": "xfce",
  "Priority": "optional",
  "Maintainer": "Debian Xfce Maintainers <debian-xfce@lists.debian.org>",
  "Uploaders": "Yves-Alexis Perez <corsac@debian.org>",
  "Build-Depends": "debhelper-compat (= 13),\nintltool,\nlibexo-2-dev,\nlibgtk-3-dev,\nlibgudev-1.0-dev,\nlibnotify-dev,\nlibxfce4ui-2-dev,\nlibxfce4util-dev,\nlibxfconf-0-dev,\nxfce4-dev-tools (>= 4.16)",
  "Rules-Requires-Root": "no",
  "Standards-Version": "4.6.1",
  "Vcs-Git": "https://salsa.debian.org/xfce-team/goodies/thunar-volman.git",
  "Vcs-Browser": "https://salsa.debian.org/xfce-team/goodies/thunar-volman",
  "Homepage": "https://docs.xfce.org/xfce/thunar/thunar-volman",
  "Package": "thunar-volman",
  "Architecture": "linux-any",
  "Depends": "exo-utils, thunar, ${misc:Depends}, ${shlibs:Depends}",
  "Description": "Thunar extension for volumes management\nThe Thunar Volume Manager is an extension for the Thunar file manager, which\nenables automatic management of removable drives and media."
}

相关问题