反向使用Python格式字符串进行解析

z8dt9xmd  于 2023-02-11  发布在  Python
关注(0)|答案(5)|浏览(125)

我一直在使用下面的python代码将整数零件ID格式化为格式化零件编号字符串:

pn = 'PN-{:0>9}'.format(id)

我想知道是否有一种方法可以反向使用相同的格式字符串('PN-{:0>9}')从格式化的零件号中提取整数ID。如果无法做到这一点,是否有一种方法可以使用单个格式字符串(或regex?)来创建和解析?

b09cbbtk

b09cbbtk1#

parse module "与format()"相反。
示例用法:

>>> import parse
>>> format_string = 'PN-{:0>9}'
>>> id = 123
>>> pn = format_string.format(id)
>>> pn
'PN-000000123'
>>> parsed = parse.parse(format_string, pn)
>>> parsed
<Result ('123',) {}>
>>> parsed[0]
'123'
ubby3x7f

ubby3x7f2#

你可能会觉得模拟扫描很有趣。

ctzwtxfj

ctzwtxfj3#

如果你不想使用parse模块,这里有一个解决方案。它将格式字符串转换为带有命名组的正则表达式。它做了一些假设(在docstring中描述),这些假设在我的情况下是可以的,但在你的情况下可能不太好。

def match_format_string(format_str, s):
    """Match s against the given format string, return dict of matches.

    We assume all of the arguments in format string are named keyword arguments (i.e. no {} or
    {:0.2f}). We also assume that all chars are allowed in each keyword argument, so separators
    need to be present which aren't present in the keyword arguments (i.e. '{one}{two}' won't work
    reliably as a format string but '{one}-{two}' will if the hyphen isn't used in {one} or {two}).

    We raise if the format string does not match s.

    Example:
    fs = '{test}-{flight}-{go}'
    s = fs.format('first', 'second', 'third')
    match_format_string(fs, s) -> {'test': 'first', 'flight': 'second', 'go': 'third'}
    """

    # First split on any keyword arguments, note that the names of keyword arguments will be in the
    # 1st, 3rd, ... positions in this list
    tokens = re.split(r'\{(.*?)\}', format_str)
    keywords = tokens[1::2]

    # Now replace keyword arguments with named groups matching them. We also escape between keyword
    # arguments so we support meta-characters there. Re-join tokens to form our regexp pattern
    tokens[1::2] = map(u'(?P<{}>.*)'.format, keywords)
    tokens[0::2] = map(re.escape, tokens[0::2])
    pattern = ''.join(tokens)

    # Use our pattern to match the given string, raise if it doesn't match
    matches = re.match(pattern, s)
    if not matches:
        raise Exception("Format string did not match")

    # Return a dict with all of our keywords and their values
    return {x: matches.group(x) for x in keywords}
wfsdck30

wfsdck304#

不如这样:

id = int(pn.split('-')[1])

这将在破折号处拆分零件代号,获取第二个部分并将其转换为整数。
另外,我保留了id作为变量名,这样与你的问题的联系就很清楚了。重命名这个变量是一个好主意,它不会隐藏内置函数。

vbkedwbf

vbkedwbf5#

使用lucidity

import lucidty

template = lucidity.Template('model', '/jobs/{job}/assets/{asset_name}/model/{lod}/{asset_name}_{lod}_v{version}.{filetype}')

path = '/jobs/monty/assets/circus/model/high/circus_high_v001.abc'
data = template.parse(path)
print(data)

# Output 
#   {'job': 'monty', 
#    'asset_name': 'circus',
#    'lod': 'high', 
#    'version': '001', 
#    'filetype': 'abc'}

相关问题