regex 如何查询字符串列表下的特定字符串并检索值

rks48beu 于 2023-05-23 发布在其他

关注(0)|答案(3)|浏览(133)

给定一个数据文件，我需要在给定的a列表下查询一个特定的值，并提取与之关联的值
假设我的数据文件看起来像

Surface name: wing

Total CL    (   86%):    0.994313 | Pressure (  100%):    0.994348 | Friction (    0%):   -0.000035 | Momentum (    0%):    0.000000

Surface name: body

Total CL    (    1%):    0.018554 | Pressure (   99%):    0.018535 | Friction (    0%):    0.000019 | Momentum (    0%):    0.000000

我需要在每个曲面名称下查询Total CL的值并提取该值。我需要最终输出为

Surface name  Total CL
wing           0.9943
body           0.0185

我的新手尝试是使用regex表达式查询Total CL的值，如下所示

import os
import re
import shutil

SurfaceList=[wing,body]
CL=[]

# I need to query for Total CL for each of the elements of the list

regexp1=re.compile(r'Total CL: .*?([0-9.-]+)')
for surface in SurfaceList:
   with open(file) as f:
        for line in f:
            match1 = regexp1.match(line)
            if (match1):
                CL.append(match2.group(1))

然而，这只查询出现的第一个示例，并停止进一步的查询，我无法转到列表的其他元素。

regex

来源：https://stackoverflow.com/questions/76312221/how-to-query-for-a-particular-string-under-a-list-of-string-and-retrieve-the-val

3条答案

按热度按时间

klsxnrf11#

你的代码中有很多错误：

什么是wing和body？您是否在其他地方定义了此类变量？
什么是match2？你是说match1吗
re.match不是您要搜索的函数，因为它查找的是 exact 匹配。re.search是正确的。
你的正则表达式是错误的！如果文件中没有冒号，为什么要在Total CL后面使用冒号？此外，您正在搜索它后面的第一个数字，但第一个数字是百分比！您应该使用类似^Total CL.*?([0-9]+\.[0-9]+\b(?!%))的代码，其中\b是单词边界，(?!%)表示后面没有%。

我相信这就是你需要的：

import os
import re
import shutil

file = "my_file.txt"
SurfaceList=["wing","body"]
CL=[]

regexp1=re.compile(r'^Surface name:[ \t]+(\w+)')
regexp2=re.compile(r'^Total CL.*?([0-9]+\.[0-9]+\b(?!%))')
with open(file) as f :
    surface = None
    for line in f :
        src1 = regexp1.search(line)
        if src1 :
            surface = src1.group(1)
        elif surface in SurfaceList :
            src2 = regexp2.search(line)
            if src2 :
                CL.append((surface, src2.group(1)))
print (CL)
#[('wing', '0.994313'), ('body', '0.018554')]

或者，如果你可以通过一个read调用打开你的文件，你可以使用re.findall：

import os
import re
import shutil

file = "my_file.txt"
SurfaceList=["wing","body"]
CL=[]

regexp1=re.compile(r'(?m)^Surface name:[ \t]+(\w+)\s+^Total CL.*?([0-9]+\.[0-9]+\b(?!%))')
with open(file) as f :
    whole_text = f.read()
    CL = regexp1.findall(whole_text)
print (CL)
#[('wing', '0.994313'), ('body', '0.018554')]

赞(0）回复(0）举报 2023-05-23

0yycz8jy2#

可以从文件中导出曲面名称。
要提取的值是以下字符之间的第一个字符序列：（冒号）和|（管道）以“Total CL”开头的行
因此：

import re

result = dict()
surface = None
pattern = re.compile(r'(?<=\:).+?(?=\|)')

with open('/Volumes/G-Drive/data.txt') as data:
    for line in map(str.strip, data):
        if line.startswith('Surface'):
            surface = line.split()[-1]
        elif surface and line.startswith('Total CL'):
            result[surface] = float(pattern.findall(line)[0])

print('Surface name  Total CL')

for k, v in result.items():
    print(f'{k:<15}{v}')

输出：

Surface name  Total CL
wing           0.994313
body           0.018554

赞(0）回复(0）举报 2023-05-23

s4n0splo3#

你知道名字和值的位置。所以你可以不使用regex。下面是一个例子：

data = """Surface name: wing

Total CL    (   86%):    0.994313 | Pressure (  100%):    0.994348 | Friction (    0%):   -0.000035 | Momentum (    0%):    0.000000

Surface name: body

Total CL    (    1%):    0.018554 | Pressure (   99%):    0.018535 | Friction (  
"""

name = ''
for line in data.split('\n\n'):
    if line.startswith('Surface name: '):
        name = line[14:]
        print('Surface name  Total CL')
        continue

    total = line[25:31]
    # or:
    # total = round(float(line[25:33]), 4)
    print(f'{name}           {total}')

赞(0）回复(0）举报 2023-05-23

我来回答

regex 如何查询字符串列表下的特定字符串并检索值

3条答案

相关问题

热门标签

最新问答