python 我如何在一个列表列表中按空格和引号分割元素?

sq1bmfud  于 2023-01-01  发布在  Python
关注(0)|答案(6)|浏览(251)

我有一个数据列表df1,它由三个列表字符串组成:

df1 = [
    ['1 "P040" 68.13 "P040_1" 2.55 8'],
    ['2 "P040" 46.82 "P040_2" 2.53 8'],
    ['3 "P040" 46.82 "P040_3" 2.51 8']
]

我想把它转换成下面的列表df2,不带双引号("):

df2 = [
    ['1', 'P040', '68.13', 'P040_1', '2.55', '8'],
    ['2', 'P040', '46.82', 'P040_2', '2.53', '8'],
    ['3', 'P040', '46.82', 'P040_3', '2.51', '8']
]

我尝试了以下方法,但效果不佳

for row in df1:
    for elem in row:
        elem.strip().split('"')
        elem.strip().split('"')
cigdeys3

cigdeys31#

这里有一个简单的方法,你可以替换掉不需要的引号,然后用一个空格分割,使用列表解析得到一个列表:

df2 = [''.join(row).replace('"', '').split(" ") for row in df1]

print(df2)

输出:

[['1', 'P040', '68.13', 'P040_1', '2.55', '8'],
['2', 'P040', '46.82', 'P040_2', '2.53', '8'],
['3', 'P040', '46.82', 'P040_3', '2.51', '8']]
xv8emn3q

xv8emn3q2#

因为list中的每个list只包含一个元素,所以你不需要运行2个for循环,这可以用一行列表解析来解决:

df1 = [
['1 "P040" 68.13 "P040_1" 2.55 8'],
['2 "P040" 46.82 "P040_2" 2.53 8'],
['3 "P040" 46.82 "P040_3" 2.51 8']
]

df2 = [row[0].replace('"', '').split(' ') for row in df1]

print(df2)

>>> [['1', 'P040', '68.13', 'P040_1', '2.55', '8'],
     ['2', 'P040', '46.82', 'P040_2', '2.53', '8'],
     ['3', 'P040', '46.82', 'P040_3', '2.51', '8']]
ttp71kqs

ttp71kqs3#

你可以通过嵌套列表解析来实现它:

df2 = [[item.strip('""') for item in elem.split()] for row in df1 for elem in row]

或具有列表解析的嵌套循环:

df2 = []
for row in df1:
    for elem in row:
        df2.append([item.strip('""') for item in elem.split()])

输出:

['1', 'P040', '68.13', 'P040_1', '2.55', '8']
['2', 'P040', '46.82', 'P040_2', '2.53', '8']
['3', 'P040', '46.82', 'P040_3', '2.51', '8']
pgx2nnw8

pgx2nnw84#

df1 = [['1 "P040" 68.13 "P040_1" 2.55 8'],
       ['2 "P040" 46.82 "P040_2" 2.53 8'],
       ['3 "P040" 46.82 "P040_3" 2.51 8']]

df1 = [[v.replace(' ','').split('"') for v in l] for l in df1]
print(df1)
7uzetpgm

7uzetpgm5#

您可以组合使用split()strip()函数来拆分引号和空格。
此示例将拆分df1中的元素并创建一个新的df2列表:

df2 = []
for row in df1:
    new_row = []
    for elem in row[0].split('"'):
        new_row.extend(elem.strip().split())
    df2.append(new_row)

print(df2)
ia2d9nvy

ia2d9nvy6#

您可以使用shlex.split()在'words'上进行拆分,其中单词可能是带引号的字符串,如下所示:

import shlex

for i in range(len(df1)):
    df1[i] = shlex.split(df1[i][0])

这是假设你的每个列表项都是一个包含字符串的列表。这修改了df1,创建了一个新的'df2',它将是:

import shlex

df2 = []
for row in df1:
    df2.append(shlex.split(df1[i][0]))

输出将为:

[
  ['1', 'P040', '68.13', 'P040_1', '2.55', '8'],
  ['2', 'P040', '46.82', 'P040_2', '2.53', '8'],
  ['3', 'P040', '46.82', 'P040_3', '2.51', '8']
]

使用shlex.split()的优点是,带引号的字符串可以包含空格,而不会产生普通的'split()'无法解决的问题。

相关问题