Pandas在csv文件中读取,并将句子拆分为多列,因为“'“字符后面的值

yk9xbfzb  于 2022-11-27  发布在  其他
关注(0)|答案(1)|浏览(114)

我有下面的字典:

{0: {0: "It's chic", 1: 'Samsung and Panasonic were overpriced.', 2: 'Others that compare are much more expensive.', 3: "Can't beat it at the price.", 4: 'I bought the more expensive case  for my 8.9 but it made my kindle very heavy.'}, 1: {0: " looks expensive but it's affordable", 1: nan, 2: nan, 3: nan, 4: nan}, 2: {0: ' what more can you want.', 1: nan, 2: nan, 3: nan, 4: nan}}

当我试着用下面的代码读这个的时候:pd.DataFrame(dict)
不幸的是,我得到的文本被分成三列,而不是只有一列。这怎么能解决呢?
编辑:最好采用以下格式阅读:

dict_2 = {0: {0: "It's chic looks expensive but it's affordable 
what more can you want.",
1: 'Samsung and Panasonic were overpriced.',
2: 'Others that compare are much more expensive.',
3: "Can't beat it at the price.",
4: 'I bought the more expensive case  for my 8.9 but it made my 
kindle very heavy.'}}

先谢谢你。

tyg4sfes

tyg4sfes1#

您可以用途:

from collections import defaultdict
dd = defaultdict(list)
for d in range (0,len(dictt)): # you can list as many input dicts as you want here
    for key, value in dictt[d].items():
        if value ==np.nan:
            pass
        else:
            dd[key].append(value)
for i in range(0,len(dd)):
    dd[i]=''.join([x for x in dd[i] if str(x) != 'nan'])
        
df=pd.DataFrame(data={'col1':dd})
print(df)
'''
    col1
0   It's chic looks expensive but it's affordable what more can you want.
1   Samsung and Panasonic were overpriced.
2   Others that compare are much more expensive.
3   Can't beat it at the price.
4   I bought the more expensive case  for my 8.9 but it made my kindle very heavy.

'''

或(更佳):

df = pd.DataFrame(dictt)
df2 = pd.Series(df.fillna('').values.tolist()).str.join('')
print(df2)
'''
    0
0   It's chic looks expensive but it's affordable what more can you want.
1   Samsung and Panasonic were overpriced.
2   Others that compare are much more expensive.
3   Can't beat it at the price.
4   I bought the more expensive case  for my 8.9 but it made my kindle very heavy.

'''

相关问题