我有这个csv文件favsites.csv
:
Emails Favorite Site
batman@email.com something.com
batman@email.com hamburgers.com
poisonivy@email.com yonder.com
superman@email.com cookies.com
catgirl@email.com cattreats.com
catgirl@email.com fishcaviar.com
catgirl@email.com elegantfashion.com
joker@email.com cards.com
supergirl@email.com nailart.com
我想将重复项分组,然后合并列,然后发送到csv。
因此,分组和合并后,它应该如下所示:
Emails Favorite Site
batman@email.com something.com
hamburgers.com
poisonivy@email.com yonder.com
superman@email.com cookies.com
catgirl@email.com cattreats.com
fishcaviar.com
elegantfashion.com
joker@email.com cards.com
supergirl@email.com nailart.com
我怎样把它发送到一个csv文件中,并让它看起来像这样?但是something.com
和hamburgers.com
在一个单元格中表示 bat 侠;而cattreats.com
、fishcaviar.com
和elegantfashion.com
在一个单元格中,或者,将它们放在同一行但不同的列中,如下图所示。
Emails Favorite Site
batman@email.com something.com hamburgers.com
poisonivy@email.com yonder.com
superman@email.com cookies.com
catgirl@email.com cattreats.com fishcaviar.com elegantfashion.com
joker@email.com cards.com
supergirl@email.com nailart.com
下面是我的代码:
import pandas as pd
Dir='favsites.csv'
sendcsv='mergednames.csv'
df = pd.read_csv(Dir)
df = pd.DataFrame(df)
df_sort = df.sort_values('Emails')
grouped = df_sort.groupby(['Emails', 'Favorite Site']).agg('sum')
分组打印时,显示:
Empty DataFrame
Columns: []
Index: [(batman@email.com, hamburgers.com), (batman@email.com, something.com), (catgirl@email.com, cattreats.com), (catgirl@email.com, elegantfashion.com), (catgirl@email.com, fishcaviar.com), (joker@email.com, cards.com), (poisonivy@email.com, yonder.com), (supergirl@email.com, nailart.com), (superman@email.com, cookies.com)]
2条答案
按热度按时间idfiyjo81#
可以用空字符串替换重复值:
输出:
| 电子邮件|收藏网站|
| - ------|- ------|
| batman@email.com | something.com |
| | cookies.com |
| poisonivy@email.com | hamburgers.com |
| superman@email.com | yonder.com |
qcbq4gxm2#
IIUC,您可以将
pandas.Series.str.ljust
和pandas.DataFrame.to_csv
与(\t
)一起用作 sep:输出(* 记事本 *):
对于第二种格式,可以使用
pandas.DataFrame.groupby
输出(* 记事本 *):