pandas 使用原始 Dataframe 中的值迭代替换 Dataframe 中的每个单元格

jjjwad0x 于 2023-04-18 发布在其他

关注(0)|答案(3)|浏览(113)

下面是一个示例dataframe：

我需要能够获得“2023-01-01”* 编辑：（一串随机数，不是真正的Date对象）* 和“Python太棒了”，并通过函数发送（do_calculations(date, phrase)），它将返回一个新值，然后我将通过一个函数发送“2023-01-01”和“Is the pizza”，新的返回值将被放在“Is the pizza”的位置。最后，我会得到“2023-01-01”和“披萨”，并做同样的事情。
然后，我将沿着列向下，对“2023-01-02”进行相同的操作，然后是“2023-01-03”，依此类推，直到所有单元格都被替换。
我试过以下方法：

for i, row in new_df.iterrows():
    print('index: ', i)
    print('row: ', row['Date'], row['Title1'], row.index)
    if row['Title1']:
        text = do_calculations(row['Date'], row['Title1'][0])
        #print("TEXT:", text)
        value = new_df.at[i, row.index[1]]
        print("VALUE:", value)
        
        new_df.at[i, row.index[2]] = text

但是不能让它工作。我想这里需要另一个for循环，并且更好地使用i索引。
无论是生成新的 Dataframe ，还是就地更新 Dataframe ，都不重要，无论哪个更快都是优选的。
下面是生成示例 Dataframe 的代码：

import pandas as pd
import random
import datetime

# Create a list of dates
date_rng = pd.date_range(start='1/1/2023', end='1/10/2023', freq='D')

# Generate random phrases
phrases = ['Hello world', 'Python is awesome', None, 'Data science is fun', 'I love coding', 'Pandas is powerful', 'Pineapples', 'Pizza', 'Krusty', 'krab', 'Is the pizza']

# Create an empty DataFrame
df = pd.DataFrame(columns=['Date', 'title1', 'title2', 'title3'])

# Populate DataFrame with random phrases
for date in date_rng:
    # Generate random phrases for each column
    row = [date]
    row.extend(random.sample(phrases, 3))
    
    # Append row to DataFrame
    df = df.append(pd.Series(row, index=df.columns), ignore_index=True)

# Print DataFrame
print(df)

edit：我已经澄清了传递的参数之一是一个数字字符串，而不是一个真正的日期对象，大多数答案似乎都考虑到了这一点。

pandas

来源：https://stackoverflow.com/questions/76019862/iteratively-replace-every-cell-in-a-dataframe-using-values-from-the-original-dat

3条答案

按热度按时间

qzlgjiam1#

IIUC，你可以用两个循环来做以下操作

for i, row in new_df.iterrows():
    for col in ['title1', 'title2', 'title3']:
        if row[col]:
            text = do_calculations(row['Date'], row[col])
            new_df.loc[i, col] = text

赞(0）回复(0）举报 2023-04-18

zpf6vheq2#

下面是通过使用apply逐个单元调用函数来完成您所要求的操作的方法：

df[df.columns[1:]] = (
    df.apply(lambda row: [do_calculations(row.Date.date(), val) 
    for val in row[1:]], axis=1, result_type='expand') )

示例功能：

callnum = [0]
def do_calculations(date, phrase):
    callnum[0] += 1
    return f'{phrase} {date} {callnum[0]}'

输出：

Date                            title1                          title2                             title3
0  2023-01-01 00:00:00               Krusty 2023-01-01 1      I love coding 2023-01-01 2    Pandas is powerful 2023-01-01 3
1  2023-01-02 00:00:00                 krab 2023-01-02 4  Python is awesome 2023-01-02 5   Data science is fun 2023-01-02 6
2  2023-01-03 00:00:00                 None 2023-01-03 7      I love coding 2023-01-03 8                 Pizza 2023-01-03 9
3  2023-01-04 00:00:00   Python is awesome 2023-01-04 10        Pineapples 2023-01-04 11                 krab 2023-01-04 12
4  2023-01-05 00:00:00       I love coding 2023-01-05 13        Pineapples 2023-01-05 14                 krab 2023-01-05 15
5  2023-01-06 00:00:00               Pizza 2023-01-06 16              None 2023-01-06 17               Krusty 2023-01-06 18
6  2023-01-07 00:00:00          Pineapples 2023-01-07 19             Pizza 2023-01-07 20    Python is awesome 2023-01-07 21
7  2023-01-08 00:00:00        Is the pizza 2023-01-08 22        Pineapples 2023-01-08 23                 None 2023-01-08 24
8  2023-01-09 00:00:00  Pandas is powerful 2023-01-09 25             Pizza 2023-01-09 26  Data science is fun 2023-01-09 27
9  2023-01-10 00:00:00  Pandas is powerful 2023-01-10 28              None 2023-01-10 29           Pineapples 2023-01-10 30

或者，如果你的函数可以为日期和短语设置Series参数的日期，你可以在逐列的基础上使用apply：

df[df.columns[1:]] = ( df[df.columns[1:]]
    .apply(lambda col: vect_calculations(df.Date.infer_objects(), col)) )

示例功能：

vect_callnum = [0]
def vect_calculations(date, phrase):
    vect_callnum[0] += 1
    return phrase + ' ' + date.astype(str) + f' {vect_callnum[0]}'

输出：

Date                            title1                            title2                            title3
0  2023-01-01 00:00:00          Hello world 2023-01-01 1               Krusty 2023-01-01 2                 krab 2023-01-01 3
1  2023-01-02 00:00:00         Is the pizza 2023-01-02 1               Krusty 2023-01-02 2                               NaN
2  2023-01-03 00:00:00          Hello world 2023-01-03 1    Python is awesome 2023-01-03 2                 krab 2023-01-03 3
3  2023-01-04 00:00:00           Pineapples 2023-01-04 1               Krusty 2023-01-04 2    Python is awesome 2023-01-04 3
4  2023-01-05 00:00:00          Hello world 2023-01-05 1               Krusty 2023-01-05 2         Is the pizza 2023-01-05 3
5  2023-01-06 00:00:00           Pineapples 2023-01-06 1  Data science is fun 2023-01-06 2    Python is awesome 2023-01-06 3
6  2023-01-07 00:00:00   Pandas is powerful 2023-01-07 1          Hello world 2023-01-07 2                Pizza 2023-01-07 3
7  2023-01-08 00:00:00           Pineapples 2023-01-08 1                               NaN  Data science is fun 2023-01-08 3
8  2023-01-09 00:00:00  Data science is fun 2023-01-09 1                Pizza 2023-01-09 2                               NaN
9  2023-01-10 00:00:00                Pizza 2023-01-10 1         Is the pizza 2023-01-10 2                 krab 2023-01-10 3

请注意，在上面的第一个解决方案中，函数调用的数量（如输出df中可见的callnum的值所指示的）等于单元格的数量，而在第二个解决方案中，对于每个短语列，函数仅被调用一次（如输出中的vect_callnum的值所指示的）。

赞(0）回复(0）举报 2023-04-18

js5cn81o3#

如果您需要矢量化方法，一个可能的解决方案是：

def f(x):
    return x

df.iloc[:,1:] = f(df.iloc[:,1:].values)

赞(0）回复(0）举报 2023-04-18

我来回答

pandas 使用原始 Dataframe 中的值迭代替换 Dataframe 中的每个单元格

3条答案

相关问题

热门标签

最新问答