numpy 如何重置数据框列并替换缺少的值使用pandas python追加0

fiei3ece  于 2023-08-05  发布在  Python
关注(0)|答案(3)|浏览(80)

我有Pandas数据框看起来像这样

data = {
    'sec': [35,36,38,0,1,2,3,4,5,9],
    'rpm': [40.5,41.6,56,56.8,67,89,90,91,102,123]
}

df = pd.DataFrame(data)

字符串
我希望在这个rpm列中的输出如下36,38缺少37个值我想追加37个值并在rpm的位置0在37和第8索引5到9缺少6,7,8个值附加缺少值并在rpm的位置0,0,0

data = {
    'sec': [35,36,37,38,0,1,2,3,4,5,6,7,8,9],
    'rpm': [40.5,41.6,0,56,56.8,67,89,90,91,102,0,0,0,123]
}
df1 = pd.DataFrame(data)
finally my expected output is sec column has to reset like this
data = {
    'sec': [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
    'rpm': [40.5,41.6,0,56,56.8,67,89,90,91,102,0,0,0,123]
}
df2 = pd.DataFrame(data)

的数据
最后我的预期输出是sec列必须重置像这样如何做友好帮助我通过这个

9ceoxa92

9ceoxa921#

这里有一个方法:

df1 = (df.groupby(df['sec'].diff().lt(0).cumsum(),group_keys=False)
.apply(lambda x: x.set_index('sec').reindex(range(x['sec'].min(),x['sec'].max() + 1),fill_value=0))
.reset_index())

df2 = df1.assign(sec = df1.index+1)

字符串
输出量:

sec    rpm
0     1   40.5
1     2   41.6
2     3    0.0
3     4   56.0
4     5   56.8
5     6   67.0
6     7   89.0
7     8   90.0
8     9   91.0
9    10  102.0
10   11    0.0
11   12    0.0
12   13    0.0
13   14  123.0

bqujaahr

bqujaahr2#

以下是获得问题描述的结果的方法:

df['start'] = df.sec.shift() > df.sec
df.loc[df.start.index[0], 'start'] = True
df['end'] = df.start.shift(-1, fill_value=True)
secFull = list(chain(*map(lambda x, y: list(range(x, y)), df.loc[df.start].sec, df.loc[df.end].sec + 1)))
df = df.set_index('sec').reindex(secFull, fill_value=0).reset_index().drop(columns=['start','end']).assign(sec=list(range(1, len(secFull) + 1)))

字符串
输出量:

sec    rpm
0     1   40.5
1     2   41.6
2     3    0.0
3     4   56.0
4     5   56.8
5     6   67.0
6     7   89.0
7     8   90.0
8     9   91.0
9    10  102.0
10   11    0.0
11   12    0.0
12   13    0.0
13   14  123.0


说明:

  • 使用True填充列startend,True标记列sec中每个非递减子序列的边界
  • 创建列表secFull,该列表包括列sec中的每个这样的值的子序列的全整数范围的级联
  • 删除中间列(startend)并使用reindex()扩展输入中的行数,以便为缺失的sec值腾出空间,并使用0填充其对应的rpm
  • sec替换为从1开始的整数序列。
jxct1oxe

jxct1oxe3#

在Python中使用pandas,您可以使用**reindex()fillna()**方法来清除DataFrame的'sec'列中的值,并将任何缺失的值替换为0。下面是如何实现所需转换的示例:

import pandas as pd

data = {
    'sec': [35, 36, 38, 0, 1, 2, 3, 4, 5, 9],
    'rpm': [40.5, 41.6, 56, 56.8, 67, 89, 90, 91, 102, 123]
}

df = pd.DataFrame(data)

#Find the'sec' column's missing values.
missing_sec = set(range(df['sec'].min(), df['sec'].max() + 1)) - set(df['sec'])

#Make a new DataFrame with appended missing values.
df_missing = pd.DataFrame({'sec': list(missing_sec), 'rpm': 0})

#Add the DataFrame with the missing values to the original one.
df_concat = pd.concat([df, df_missing]).sort_values('sec')

# Reset the DataFrame's index.
df_concat.reset_index(drop=True, inplace=True)

# Use 0s to fill in any missing values in the 'rpm' column.
df_concat['rpm'].fillna(0, inplace=True)

# Clear the'sec' column
df_concat['sec'] = range(1, len(df_concat) + 1)

#Print the completed DataFrame.
print(df_concat)

----->Output:
    sec    rpm
0     1   40.5
1     2   41.6
2     3    0.0
3     4   56.0
4     5   56.8
5     6   67.0
6     7   89.0
7     8   90.0
8     9   91.0
9    10  102.0
10   11    0.0
11   12    0.0
12   13    0.0
13   14  123.0

字符串

相关问题