从PDF中提取表格产生以下 Dataframe :
Date Transaction Details Withdrawals Deposits Balance
0 01-01-2020 Tx1-Description - Line1 1625.0 NaN 97994.82
1 NaN Line 2 NaN NaN NaN
2 01-01-2020 Tx2-Description - Line1 NaN 84994.82 90000.00
3 NaN Line 2 NaN NaN NaN
4 NaN Line 3 NaN NaN NaN
5 02-01-2020 Tx3-Description - Line1 71.0 NaN 84923.82
6 NaN Line 2 NaN NaN NaN
7 02-01-2020 Tx4-Description - Line1 NaN 80.00 90000.00
8 NaN Line 2 NaN NaN NaN
9 NaN Line 3 NaN NaN NaN
10 03-01-2020 Tx5-Description - Line1 100.0 NaN 85000.00
如何正确合并Transaction Details
列?
预期输出:
Date Transaction Details Withdrawals Deposits Balance
0 01-01-2020 Tx1-Description - Line1 Line 2 1625.0 NaN 97994.82
1 01-01-2020 Tx2-Description - Line1 Line 2 Line 3 NaN 84994.82 90000.00
2 02-01-2020 Tx3-Description - Line1 Line 2 71.0 NaN 84923.82
3 02-01-2020 Tx4-Description - Line1 Line 2 Line 3 NaN 80.00 90000.00
4 03-01-2020 Tx5-Description - Line1 100.0 NaN 85000.00
2条答案
按热度按时间9vw9lbht1#
IIUC,您可以使用“日期”进行
groupby
分组,然后汇总:replace(0, float('nan'))
*输出:
x4shl7ld2#
输出: