从CSV文件中删除多余的分隔符

bjp0bcyl  于 2022-12-15  发布在  其他
关注(0)|答案(1)|浏览(170)

我正在用Python 3清理一个CSV文件,有时候每行有四个条目,由于某种原因,datalogger没有插入新行,这种情况周期性地发生,不知道为什么。
因此,我尝试从CSV中删 debugging 误字符,并取得了成功,但行中有四个条目,而不是两个,我想找到分隔符,并将其替换为新行。
听起来很简单,但我没有 * 代码-福 *,我想知道是否有人可以帮助。:)谢谢

import csv 

with open('outty1.csv', 'w', newline='') as outcsv:
    writer = csv.writer(outcsv)
    writer.writerow(["Date", "Temperature", "Humidity"])

text = open("temperature.csv", "r")
text = ''.join([i for i in text]) \
    .replace("ÿÿ", ",")

for i in text:
    if i.count(',')>1:
        text.replace(",", "/n")

x = open("outty1.csv","a")
x.writelines(text)
x.close()

解析前的温度日志样本:)。

1629881977,24.27
1629882037,24.28ÿÿ1629882097,24.29
1629882157,24.31ÿÿ1629882217,23.52
1629882277,23.38ÿÿ1629882337,23.72
1629882397,23.87ÿÿ1629882457,23.92
1629882517,23.98ÿÿ1629882577,24.02
1629882637,24.08ÿÿ1629882697,24.12
1629882757,24.15
1629882817,24.19
1629882877,24.24
1629882937,24.31
1629882997,24.36
1629883057,24.40
1629883117,24.44
1629883177,24.38
1629883237,24.50
1629883298,24.60
1629883358,24.72
1629883418,24.88
1629883478,25.05
1629883538,25.23
1629883598,25.42
1629883658,25.63ÿÿ1629883718,25.85
1629883778,26.08ÿÿ1629883838,26.31
1629883898,26.53ÿÿ1629883958,26.74
1629884018,26.96ÿÿ1629884078,27.12
1629884138,27.26ÿÿ1629884198,27.38
1629884258,27.48ÿÿ1629884318,27.56
1629884378,27.63ÿÿ1629884438,27.69
1629884498,27.73.

这是我运行程序后的进度

Date,Temperature,Humidity
1629881977,24.27
1629882037,24.28,1629882097,24.29
1629882157,24.31,1629882217,23.52
1629882277,23.38,1629882337,23.72
1629882397,23.87,1629882457,23.92
1629882517,23.98,1629882577,24.02
1629882637,24.08,1629882697,24.12
1629882757,24.15
1629882817,24.19
1629882877,24.24
1629882937,24.31
1629882997,24.36
1629883057,24.40
1629883117,24.44
1629883177,24.38
1629883237,24.50
1629883298,24.60
1629883358,24.72
1629883418,24.88
1629883478,25.05
1629883538,25.23
1629883598,25.42
1629883658,25.63,1629883718,25.85
1629883778,26.08,1629883838,26.31
1629883898,26.53,1629883958,26.74
1629884018,26.96,1629884078,27.12
1629884138,27.26,1629884198,27.38
1629884258,27.48,1629884318,27.56
1629884378,27.63,1629884438,27.69
1629884498,27.73

和固定的样本输出,我看到了答案一旦我粘贴输入和比较输出LOL:)

Date,Temperature,Humidity
1629881977,24.27
1629882037,24.28
1629882097,24.29
1629882157,24.31
1629882217,23.52
1629882277,23.38
1629882337,23.72
1629882397,23.87
1629882457,23.92
1629882517,23.98
1629882577,24.02
1629882637,24.08
1629882697,24.12
1629882757,24.15
1629882817,24.19
1629882877,24.24
1629882937,24.31
1629882997,24.36
1629883057,24.40
1629883117,24.44
1629883177,24.38
1629883237,24.50
1629883298,24.60
1629883358,24.72
1629883418,24.88
1629883478,25.05
1629883538,25.23
1629883598,25.42
1629883658,25.63
1629883718,25.85
1629883778,26.08
1629883838,26.31
1629883898,26.53
1629883958,26.74
1629884018,26.96
1629884078,27.12
1629884138,27.26
1629884198,27.38
1629884258,27.48
1629884318,27.56
1629884378,27.63
1629884438,27.69
1629884498,27.73
1629884558,27.75

旧法典

text = ''.join([i for i in text]) \
    .replace("ÿÿ", ",")

新法典

text = ''.join([i for i in text]) \
    .replace("ÿÿ", "\n")
yacmzcpb

yacmzcpb1#

如果您正在寻找另一种选择,这里有一些可以尝试:

data = """1629881977,24.27
1629882037,24.28ÿÿ1629882097,24.29
1629882757,24.15
"""

# spliting on the garbage chars has a side-effect of removing them
a = data.split('ÿÿ')

# then simply join() to reassemble the original data
b = '\n'.join(a)

或者,作为一行程序:

fixed_data = '\n'.join(data.split('ÿÿ'))

相关问题