使用Python脚本将带注解的csv(influxdb)转换为普通csv文件

kkih6yb8  于 2023-04-03  发布在  Python
关注(0)|答案(1)|浏览(201)

我有一个从InfluxDB UI下载的CSV文件。我想从下载的文件中提取有用的数据。下载文件的片段如下:

#group  FALSE   FALSE   TRUE    TRUE    FALSE   FALSE   TRUE    TRUE    TRUE    TRUE    TRUE
#datatype   string  long    dateTime:RFC3339    dateTime:RFC3339    dateTime:RFC3339    double  string  string  string  string  string
#default    mean                                        
    result  table   _start  _stop   _time   _value  _field  _measurement    smart_module    serial  type
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T08:20:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T08:40:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:00:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  0   sm_alarm    system_test 8   2.14301E+11 sm_extended

我希望输出CSV如下:

_time                   sm_alarm  next_column next_column ....... ...........
2023-03-29T08:41:15Z    0

请注意,sm_alarm只是其他9个字段(在_filed下)中的一个字段。
我尝试使用以下脚本,但无法解决我的问题。

import csv

# Specify the input and output file names
input_file = 'influx.csv'
output_file = 'output.csv'

try:
    # Open the input file for reading
    with open(input_file, 'r') as csv_file:
        # Create a CSV reader object
        csv_reader = csv.reader(csv_file)

        # Skip the first row (header)
        next(csv_reader)

        # Open the output file for writing
        with open(output_file, 'w', newline='') as output_csv:
            # Create a CSV writer object
            csv_writer = csv.writer(output_csv)

            # Write the header row
            csv_writer.writerow(['_time', '_field', '_value'])

            # Iterate over the input file and write the rows to the output file
            for row in csv_reader:
                # Check if the row is not empty
                if row:
                    # Split the fields
                    fields = row[0].split(',')

                    # Write the row to the output file
                    csv_writer.writerow(fields)

    print(f'{input_file} converted to {output_file} successfully!')

except FileNotFoundError:
    print(f'Error: File {input_file} not found.')

except Exception as e:
    print(f'Error: {e}')

谢谢大家。

qij5mzcb

qij5mzcb1#

预期输出的格式不明确且不完全清楚。
但是作为一个 * 起点 *,你可以用read_csvpandas中理顺你的文件:

import pandas as pd
​
with open("influx.csv", "r") as csv_file:
    headers = csv_file.readlines()[3].strip().split()[1:]
    
df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
                 engine="python", names=headers).iloc[:, 1:]

#df.to_csv("output.csv", index=False, sep=",") # <- uncomment this line to make a real csv

输出:

print(df)

                           _start                           _stop                           _time  _value    _field _measurement  smart_module        serial         type
0  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T08:20:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
1  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T08:40:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
2  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T09:00:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
3  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:12:40.697076925Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
  • 如果你分享一个明确的预期输出,我会相应地更新我的答案。

相关问题