找到特定字符串后打印csv文件的数据

dwbf0jvd  于 11个月前  发布在  其他
关注(0)|答案(1)|浏览(137)

我对Python很陌生,渴望学习。我想打开并使用csv文件中的数据,但只使用特定字符串(s,s,N,s)后的数据(四列值,用“,”分隔)。字符串ssNs并不总是在同一行,所以我不能使用行号。


的数据
你能帮助我如何使用这些数据吗?我目前的代码看起来像下面这样:

import pandas as pd
import math
import sys  
data = pd.read_csv(r'C:\Users\User\Documents\pythonProject\filename.csv', engine="python",sep=',',encoding='latin-1')

if len(row) > 1:
   if row[0].startswith('s,s,N,s'):
      print(row)

字符串

rqqzpn5f

rqqzpn5f1#

下面是我在个人项目中使用的一个工作示例。不要介意chardet模块,它只是用来检测编码的

import chardet
def read_file(path, keyword, delim=";", encoding="latin-1"):
    """
     Read file and create data frame. This function is used to read data from csv file
     
     @param path - path to file to read
     @param keyword - first row starting with keyword to look for
     @param delim - delimiter to use for reading csv file default is space
     @return data frame or 0 if file not found or error
     """
    num = 0
    with open(path) as f:
        lines = f.readlines()        
        #get list of all possible lins starting by first_col
        # Find the first row starting with keyword in the list of lines
        for i in range(10):
            # Find the first column in the line
            if keyword in lines[i]:
                encoding = chardet.detect(str.encode(lines[i]))    
                num = i
                break
    if num < 1:
        return
    encoding = encoding["encoding"]
    if encoding == "ascii":
        encoding = "latin-1"
    elif encoding == "utf-8":
        delim = ";"
    try:
        df = pd.read_csv(path, delimiter=delim, skiprows=num, on_bad_lines="skip", encoding=encoding)
    except:
        df = pd.read_csv(path, delimiter="\t", skiprows=num, on_bad_lines="skip", encoding="latin-1")

字符串
只要使用这个函数与您的文件名的路径,您的关键字在这里将是's,s,N,s'

相关问题