从大于5条记录的csv文件中提取数据

bnl4lu3b  于 2023-09-28  发布在  其他
关注(0)|答案(1)|浏览(106)

更多内容请看我的第一篇文章:第一邮报
我需要从CSV文件中提取数据到现有的Excel文件中。我对自定义函数和Excel的整体机制知之甚少,因此在GPT 3.5的帮助下,我开发了一个自定义函数,用于从具有2条记录的测试csv文件中提取数据。
下面是GPT给我的函数,根据我得到的反馈做了一些调整:

Function GetCSVCellValueFromRecord(csvFilePath As String, recordIndex As Long, targetColumnName As String) As Variant
    Dim csvContent As String
    Dim lines() As String
    Dim headers() As String
    Dim columnIndex As Long
    Dim i As Long
    
    ' Read the entire CSV file into a string
    Open ThisWorkbook.Path & "\" & csvFilePath For Input As #1
    csvContent = Input$(LOF(1), 1)
    Close #1
    
    ' Split the CSV content into lines
    lines = Split(csvContent, vbCrLf)
    
    ' Get the headers from the first line
    headers = Split(lines(0), ",")
    
    ' Find the column index of the target data
    columnIndex = -1
    For i = LBound(headers) To UBound(headers)
        If Trim(headers(i)) = targetColumnName Then
            columnIndex = i
            Exit For
        End If
    Next i
    
    ' Return an error if the column name is not found
    If columnIndex = -1 Then
        GetCSVCellValueFromRecord = CVErr(xlErrValue)
        Exit Function
    End If
    
    ' Check if the requested record index is within bounds
    If recordIndex >= 1 And recordIndex <= UBound(lines) Then
        Dim fields() As String
        fields = Split(lines(UBound(lines) - recordIndex), ",")  ' Remove the subtraction of 1 here
        If UBound(fields) >= columnIndex Then
            GetCSVCellValueFromRecord = Trim(fields(columnIndex))
            Exit Function
        End If
    End If

    
    ' Return "N/A" if the record or data is not found
    GetCSVCellValueFromRecord = "N/A"
End Function

我有一个包含所需csv文件名称的单元格,我的函数调用引用此单元格,以便使用csv中的信息更新所有相关字段。
函数调用示例:=GetCSVCellValueFromRecord(F1, 1, "Comment")如果找不到记录,则返回“N/A”。
该函数可以很好地处理一个包含5条记录的小csv文件,除了不存在的第6条记录,它返回指定列的名称,然后正确地显示不存在的第7条到第20条记录的“N/A”。我不确定是什么导致了这个错误。当测试一个有10条记录的文件时,该函数会完全失败,并为所有字段返回错误的数据,为应该存在的字段返回N/A。
我再一次不确定这个问题。请告知
csv文件的第一行是最近的记录,出于我的目的,我需要将最旧的记录作为第一行
来自工作csv的数据:

Date,Time,Service provider,Client name,Client phone,Comment,Service category
24/8/2023,01:00 PM - 01:55 PM,Tim Robinson,Severus Mitsu,18765894838,"5.Checked only 1 mirror before moving off
",Manual General License
24/8/2023,10:15 AM - 11:10 AM,Tim Robinson,Severus Mitsu,18765894838,"4.Started the vehicle properly
",Manual General License
24/8/2023,09:20 AM - 10:15 AM,Tim Robinson,Severus Mitsu,18765894838,"3.Failed to disengage parking brake
",Manual General License
24/8/2023,08:25 AM - 09:20 AM,Tim Robinson,Severus Mitsu,18765894838,"2.Failed to engage seatbelt
",Manual General License
24/8/2023,07:30 AM - 08:25 AM,Tim Robinson,Severus Mitsu,18765894838,"1.Successfully Opened door
",Manual General License

来自损坏csv的数据:

Date,Tim e,Service,"Service provider","Client name","Client phone",Comment,"Service category"
26-08-2023,"01:00 PM - 01:55 PM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"10. Excessive Speeding","Manual General License"
26-08-2023,"10:15 AM - 11:10 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"9.Ran the red lights","Manual General License"
26-08-2023,"09:20 AM - 10:15 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"8.Mastered Parallel Parking","Manual General License"
26-08-2023,"08:25 AM - 09:20 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"7.botched attempt at parallel parking","Manual General License"
26-08-2023,"07:30 AM - 08:25 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"6. Making good progress","Manual General License"
24-08-2023,"01:00 PM - 01:55 PM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"5.Checked only 1 mirror before moving off
","Manual General License"
24-08-2023,"10:15 AM - 11:10 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"4.Started the vehicle properly
","Manual General License"
24-08-2023,"09:20 AM - 10:15 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"3.Failed to disengage parking brake
","Manual General License"
24-08-2023,"08:25 AM - 09:20 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"2.Failed to engage seatbelt
","Manual General License"
24-08-2023,"07:30 AM - 08:25 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"1.Success fully Opened door
","Manual General License"
eufgjt7s

eufgjt7s1#

某些引用的字段值包含换行符。vbCrLf和逗号分割只有在文件的行间和字段之间只有这些东西时才有效:如果字段值可能包含其中任何一个,则不起作用。
为了让它更清楚,这将显示换行符在文件中的位置(vbCrLf/vbLf/vbCr中的任何一个):

ff = FreeFile ' Read the entire CSV file into a string...
    Open ThisWorkBook.path & "\" & csvFilePath For Input As ff
    csvContent = Input$(LOF(ff), ff)
    Close ff
    
    csvContent = Replace(csvContent, vbCrLf, "<CrLf>")
    csvContent = Replace(csvContent, vbCr, "<Cr>")
    csvContent = Replace(csvContent, vbLf, "<Lf>")
    csvContent = Replace(csvContent, "<CrLf>", "<CrLf>" & vbLf)
    Debug.Print csvContent

某些记录在引号 * 中有CrLf * -这些不是分隔行的新行,而是嵌入字段数据中的新行。

Date,Tim e,Service,"Service provider","Client name","Client phone",Comment,"Service category"<CrLf>
26-08-2023,"01:00 PM - 01:55 PM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"10. Excessive Speeding","Manual General License"<CrLf>
26-08-2023,"10:15 AM - 11:10 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"9.Ran the red lights","Manual General License"<CrLf>
26-08-2023,"09:20 AM - 10:15 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"8.Mastered Parallel Parking","Manual General License"<CrLf>
26-08-2023,"08:25 AM - 09:20 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"7.botched attempt at parallel parking","Manual General License"<CrLf>
26-08-2023,"07:30 AM - 08:25 AM","Nissan Frontier #CO 026","Harold  Richards","Severus Mitsu",+18765894838,"6. Making good progress","Manual General License"<CrLf>
24-08-2023,"01:00 PM - 01:55 PM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"5.Checked only 1 mirror before moving off<CrLf>
","Manual General License"<CrLf>
24-08-2023,"10:15 AM - 11:10 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"4.Started the vehicle properly<CrLf>
","Manual General License"<CrLf>
24-08-2023,"09:20 AM - 10:15 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"3.Failed to disengage parking brake<CrLf>
","Manual General License"<CrLf>
24-08-2023,"08:25 AM - 09:20 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"2.Failed to engage seatbelt<CrLf>
","Manual General License"<CrLf>
24-08-2023,"07:30 AM - 08:25 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"1.Success fully Opened door<CrLf>
","Manual General License"<CrLf>

EDIT:这成功地从“问题”文件中读取内容:

Sub Tester()
    Debug.Print GetCSVCellValueFromRecord("Test2.csv", 1, "Comment")
    '>> 1.Success fully Opened door
    Debug.Print GetCSVCellValueFromRecord("Test2.csv", 2, "Comment")
    '>> 2.Failed to engage seatbelt
    Debug.Print GetCSVCellValueFromRecord("Test2.csv", 1, "Commentx")
    '>> ?header?
    Debug.Print GetCSVCellValueFromRecord("Test2.csv", 4, "Comment")
    '>> 4.Started the vehicle properly
    Debug.Print GetCSVCellValueFromRecord("Test2.csv", 30, "Comment")
    '>> ?rec num?
End Sub

Function GetCSVCellValueFromRecord(csvFilePath As String, recordIndex As Long, targetColumnName As String) As Variant
    Dim numLines As Long, lines As Collection
    Dim headers() As String, fields() As String
    Dim columnIndex As Long, i As Long
   
    Set lines = CSVToLines(ThisWorkBook.path & "\" & csvFilePath)
    numLines = lines.Count
    If numLines = 0 Then
        GetCSVCellValueFromRecord = "?no data?"
        Exit Function
    End If
    
    ' Find the column index of the target data
    headers = lines(1) ' Get the headers from the first line
    columnIndex = -1
    For i = LBound(headers) To UBound(headers)
        If Trim(headers(i)) = targetColumnName Then
            columnIndex = i
            Exit For
        End If
    Next i
    
    ' check the column name was found
    If columnIndex = -1 Then
        GetCSVCellValueFromRecord = "?header?"
    Else
        ' record index is within bounds?
        If recordIndex >= 1 And recordIndex <= lines.Count - 1 Then
            fields = lines(numLines - (recordIndex - 1))
            If UBound(fields) >= columnIndex Then
                GetCSVCellValueFromRecord = Trim(fields(columnIndex))
                Exit Function
            End If
        Else
            GetCSVCellValueFromRecord = "?rec num?"
        End If
    End If
End Function

'Return the content from a CSV file as a Collection of
'  arrays, one array per line
'Account for quoted fields containing commas, quotes, or newlines
Function CSVToLines(csvFilePath As String) As Collection

    Dim csvContent As String, line As String, cNext As String
    Dim c, col As New Collection
    Dim i As Long, ff As Integer, length As Long, inQ As Boolean
    
    ff = FreeFile ' Read the entire CSV file into a string...
    Open csvFilePath For Input As ff
    csvContent = Input$(LOF(ff), ff)
    Close ff
    
    csvContent = Replace(csvContent, vbCrLf, vbLf) & " " 'normalize newlines and add a space
    length = Len(csvContent) - 1                         'skip the added space when looping...
    inQ = False                                          'not yet inside a quoted field
    i = 1
    Do While i < length
        c = Mid(csvContent, i, 1)
        cNext = Mid(csvContent, i + 1, 1)
        If c = """" Then       'already in quotes?
            If cNext = """" Then
                line = line & c     'was a doubled-up quote - add one quote
                i = i + 1           'skip the next character
            Else
                inQ = Not inQ       'switching in/out of a quoted field
            End If
        Else
            If Not inQ Then
                'Not in a quoted field - decide how to handle the character
                Select Case c
                    Case vbLf   'not inside quotes, so this is the end of a line...
                        col.Add Split(line, Chr(0))  'split fields on chr(0)
                        line = ""
                    Case ",": line = line & Chr(0) 'sub comma for chr(0) for later splitting
                    Case Else: line = line & c
                End Select
            Else
                'In a quoted field: add each character unless a vblf
                '  change behaviour to suit your needs....
                Select Case c
                    Case vbLf 'do nothing?
                    Case Else: line = line & c
                End Select
            End If
        End If
        i = i + 1
    Loop
    If Len(line) > 0 Then col.Add Split(line, Chr(0)) 'add any remaining line
    Set CSVToLines = col
End Function

相关问题