如何使脚本合并所有的.txt文件到onve .csv文件到多列在Powershell

w6mmgewl  于 2023-01-18  发布在  Shell
关注(0)|答案(3)|浏览(172)

我不知道如何将多个.txt文件与数据合并成一个.csv文件,每个.txt文件分隔到列。
这是目前为止我的代码

$location = (Get-Location).Path
$files = Get-ChildItem $location -Filter "*.asd.txt"
$data = @()

foreach ($file in $files) {
    $fileData = Get-Content $file.FullName

    foreach ($line in $fileData) {
        $lineData = $line -split "\t"
        $data = $lineData[1]
        Add-Content -Path "$location\output.csv" -Value  $data
    } 

}

每个文件看起来都像

我想保留第一列“WaveLength”,并将文件夹中所有文件的第二列放在一起。标题将以确切名称“stovikmladyDoupno2 2020080500001.asd”或“stovikmladyDoupno2 2020080500002.asd”开头,依此类推...
所以它应该看起来像

我试着找了两天的信息,但还是不知道。我试着在文件的结尾加上“”,我想excel会处理这个问题,但没有任何帮助。
这里我提供了一些文件作为测试数据https://mega.nz/folder/zNhTzR4Z#rpc-BQdRfm3wxl87r9XUkw
几行数据

Wavelength  stovikmladyDoupno2 2020080500000.asd
350  6.38961399706465E-02 
351  6.14107911262903E-02 
352  6.04866108251357E-02 
353  5.83485359067184E-02 
354  0.054978792413247 
355  5.27014859356317E-02 
356  5.34849237528764E-02 
357  5.32841277775603E-02 
358  5.23466655229364E-02 
359  5.47595002186027E-02 
360  5.22061034631109E-02 
361  4.90149806042666E-02 
362  4.81633530421385E-02 
363  4.83974076557941E-02 
364  4.65219929658367E-02 
365  0.044800930294557 
366  4.47830287392802E-02 
367  4.46947539436297E-02 
368  0.043756926558447 
369  4.31725380363072E-02 
370  4.36867609723618E-02 
371  4.33227601805265E-02 
372  4.29978664449687E-02 
373  4.23860463187361E-02 
374  4.12183604375401E-02 
375  4.14306521081773E-02 
376  4.11760903772502E-02 
377  4.06421127128478E-02 
378  4.09771489689262E-02 
379  4.10083126746385E-02 
380  4.05161601354181E-02 
381  3.97904564387456E-02
bxgwgixi

bxgwgixi1#

希望这有帮助!
我假设了一个位置,因为我不喜欢声明没有文本路径的文件路径。请根据需要调整路径。

$Files = Get-ChildItem J:\Test\*.txt -Recurse 

$Filecount = 0

$ObjectCollectionArray = @()

#Fist parse and collect each row in an array.. While keeping the datetime information from filename. 

foreach($File in $Files){

$Filecount++
Write-Host $Filecount 

$DateTime = $File.fullname.split(" ").split(".")[1]

$Content = Get-Content $File.FullName

foreach($Row in $Content){

    $Split = $Row.Split("`t")

    if($Split[0] -ne 'Wavelength'){

        $Object = [PSCustomObject]@{
            'Datetime' = $DateTime
            'Number' = $Split[0]
            'Wavelength' = $Split[1]
        }

        $ObjectCollectionArray += $Object
    }
}    
}

#Match by number and create a new object with relation to the number and different datetime. 

$GroupedCollection = @()
$Grouped =  $ObjectCollectionArray | Group-Object number

foreach($GroupedNumber in $Grouped){
    $NumberObject = [PSCustomObject]@{
            'Number' = $GroupedNumber.Name
    }

foreach($Occurance in $GroupedNumber.Group){
        $NumberObject | Add-Member -NotePropertyName $Occurance.Datetime -NotePropertyValue $Occurance.wavelength
}

$GroupedCollection += $NumberObject

}

$GroupedCollection | Export-Csv -Path J:\Test\result.csv -NoClobber -NoTypeInformation
myzjeezk

myzjeezk2#

你要做的是一个相当困难的任务,有几种方法可以做到。这种方法要求所有的文件都在内存中来处理它们。你完全可以把这些文件当作TSVs,所以Import-Csv -Delimiter "t"`是一个选项,这样你就可以处理对象而不是纯文本。

# using this temp dictionary to create objects for each line of each tsv
$tmp  = [ordered]@{}
# get all files and enumerate
$csvs = Get-ChildItem $location -Filter *.asd.txt | ForEach-Object {
    # get their content as objects
    $content  = $_ | Import-Csv -Delimiter "`t"
    # get their property Name that is not `Wavelength`
    $property = $content[0].PSObject.Properties.Where{ $_.Name -ne 'Wavelength' }.Name

    # output an object holding the total lines of this csv,
    # its content and the property name of interest
    [pscustomobject]@{
        Lines    = $content.Count
        Content  = $content
        Property = $property
    }
}

# use a scriptblock to allow streaming so `Export-Csv` starts exporting as
# output is going through the pipeline
& {
    # for loop used for each line of the Tsv having the highest number of lines
    for($i = 0; $i -lt [System.Linq.Enumerable]::Max([int[]] $csvs.Lines); $i++) {
        # this boolean is used to preserve the "Wavelength" value of the first Tsv
        $isFirstCsv = $true

        foreach($csv in $csvs) {
            # if this is the first object
            if($isFirstCsv) {
                # add the value of "Wavelength"
                $tmp['Wavelength'] = $csv.Content[$i].Wavelength
                # and set the bool to false, since we are only using this once
                $isFirstCsv = $false
            }
            # then add the value of each property of each Tsv to the temp dictionary
            $tmp[$csv.Property] = $csv.Content[$i].($csv.Property)
        }

        # then output this object
        [pscustomobject] $tmp
        # clear the temp dictionary
        $tmp.Clear()
    }
} | Export-Csv path\to\result.csv -NoTypeInformation
7nbnzgx9

7nbnzgx93#

这里有一个更有效的方法,将文件作为纯文本处理,这个方法更快,内存效率更高,但不可靠。它使用StreamReader来读取文件内容 * 逐行 *,并使用StringBuilder来构造每一行。

& {
    # get all files and enumerate
    $readers = Get-ChildItem $location -Filter *.asd.txt | ForEach-Object {
        # create a stream reader for each file
        [System.IO.StreamReader] $_.FullName
    }

    # this StringBuilder is used to construct each line
    $sb = [System.Text.StringBuilder]::new()
    # while any of the readers has more content
    while($readers.EndOfStream -contains $false) {
        # signals this is our first Tsv
        $isFirstReader = $true
        # enumerate each reader
        foreach($reader in $readers) {
            # if this is the first Tsv
            if($isFirstReader) {
                # append the line as-is, only trimming exces white space
                $sb = $sb.Append($reader.ReadLine().Trim())
                $isFirstReader = $false
                # go to next reader
                continue
            }
            # if this is not the first Tsv,
            # split on Tab and exclude the first token (Wavelength)
            $null, $line = $reader.ReadLine().Trim() -split '\t'
            # append a Tab + this line
            $sb = $sb.Append("`t$line")
        }
        # append a new line and output the constructed string
        $sb.AppendLine().ToString()
        # and clear it for next lines
        $sb = $sb.Clear()
    }

    # dispose all readers when done
    $readers | ForEach-Object Dispose
} | Set-Content path\to\result.tsv -NoNewline

相关问题