Powershell -循环处理拆分文件和重文件

xjreopfe  于 2022-12-13  发布在  Shell
关注(0)|答案(1)|浏览(138)

我有许多SFTP日志文件,每个大约1GB,有数百万行。我需要每个文件只提取能够登录的帐户。
我的想法是首先从日志文件中提取帐户以.csv格式显示的行,然后使用“Split”只提取帐户,并将其放入另一个. csv文件中。而且,它工作正常...问题是,对于如此大量的数据,拆分部分需要大量的时间和资源,即使只是一个文件。
作为Powershell的新手,我想知道是否有更好的方法来实现这一点,或者让它运行得更快。我猜我的脚本需要时间来运行,因为每次我提取帐户时,我都会将其写入.csv文件。也许有一种方法可以将结果写入内存,然后在一个块中将其导出到.csv文件?
谢谢你的帮助!

#Location of sftp logs in .txt
$log_file = get-childitem -name C:\temp\logs_sftp\Logs

#Extraction of lines where an account logged in
foreach ($File in $log_file)
{
get-date
Select-String -Path C:\Temp\logs_sftp\Logs\$file -Pattern "logged in" | select-string -Pattern "230" -NotMatch | select-string -Pattern "530 Not logged in" -NotMatch | Export-Csv C:\Temp\logs_sftp\Logs\$file.csv
get-date
}

#Location of sftp logs with only the account lines in .csv

#Split loop to extract only the account
$log_file_csv = get-childitem -name C:\temp\logs_sftp\Logs\*.csv
foreach ($File in $log_file_csv)
{    
$ToSplit = Get-Content C:\temp\logs_mutualise\Logs\$file | Select-Object -Skip 2
$ToSplit | ForEach-Object {
    $aItems = $_ -split { $_ -eq " "}
    $aItems[$aItems.length-4]  >> C:\Temp\logs_sftp\Result\export_split.csv
    }
}

编辑:根据要求,这里是sftp日志之一的一些行(仅包含帐户的行)

"True","12","[5] Tue 01Nov22 00:00:00 - (63025700) User User_1 logged in","LogsServ-U01112022.txt","C:\Temp\logs_mutualise\Logs\LogsServ-U01112022.txt","logged in",,"System.Text.RegularExpressions.Match[]"
"True","31","[5] Tue 01Nov22 00:00:00 - (63025701) User User_2 logged in","LogsServ-U01112022.txt","C:\Temp\logs_mutualise\Logs\LogsServ-U01112022.txt","logged in",,"System.Text.RegularExpressions.Match[]"
"True","49","[5] Tue 01Nov22 00:00:00 - (63025702) User User_3 logged in","LogsServ-U01112022.txt","C:\Temp\logs_mutualise\Logs\LogsServ-U01112022.txt","logged in",,"System.Text.RegularExpressions.Match[]"

我应该已经明确的目标是有一个列表的用户已经连接了至少一次的sftp服务器。我提取的文件这个例子是超过700.000行。我想要什么:

User_1
User_2
User_3
kpbpu008

kpbpu0081#

对于那些需要它的人,我现在是这样做的:

$log_file = get-childitem -name C:\temp\logs_mut\Logs

#loop to extract lines from files
foreach ($File in $log_file)
{
Select-String -Path C:\Temp\logs_mut\Logs\$file -Pattern "logged in" | select-string -Pattern "230" -NotMatch | select-string -Pattern "530 Not logged in" -NotMatch | Export-Csv C:\Temp\logs_mut\Logs\$file.csv
}

#Loop to extract only the account name
$log_file_csv = get-childitem -name C:\temp\logs_mut\Logs\*.csv
foreach ($File in $log_file_csv)
{    
$ToSplit = Get-Content C:\temp\logs_mut\Logs\$file | Select-Object -Skip 2
$ToSplit -replace '"True.+User ','' -replace ' logged.+$','' | sort-object | get-unique >> C:\Temp\logs_mut\Result\$file.result.csv
}
 

#Put results in one file
$date_result = get-date -format ddMMyy
get-content C:\Temp\logs_mut\result\*.csv | sort-object | get-unique >> C:\Temp\logs_mut\final\Logs-$date_result-Result.csv

快多了!
谢谢所有帮助过我的人

相关问题