用于报告的Linux筛选器大日志文件

sbtkgmzw  于 2023-11-17  发布在  Linux
关注(0)|答案(1)|浏览(100)

我有一个大日志超过26000个文件,每个文件将有如下内容..我需要将所有的线,其中有404与JSON。在下面的情况下,我需要得到的最后一行,因为这是有404,而不是JSON的内容。在编写过滤器正则表达式的任何帮助?Linux大师的帮助是赞赏。

  • -错误pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw == www.example.com http 151 0.004-错误2015 - 07 - 28 11:34:55 SIN 3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.jpg 404*

版本号:1.0

字段:日期时间x-edge-location sc-bytes c-ip cs-method cs(Host)cs-uri-stem sc-status cs(Referer)cs(User-Agent)cs-uri-query cs(Cookie)x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type

2015 - 07 - 28 11:34:00 57 MAD50 658 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS % 2520VM%2520发动机/002%2520Apr%252004%25202014%2520(OSD:%252032%2520; SD)--错误tdlmnsfrOCxOelbe82y3kIp_QfbBF7S3dDCn4rHR65JOMkOtZu4dz A == www.example.com http 151 0.004-Error 2015 - 07 - 28 11:34:53 SIN3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS%2520VM %2520发动机/002%2520Apr%252004% 124.13.170.152mediaInfo_211.json 404-NDS%2520VM%2520引擎/002%2520Apr%252004%2520201 4%2520(OSD:%252032%2520; SD)--错误bvLIe540oNMCeZ0QpOmX1OKoClgNgvSWppGuOmgVS85WnAXKJ1ryDg == www.cnX1000000.example.com http 151 0.002-错误2015 - 07 - 28 11:34:54 SIN3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS%252 0VM%2520发动机/002%2520Apr%252004% d2v2sjgehuhalt.cloudfront.net211.json 404-NDS%2520VM%2520引擎/002%2520Apr%252004%25202014%25二十(OSD:%252032%2520; SD)--错误hTbk9HE5nyFSla1DmeC1D1jhuMtoUY6E7QQvyf0v1YYJ1GBp-I40bw == www.example.com http 151 0.001-错误2015 - 07 - 28 11:34:55 SIN3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS%2520错误@%2520发动机/002%2520Apr%252004%25202014% pdl.astro.com.my HD)--Error avWgysZyGeGXdt.ZHLfP5uLJ4ie5Hx8pa6ZJC5GHXfvOkyEXXp8o0g == www.example.com http 151@.001-错误2015 - 07 - 28 11:34:55 SIN3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS%25 20VM%2520引擎/002%2520Apr%252004%25202014%2520(OSD:%252032%2520; SD)--错误wBepjCn58o9AiTifvtrCprkjdAdg--zsLTsjDpUBkxnEU5tahmJxxQ == www.wbjCn58o9AiTifvtrCprkjdAdg@.example.com http 151 0.004-错误2015 - 07 - 28 11:34:55 SIN3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404-NDS%252 0VM % 14.192.214.93(OSD:%252032%2520; SD)--错误pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw == www.example.com http 151 0.004-错误2015 - 07 - 28 11:34:55 SIN 3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.json 404

  • 错误pbmzjYvLFIlLeth6mN2Yox9DH4vap1hcFHuJgNosd0XHVSxGdRcrWw == www.example.com http 151 0.004-错误2015 - 07 - 28 11:34:55 SIN 3 659 www.example.com GET www.example.com/thumbnail/mediaInfo_211.jpg 404
bcs8qyzn

bcs8qyzn1#

如果你想解析大的HTTP日志,你应该使用visitors,如果你想要一个JSON输出,因为这个社区是关于编码的,你可以扩展它来实现。
否则,对于你最初的问题,这里有一个awk的方法:

awk '$NF == 404 && $(NF -1) ~ /\.json$/ { next; } {print}' /path/to/yourfile.log

$NF == 404  # the last field is 404
$(NF -1)    # the field before the last
~ /\.json$/ # ends with .json
{ next; }   # skip this line
{ print }   # print anything else

字符串

相关问题