windows Win批处理正则表达式搜索和替换

flvlnr44  于 2022-12-05  发布在  Windows
关注(0)|答案(4)|浏览(585)

我有一组数据
7859 10000:00 7859 10000:00(传送器#1,待检查=1033/1035)
32768 000:17 22174479 10000:00(传送器#2,待检查=1032/1035)
它们是从文件中读取的,并逐行传递给批处理脚本中的一个方法。在该方法中,我想做的是只提取
7859
22174479
从这几行开始,基本上不管“\d+:\d\d\s+"后面是什么,接下来就是我需要的数字,然后是另一个“\d\d.*”
这是可能的只使用批处理脚本正则表达式和搜索和替换?我尝试和阅读了一堆文章,但无法找到一个解决方案,我想添加的数字
谢谢你

编辑

  • 根据Andrei对大卫Ruhmann答案的评论,Andrei想要的是(xfer#前2位的令牌,而不是从开头算起的第3个令牌。*
qpgpyjmq

qpgpyjmq1#

请注意,批处理不是用于regex的最佳语言!Cmd一次处理一行输入,而regex允许多行处理。

听起来你只需要从这些行中执行一个令牌抓取。假设该行的更完整的正则表达式看起来像[\d+\s+\d+:\d\d\s+]+\(xfer#\d+, to-check=\d+/\d+\)
这让我们知道行中有常量分隔符。:冒号和\s+空格,从那里我们只需要使用这些锚来确定标记的位置。
从行中撷取以单行空白分隔的第三个词语基元。

for /f "tokens=3" %%A in ("line") do echo %%A

从行中以冒号分隔的第二个词语基元,撷取以单行空白分隔的第二个词语基元。

for /f "tokens=2 delims=:" %%A in ("line") do (
    for /f "tokens=2" %%B in ("%%A") do echo %%B
)

更新

提取最后一个冒号之前的第二个标记。

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "Line=32768 004:47 2686976 2200:03 11707819 10000:01 (xfer#5264, to-check=1020/6975)"

set "Last="
for /f "delims=" %%A in ('echo("%Line::="^&echo("%"') do (
    for /f "tokens=2" %%B in ("%%A") do (
        if defined This set "Last=!This!"
        set "This=%%B"
    )
)
echo %Last%

endlocal
pause >nul

限制

1.包含奇数个双引号"的行将导致脚本崩溃。防止这种情况的一个方法是用set Line=%Line:"=%去掉for循环前面的引号。

cgyqldqp

cgyqldqp2#

根据您对大卫Ruhmann答案的评论,您需要(xfer#字符串前2个位置的标记。我想可以使用本地批处理命令来完成,但这是一个严重的问题。
我假设您仅限于Windows自带的命令-没有下载的可执行文件。
我希望您可以使用JScript,因为它是Windows自带的。
我写了一个混合JScript/Batch实用程序脚本,名为“REPL.BAT”,它执行regex搜索和替换。这是一个非常有用的实用程序,尽管不需要太多代码。该实用程序使解决方案非常简单。
我使用FINDSTR过滤掉不符合模板的行,模板中至少有2个以空格分隔的标记位于(xfer#之前。我将这些结果通过管道传送到REPL实用程序,只保留所需的标记。结果被发送到stdout。

findstr /r /c:" [^ ][^ ]* [^ ][^ ]* (xfer#" test.txt | repl ".* ([^ ]+) ([^ ]+) \(xfer#.*" "$1"

下面是REPL.BAT实用程序脚本的代码。完整的文档都嵌入在脚本中。

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
:::
:::REPL  Search  Replace  [Options  [SourceVar]]
:::REPL  /?
:::
:::  Performs a global search and replace operation on each line of input from
:::  stdin and prints the result to stdout.
:::
:::  Each parameter may be optionally enclosed by double quotes. The double
:::  quotes are not considered part of the argument. The quotes are required
:::  if the parameter contains a batch token delimiter like space, tab, comma,
:::  semicolon. The quotes should also be used if the argument contains a
:::  batch special character like &, |, etc. so that the special character
:::  does not need to be escaped with ^.
:::
:::  If called with a single argument of /? then prints help documentation
:::  to stdout.
:::
:::  Search  - By default this is a case sensitive JScript (ECMA) regular
:::            expression expressed as a string.
:::
:::            JScript syntax documentation is available at
:::            http://msdn.microsoft.com/en-us/library/ae5bf541(v=vs.80).aspx
:::
:::  Replace - By default this is the string to be used as a replacement for
:::            each found search expression. Full support is provided for
:::            substituion patterns available to the JScript replace method.
:::            A $ literal can be escaped as $$. An empty replacement string
:::            must be represented as "".
:::
:::            Replace substitution pattern syntax is documented at
:::            http://msdn.microsoft.com/en-US/library/efy6s3e6(v=vs.80).aspx
:::
:::  Options - An optional string of characters used to alter the behavior
:::            of REPL. The option characters are case insensitive, and may
:::            appear in any order.
:::
:::            I - Makes the search case-insensitive.
:::
:::            L - The Search is treated as a string literal instead of a
:::                regular expression. Also, all $ found in Replace are
:::                treated as $ literals.
:::
:::            E - Search and Replace represent the name of environment
:::                variables that contain the respective values. An undefined
:::                variable is treated as an empty string.
:::
:::            M - Multi-line mode. The entire contents of stdin is read and
:::                processed in one pass instead of line by line. ^ anchors
:::                the beginning of a line and $ anchors the end of a line.
:::
:::            X - Enables extended substitution pattern syntax with support
:::                for the following escape sequences:
:::
:::                \\     -  Backslash
:::                \b     -  Backspace
:::                \f     -  Formfeed
:::                \n     -  Newline
:::                \r     -  Carriage Return
:::                \t     -  Horizontal Tab
:::                \v     -  Vertical Tab
:::                \xnn   -  Ascii (Latin 1) character expressed as 2 hex digits
:::                \unnnn -  Unicode character expressed as 4 hex digits
:::
:::                Escape sequences are supported even when the L option is used.
:::
:::            S - The source is read from an environment variable instead of
:::                from stdin. The name of the source environment variable is
:::                specified in the next argument after the option string.
:::

::************ Batch portion ***********
@echo off
if .%2 equ . (
  if "%~1" equ "/?" (
    findstr "^:::" "%~f0" | cscript //E:JScript //nologo "%~f0" "^:::" ""
    exit /b 0
  ) else (
    call :err "Insufficient arguments"
    exit /b 1
  )
)
echo(%~3|findstr /i "[^SMILEX]" >nul && (
  call :err "Invalid option(s)"
  exit /b 1
)
cscript //E:JScript //nologo "%~f0" %*
exit /b 0

:err
>&2 echo ERROR: %~1. Use REPL /? to get help.
exit /b

************* JScript portion **********/
var env=WScript.CreateObject("WScript.Shell").Environment("Process");
var args=WScript.Arguments;
var search=args.Item(0);
var replace=args.Item(1);
var options="g";
if (args.length>2) {
  options+=args.Item(2).toLowerCase();
}
var multi=(options.indexOf("m")>=0);
var srcVar=(options.indexOf("s")>=0);
if (srcVar) {
  options=options.replace(/s/g,"");
}
if (options.indexOf("e")>=0) {
  options=options.replace(/e/g,"");
  search=env(search);
  replace=env(replace);
}
if (options.indexOf("l")>=0) {
  options=options.replace(/l/g,"");
  search=search.replace(/([.^$*+?()[{\\|])/g,"\\$1");
  replace=replace.replace(/\$/g,"$$$$");
}
if (options.indexOf("x")>=0) {
  options=options.replace(/x/g,"");
  replace=replace.replace(/\\\\/g,"\\B");
  replace=replace.replace(/\\b/g,"\b");
  replace=replace.replace(/\\f/g,"\f");
  replace=replace.replace(/\\n/g,"\n");
  replace=replace.replace(/\\r/g,"\r");
  replace=replace.replace(/\\t/g,"\t");
  replace=replace.replace(/\\v/g,"\v");
  replace=replace.replace(/\\x[0-9a-fA-F]{2}|\\u[0-9a-fA-F]{4}/g,
    function($0,$1,$2){
      return String.fromCharCode(parseInt("0x"+$0.substring(2)));
    }
  );
  replace=replace.replace(/\\B/g,"\\");
}
var search=new RegExp(search,options);

if (srcVar) {
  WScript.Stdout.Write(env(args.Item(3)).replace(search,replace));
} else {
  while (!WScript.StdIn.AtEndOfStream) {
    if (multi) {
      WScript.Stdout.Write(WScript.StdIn.ReadAll().replace(search,replace));
    } else {
      WScript.Stdout.WriteLine(WScript.StdIn.ReadLine().replace(search,replace));
    }
  }
}
f0brbegy

f0brbegy3#

:: Does %variable% =~ s/old/new/
  setlocal ENABLEDELAYEDEXPANSION     
  for /f "delims=" %%a in ('echo !variable! ^|perl -pe "s/regexp/replace/" ') do set variable=%%a
lmyy7pcs

lmyy7pcs4#

要完成所需操作,最简单、最灵活的方法是使用awkregexp examples)或sed(例如:sed -i -r -e "s/(\d+:\d\d\s+)\d+/\1replacementstring/g" filename),这两个版本都支持Perl regexp语法。我认为您所涉及的正是awk的设计目的。
如果您只能使用可用的工具而不能使用第三方工具,您可以使用vbscript执行regexp匹配。您可以通过将脚本回显到.vbs文件、调用cscript vbsfile并捕获其输出来调用vbscript。下面是概念证明。

@echo off & setlocal enabledelayedexpansion

:: rxp.bat
:: rxp /? for usage instructions

if #%4==# goto usage
set global=false
set replace=false
for %%I in (%*) do (
    if not #!next!==# (
        if !next!==string set string=%%I
        if !next!==pattern set pattern=%%I
        if !next!==replace set replace=%%I
        set next=
    )
    if #%%I==#/s set next=string
    if #%%I==#/p set next=pattern
    if #%%I==#/r set next=replace
    if #%%I==#/g set global=true
)
if #%string==# goto usage
if #%pattern==# goto usage

set string=!string:"=""!
set string=!string:\=!
set pattern=!pattern:"=""!
set pattern=!pattern:\=!
if #!replace!==#false (
    call :rxp !string:~1,-1! !pattern:~1,-1! !global!
) else (
    set replace=!replace:"=""!
    set replace=!replace:\=!
    call :rxp !string:~1,-1! !pattern:~1,-1! !global! !replace:~1,-1!
)
goto :EOF

:rxp string pattern global replacement
echo Set rxp = New RegExp>regexp.vbs
echo rxp.Pattern = %2>>regexp.vbs
echo rxp.Global = %3>>regexp.vbs
if #%4==# (
    echo Set res = rxp.Execute^(%1^)>>regexp.vbs
    echo For Each match in res>>regexp.vbs
    echo Wscript.Echo match.value>>regexp.vbs
    echo Next>>regexp.vbs
) else (
    echo Wscript.echo rxp.Replace^(%1, %4^)>>regexp.vbs
)
cscript /nologo regexp.vbs
del /q regexp.vbs
goto :EOF

:usage
echo Usage: %~nx0 /s "string" /p "regexp" [/g] [/r "replacement text"]
echo;
echo    /s -- search string
echo;
echo    /p -- regular expression pattern
echo          Example: /p "<[^>]+>" to search for markup tags
echo          matches ^<span class='a'^> or similar
echo;
echo    /r -- replacement text (optional)
echo          If specified, replace the matched text
echo          Example: /p "(<div class=')blue('>)" /r "$1red$2"
echo          matches ^<div class='blue'^>
echo          replaces match with ^<div class='red'^>
echo;
echo    /g -- global match (optional)
echo          match every occurrence (matches only the first by default)
echo;
echo notes: If the regexp pattern includes capturing parentheses, use ^$1-^$9 as
echo backreferences in your replacement text.  If any of your strings include
echo quotation marks, they can be escaped with a backslash (\).
echo;
echo Example:
echo %~nx0 /s "text begin <div id=\"foo\"> text end" /p "(<div)[^>]+(>)"
echo /r "$1 class=\"bar\"$2"
echo;
echo matches ^<div id="foo"^>, replaces match with ^<div class="bar"^>
echo output: text begin ^<div class="bar"^> text end

示例输出:

C:\Users\me\Desktop>rxp /s "7859 10000:00 7849 10000:00 (xfer#1, to-check=1033/1035)" /p "(\d+:\d\d\s+)\d+" /r "$1foo"
7859 10000:00 foo 10000:00 (xfer#1, to-check=1033/1035)

C:\Users\me\Desktop>rxp
Usage: rxp.bat /s "string" /p "regexp" [/g] [/r "replacement text"]

   /s -- search string

   /p -- regular expression pattern
         Example: /p "<[^>]+>" to search for markup tags
         matches <span class='a'> or similar

   /r -- replacement text (optional)
         If specified, replace the matched text

   /g -- global match (optional)
         match every occurrence (matches only the first by default)

notes: If the regexp pattern includes capturing parentheses, use $1-$9 as
backreferences in your replacement text.  If any of your strings include
quotation marks, they can be escaped with a backslash (\).

Example:
rxp.bat /s "text begin <div id=\"foo\"> text end" /p "(<div)[^>]+(>)"
/r "$1 class=\"bar\"$2"

matches <div id="foo">, replaces match with <div class="bar">
output: text begin <div class="bar"> text end

相关问题