regex 在定义的间隔后添加空格的正则表达式

q8l4jmvw  于 2023-01-06  发布在  其他
关注(0)|答案(3)|浏览(237)

我有以下字符串:

string str = "***********4123";

我希望输出为

**********4123

以下代码在每4个字符后添加空格:

Regex.Replace(maskedString, ".{4}", "$0 ");

有什么方法可以在第4和第6个字符后添加空格吗?

b91juud3

b91juud31#

使用捕获组。
转换为具有“{第一组} {第二组} {剩余}"的新值。

string str = "***********4123";
var replaceStr = Regex.Replace(str, @"(\*{4})(\*{6})(.*)", "$1 $2 $3");

Demo @ .NET Fiddle

y0u0uwnf

y0u0uwnf2#

如果您不坚持使用Regex,您可以:

using System;
                    
public class Program
{
    public static void Main()
    {
        string str = "***********4123";
        var strSpan = str.AsSpan();
        Console.WriteLine($"{strSpan[..4]} {strSpan[4..10]} {strSpan[10..]}");
    }
}

输出:

**** ****** *4123

实际应用:https://dotnetfiddle.net/kKoRJb
注意,对于生产,我会为输入添加健全性检查。

  • 是空还是空白
  • 修剪(可能)
  • 长度不足

另外:如果你选择正则表达式,考虑预编译它和缓存。我没有对正则表达式运行这个解决方案的基准测试,所以如果性能在这里很关键,你可能想这样做。

更新

所以,我很感兴趣,做了一个小基准:

BenchmarkDotNet=v0.13.3, OS=Windows 10 (10.0.19044.2364/21H2/November2021Update)
Intel Core i9-10885H CPU 2.40GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.101
  [Host]     : .NET 7.0.1 (7.0.122.56804), X64 RyuJIT AVX2
  DefaultJob : .NET 7.0.1 (7.0.122.56804), X64 RyuJIT AVX2

|                 Method |     N |        Mean |     Error |    StdDev | Ratio | RatioSD | Allocated | Alloc Ratio |
|----------------------- |------ |------------:|----------:|----------:|------:|--------:|----------:|------------:|
| RegexVersionUncompiled |  1000 |   336.97 us |  6.616 us |  6.189 us |  7.79 |    0.21 |  62.52 KB |        1.00 |
|   RegexVersionCompiled |  1000 |   252.49 us |  4.904 us |  5.036 us |  5.85 |    0.19 |  62.52 KB |        1.00 |
|            SpanVersion |  1000 |    43.15 us |  0.812 us |  0.834 us |  1.00 |    0.00 |  62.52 KB |        1.00 |
|    StringInsertVersion |  1000 |    33.27 us |  0.655 us |  0.852 us |  0.78 |    0.03 | 117.21 KB |        1.87 |
|                        |       |             |           |           |       |         |           |             |
| RegexVersionUncompiled | 10000 | 3,254.56 us | 54.729 us | 48.515 us |  7.55 |    0.20 | 625.03 KB |        1.00 |
|   RegexVersionCompiled | 10000 | 2,424.43 us | 39.463 us | 32.953 us |  5.62 |    0.14 | 625.03 KB |        1.00 |
|            SpanVersion | 10000 |   432.77 us |  8.456 us |  9.048 us |  1.00 |    0.00 | 625.02 KB |        1.00 |
|    StringInsertVersion | 10000 |   429.94 us |  4.070 us |  3.178 us |  1.00 |    0.02 | 1171.9 KB |        1.87 |

// * Hints *
Outliers
  Benchmark.SpanVersion: Default            -> 4 outliers were removed (45.56 us..71.61 us)
  Benchmark.RegexVersionUncompiled: Default -> 1 outlier  was  removed, 4 outliers were detected (3.16 ms..3.18 ms, 3.37 ms)
  Benchmark.RegexVersionCompiled: Default   -> 2 outliers were removed, 5 outliers were detected (2.36 ms..2.39 ms, 2.48 ms, 3.42 ms)
  Benchmark.StringInsertVersion: Default    -> 3 outliers were removed (441.17 us..460.36 us)

// * Legends *
  N           : Value of the 'N' parameter
  Mean        : Arithmetic mean of all measurements
  Error       : Half of 99.9% confidence interval
  StdDev      : Standard deviation of all measurements
  Ratio       : Mean of the ratio distribution ([Current]/[Baseline])
  RatioSD     : Standard deviation of the ratio distribution ([Current]/[Baseline])
  Allocated   : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline])
  1 us        : 1 Microsecond (0.000001 sec)

基于这段代码,我从Yong Shun's answer "窃取"了Regex,从Hossein Sabziani' answer "窃取"了String. Insert Version:

using BenchmarkDotNet.Attributes;
using Bogus;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

namespace RegexBenchmark
{
    [MemoryDiagnoser(false)]
    public class Benchmark
    {
        [Params(1000, 10_000)]
        public int N = 1000;

        private readonly Regex _regex = new (@"(\*{4})(\*{6})(.*)", RegexOptions.Compiled);
        private string[] _inputs;

        [GlobalSetup]
        public void Setup()
        {
            var faker = new Faker();
            _inputs = Enumerable.Range(0, N).Select(_ => faker.Random.ReplaceNumbers("***********####")).ToArray();
        }

        [Benchmark]
        public string[] RegexVersionUncompiled()
        {
            string[] result = new string[N];
            for( int i = 0; i < N; i++ ) result[i] = Regex.Replace(_inputs[i], @"(\*{4})(\*{6})(.*)", "$1 $2 $3");
            return result;
        }

        [Benchmark]
        public string[] RegexVersionCompiled()
        {
            string[] result = new string[N];
            for (int i = 0; i < N; i++) result[i] = _regex.Replace(_inputs[i], "$1 $2 $3");
            return result;
        }

        [Benchmark]
        public string[] SpanVersion()
        {
            string[] result = new string[N];
            for (int i = 0; i < N; i++)
            {
                var strSpan = _inputs[i].AsSpan();
                result[i] = $"{strSpan[..4]} {strSpan[4..10]} {strSpan[10..]}";
            }
            return result;
        }

        [Benchmark]
        public string[] StringInsertVersion()
        {
            string[] result = new string[N];
            for (int i = 0; i < N; i++)
            {
                result[i] = _inputs[i].Insert(4, " ").Insert(11, " ");
            }
            return result;
        }
    }
}

有趣的是:当我打开GC列的显示时,Regex似乎对垃圾收集的压力较小:

|                 Method |     N |        Mean |     Error |    StdDev | Ratio | RatioSD |     Gen0 |     Gen1 | Allocated | Alloc Ratio |
|----------------------- |------ |------------:|----------:|----------:|------:|--------:|---------:|---------:|----------:|------------:|
| RegexVersionUncompiled |  1000 |   329.92 us |  6.402 us |  8.547 us |  8.55 |    0.29 |   7.3242 |   1.4648 |  62.52 KB |        1.00 |
|   RegexVersionCompiled |  1000 |   244.21 us |  4.637 us |  4.962 us |  6.34 |    0.17 |   7.5684 |   1.7090 |  62.52 KB |        1.00 |
|            SpanVersion |  1000 |    38.60 us |  0.717 us |  0.670 us |  1.00 |    0.00 |   7.6294 |   1.8921 |  62.52 KB |        1.00 |
|    StringInsertVersion |  1000 |    32.69 us |  0.302 us |  0.267 us |  0.85 |    0.02 |  14.3433 |   3.5400 | 117.21 KB |        1.87 |
|                        |       |             |           |           |       |         |          |          |           |             |
| RegexVersionUncompiled | 10000 | 3,242.25 us | 61.809 us | 66.135 us |  7.50 |    0.09 |  74.2188 |  70.3125 | 625.03 KB |        1.00 |
|   RegexVersionCompiled | 10000 | 2,431.65 us | 47.894 us | 44.800 us |  5.64 |    0.13 |  74.2188 |  70.3125 | 625.03 KB |        1.00 |
|            SpanVersion | 10000 |   431.01 us |  5.069 us |  4.741 us |  1.00 |    0.00 |  76.1719 |  75.6836 | 625.02 KB |        1.00 |
|    StringInsertVersion | 10000 |   429.69 us |  7.117 us |  5.943 us |  1.00 |    0.02 | 142.5781 | 142.0898 | 1171.9 KB |        1.87 |

考虑到扩展性,我可能仍会选择Span解决方案:

|                 Method |      N |         Mean |      Error |       StdDev | Ratio | RatioSD |      Gen0 |     Gen1 |     Gen2 |   Allocated | Alloc Ratio |
|----------------------- |------- |-------------:|-----------:|-------------:|------:|--------:|----------:|---------:|---------:|------------:|------------:|
| RegexVersionUncompiled |   1000 |    333.54 us |   6.403 us |     6.288 us |  7.69 |    0.15 |    7.3242 |   1.4648 |        - |    62.52 KB |        1.00 |
|   RegexVersionCompiled |   1000 |    239.71 us |   4.601 us |     4.519 us |  5.52 |    0.14 |    7.5684 |   1.7090 |        - |    62.52 KB |        1.00 |
|            SpanVersion |   1000 |     43.35 us |   0.412 us |     0.365 us |  1.00 |    0.00 |    7.6294 |   1.8921 |        - |    62.52 KB |        1.00 |
|    StringInsertVersion |   1000 |     34.18 us |   0.523 us |     0.489 us |  0.79 |    0.01 |   14.3433 |   3.5400 |        - |   117.21 KB |        1.87 |
|                        |        |              |            |              |       |         |           |          |          |             |             |
| RegexVersionUncompiled |  10000 |  3,343.72 us |  66.047 us |    90.406 us |  6.85 |    0.21 |   74.2188 |  70.3125 |        - |   625.03 KB |        1.00 |
|   RegexVersionCompiled |  10000 |  2,450.42 us |  31.348 us |    29.323 us |  5.00 |    0.10 |   74.2188 |  70.3125 |        - |   625.03 KB |        1.00 |
|            SpanVersion |  10000 |    490.05 us |   6.722 us |     6.288 us |  1.00 |    0.00 |   76.1719 |  75.6836 |        - |   625.02 KB |        1.00 |
|    StringInsertVersion |  10000 |    436.04 us |   5.316 us |     4.973 us |  0.89 |    0.02 |  142.5781 | 142.0898 |        - |   1171.9 KB |        1.87 |
|                        |        |              |            |              |       |         |           |          |          |             |             |
| RegexVersionUncompiled | 100000 | 40,730.47 us | 793.225 us | 1,058.932 us |  3.39 |    0.12 |  846.1538 | 769.2308 | 230.7692 |  6251.31 KB |        1.00 |
|   RegexVersionCompiled | 100000 | 32,578.02 us | 645.500 us | 1,163.969 us |  2.74 |    0.14 |  937.5000 | 906.2500 | 281.2500 |  6251.14 KB |        1.00 |
|            SpanVersion | 100000 | 12,016.15 us | 237.602 us |   300.491 us |  1.00 |    0.00 |  968.7500 | 953.1250 | 312.5000 |  6250.24 KB |        1.00 |
|    StringInsertVersion | 100000 | 25,158.93 us | 372.056 us |   329.818 us |  2.09 |    0.06 | 1625.0000 | 968.7500 | 312.5000 | 11719.01 KB |        1.87 |
lstz6jyr

lstz6jyr3#

您可以使用String.Insert(Int32, String)add a string at a specific index

var result= str.Insert(4, " ").Insert(11, " ");
 Console.WriteLine(result); //"**** ****** *4123"

相关问题