regex 正则表达式-从markdown字符串中提取所有标头

6jjcrrmo  于 2023-04-13  发布在  其他
关注(0)|答案(3)|浏览(190)

我使用gray-matter将文件系统中的.MD文件解析为字符串。解析器生成的结果是这样的字符串:

\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n

我现在试着写一个正则表达式,从字符串中提取所有的标题文本。markdown中的标题以#开头(可以是1-6),在我的例子中总是以新的一行结束。
我试过使用下面的正则表达式,但是在测试字符串上调用它时没有返回任何结果:

const testString = "\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n"
const HEADING_R = /(?<!#)#{1,6} (.*?)(\\r(?:\\n)?|\\n)/gm;
const headings = HEADING_R.exec(content);

console.log('headings: ', headings);

此控制台将headings记录为null(未找到匹配项)。我要查找的结果是:["# Clean-er ReactJS Code - Conditional Rendering", "## TL;DR", "## Introduction"]
我相信正则表达式是错误的,但不知道为什么。

6pp0gazn

6pp0gazn1#

/#{1,6}.+(?=\n)/g

  • #{1,6}...与字符#至少匹配一次,或者作为最多6个相等字符的序列。
  • .+匹配任何字符(行终止符除外)至少一次,并尽可能多地匹配(贪婪)
  • 这样做,直到 positive lookahead(?=\n)匹配…
  • 也就是... \n...一个换行符。
  • 使用g全局修饰符,它匹配所有内容。
    编辑

已经提到

  • “匹配任何字符(行终止符除外)"*

因此,像... /#{1,6}.+/g...这样的正则表达式应该已经完成OP用例的工作(不需要积极的前瞻),OP用例是...

  • “markdown中的标题以#开头(可以是1-6),在我的情况下总是以新行结束。"*
    我想得到的结果是:["# Clean-er ReactJS Code - Conditional Rendering", "## TL;DR", "## Introduction"] .
const testString = `\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n`;

// see...[https://regex101.com/r/n6XQub/2]
const regXHeader = /#{1,6}.+/g

console.log(
  testString.match(regXHeader)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

奖金

将上面的正则表达式重构为例如/(?<flag>#{1,6})\s+(?<content>.+)/g,通过利用 named capturing groups 以及matchAllmap ping任务,可以实现如下提供的示例代码计算的结果...

const testString = `\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n`;

// see...[https://regex101.com/r/n6XQub/4]
const regXHeader = /(?<flag>#{1,6})\s+(?<content>.+)/g

console.log(
  Array
    .from(
      testString.matchAll(regXHeader)
    )
    .map(({ groups: { flag, content } }) => ({
      heading: `h${ flag.length }`,
      content,
    }))
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
qyzbxkaa

qyzbxkaa2#

问题是,您正在使用正则表达式的文字,不应该对反斜杠进行双转义,因此您可以将其写成(?<!#)#{1,6} (.*?)(\r(?:\n)?|\n)
您可以缩短模式,捕获您想要的内容,并匹配尾随的换行符,而不是使用lookbehindAssert。

(#{1,6} .*)\r?\n

正在检索所有捕获组1值:

const testString = "\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n"
const HEADING_R = /(#{1,6} .*)\r?\n/g;
const headings = Array.from(testString.matchAll(HEADING_R), m => m[1]);
console.log('headings: ', headings);
8mmmxcuj

8mmmxcuj3#

角箱:在许多编程语言中,注解掉的代码行以#开头,与标题相同-例如Bash,Ruby,Python。它看起来像这样:

# run command
npm install

表示此的示例markdown字符串为:

const md = "\n# Bash\n\n```\n# nuke OS\nfork()\n```\n"

迄今为止,没有任何答案可以解决这个问题。一个可能的解决方案是事先删除所有代码块:

const md = "\n# Clean-er ReactJS Code - Conditional Rendering\n\n```bash\n# initialize project via CLI\nnpm create-react-app conditional-rendering\n```\n\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n"
const headings = md
    .replaceAll(/```(.|\n)+```/gm, "")
    .split("\n")
    .filter(line => line.startsWith("#"))
console.log("Headings:", headings)

相关问题