regex PHP：将字符串拆分为一个数组，数组中的单词用波浪号括起来并保留这些单词

q5iwbnjs 于 2023-03-04 发布在 PHP

关注(0)|答案(2)|浏览(179)

很晚了，我想我已经盯着这个太久了弄不明白，但是：我得到了一堆原始文本，其中波浪线（~）中的任何内容都是标题，其他内容都是纯文本。例如：
标题文本在同一行（& T）：~THE BURGER MINI~A tiny little burger patty in a tiny little bun.
标题和文本在不同的行上：

~THE BURGER MAX~
A gigantic hunk of steak in between two toasted baguettes, each stuffed with beef & cheese`

两者的结合：

~THE BURGER ZERO~
No burger, no bun, just air.

~THE BURGER ITALIANO~
A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.~NOTE~This is basically giant ravioli.

最终，我想要达到的结果应该是：

Array
(
    [0] => Array
        (
            [title] => THE BURGER ZERO
        )

    [1] => Array
        (
            [text] => No burger, no bun, just air.
        )

    [2] => Array
        (
            [title] => THE BURGER ITALIANO
        )

    [3] => Array
        (
            [text] => A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.
        )

    [4] => Array
        (
            [title] => NOTE
        )

    [5] => Array
        (
            [text] => This is basically giant ravioli.
        )

)

...这样我就可以区分标题和文本，但关键是 * 按它们出现的顺序 *。
我可以将换行符中的字符串拆分成一个数组，如下所示：

$tempArray = preg_split('/\s*\R\s*/', trim($str), NULL, PREG_SPLIT_NO_EMPTY);

但在那之后，我就卡住了。在波浪线（preg_split('/~(.*?)~/uim', $line);）中的任何组上使用preg_split都会给我所有的段落文本，但会丢失标题（因为它们被用于拆分）。我一直在用各种形式的preg_match和preg_match_all来敲我的头，但我得到的只是头痛。
有没有一种直接的方法可以得到我想要的东西，并且适用于上面所有的例子？

regex

来源：https://stackoverflow.com/questions/75597027/php-splitting-a-string-into-an-array-around-words-wrapped-in-tildes-keeping-t

2条答案

按热度按时间

cetgtptt1#

preg_match_all('/~([^~]+)~\n*([^~\n]+)/', $str, $match);

因此，匹配一个波浪号，然后匹配一个或多个波浪号以外的其他波浪号，再匹配另一个波浪号。捕获波浪号之间的内容：

~([^~]+)~

后跟零个或多个换行符：

\n*

后面跟着一个或多个除波浪号和换行符以外的任何内容。并捕获这些内容。

([^~\n]+)

这将为您提供$match[1]中的标题和$match[2]中的描述：

print_r($match[1]);

Array
(
    [0] => THE BURGER ZERO
    [1] => THE BURGER ITALIANO
    [2] => NOTE
)

print_r($match[1]);

Array
(
    [0] => No burger, no bun, just air.
    [1] => A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.
    [2] => This is basically giant ravioli.
)

然后您可以将其合并到单个数组中：

$items = array_combine($match[1], $match[2]);
print_r($items);

Array
(
    [THE BURGER ZERO] => No burger, no bun, just air.
    [THE BURGER ITALIANO] => A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.
    [NOTE] => This is basically giant ravioli.
)

赞(0）回复(0）举报 2023-03-04

wqsoz72f2#

<?php
$input = '~THE BURGER ZERO~
No burger, no bun, just air.

~THE BURGER ITALIANO~
A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.~NOTE~This is basically giant ravioli.';

$splittedText = array_values(array_filter(explode ("~", $input)));

foreach($splittedText as $key => $value){
    if (ctype_upper(str_replace(' ', '', $value))){
        $splittedText[$key] = ['title' => $value];
    }
    else{
        $splittedText[$key] = ['text' => $value];
    }
}

print_r($splittedText);

此解决方案不使用任何正则表达式。
它的工作原理是

首先在波形划线上爆炸整个字符串
然后清除数组中的空点，重新排列键并迭代数组
检查我们迭代的值是否都是大写字母（去掉空格），如果是，那么我们将键设置为“title”，否则它是“text”，如预期输出所示。

输出为：

Array
(
    [0] => Array
        (
            [title] => THE BURGER ZERO
        )

    [1] => Array
        (
            [text] => 
No burger, no bun, just air.

        )

    [2] => Array
        (
            [title] => THE BURGER ITALIANO
        )

    [3] => Array
        (
            [text] => 
A soft mix of ground beef & mozzarella stuffed between two pillowy pieces of pasta.
        )

    [4] => Array
        (
            [title] => NOTE
        )

    [5] => Array
        (
            [text] => This is basically giant ravioli.
        )

)

赞(0）回复(0）举报 2023-03-04

我来回答

regex PHP：将字符串拆分为一个数组，数组中的单词用波浪号括起来并保留这些单词

2条答案

相关问题

热门标签

最新问答