c++ 字符数组表示的字符串中的模式匹配

jfewjypa 于 2024-01-09 发布在其他

关注(0)|答案(1)|浏览(99)

我试图只使用数组来解决一个非常基本的问题（文本是字符的数组），但在实现某些特定的东西时遇到了问题。
因此，我尝试编写一个函数，该函数以两个字符数组（即text和pattern）为参数，并返回该模式在文本中出现的次数（文本中有多少不同的子字符串与该模式匹配）。文本仅由数字和拉丁字母组成。该模式可能包含以下特殊字符（这些字符不能出现在文本中）：
“*”-正好对应一个任意字符;“%”-对应一个或两个数字（十进制数字系统中）;“@”-对应一个拉丁字母。
下面我们来分析一些例子：

text: "te3t zdrte44q t33t"

pattern: "t\*%@"

字符串
说明：
模式“t*%@”可以匹配子字符串，如“te 3a”、“t337 q”等。* 正好对应一个任意字符，%对应一个或两个数字，@对应一个拉丁字母。
在给定的文本“te 3 t zdrte 44 q t33 t”中，该模式匹配三个子字符串：“te 3 t”、“te 44 q”和“t33 t”。
因此，此示例的预期输出为3。

text: "aaaaaa"

pattern: "aa"

Expected output: 5

型
....

text: "123"

pattern: "%%"

Expected output: 3

型
我在使用模式符号 * 和@方面取得了很大的进步。它们运行得很好，与预期的完全匹配一个字符。但是，我在使用%时遇到了一些障碍。与其他符号不同，它不仅可以匹配一个符号，有时还可以匹配两个符号，这让我有点头疼。
你看，棘手的是%没有固定的长度。它是可变的，所以它可以在一个示例中匹配一个符号，在另一个示例中匹配两个。弄清楚如何处理这种动态匹配对我来说现在有点挑战。
如果你有任何想法或建议如何解决这个问题，我真的很感激你的意见。
以下是我目前的代码：

#include <iostream>

using namespace std;

const int MAX_SIZE_TEXT = 201;
const int MAX_SIZE_PATTERN = 201;

unsigned getStrLen(char str[])
{
    int i = 0, count = 0;

    while (str[i] != '\0')
    {
        i++;
        count++;
    }

    return count;
}

bool isLetter(char ch)
{
    return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z');
}

bool isDigit(char ch)
{
    return ch >= '0' && ch <= '9';
}

unsigned countMatches(char text[], char pattern[])
{
    int textLen = getStrLen(text);
    int patternLen = getStrLen(pattern);

    int i = 0, j = 0, currentLen = 0, matches = 0;

    bool matchedASingleDigit = false;
    while (i < textLen)
    {
        cout << text[i] << "?=" << pattern[j] << endl;
        if (text[i] == pattern[j] || pattern[j] == '*' 
            || (pattern[j] == '@' && isLetter(text[i])))
        {
            cout << "(before) i=" << i << "; j=" << j << endl;
            currentLen += 1;
            i++;
            j++;
            cout << "(after) i=" << i << "; j=" << j << endl;

            cout << "currentLen = " << currentLen << endl;
            if (currentLen == patternLen)
            {
                matches++;
            }
        }
        else if (pattern[j] == '%')
        {
            /*if (isDigit(text[i]))
            {
                matchedASingleDigit = true;
                currentLen += 1;
                i++;
                j++;
                cout << "(after) i=" << i << "; j=" << j << endl;

                cout << "currentLen = " << currentLen << endl;
                if (currentLen == patternLen)
                {
                    matches++;
                }
            }

            if (i < textLen - 1 && isDigit[text])*/
        }
        else
        {
            i -= currentLen - 1;
            j = 0;
            cout << "(after) i=" << i << "; j=" << j << endl;
            
            currentLen = 0;
            cout << "currentLen = " << currentLen << endl;
        }
    }

    return matches;
}

void testCountingMatches()
{
    char text[MAX_SIZE_TEXT];
    cin.getline(text, MAX_SIZE_TEXT);

    char pattern[MAX_SIZE_PATTERN];
    cin.getline(pattern, MAX_SIZE_PATTERN);

    unsigned matches = countMatches(text, pattern);

    cout << matches << endl;
}

int main()
{
    testCountingMatches();  
}

型

c++

来源：https://stackoverflow.com/questions/77771396/pattern-matching-in-a-string-represented-by-char-array

1条答案

按热度按时间

eanckbw91#

对于初学者，您可以使用递归。

bool matchSingle(char text[], char pattern[], int remainLen)
{
    if(remainLen < 0){
        return 0;
    }
    if(text[0] == '\0' || pattern[0]=='\0')
    {
        return pattern[0]=='\0' && remainLen == 0;
    }
    if (text[0] == pattern[0] || pattern[0] == '*' 
            || (pattern[0] == '@' && isLetter(text[0])))
    {
        return matchSingle(text + 1, pattern + 1, remainLen - 1);
    }
    else if(pattern[0] == '%')
    {
        if(!isDigit(text[0]))
        {
            return false;
        }
        else if(matchSingle(text + 1, pattern + 1, remainLen - 1))
        {
            return true;
        }
        else if(isDigit(text[1]) && matchSingle(text + 2, pattern + 1, remainLen - 2))
        {
            return true;
        }

    }
    return false;
}

unsigned countMatches(char text[], char pattern[])
{
    int textLen = getStrLen(text);
    int patternLen = getStrLen(pattern);

    int i = 0, j = 0, currentLen = 0, matches = 0;

    for(i = 0; i < getStrLen(text); i++){
        for(j = getStrLen(pattern); j <= getStrLen(text) - i; j++){            
            matches += matchSingle(text + i, pattern, j));
        }
    }

    return matches;
}

字符串
在这里我们首先将不同的i的计数分解成它自己的函数，然后我们可以利用递归来推进文本和模式指针。我已经尽可能地保持了代码与原始代码的接近。在分解后的函数中，我们只关心精确匹配，而在countMatches函数中，我们枚举了所有可能的长度。尽管效率很低，这应该能够处理注解中的text: 3333, pattern: %3%情况。
正如@n.m.couldbeanAI评论的那样，让模式匹配变得高效并不容易。这并不意味着高效，只是为了给予你一个想法，因为你被卡住了。
演示：https://godbolt.org/z/PWjbhbjKK
代码的一些附加注解：
不要使用using namespace std。它现在看起来很方便，但迟早会起作用，请参阅this question以了解更多详细信息。在这种情况下，您需要从std命名空间中获得的所有内容是cout、cin和endl。因此，您应该拼写出std::cout的全名，std::cin和std::endl（我的首选项），或者如果您不想在任何地方键入它们，请使用

using std::cout;
using std::cin;
using std::endl;

型
而不是.
另外，在这些地方实际上并不需要std::endl，应该只使用'\n'，因为std::endl会导致不必要的刷新。
另外，不要写C风格的C++。C标准库已经提供了很多你可以使用的工具。如果你的老师是那些限制学生使用C标准库的人之一，那么他就没有真正正确地教C++，我对此感到遗憾。不管这是否是事实，你需要知道如何使用标准库。
比如说用std::string来代替固定长度的raw char数组，这样就不用担心MAX_SIZE_TEXT够不够大了，不管你选择不选择raw char数组。除非有明确的要求，否则没有理由自己编写getStrLen。如果您选择std::string，则应使用std::string::size()，或者std::strlen()，如果你想坚持使用原始的char数组。也有std::isdigit()，虽然那个更棘手，使用起来不会在程序中引入微妙的bug。

赞(0）回复(0）举报 2024-01-09

我来回答

c++ 字符数组表示的字符串中的模式匹配

1条答案

相关问题

热门标签

最新问答