改进“if/else”strcmp()梯形图以确定可用值

jhkqcmku  于 2023-01-29  发布在  其他
关注(0)|答案(2)|浏览(150)
//...
if( strcmp( str, "January" ) == 0 )
    month = 1;
else if( strcmp( str, "February") == 0 )
    month = 2;
//...

问:有没有更有效的方法来确定,例如,“April”是一年中的第四个月?重复调用strcmp()的效率肯定非常低,而且if/else的代码编写过程非常繁琐。有时是“March”,有时缩写为“MAR ......”一定有更好的方法......
将已知的字符串放入一个排序的结构体数组中至少可以进行二进制搜索,但是仍然需要对代码进行大量的猜测。

gdx19jrr

gdx19jrr1#

这是一个Can I answer my own question?答案。欢迎提供其他答案。
有几种方法可以将任意字符串从有限的字符串集合转换为简洁、可用的形式。大多数方法都涉及迭代(或次优线性)搜索,其中包括重复比较(可能需要考虑大小写敏感性)。
对我最近一个问题的回答建议“分享”一个(公认《双城之战》)散列函数,它可以识别假阳性,返回月份序号(1-12)当传递包含月份名称的字符串时(英语)在7位ASCII中。函数对第2和第3个字符执行原语操作,并输出字符串的散列值。注意,“January”,“jan”和“JAN”都返回值1。同样,“feb”、“FEBRUARY”和“Feb”将返回值2。

static int monthOrd( char cp[] ) { return "DIE@CB@LJF@HAG@K"[ cp[1]/4&7 ^ cp[2]*2 &0xF ] &0xF; }

所显示的操作是通过大量原始操作的“蛮力”排列来揭示的,这些原始操作寻求一种组合,该组合将返回0x 0和0xF(4位)之间的12个不同值。鼓励读者拆开两个ASCII字符的位的损坏的每一步。这个结果不是“发明”的,而是“发现”的。
在两个字符的位被损坏之后,该值用作字符串的索引(又名“廉价LUT”),其12个字母A-L的位置使得“?an”(January)将损坏字母“A”的索引。屏蔽该字母的低4位将生成值1作为字符串“JANUARY”的序号...当函数被传递字符串“Jan”的变体时,返回值为1。
注意:使用此函数允许调用者检查字符串是否确实为“JAN”、“jan”、“January”,以适合应用程序。调用者不需要尝试匹配其他11个月的名称。此函数将返回字符串“Random”的假正值1,因此调用者只需要验证单个月份的名称(长度和大小写适合应用程序)。
奖励回合:

static int wkdayOrd( char cp[] ) { return "65013427"[*cp/2 + ~cp[1] & 0x7] & 0x7; }

一个等效函数,将“Sun(day)”(不区分大小写)转换为1,将“MON”转换为2,将“tue”转换为3,等等...
同样,调用者必须仅针对一天的名称来确认字符串,以避免“误报”。
既然说到这里,下面是一个等价的函数,用于从“零”到“十”的“数字名称”,同样不区分大小写(数字名称不像月份名称或星期名称那样缩写)。

static int numberOrd( char cp[] ) { return "@~IBAH~FCGE~~DJ~"[ ( cp[0] ^ cp[1]/2 + cp[2]*4 ) & 0xF ] & 0xF; }
yptwkmov

yptwkmov2#

我已经检查了gperf将所有月份作为“January”、“Jan”、“January”、“JAN”、“January”、“jan”等等传递时会发生什么。

struct months {
char *name;
int number;
};

#define TOTAL_KEYWORDS 69
#define MIN_WORD_LENGTH 3
#define MAX_WORD_LENGTH 9
#define MIN_HASH_VALUE 3
#define MAX_HASH_VALUE 218
/* maximum key range = 216, duplicates = 0 */

#ifdef __GNUC__
__inline
#else
#ifdef __cplusplus
inline
#endif
#endif
static unsigned int
hash (register const char *str, register size_t len)
{
  static unsigned char asso_values[] =
    {
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219,  10,  80,  75,  95,   5,
      125,  95, 219, 219,   5, 219,  95,  55,  45,  60,
       60, 219,  85,  95,  50,  90,  25, 219, 219,  12,
      219, 219, 219, 219, 219, 219, 219,   0,  40,  35,
       35,  35,  40,  25, 219, 219,   0, 219,  10,  50,
        0,  25,  15, 219,  15,  35,  30,  10,  25, 219,
      219,  25, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219, 219, 219, 219, 219,
      219, 219, 219, 219, 219, 219
    };
  return len + asso_values[(unsigned char)str[2]] + asso_values[(unsigned char)str[1]] + asso_values[(unsigned char)str[0]];
}

struct months *
in_word_set (register const char *str, register size_t len)
{
  static struct months wordlist[] =
    {
      {""}, {""}, {""},
      {"jan",1},
      {""}, {""}, {""},
      {"january",1},
      {"Jan",1},
      {""}, {""}, {""},
      {"January",1},
      {"jun",6},
      {"june",6},
      {""}, {""}, {""},
      {"Jun",6},
      {"June",6},
      {""}, {""}, {""},
      {"jul",7},
      {"july",7},
      {""}, {""}, {""},
      {"Jul",7},
      {"July",7},
      {""}, {""}, {""},
      {"apr",4},
      {""},
      {"april",4},
      {""}, {""},
      {"aug",8},
      {""}, {""},
      {"august",8},
      {""},
      {"Apr",4},
      {""},
      {"April",4},
      {""}, {""},
      {"Aug",8},
      {""}, {""},
      {"August",8},
      {""},
      {"nov",11},
      {""}, {""}, {""}, {""},
      {"november",11},
      {""}, {""}, {""}, {""},
      {"JAN",1},
      {""}, {""}, {""},
      {"JANUARY",1},
      {"mar",3},
      {""},
      {"march",3},
      {""}, {""},
      {"Mar",3},
      {""},
      {"March",3},
      {""}, {""},
      {"may",5},
      {""},
      {"MAY",5},
      {""}, {""},
      {"May",5},
      {""}, {""}, {""}, {""},
      {"sep",9},
      {""}, {""}, {""}, {""},
      {"oct",10},
      {"september",9},
      {""}, {""},
      {"october",10},
      {"Nov",11},
      {""}, {""}, {""}, {""},
      {"November",11},
      {""}, {""}, {""}, {""},
      {"dec",12},
      {""}, {""}, {""}, {""},
      {"december",12},
      {""}, {""}, {""}, {""},
      {"feb",2},
      {""}, {""}, {""}, {""},
      {"february",2},
      {""}, {""}, {""}, {""},
      {"Oct",10},
      {""}, {""}, {""},
      {"October",10},
      {"NOV",11},
      {""}, {""}, {""}, {""},
      {"NOVEMBER",11},
      {""}, {""}, {""}, {""},
      {"JUN",6},
      {"JUNE",6},
      {""}, {""}, {""},
      {"Sep",9},
      {""}, {""}, {""}, {""},
      {"MAR",3},
      {"September",9},
      {"MARCH",3},
      {""}, {""},
      {"APR",4},
      {""},
      {"APRIL",4},
      {""}, {""},
      {"SEP",9},
      {""}, {""}, {""}, {""},
      {"Dec",12},
      {"SEPTEMBER",9},
      {""}, {""}, {""},
      {"December",12},
      {""}, {""}, {""}, {""},
      {"DEC",12},
      {""}, {""}, {""}, {""},
      {"DECEMBER",12},
      {""}, {""}, {""}, {""},
      {"OCT",10},
      {""}, {""}, {""},
      {"OCTOBER",10},
      {"JUL",7},
      {"JULY",7},
      {""}, {""}, {""},
      {"AUG",8},
      {""}, {""},
      {"AUGUST",8},
      {""},
      {"Feb",2},
      {""}, {""}, {""}, {""},
      {"February",2},
      {""}, {""}, {""}, {""},
      {"FEB",2},
      {""}, {""}, {""}, {""},
      {"FEBRUARY",2}
    };

  if (len <= MAX_WORD_LENGTH && len >= MIN_WORD_LENGTH)
    {
      register unsigned int key = hash (str, len);

      if (key <= MAX_HASH_VALUE)
        {
          register const char *s = wordlist[key].name;

          if (*str == *s && !strcmp (str + 1, s + 1))
            return &wordlist[key];
        }
    }
  return 0;
}

我想这是相当快的,并且每个字符串只需要一个strcmp。这正是GCC中用于关键字检查的。
here提供了一个非常好的gperf介绍。

相关问题