C语言单词长度直方图习题提示？

zc0qhyus 于 12个月前发布在其他

关注(0)|答案(4)|浏览(73)

我正在学习《C编程语言》这本书中的C，我正在尝试解决练习1.13：
“写一个程序来打印输入的单词长度的直方图。画水平方向的直方图很容易;垂直方向的直方图更具挑战性。”
我写了代码，但是当我按CTRL+Z（文件结束）时，它显示的全是零，而不是单词的长度。
有没有人给予点提示，告诉我哪里做错了？

#include <stdio.h>

/* print a histogram of the length of words from input */
main()
{
    int c, i, wordn, space;
    int lengthn[20];

    wordn = space = 0;
    for (i = 0; i < 20; ++i)
        lengthn[i] = 0;

    while ((c = getchar()) != EOF) {
        if (c == ' ' || c == '\t' || c == '\n')
            if (space == 1) {
                ++wordn;
                space = 0;
                ++i;
            }
        if (c != ' ' && c != '\t' && c != '\n') {
            ++lengthn[i];
            space = 1;
        }
    }
    printf("Length: ");
    for (i = 0; i < 16; ++i)
        printf("%d   ", lengthn[i]);
    printf("\n        --------------------------------------------------------------\n");
    printf("Word:   1   2   3   4   5   6   7   8   9   10   11   12   13   14   15\n");
}

字符串

c

来源：https://stackoverflow.com/questions/5843156/histogram-of-the-length-of-words-exercise-hint

4条答案

按热度按时间

jhiyze9q1#

（因为OP是在寻求提示，而不是解决方案）

那么......在这个循环之后i等于什么呢？

for (i = 0; i < 20; ++i)
    lengthn[i] = 0;

字符串
接下来你会在哪里使用它？

赞(0）回复(0）举报 12个月前

vom3gejh2#

for (i = 0; i < 20; ++i)     
    lengthn[i] = 0;

字符串
在此循环之后，i的值将为i=20
因此必须在while循环之前初始化i

赞(0）回复(0）举报 12个月前

cigdeys33#

我写了一个垂直方向的代码。我是新来的C，所以可能是代码不好。

#include <stdio.h>
#include <conio.h>

#define MAX_WORDS 100
#define IN 1
#define OUT 0

int maxlength(int length[], char num_of_word);

int main()
{
    char c,i,j,state,num_of_word;
    int length[MAX_WORDS];
    /*initialize length[]*/
        for(i=0;i<MAX_WORDS;i++){
        length[i]=0;
        }
    /* find the length of each word */
    num_of_word=0;
    while(num_of_word<MAX_WORDS && (c = getchar()) != EOF && c != 'a'){
        if(c != ' ' && c!= '\t' && c!= '\n'){
            state = IN;
            length[num_of_word]++;
        }else{
            if(state != OUT){
                state = OUT;
                num_of_word++;
            }
        }
    }
    /*   draw histogram            */
    for(i= maxlength(length[],num_of_word);i>0;i--){
        for(j=0;j<num_of_word;j++){
            if(length[j]<i){
                printf(" ");
            }else{
                printf("|");
            }
        }
        printf("\n");
    }
    /* print name of each column*/
    for(i=0;i<num_of_word;i++){
        printf("%d",i+1);
    }

    _getch();
    return(0);
}
/*sub-function that find the longest word */
int maxlength(int length[], char num_of_word){
    int i, max;
    max = length[0];
    for(i=1;i<num_of_word;i++){
        if(max<length[i]){
            max = length[i];
        }
    }
    return max;
}

字符串

赞(0）回复(0）举报 12个月前

b0zn9rqh4#

首先，下面是原始代码的输出，只更改了一行：

#define MAXLEN 3

字符串
100对于测试来说肯定有点太大了。

t te tes test tests
^Z
  1 -   1  *
  2 -   1  *
  3 -   0
  4 -   0
  5 -   0
  6 -   0
  7 -   0
  8 -   0
  9 -   0
 10 -   0
 11 -   0
 12 -   0
 13 -   0
 14 -   0
 15 -   0
 16 -   0
 17 -   0
 18 -   0
 19 -   0
 20 - -858993460

Overflow: 2

型
而我们看到的第一个问题：

数字不匹配：5个字就输入，只有4个占了。
长度为20的字的负值

有一些 * 一次性 * 错误。
下面是3个错误的一些提示和一个完整的替代示例，包括源代码、输出和参数。

问题：忽略长度为`MAXLEN`的单词

我们不清楚MAXLEN是包含还是不包含，但在直方图中包含长度为MAXLEN的单词似乎更自然，但无论如何，这些单词都应该被计算在内：在直方图中或在overflow计数中：

if (nc < MAXLEN)
               ++wlen[nc];
           else if (nc > MAXLEN)
               ++overflow;

型
此处将其排除，因为==不匹配

变更1

if (nc <= MAXLEN)
                ++wlen[nc];
            else if (nc > MAXLEN)
                ++overflow;

型
我们得到

t te tes test tests
^Z
  1 -   1  *
  2 -   1  *
  3 -   1  *
  4 -   0
...
 20 - -858993460

Overflow: 2

型
现在我们有了这五个词的解释。

变更2

溢出中的巨大数字是由于此处的错误

for (i = 1; i < MAXHIST + 1; ++i)
    {
        printf("%3d - %3d  ", i, wlen[i]);
        for (j = wlen[i]; j > MINLEN && j < MAXLEN; --j)
        {
            printf("*");
        }
        printf("\n");
    }

型
因为在printf中，索引将一直到MAXHIST，而wlen从0到MAXHIST-1。
有很多方法可以纠正这个问题。这里使用的是一个简单的方法：只需要向wlen数组中添加1个项。

int wlen[1+MAXHIST];

型
和使用

for (i = 1; i <= MAXHIST; ++i) wlen[i] = 0;

型
我们得到

t te tes test tests
^Z
  1 -   1  *
  2 -   1  *
  3 -   1  *
  4 -   0
 ...
 20 -   0

Overflow: 2

型

第一次

nc = overflow = 0;
    for (i = 0; i < MAXHIST; ++i) wlen[i] = 0;

型

几十年来，我们可以并且应该在for中声明i。例如，在C++中，我们可以在switch或while或if头文件中声明变量。重要的是要缩小任何变量的作用域-生存期-，特别是那些具有简单名称的变量，如i和j。
多个初始化很花哨，但它使在其中查找变量变得更加困难。

我们可以写

int nc                = 0;
    int overflow          = 0;
    for (int i = 0; i < MAXHIST; ++i) wlen[i] = 0;

型
我们也可以写

int wlen[1 + MAXHIST] = {0};

型
并删除这个for循环，因为程序只对数组运行一次。
而不是

int state;
    state = OUT;

型
你应该使用

int state = OUT;

型

函数应该打印直方图，类似于

int print_h(unsigned len, int wlen[], unsigned overflow)
{
  for (unsigned i = 1; i <= len; ++i)
  {
      printf("%3d - %3d  ", i, wlen[i]);
      for (unsigned j = wlen[i]; j > MINLEN && j < MAXLEN;
           --j)
          printf("*");
      printf("\n");
  }
  printf("\nOverflow: %d\n", overflow);
  return 0;
}

型
可以。但是第二个只用于打印*的循环是多余的。

程序现在进行了这些更改

#include <stdio.h>

/* Prints a histogram of the lengths of words. */

#define MAXHIST 20
#define MAXLEN 3
#define MINLEN 0
#define IN 1
#define OUT 0

int print_h(unsigned len, int wlen[], unsigned overflow);

int main()
{
    int wlen[1 + MAXHIST] = {0};
    int nc                = 0;
    int overflow          = 0;

    int c     = 0;
    int state = OUT;
    while ((c = getchar()) != EOF)
    {
        if (c == '\t' || c == '\n' || c == ' ')
        {
            state = OUT;
            if (nc <= MAXLEN)
                ++wlen[nc];
            else if (nc > MAXLEN)
                ++overflow;
        }
        else if (state == OUT)
        {
            state = IN;
            nc    = 1;
        }
        else { ++nc; }
    }
    print_h(MAXHIST, wlen, overflow);
    return 0;
}

int print_h(unsigned len, int wlen[], unsigned overflow)
{
    for (unsigned i = 1; i <= len; ++i)
    {
        printf("%3d - %3d  ", i, wlen[i]);
        for (unsigned j = wlen[i]; j > MINLEN && j < MAXLEN;
             --j)
            printf("*");
        printf("\n");
    }
    printf("\nOverflow: %d\n", overflow);
    return 0;
}

型

示例输出

t te test tests
^Z
  1 -   1  *
  2 -   1  *
  3 -   0
  4 -   0
  5 -   0
  6 -   0
  7 -   0
  8 -   0
  9 -   0
 10 -   0
 11 -   0
 12 -   0
 13 -   0
 14 -   0
 15 -   0
 16 -   0
 17 -   0
 18 -   0
 19 -   0
 20 -   0

Overflow: 2

型

关于状态机和另一个可能的问题

使用if来控制状态，并且首先测试字母流，然后测试状态是很难遵循的。一般来说，有限状态机是循环中馈送的一些输入（在本例中是字母流）的过滤器，然后switch得到state并执行其操作。
FSM状态必须在项目（本例中为字母）之前进行测试。其思想是每个状态都是独立的，可以单独编码，即使是由不同的人和时间进行编码。
*overflow应该是这样一种状态：当一个单词达到溢出计数时，所有剩余的字母都被跳过。这是一种不同的行为，用这种方式编码会更容易。
*EOF也应该是一个状态：在EOF，我们只打印结果并终止。
*但在EOF中，我们可能会扫描某个单词，而所提供代码的工作方式将被忽略。

另一个示例

我没有一步一步地修改提供的代码，而是在这里留下一个完整的C示例、示例输出和一些参数，以免有一个更大的帖子。

示例中的一些结果

Stack Overflow C>_ p
    maximum word len is 4 (inclusive)
test
^Z
  1 -   0
  2 -   0
  3 -   0
  4 -   1  *
  5 -   0
  6 -   0
  7 -   0
  8 -   0
  9 -   0
 10 -   0
 11 -   0
 12 -   0
 13 -   0
 14 -   0
 15 -   0
 16 -   0
 17 -   0
 18 -   0
 19 -   0
 20 -   0

1 words in histogram [overflow: 0 words]. [All words accounted for].

Stack Overflow C>_ echo t te tes test tests | p
    maximum word len is 4 (inclusive)
  1 -   1  *
  2 -   1  *
  3 -   1  *
  4 -   1  *
  5 -   0
  6 -   0
...
 20 -   0

4 words in histogram [overflow: 1 words]. [All words accounted for].

Stack Overflow C>_ p 5
    maximum word len is 5 (inclusive)
1 12 123 1234
^Z
  1 -   1  *
  2 -   1  *
  3 -   1  *
  4 -   1  *
  5 -   0
  6 -   0
...
 20 -   0

4 words in histogram [overflow: 0 words]. [All words accounted for].

Stack Overflow C>_ echo t | p 5
    maximum word len is 5 (inclusive)
  1 -   1  *
  2 -   0
  3 -   0
 ...
 20 -   0

1 words in histogram [overflow: 0 words]. [All words accounted for].

Stack Overflow C>_ echo t te tes test tests | p 4
    maximum word len is 4 (inclusive)
  1 -   1  *
  2 -   1  *
  3 -   1  *
  4 -   1  *
  5 -   0
...
 20 -   0

4 words in histogram [overflow: 1 words]. [All words accounted for].

Stack Overflow C>_

型

程序p假定默认长度为4个字母，但用户可以在命令行上提供值，如p 12中的值，最多可输入12个字母的单词。
终端上类似于echo t te tes test tests | p 4的行使用4作为最大长度，并且echo和管道|符号之间的文本作为输入来运行程序，因此它使测试更快更容易。
这段代码记录了一个读过的单词数。2最后将它与直方图中的单词总数加上溢出计数进行比较，并报告一个最终的不匹配。

示例的完整代码

/* Prints a histogram of the lengths of words. */
#include <stdio.h>
#include <stdlib.h>

#define MAXHIST 20
#define MAXLEN 4
#define MINLEN 0

// states for the FSM
#define S_OUT 0
#define S_IN 1
#define S_OVERFLOW 2
#define S_EOF 3

// delimiters for wor
#define TAB '\t'
#define NEWLINE '\n'
#define SPACE ' '

int print_h(
    unsigned len, int wlen[], unsigned overflow,
    unsigned n_words);

int main(int argc, char** argv)
{
    unsigned max_l = MAXLEN;  // default
    if (argc > 1) max_l = atoi(argv[1]);
    fprintf(
        stderr, "    maximum word len is %u (inclusive)\n",
        max_l);

    int      wlen[1 + MAXHIST] = {0};
    unsigned nc                = 0;
    unsigned n_words           = 0;
    unsigned overflow          = 0;

    int state = S_OUT;

    int c = getchar();  // read first char
    do {
        switch (state)
        {
            case S_OUT:
                switch (c)
                {
                    case SPACE:
                    case TAB:
                    case NEWLINE:
                        // keep searching
                        c = getchar();  // keep going
                        break;
                    case EOF:
                        state = S_EOF;
                        // just go
                        break;
                    default:
                        // some non-space char
                        state = S_IN;
                        nc    = 1;
                        c     = getchar();  // keep going
                        break;
                };  // switc
                break;

            case S_IN:  // in a word
                switch (c)
                {
                    case SPACE:
                    case TAB:
                    case NEWLINE:  // white-space
                        ++wlen[nc];
                        ++n_words;
                        state = S_OUT;
                        c     = getchar();  // keep going
                        break;
                    case EOF:
                        if (nc <= max_l) ++wlen[nc]; // last word
                        ++n_words;
                        state = S_EOF;
                        break;
                    default:  // another non-space char
                        nc += 1;
                        c = getchar();  // keep going
                        if (nc > max_l)
                        {
                            overflow += 1;
                            ++n_words;
                            state = S_OVERFLOW;
                            break;
                        }
                        break;
                };              // switch
                break;

            case S_OVERFLOW:
                switch (c)
                {
                    case SPACE:
                    case TAB:
                    case NEWLINE:       // white-space
                        state = S_OUT;  // back to search
                        c     = getchar();  // keep going
                        break;
                    case EOF:  // we have a last word to
                               // account for
                        state = S_EOF;
                        break;
                    default:  // another non-space char
                        // do nothing
                        c = getchar();  // keep going
                        break;
                };              // switch
                break;

            case S_EOF:  // print results and return
            default:
                print_h(MAXHIST, wlen, overflow, n_words);
                return 0;
                break;

        };  // switch(state)

    } while (1);  // while

    return 0;  // to make the compiler happy
};             // main



int print_h(
    unsigned len, int wlen[], unsigned overflow,
    unsigned n_words)
{
    long unsigned n_hist = 0;  // words on table
    for (unsigned i = 1; i <= len; ++i)
    {
        printf("%3d - %3d  ", i, wlen[i]);
        n_hist += wlen[i];
        if (wlen[i] > 0)
            printf("*\n");
        else
            printf("\n");
    }
    printf(
        "\n%lu words in histogram [overflow: %d words].",
        n_hist, overflow);
    long unsigned total = n_hist + overflow;
    if (n_words == total)
        printf(" [All words accounted for].\n");
    else
        printf(
            "\
\n\
[ERROR]: %lu words in histogram\n\
         %u words on overflow\n\
         %u words on input.\n",
            n_hist, overflow, n_words);
    return 0;
}

型

关于示例代码

有4种状态：增加了溢出和eof
从命令行接受最大长度
在无限循环中测试状态

作为一个例子，下面是S_IN状态的代码，当我们在一个字内部时，一个新的c值刚刚被读取：

case S_IN:  // in a word
               switch (c)
               {
                   case SPACE:
                   case TAB:
                   case NEWLINE:  // white-space
                       ++wlen[nc];
                       ++n_words;
                       state = S_OUT;
                       c     = getchar();  // keep going
                       break;
                   case EOF:
                       if (nc <= max_l) ++wlen[nc]; // last word
                       ++n_words;
                       state = S_EOF;
                       break;
                   default:  // another non-space char
                       nc += 1;
                       c = getchar();  // keep going
                       if (nc > max_l)
                       {
                           overflow += 1;
                           ++n_words;
                           state = S_OVERFLOW;
                           break;
                       }
                       break;
               };              // switch
               break;

型
所有的州都使用类似的代码：在字母上有一个开关，这样写和管理状态就更快了。上面我们看到了其他3种状态的转换。S_OVERFLOW的代码类似于：

case S_OVERFLOW:
                switch (c)
                {
                    case SPACE:
                    case TAB:
                    case NEWLINE:       // white-space
                        state = S_OUT;  // back to search
                        c     = getchar();  // keep going
                        break;
                    case EOF:  // we have a last word to
                               // account for
                        state = S_EOF;
                        break;
                    default:  // another non-space char
                        // do nothing
                        c = getchar();  // keep going
                        break;
                };              // switch
                break;

型
这就是我想展示的。溢出的逻辑并没有隐藏在S_IN中，也没有跳过剩余字母的测试。FSM使抽象模型更容易。
每个状态的代码可以由不同的人开发和测试。

赞(0）回复(0）举报 12个月前

我来回答

C语言单词长度直方图习题提示？

4条答案

问题：忽略长度为`MAXLEN`的单词

变更1

变更2

更多关于代码的信息

第一次

程序现在进行了这些更改

示例输出

关于状态机和另一个可能的问题

另一个示例

示例中的一些结果

示例的完整代码

关于示例代码

相关问题

热门标签

最新问答

C语言 单词长度直方图习题提示？

4条答案

问题：忽略长度为MAXLEN的单词

变更1

变更2

更多关于代码的信息

第一次

程序现在进行了这些更改

示例输出

关于状态机和另一个可能的问题

另一个示例

示例中的一些结果

示例的完整代码

关于示例代码

相关问题

热门标签

最新问答

C语言单词长度直方图习题提示？

问题：忽略长度为`MAXLEN`的单词