在Matlab中有没有一种方法可以确定文件中的行数，而无需循环遍历每一行？

czfnxgou 于 2023-05-29 发布在 Matlab

关注(0)|答案(6)|浏览(329)

显然，可以使用fgetl或类似的函数循环遍历文件并递增计数器，但是否有一种方法可以确定文件中的行数 * 而无需 * 执行这样的循环？

来源：https://stackoverflow.com/questions/12176519/is-there-a-way-in-matlab-to-determine-the-number-of-lines-in-a-file-without-loop

6条答案

按热度按时间

iyfamqjs1#

我喜欢使用下面的代码来完成这个任务

fid = fopen('someTextFile.txt', 'rb');
%# Get file size.
fseek(fid, 0, 'eof');
fileSize = ftell(fid);
frewind(fid);
%# Read the whole file.
data = fread(fid, fileSize, 'uint8');
%# Count number of line-feeds and increase by one.
numLines = sum(data == 10) + 1;
fclose(fid);

如果你有足够的内存来一次读取整个文件，那么它的速度是相当快的。它应该适用于Windows和Linux风格的行尾。

**编辑：**我衡量了到目前为止提供的答案的表现。下面是确定包含一百万个双精度值的文本文件的行数的结果（每行一个值）。平均10次尝试。

Author           Mean time +- standard deviation (s)
------------------------------------------------------
 Rody Oldenhuis      0.3189 +- 0.0314
 Edric (2)           0.3282 +- 0.0248
 Mehrwolf            0.4075 +- 0.0178
 Jonas               1.0813 +- 0.0665
 Edric (1)          26.8825 +- 0.6790

所以最快的方法是使用Perl并将所有文件作为二进制数据读取。如果Perl内部也一次读取文件的大块，而不是逐行循环，我不会感到惊讶（只是猜测，对Perl一无所知）。
使用一个简单的fgetl()循环比其他方法慢25-75倍。

**编辑2：**包括Edric的第二种方法，我会说，它比Perl解决方案快得多。

赞(0）回复(0）举报 2023-05-29

noj0wjuj2#

我认为一个循环实际上是最好的--到目前为止建议的所有其他选项要么依赖于外部程序（需要错误检查; need str2num;更难调试/跨平台运行等）或一次性读取整个文件。循环也没那么糟。这是我的变体

function count = countLines(fname)
  fh = fopen(fname, 'rt');
  assert(fh ~= -1, 'Could not read: %s', fname);
  x = onCleanup(@() fclose(fh));
  count = 0;
  while ischar(fgetl(fh))
    count = count + 1;
  end
end

编辑：Jonas正确地指出上面的循环真的很慢。这里有一个更快的版本。

function count = countLines(fname)
fh = fopen(fname, 'rt');
assert(fh ~= -1, 'Could not read: %s', fname);
x = onCleanup(@() fclose(fh));
count = 0;
while ~feof(fh)
    count = count + sum( fread( fh, 16384, 'char' ) == char(10) );
end
end

它仍然不如wc -l快，但也不是灾难。

赞(0）回复(0）举报 2023-05-29

qacovj5a3#

我发现了一个很好的技巧here：

if (isunix) %# Linux, mac
    [status, result] = system( ['wc -l ', 'your_file'] );
    numlines = str2num(result);

elseif (ispc) %# Windows
    numlines = str2num( perl('countlines.pl', 'your_file') );

else
    error('...');

end

其中'countlines.pl'是一个perl脚本，包含

while (<>) {};
print $.,"\n";

赞(0）回复(0）举报 2023-05-29

ubbxdtey4#

您可以一次读取整个文件，然后计算您已经读取了多少行。

fid = fopen('yourFile.ext');

allText = textscan(fid,'%s','delimiter','\n');

numberOfLines = length(allText{1});

fclose(fid)

赞(0）回复(0）举报 2023-05-29

btqmn9zl5#

我建议使用一个外部工具。例如，一个名为cloc的应用程序，您可以在这里免费下载。
在Linux上，只需输入cloc <repository path>并获得

YourPC$ cloc <directory_path>
      87 text files.
      81 unique files.                              
      23 files ignored.

http://cloc.sourceforge.net v 1.60  T=0.19 s (311.7 files/s, 51946.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
MATLAB                          59           1009           1074           4993
HTML                             1              0              0             23
-------------------------------------------------------------------------------
SUM:                            60           1009           1074           5016
-------------------------------------------------------------------------------

他们还声称它应该在Windows上工作。

赞(0）回复(0）举报 2023-05-29

wtlkbnrh6#

艾德里奇的答案中数错行的问题可以用这个来解决。

function count = countlines(fname)
    fid = fopen(fname, 'r');
    assert(fid ~= -1, 'Could not read: %s', fname);
    x = onCleanup(@() fclose(fid));
    count = 0;
    % while ~feof(fid)
    %     count = count + sum( fread( fid, 16384, 'char' ) == char(10) );
    % end
    while ~feof(fid)
        [~] = fgetl(fid);
        count = count + 1;
    end
end

赞(0）回复(0）举报 2023-05-29

我来回答

在Matlab中有没有一种方法可以确定文件中的行数，而无需循环遍历每一行？

6条答案

相关问题

热门标签

最新问答