不理解Python的csv.reader对象[重复]

c6ubokkw 于 2023-01-08 发布在 Python

关注(0)|答案(3)|浏览(133)

- 此问题在此处已有答案**：

Why can't I iterate twice over the same iterator? How can I "reset" the iterator or reuse the data?（5个答案）
Proper way to reset csv.reader for multiple iterations?（3个答案）
11小时前关门了。
我在python的内置csv模块中遇到了一个我以前从未注意到的行为。通常，当我读入一个csv时，它几乎是一字不差地遵循doc's，使用"with"打开文件，然后使用"for"循环遍历reader对象。然而，我最近尝试了连续两次遍历csv.reader对象。却发现第二个"for"循环什么也没做。

import csv

with open('smallfriends.csv','rU') as csvfile:
readit = csv.reader(csvfile,delimiter=',')

for line in readit:
    print line

for line in readit:
    print 'foo'

控制台输出：

Austins-iMac:Desktop austin$ python -i amy.py 
['Amy', 'James', 'Nathan', 'Sara', 'Kayley', 'Alexis']
['James', 'Nathan', 'Tristan', 'Miles', 'Amy', 'Dave']
['Nathan', 'Amy', 'James', 'Tristan', 'Will', 'Zoey']
['Kayley', 'Amy', 'Alexis', 'Mikey', 'Sara', 'Baxter']
>>>
>>> readit
<_csv.reader object at 0x1023fa3d0>
>>>

所以第二个"for"循环基本上什么也不做。我的一个想法是，csv.reader对象在被读取一次后从内存中释放。但事实并非如此，因为它仍然保留它的内存地址。我发现一个post提到了类似的问题。他们给出的原因是，一旦对象被读取，指针停留在内存地址的末尾，准备好向对象写入数据，这对吗？，有人能更详细地解释一下，这里发生了什么吗？，有没有办法把指针推回到内存地址的开头，来重新读取它？我知道这样做是不好的编码实践，但我主要只是好奇，想更多地了解Python的引擎盖下发生了什么。
谢谢!

python

来源：https://stackoverflow.com/questions/27264818/dont-understand-pythons-csv-reader-object

3条答案

按热度按时间

t9aqgxwy1#

如果不是太多的数据，你总是可以把它读入一个列表：

import csv

with open('smallfriends.csv','rU') as csvfile:
    readit = csv.reader(csvfile,delimiter=',')
    csvdata = list(readit)

    for line in csvdata :
        print line

    for line in csvdata :
        print 'foo'

赞(0）回复(0）举报 2023-01-08

eqqqjvef2#

我将尝试回答您的其他问题，如阅读器的作用以及为什么reset()或seek(0)可能有帮助。最基本的形式，csv阅读器可能类似于：

def csv_reader(it):
    for line in it:
        yield line.strip().split(',')

也就是说，它接受任何产生字符串的迭代器，并给你一个生成器。它所做的一切就是从你的迭代器中获取一个项，处理它并返回该项。当it被使用时，csv_reader将退出。读取器不知道迭代器来自哪里，也不知道如何正确地创建一个新的迭代器，所以它甚至不尝试重置自己。这留给程序员。
我们可以在读者不知道的情况下修改迭代器，也可以创建一个新的读者。下面是一些例子来证明我的观点。

data = open('data.csv', 'r')
reader = csv.reader(data)

print(next(reader))               # Parse the first line
[next(data) for _ in range(5)]    # Skip the next 5 lines on the underlying iterator
print(next(reader))               # This will be the 7'th line in data
print(reader.line_num)            # reader thinks this is the 2nd line
data.seek(0)                      # Go back to the beginning of the file
print(next(reader))               # gives first line again

data = ['1,2,3', '4,5,6', '7,8,9']
reader = csv.reader(data)         # works fine on lists of strings too
print(next(reader))               # ['1', '2', '3']

一般来说，如果你需要第二次通过，最好关闭/重新打开你的文件，并使用一个新的csv阅读器。它的清洁，并确保良好的簿记。

赞(0）回复(0）举报 2023-01-08

djmepvbi3#

遍历一个csvreader只需要遍历底层文件对象中的行，每次遍历读取器都会从文件中获取下一行，转换并返回它。
因此，遍历一个csvreader和遍历文件遵循相同的约定，也就是说，一旦文件到达末尾，在第二次迭代之前，你必须找到文件的开头。
下面应该可以，虽然我还没有测试它：

import csv

with open('smallfriends.csv','rU') as csvfile:
    readit = csv.reader(csvfile,delimiter=',')

    for line in readit:
        print line

    # go back to the start of the file
    csvfile.seek(0)

    for line in readit:
        print 'foo

赞(0）回复(0）举报 2023-01-08

我来回答

不理解Python的csv.reader对象[重复]

3条答案

相关问题

热门标签

最新问答