打开不同文件夹中的文本文件并写入csv单元格

zpf6vheq 于 2021-08-20 发布在 Java

关注(0)|答案(2)|浏览(329)

我正在尝试从不同的文件夹中获取文本，并将每个文本以csv格式及其文件名（*.txt）写入单个单元格中

import os
folders = os.listdir("/Users/hilo/Documents/digitization/ReleasedDataset_mp3")
folders

import  glob, csv

在这里，我尝试获取文件夹名称列表，它们如下所示：

['Becton Dickinson_20170803',
 'CIGNA Corp._20170202',
 'The Bank of New York Mellon Corp._20170720',
 'JPMorgan Chase & Co._20170714']

在这里，我尝试应用一个循环来打开和提取每个txt文件中的所有文本，并使用键（）将所有文本写入csv文件中的一个单元格中

for i in folders:
    files=glob.glob("/Users/hilo/Documents/digitization/ReleasedDataset_mp3/i/*.txt")
with open('writeData.csv', mode='w') as new_file:
  writer = csv.writer(new_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
  for filename in files:

    # Take all sentences from a given file
    file = open(filename, 'rt')
    text = file.read()
    file.close()

    for text in text:
      writer.writerow((filename, text))

这会不断生成一个空的csv。是否有人对代码中的问题提出了解决建议？
更新：一个小样本的数据链接！

python csv Glob

来源：https://stackoverflow.com/questions/68309791/opening-text-files-in-different-folders-and-write-to-a-csv-cell

2条答案

按热度按时间

tez616oj1#

根据您在评论中提供的其他信息，我认为这将起作用：

import csv
import glob
import os
from pprint import pprint, pp

# root_folder = "/Users/hilo/Documents/digitization/ReleasedDataset_mp3"

root_folder = "/Stack Overflow/_test_files_root"

# folders = ['Becton Dickinson_20170803',

# 'CIGNA Corp._20170202',

# 'The Bank of New York Mellon Corp._20170720',

# 'JPMorgan Chase & Co._20170714']

folders = ['Subfolder1', 'Subfolder3']

filepaths = []
for subfolder in folders:
    filepaths.extend(glob.glob(os.path.join(root_folder, subfolder, "*.txt")))

if os.name == 'nt':  # Improve readability on Windows (optional)
    filepaths[:] = [filepath.replace('\\', '/') for filepath in filepaths]
pprint(filepaths, width=128)  # Show files to be processed (optional)

# Process the files.

with open('writeData.csv', mode='w', newline='') as new_file:
    writer = csv.writer(new_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for filename in filepaths:
        # Take all sentences from a given file.
        with open(filename, 'rt') as file:
            text = file.read()
        # Write them into CSV along with filename.
        writer.writerow((filename, text))

print('-FINI-')

以下是在excel中创建的文件的外观：
（我使用各种在线新闻文章的文本进行测试。）