如何在Python中存储包含常见分隔字符的字符串的2D数组？

fcipmucu 于 2023-03-21 发布在 Python

关注(0)|答案(2)|浏览(100)

我正试图存储大量的新闻文章（最终可能超过几千个）。为了方便起见，我想将它们存储在文本文件中的2D数组中，例如[[ID，Title，Article]，[1，'Bill's Burgers'，'The owner，Bill，makes good burgers."]]。然而，我在网上找到的解决方案需要一些字符，例如逗号，空格、换行符等来分隔条目。因为这些通常出现在新闻文章中，我不能用它们来分隔元素。
我尝试使用json格式化我的2D数组，但发现这对我的数组没有任何影响。当打印/打开txt文件时，它的显示与我声明它时完全一样-“[[[“ID”，“URL”，“Title”，“Date”，“Article”]，[“1”，“2”，“3”，“4”，“5”]]"。我的代码如下：

scraped_articles_array_headings = [["ID", "URL", "Title", "Date", "Article"],["1","2","3","4","5"]]
headings_encoded = json.dumps(scraped_articles_array_headings)
print(headings_encoded)
f = open("articles_encoded2.txt", "w", encoding="utf-8")
f.write(headings_encoded)
f.close()

我欢迎任何关于存储此数据的合适方法的建议-理想情况下，我只是希望系统能够轻松搜索每个参数（ID，Title等）的内容，我意识到上述方法可能不会遵循合理的路径来实现这一点。

python

来源：https://stackoverflow.com/questions/75796254/how-to-store-a-2d-array-of-strings-containing-common-delimited-characters-in-pyt

2条答案

按热度按时间

x7yiwoj41#

我想你可能想看看一个JSON格式的文档示例。记住，这与CSV格式不同，CSV格式是基于行的，并且使用了你提到的分隔符。
要将这些值存储在JSON中，您可能需要执行以下操作：

[
    {
        "id": 1,
        "url": "https://example.com",
        "title": "Example Title",
        "date": "21-03-2003",
        "article": "Some article content..."
    },
    {
        "id": 2,
        "url": "https://example.com",
        "title": "Example Title",
        "date": "21-03-2003",
        "article": "Some article content..."
    }
]

对于存储大量数据，您可能需要考虑使用数据库来提高性能，并允许查询（例如从某个URL查找所有文章）。

赞(0）回复(0）举报 2023-03-21

ttcibm8c2#

你可以有包含“逗号，空格，换行符等”的文章，但仍然使用常见的格式，将它们用作分隔符。一个使用Python csv模块的例子：

import csv

scraped_articles = [['ID', 'URL', 'Title', 'Date', 'Article'],
                    ['1','2','3','4','article1 with\n "quotes" and commas(,) and newlines'],
                    ['1','2','3','4','article2 with\n "quotes" and commas(,) and newlines']]

with open('articles.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    writer.writerows(scraped_articles)

with open('articles.csv', 'r', newline='', encoding='utf-8') as f:
    reader = csv.DictReader(f)
    for line in reader:
        print('-'*80)
        print(line['Article'])
    print('-'*80)

结果articles.csv：

ID,URL,Title,Date,Article
1,2,3,4,"article1 with
 ""quotes"" and commas(,) and newlines"
1,2,3,4,"article2 with
 ""quotes"" and commas(,) and newlines"

回阅读数据后输出：

--------------------------------------------------------------------------------
article1 with
 "quotes" and commas(,) and newlines
--------------------------------------------------------------------------------
article2 with
 "quotes" and commas(,) and newlines
--------------------------------------------------------------------------------

赞(0）回复(0）举报 2023-03-21

我来回答

如何在Python中存储包含常见分隔字符的字符串的2D数组？

2条答案

相关问题

热门标签

最新问答