如何使用python脚本将无头csv数据加载到cassandra列族中

yjghlzjz 于 2021-06-09 发布在 Cassandra

关注(0)|答案(1)|浏览(405)

我用cqlsh在本地cassandra创建了一个列族，如下所示。

CREATE TABLE sample.stackoverflow_question12 (
    id1 int,
    class1 int,
    name1 text,
    PRIMARY KEY (id1)
)

我有一个名为“data.csv”的示例csv文件，文件中的数据如下所示。
id1 |名称1 |类别1
1 |你好| 10
2 |世界| 20
使用下面的python代码连接数据库并使用anaconda从csv加载数据（在anaconda中使用pip安装cassandra驱动程序之后）


# Connecting to local Cassandra server

from Cassandra.Cluster import Cluster

from cassandra.auth import PlainTextAuthProvider

auth_provider = PlainTextAuthProvider(username='cassandra', password='cassandra')

cluster = Cluster(["127.0.0.1"],auth_provider = auth_provider,protocol_version=4)
session = cluster.connect()
session.set_keyspace('sample')
cluster.connect()

# File loading

prepared = session.prepare(' Insert into stackoverflow_question12 (id1,class1,name1)VALUES (?, ?, ?)')
with open('D:/Cassandra/NoSQL/data.csv', 'r') as fares:
    for fare in fares:
        columns=fare.split(",")
        id1=columns[0]
        class1=columns[1]
        name1=columns[2]
        session.execute(prepared, [id1,class1,name1])

# closing the file

fares.close()

当我执行上面的代码得到下面的错误。
收到列“id1”的无效类型的参数。预期：<class'cassandra.cqltypes.int32type'>，获得：<class'str'>(必需参数不是整数）
当我将数据类型更改为文本并运行上述代码时，它也会加载带有标题字段的数据。
有人能帮我修改代码来加载没有标题内容的数据吗？或者你成功的代码也可以。
将列名设为id1和class1的原因是id和class是关键字，在“fares”循环中使用时在代码中抛出错误。
但在现实世界中，列名将被视为class和id。当这些类型的列出现在图片中时，如何运行代码？
我想到的另一个问题是，cassandra将先存储主键，然后按升序存储剩余的键。我们可以加载与cassandra列存储不同索引的csv列吗？
基于此，我需要构建另一个解决方案。

cassandra python csv pycassa

来源：https://stackoverflow.com/questions/64108047/how-to-load-csv-data-without-header-into-cassandra-column-family-using-python-sc

1条答案

按热度按时间

ax6ht2ek1#

您需要根据您的模式使用相应的类型—对于需要使用的整数列 int(columns...) 因为拆分生成字符串。如果要跳过标题，则可以执行以下操作：

cnt = 0
with open('D:/Cassandra/NoSQL/data.csv', 'r') as fares:
    if cnt = 0:
       continue
    for fare in fares:
       ...

尽管最好使用python内置的csv读取器，该读取器可以定制为自动跳过标题。。。
p、如果您只想从csv加载数据，我建议您使用外部工具，比如dsbulk，它们非常灵活，并且针对该任务进行了大量优化。有关示例，请参阅以下博客文章：
https://www.datastax.com/blog/2019/03/datastax-bulk-loader-introduction-and-loading
https://www.datastax.com/blog/2019/04/datastax-bulk-loader-more-loading
https://www.datastax.com/blog/2019/04/datastax-bulk-loader-common-settings
https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
https://www.datastax.com/blog/2019/07/datastax-bulk-loader-counting
https://www.datastax.com/blog/2019/12/datastax-bulk-loader-examples-loading-other-locations

赞(0）回复(0）举报 2021-06-09

我来回答

如何使用python脚本将无头csv数据加载到cassandra列族中

1条答案

相关问题

热门标签

最新问答