我有一个csv文件,它的格式不正确,无法让AGE加载。我的任务是将它转换成一个新的,以便AGE可以读取它并创建节点,就像在documentation中指定的那样。为此,我创建了一个python脚本,它创建一个新文件,连接到postgres,并执行查询。我认为这可能是有用的,因为如果有人有csv文件,并希望创建节点和边缘,并将其发送到AGE,但它不是指定的格式,这可以用来快速解决问题。
下面是旧的csv文件(ProductsData.csv),它包含其他用户购买的产品的数据(由他们的user_id
标识),购买产品的商店(由他们的store_id
标识),以及product_id
,这是节点的id
:
product_name,price,description,store_id,user_id,product_id
iPhone 12,999,"Apple iPhone 12 - 64GB, Space Gray",1234,1001,123
Samsung Galaxy S21,899,"Samsung Galaxy S21 - 128GB, Phantom Black",5678,1002,124
AirPods Pro,249,"Apple AirPods Pro with Active Noise Cancellation",1234,1003,125
Sony PlayStation 5,499,"Sony PlayStation 5 Gaming Console, 1TB",9012,1004,126
下面是Python文件:
import psycopg2
import age
import csv
def read_csv(csv_file):
with open(csv_file, 'r') as file:
reader = csv.reader(file)
rows = list(reader)
return rows
def create_csv(csv_file):
new_header = ['id', 'product_name', 'description', 'price', 'store_id', 'user_id']
property_order = [5, 0, 2, 1, 3, 4] # Reorder the properties accordingly.
rows = read_csv(csv_file)
new_csv_file = 'products.csv'
with open(new_csv_file, 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(new_header)
# Write each row with reordered properties.
for row in rows[1:]:
new_row = [row[i] for i in property_order]
writer.writerow(new_row)
print(f"New CSV file '{new_csv_file}' has been created with the desired format.")
def load_csv_nodes(csv_file, graph_name, conn):
with conn.cursor() as cursor:
try :
cursor.execute("""LOAD 'age';""")
cursor.execute("""SET search_path = ag_catalog, "$user", public;""")
cursor.execute("""SELECT load_labels_from_file(%s, 'Node', %s)""", (graph_name, csv_file,) )
conn.commit()
except Exception as ex:
print(type(ex), ex)
conn.rollback()
def main():
csv_file = 'ProductsData.csv'
create_csv(csv_file)
new_csv_file = 'products.csv'
GRAPH_NAME = 'csv_test_graph'
conn = psycopg2.connect(host="localhost", port="5432", dbname="database", user="user", password="password")
age.setUpAge(conn, GRAPH_NAME)
path_to_csv = '/path/to/folder/' + new_csv_file
load_csv_nodes(path_to_csv, GRAPH_NAME, conn)
main()
生成的文件:
id,product_name,description,price,store_id,user_id
123,iPhone 12,"Apple iPhone 12 - 64GB, Space Gray",999,1234,1001
124,Samsung Galaxy S21,"Samsung Galaxy S21 - 128GB, Phantom Black",899,5678,1002
125,AirPods Pro,Apple AirPods Pro with Active Noise Cancellation,249,1234,1003
126,Sony PlayStation 5,"Sony PlayStation 5 Gaming Console, 1TB",499,9012,1004
但是,当运行脚本时,它显示以下消息:
<class 'psycopg2.errors.InvalidParameterValue'> label_id must be 1 .. 65535
ID设置在1和65535之间,我不明白为什么会显示此错误消息。
2条答案
按热度按时间0md85ypi1#
关于如何使用
load_labels_from_file
,请参阅regress testing file。它显示了如何使用所有命令。在使用以下命令调用
load_labels_from_file
之前,首先需要创建Node
vlabel:然后按原样运行脚本。
to94eoyn2#
那一行写得不对,需要用正确的路径修复