我正在尝试一些自定进度的学习,以创建和配置LLM供个人使用。在这个场景中,我尝试连接到AstraDB,将新闻文章样本中的标题存储在矢量数据库中
Python代码如下:
ASTRA_DB_SECURE_BUNDLE_PATH = <INSERT PATH>.zip #This is in a zip file downloaded from AstraDB
ASTRA_DB_APPLICATION_TOKEN = <INSERT TOKEN>
ASTRA_DB_CLIENT_ID = <INSERT CLIENT_ID>
ASTRA_DB_CLIENT_SECRET = <INSERT CLIENT_SECRET>
ASTRA_DB_KEYSPACE_NAME = <INSERT KEYSPACE NAME>
OPEN_API_KEY = <INSERT OPENAI KEY>
from langchain.vectorstores.cassandra import Cassandra
from langchain.indexes.vectorstore import VectorStoreIndexWrapper
from langchain.llms import OpenAI
from langchain.embeddings import OpenAIEmbeddings
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from datasets import load_dataset
cloud_config= {
'secure_connect_bundle': ASTRA_DB_SECURE_BUNDLE_PATH
}
auth_provider = PlainTextAuthProvider(ASTRA_DB_CLIENT_ID, ASTRA_DB_CLIENT_SECRET)
cluster = Cluster(cloud=cloud_config, auth_provider=auth_provider)
astraSession = cluster.connect()
llm = OpenAI(openai_api_key=OPEN_API_KEY)
myEmbedding = OpenAIEmbeddings(openai_api_key=OPEN_API_KEY)
myCassandraVStore = Cassandra(
embedding = myEmbedding,
session = astraSession,
keyspace = ASTRA_DB_KEYSPACE_NAME,
table_name = "qa_mini_demo",
)
print("loading data from huggingface")
myDataset = load_dataset("Biddls/Onion_News", split = "train")
headlines = myDataset["text"][:50]
print("\nGenerating embeddings and storing in AstraDB")
myCassandraVStore.add_texts(headlines)
print("Inserted %i headlines.\n" % len(headlines))
当我运行文件时,我收到以下错误:
Traceback (most recent call last):
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\cassandra\datastax\cloud\__init__.py", line 138, in read_metadata_info
response = urlopen(url, context=config.ssl_context, timeout=timeout)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 519, in open
response = self._open(req, data)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 1352, in do_open
r = h.getresponse()
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\http\client.py", line 1375, in getresponse
response.begin()
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\http\client.py", line 318, in begin
version, status, reason = self._read_status()
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\http\client.py", line 279, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\socket.py", line 705, in readinto
return self._sock.recv_into(b)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\ssl.py", line 1274, in recv_into
return self.read(nbytes, buffer)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\ssl.py", line 1130, in read
return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\PATH\", line 22, in <module>
cluster = Cluster(cloud=cloud_config, auth_provider=auth_provider)
File "cassandra\cluster.py", line 1132, in cassandra.cluster.Cluster.__init__
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\cassandra\datastax\cloud\__init__.py", line 92, in get_cloud_config
config = read_metadata_info(config, cloud_config)
File "C:\PATH\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\cassandra\datastax\cloud\__init__.py", line 141, in read_metadata_info
raise DriverException("Unable to connect to the metadata service at %s. "
cassandra.DriverException: Unable to connect to the metadata service at https://3b3b9a1d-bb70-4078-8d4f-5b0e69e5a4b3-us-east1.db.astra.datastax.com:29080/metadata. Check the cluster status in the cloud console.
我已经两次、三次和四次检查了群集是否处于活动状态。我的猜测是,超时错误是造成一个问题,也许是因为一个缓慢的互联网连接,但我不知道如何运行不同长度的时间测试。
欣赏这里的任何见解。
1条答案
按热度按时间new9mtju1#
如果你更新如下呢?