我正在制作一个聊天机器人,它访问一个外部知识库docs
。我想获取机器人访问的相关文档以获取其答案,但当用户输入是“你好”、“你好吗”、“2+2是什么”或任何不是从外部知识库docs
中检索到的答案时,情况就不应该是这样了。在这种情况下,我希望retriever.get_relevant_documents(query)
或任何其他行返回一个空列表或类似的东西。
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
os.environ['OPENAI_API_KEY'] = ''
custom_template = """
This is conversation with a human. Answer the questions you get based on the knowledge you have.
If you don't know the answer, just say that you don't, don't try to make up an answer.
Chat History:
{chat_history}
Follow Up Input: {question}
"""
CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)
llm = ChatOpenAI(
model_name="gpt-3.5-turbo", # Name of the language model
temperature=0 # Parameter that controls the randomness of the generated responses
)
embeddings = OpenAIEmbeddings()
docs = [
"Buildings are made out of brick",
"Buildings are made out of wood",
"Buildings are made out of stone",
"Buildings are made out of atoms",
"Buildings are made out of building materials",
"Cars are made out of metal",
"Cars are made out of plastic",
]
vectorstore = FAISS.from_texts(docs, embeddings)
retriever = vectorstore.as_retriever()
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa = ConversationalRetrievalChain.from_llm(
llm,
retriever,
condense_question_prompt=CUSTOM_QUESTION_PROMPT,
memory=memory
)
query = "what are cars made of?"
result = qa({"question": query})
print(result)
print(retriever.get_relevant_documents(query))
字符串
我尝试为检索器设置一个阈值,但我仍然得到具有高相似度分数的相关文档。在其他有相关文档的用户提示中,我没有得到任何相关文档。
retriever = vectorstore.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .9})
型
4条答案
按热度按时间neekobn81#
你需要在链“return_source_documents”中添加参数,如下所示
字符串
结果你会得到你的源文档沿着的相似性分数
获取所有相关文件
型
yc0p9oo02#
为了解决这个问题,我不得不将链类型更改为RetrievalQA并引入代理和工具。
字符串
如果结果访问源,则它将具有键
"intermediate_steps"
的值,然后源文档可以通过result1["intermediate_steps"][0][1]["source_documents"]
访问否则,当查询不需要源时,
result2["intermediate_steps"]
将为空。uxh89sit3#
很抱歉问到这里,我不能添加评论。我的问题是:当你添加代理时,你的答案是否变短了?你能解决这个问题吗?
bfhwhh0e4#
这帮助我获得了源文档的URL:
字符串
还有这个版本来更好地格式化它:
型