使用源文档进行检索式问答
本文档介绍了如何在索引上使用源文档进行问答。它通过使用 RetrievalQAWithSourcesChain
实现,该链式结构可以从索引中查找文档。
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chroma
with open("../../state_of_the_union.txt") as f:
state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)
embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": f"{i}-pl"} for i in range(len(texts))])
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
from langchain.chains import RetrievalQAWithSourcesChain
from langchain import OpenAI
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever())
chain({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
{'answer': ' The president honored Justice Breyer for his service and mentioned his legacy of excellence.\n',
'sources': '31-pl'}
链类型
您可以轻松地指定不同的链类型,以在 RetrievalQAWithSourcesChain 链中加载和使用。有关这些类型的更详细说明,请参见 此笔记本。
有两种加载不同链类型的方式。首先,您可以在 from_chain_type
方法中指定链类型参数。这允许您传递要使用的链类型的名称。例如,在下面的示例中,我们将链类型更改为 map_reduce
。
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="map_reduce", retriever=docsearch.as_retriever())
chain({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
{'answer': ' The president said "Justice Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service."\n',
'sources': '31-pl'}
上述方法允许您非常简单地更改链式类型,但它确实在链式类型的参数上提供了很大的灵活性。如果您想控制这些参数,可以直接加载链式结构(就像在这个笔记本中所做的那样),然后使用 combine_documents_chain
参数将其直接传递给 RetrievalQAWithSourcesChain 链式结构。例如:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
qa_chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
qa = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())
qa({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
{'answer': ' The president honored Justice Breyer for his service and mentioned his legacy of excellence.\n',
'sources': '31-pl'}