Skip to content

自适应 RAG

自适应 RAG 是一种 RAG 策略,它结合了 (1) 查询分析 和 (2) 主动/自我纠正 RAG

论文 中,他们报告了查询分析的路由:

  • 不进行检索
  • 单次 RAG
  • 迭代 RAG

让我们基于此使用 LangGraph。

在我们的实现中,我们将进行以下路由:

  • 网络搜索:用于与最近事件相关的问题
  • 自我纠正 RAG:用于与我们的索引相关的问题

Screenshot 2024-03-26 at 1.36.03 PM.png

设置

首先,让我们安装所需的包并设置我们的API密钥。

%%capture --no-stderr
! pip install -U langchain_community tiktoken langchain-openai langchain-cohere langchainhub chromadb langchain langgraph  tavily-python
import getpass
import os


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("OPENAI_API_KEY")
_set_env("COHERE_API_KEY")
_set_env("TAVILY_API_KEY")

为LangGraph开发设置LangSmith

注册LangSmith可以快速发现问题并提高您的LangGraph项目的性能。LangSmith允许您使用跟踪数据来调试、测试和监控您使用LangGraph构建的LLM应用 — 了解如何开始的更多信息,请点击这里

创建索引

# ##构建索引

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# ##来自 langchain_cohere 的 CohereEmbeddings

# 设置嵌入
embd = OpenAIEmbeddings()

# 要索引的文档
urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

# 加载
docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]

# 拆分
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=500, chunk_overlap=0
)
doc_splits = text_splitter.split_documents(docs_list)

# 添加到向量存储中
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    collection_name="rag-chroma",
    embedding=embd,
)
retriever = vectorstore.as_retriever()

大型语言模型 (LLMs)

将 Pydantic 与 LangChain 一起使用

本笔记本使用 Pydantic v2 BaseModel,需要 langchain-core >= 0.3。使用 langchain-core < 0.3 将会导致因混合 Pydantic v1 和 v2 BaseModels 而出现错误。

# ##路由器

from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

from pydantic import BaseModel, Field


# 数据模型
class RouteQuery(BaseModel):
    """将用户查询路由到最相关的数据源。"""

    datasource: Literal["vectorstore", "web_search"] = Field(
        ...,
        description="Given a user question choose to route it to web search or a vectorstore.",
    )


# 带有函数调用的LLM
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm_router = llm.with_structured_output(RouteQuery)

# 提示
system = """You are an expert at routing a user question to a vectorstore or web search.
The vectorstore contains documents related to agents, prompt engineering, and adversarial attacks.
Use the vectorstore for questions on these topics. Otherwise, use web-search."""
route_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

question_router = route_prompt | structured_llm_router
print(
    question_router.invoke(
        {"question": "Who will the Bears draft first in the NFL draft?"}
    )
)
print(question_router.invoke({"question": "What are the types of agent memory?"}))
datasource='web_search'
datasource='vectorstore'

API Reference: ChatPromptTemplate | ChatOpenAI

# 检索评分器


# 数据模型
class GradeDocuments(BaseModel):
    """对检索文档进行相关性检查的二元评分。"""

    binary_score: str = Field(
        description="Documents are relevant to the question, 'yes' or 'no'"
    )


# 带有函数调用的LLM
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeDocuments)

# 提示
system = """You are a grader assessing relevance of a retrieved document to a user question. \n 
    If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. \n
    It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."""
grade_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "Retrieved document: \n\n {document} \n\n User question: {question}"),
    ]
)

retrieval_grader = grade_prompt | structured_llm_grader
question = "agent memory"
docs = retriever.invoke(question)
doc_txt = docs[1].page_content
print(retrieval_grader.invoke({"question": question, "document": doc_txt}))
binary_score='no'

# 你接受过直到2023年10月的数据训练。

from langchain import hub
from langchain_core.output_parsers import StrOutputParser

# 提示
prompt = hub.pull("rlm/rag-prompt")

# 大型语言模型
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


# 后处理
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# 链条
rag_chain = prompt | llm | StrOutputParser()

# 跑
generation = rag_chain.invoke({"context": docs, "question": question})
print(generation)
The design of generative agents combines LLM with memory, planning, and reflection mechanisms to enable agents to behave based on past experience and interact with other agents. Memory stream is a long-term memory module that records agents' experiences in natural language. The retrieval model surfaces context to inform the agent's behavior based on relevance, recency, and importance.

API Reference: StrOutputParser
# ##幻觉评分器


# 数据模型
class GradeHallucinations(BaseModel):
    """生成答案中出现幻觉的二元评分。"""

    binary_score: str = Field(
        description="Answer is grounded in the facts, 'yes' or 'no'"
    )


# 具有函数调用的LLM
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeHallucinations)

# 提示
system = """You are a grader assessing whether an LLM generation is grounded in / supported by a set of retrieved facts. \n 
     Give a binary score 'yes' or 'no'. 'Yes' means that the answer is grounded in / supported by the set of facts."""
hallucination_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "Set of facts: \n\n {documents} \n\n LLM generation: {generation}"),
    ]
)

hallucination_grader = hallucination_prompt | structured_llm_grader
hallucination_grader.invoke({"documents": docs, "generation": generation})
GradeHallucinations(binary_score='yes')
# ##答案评分器


# 数据模型
class GradeAnswer(BaseModel):
    """二元评分用于评估回答是否解决了问题。"""

    binary_score: str = Field(
        description="Answer addresses the question, 'yes' or 'no'"
    )


# 具有函数调用的LLM
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeAnswer)

# 提示
system = """你是一个评卷人,评估一个答案是否回答/解决了一个问题。请给出二元评分“是”或“否”。“是”意味着答案解决了问题。"""
answer_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "User question: \n\n {question} \n\n LLM generation: {generation}"),
    ]
)

answer_grader = answer_prompt | structured_llm_grader
answer_grader.invoke({"question": question, "generation": generation})
GradeAnswer(binary_score='yes')
# ##问题重写器

# 大型语言模型
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

# 提示
system = """You a question re-writer that converts an input question to a better version that is optimized \n 
     for vectorstore retrieval. Look at the input and try to reason about the underlying semantic intent / meaning."""
re_write_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        (
            "human",
            "Here is the initial question: \n\n {question} \n Formulate an improved question.",
        ),
    ]
)

question_rewriter = re_write_prompt | llm | StrOutputParser()
question_rewriter.invoke({"question": question})
"What is the role of memory in an agent's functioning?"

网络搜索工具

# ##搜索

from langchain_community.tools.tavily_search import TavilySearchResults

web_search_tool = TavilySearchResults(k=3)
API Reference: TavilySearchResults

构建图

将流程捕捉为一个图。

定义图状态

from typing import List

from typing_extensions import TypedDict


class GraphState(TypedDict):
    """
    表示我们图形的状态。

    属性:
        问题:问题
        生成:LLM 生成
        文档:文档列表
    """

    question: str
    generation: str
    documents: List[str]

定义图流

from langchain.schema import Document


def retrieve(state):
    """
    检索文档

    参数:
        state (dict):当前图的状态

    返回:
        state (dict):向状态中添加的新键,documents,包含检索到的文档
    """
    print("---RETRIEVE---")
    question = state["question"]

    # 检索
    documents = retriever.invoke(question)
    return {"documents": documents, "question": question}


def generate(state):
    """
    生成答案

    参数:
        state (dict): 当前图状态

    返回:
        state (dict): 向状态中添加的新键,generation,其中包含LLM生成内容
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]

    # RAG生成
    generation = rag_chain.invoke({"context": documents, "question": question})
    return {"documents": documents, "question": question, "generation": generation}


def grade_documents(state):
    """
    Determines whether the retrieved documents are relevant to the question.

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Updates documents key with only filtered relevant documents
    """

    print("---CHECK DOCUMENT RELEVANCE TO QUESTION---")
    question = state["question"]
    documents = state["documents"]

    # 对每个文档进行评分。
    filtered_docs = []
    for d in documents:
        score = retrieval_grader.invoke(
            {"question": question, "document": d.page_content}
        )
        grade = score.binary_score
        if grade == "yes":
            print("---GRADE: DOCUMENT RELEVANT---")
            filtered_docs.append(d)
        else:
            print("---GRADE: DOCUMENT NOT RELEVANT---")
            continue
    return {"documents": filtered_docs, "question": question}


def transform_query(state):
    """
    将查询转换为更好的问题。

    参数:
        state (dict):当前图形状态

    返回:
        state (dict):更新问题键,使用重新措辞的问题。
    """

    print("---TRANSFORM QUERY---")
    question = state["question"]
    documents = state["documents"]

    # 重新写问题。
    better_question = question_rewriter.invoke({"question": question})
    return {"documents": documents, "question": better_question}


def web_search(state):
    """
    基于重新表述的问题进行网页搜索。

    参数:
        state (dict): 当前图形状态

    返回:
        state (dict): 更新文档键并附加网页结果
    """

    print("---WEB SEARCH---")
    question = state["question"]

    # 网络搜索
    docs = web_search_tool.invoke({"query": question})
    web_results = "\n".join([d["content"] for d in docs])
    web_results = Document(page_content=web_results)

    return {"documents": web_results, "question": question}


# ##边缘###


def route_question(state):
    """
    将问题路由到网络搜索或RAG。

    参数:
        state (dict):当前图状态

    返回:
        str:下一个要调用的节点
    """

    print("---ROUTE QUESTION---")
    question = state["question"]
    source = question_router.invoke({"question": question})
    if source.datasource == "web_search":
        print("---ROUTE QUESTION TO WEB SEARCH---")
        return "web_search"
    elif source.datasource == "vectorstore":
        print("---ROUTE QUESTION TO RAG---")
        return "vectorstore"


def decide_to_generate(state):
    """
    确定是生成答案还是重新生成问题。

    参数:
        state (dict):当前图的状态

    返回:
        str:用于下一个节点调用的二进制决策
    """

    print("---ASSESS GRADED DOCUMENTS---")
    state["question"]
    filtered_documents = state["documents"]

    if not filtered_documents:
        # 所有文件都已进行相关性筛选。
        # 我们将重新生成一个新的查询。
        print(
            "---DECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, TRANSFORM QUERY---"
        )
        return "transform_query"
    else:
        # 我们有相关文件,所以请生成答案。
        print("---DECISION: GENERATE---")
        return "generate"


def grade_generation_v_documents_and_question(state):
    """
    确定生成是否基于文档并回答问题。

    参数:
        state (dict): 当前图形状态

    返回:
        str: 下一个调用节点的决策
    """

    print("---CHECK HALLUCINATIONS---")
    question = state["question"]
    documents = state["documents"]
    generation = state["generation"]

    score = hallucination_grader.invoke(
        {"documents": documents, "generation": generation}
    )
    grade = score.binary_score

    # 检查幻觉
    if grade == "yes":
        print("---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---")
        # 检查问答情况。
        print("---GRADE GENERATION vs QUESTION---")
        score = answer_grader.invoke({"question": question, "generation": generation})
        grade = score.binary_score
        if grade == "yes":
            print("---DECISION: GENERATION ADDRESSES QUESTION---")
            return "useful"
        else:
            print("---DECISION: GENERATION DOES NOT ADDRESS QUESTION---")
            return "not useful"
    else:
        pprint("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
        return "not supported"
API Reference: Document

编译图形

from langgraph.graph import END, StateGraph, START

workflow = StateGraph(GraphState)

# 定义节点
workflow.add_node("web_search", web_search)  # 网络搜索
workflow.add_node("retrieve", retrieve)  # 检索
workflow.add_node("grade_documents", grade_documents)  # 评估文件
workflow.add_node("generate", generate)  # 生成
workflow.add_node("transform_query", transform_query)  # 转换查询

# 构建图形
workflow.add_conditional_edges(
    START,
    route_question,
    {
        "web_search": "web_search",
        "vectorstore": "retrieve",
    },
)
workflow.add_edge("web_search", "generate")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "transform_query": "transform_query",
        "generate": "generate",
    },
)
workflow.add_edge("transform_query", "retrieve")
workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents_and_question,
    {
        "not supported": "generate",
        "useful": END,
        "not useful": "transform_query",
    },
)

# 编译
app = workflow.compile()
API Reference: END | StateGraph | START

使用图形

from pprint import pprint

# 跑
inputs = {
    "question": "What player at the Bears expected to draft first in the 2024 NFL draft?"
}
for output in app.stream(inputs):
    for key, value in output.items():
        # 节点
        pprint(f"Node '{key}':")
        # 可选:在每个节点打印完整状态
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    pprint("\n---\n")

# 最终世代
pprint(value["generation"])
---ROUTE QUESTION---
---ROUTE QUESTION TO WEB SEARCH---
---WEB SEARCH---
"Node 'web_search':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'generate':"
'\n---\n'
('It is expected that the Chicago Bears could have the opportunity to draft '
 'the first defensive player in the 2024 NFL draft. The Bears have the first '
 'overall pick in the draft, giving them a prime position to select top '
 'talent. The top wide receiver Marvin Harrison Jr. from Ohio State is also '
 'mentioned as a potential pick for the Cardinals.')
抱歉,我无法访问外部链接或内容。如果您能提供ipynb文件中的具体markdown内容,我将很高兴为您翻译。

# 运行
inputs = {"question": "What are the types of agent memory?"}
for output in app.stream(inputs):
    for key, value in output.items():
        # 节点
        pprint(f"Node '{key}':")
        # 可选:在每个节点打印完整状态
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    pprint("\n---\n")

# 最终生成
pprint(value["generation"])
---ROUTE QUESTION---
---ROUTE QUESTION TO RAG---
---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'generate':"
'\n---\n'
('The types of agent memory include Sensory Memory, Short-Term Memory (STM) or '
 'Working Memory, and Long-Term Memory (LTM) with subtypes of Explicit / '
 'declarative memory and Implicit / procedural memory. Sensory memory retains '
 'sensory information briefly, STM stores information for cognitive tasks, and '
 'LTM stores information for a long time with different types of memories.')
抱歉,我无法直接访问外部链接或者文件。如果您可以提供文件中的markdown内容,我将很乐意为您进行翻译。

优云智算