为Cohere RAG实现自定义连接器

时间: 45 分钟级别: 中级

实现检索增强生成的通常方法要求用户构建提示,包含LLM可能依赖的相关上下文,并手动将其发送到模型。Cohere在这方面非常独特,因为他们的模型现在可以与外部工具对话,并自行提取有意义的数据。你几乎可以连接任何数据源,并让Cohere LLM知道如何访问它。显然,向量搜索与LLMs配合得很好,启用数据的语义搜索是一个典型的用例。

Cohere RAG 有许多有趣的功能,例如内联引用,这有助于您参考用于生成响应的文档的特定部分。

Cohere RAG citations

来源: https://docs.cohere.com/docs/retrieval-augmented-generation-rag

连接器必须实现特定的接口,并将数据源暴露为HTTP REST API。Cohere文档 描述了创建连接器的一般过程。 本教程将逐步指导您围绕Qdrant构建此类服务。

Qdrant 连接器

您可能已经有一些想要带到LLM的集合。也许您的管道是使用一些流行的库(如Langchain、Llama Index或Haystack)设置的。Cohere连接器可能实现更复杂的逻辑,例如混合搜索。在我们的案例中,我们将从一个新的Qdrant集合开始,使用Cohere Embed v3索引数据,构建连接器,最后将其与Command-R模型连接。

构建集合

首先,让我们构建一个集合并为Cohere的embed-multilingual-v3.0模型进行配置。它生成1024维的嵌入向量,我们可以选择Qdrant中可用的任何距离度量。我们的连接器将充当软件工程师的个人助手,并将我们的笔记暴露出来,以建议优先级或要执行的操作。

from qdrant_client import QdrantClient, models

client = QdrantClient(
    "https://my-cluster.cloud.qdrant.io:6333", 
    api_key="my-api-key",
)
client.create_collection(
    collection_name="personal-notes",
    vectors_config=models.VectorParams(
        size=1024,
        distance=models.Distance.DOT,
    ),
)

我们的笔记将表示为简单的JSON对象,包含特定笔记的titletext。嵌入将仅从text字段创建。

notes = [
    {
        "title": "Project Alpha Review",
        "text": "Review the current progress of Project Alpha, focusing on the integration of the new API. Check for any compatibility issues with the existing system and document the steps needed to resolve them. Schedule a meeting with the development team to discuss the timeline and any potential roadblocks."
    },
    {
        "title": "Learning Path Update",
        "text": "Update the learning path document with the latest courses on React and Node.js from Pluralsight. Schedule at least 2 hours weekly to dedicate to these courses. Aim to complete the React course by the end of the month and the Node.js course by mid-next month."
    },
    {
        "title": "Weekly Team Meeting Agenda",
        "text": "Prepare the agenda for the weekly team meeting. Include the following topics: project updates, review of the sprint backlog, discussion on the new feature requests, and a brainstorming session for improving remote work practices. Send out the agenda and the Zoom link by Thursday afternoon."
    },
    {
        "title": "Code Review Process Improvement",
        "text": "Analyze the current code review process to identify inefficiencies. Consider adopting a new tool that integrates with our version control system. Explore options such as GitHub Actions for automating parts of the process. Draft a proposal with recommendations and share it with the team for feedback."
    },
    {
        "title": "Cloud Migration Strategy",
        "text": "Draft a plan for migrating our current on-premise infrastructure to the cloud. The plan should cover the selection of a cloud provider, cost analysis, and a phased migration approach. Identify critical applications for the first phase and any potential risks or challenges. Schedule a meeting with the IT department to discuss the plan."
    },
    {
        "title": "Quarterly Goals Review",
        "text": "Review the progress towards the quarterly goals. Update the documentation to reflect any completed objectives and outline steps for any remaining goals. Schedule individual meetings with team members to discuss their contributions and any support they might need to achieve their targets."
    },
    {
        "title": "Personal Development Plan",
        "text": "Reflect on the past quarter's achievements and areas for improvement. Update the personal development plan to include new technical skills to learn, certifications to pursue, and networking events to attend. Set realistic timelines and check-in points to monitor progress."
    },
    {
        "title": "End-of-Year Performance Reviews",
        "text": "Start preparing for the end-of-year performance reviews. Collect feedback from peers and managers, review project contributions, and document achievements. Consider areas for improvement and set goals for the next year. Schedule preliminary discussions with each team member to gather their self-assessments."
    },
    {
        "title": "Technology Stack Evaluation",
        "text": "Conduct an evaluation of our current technology stack to identify any outdated technologies or tools that could be replaced for better performance and productivity. Research emerging technologies that might benefit our projects. Prepare a report with findings and recommendations to present to the management team."
    },
    {
        "title": "Team Building Event Planning",
        "text": "Plan a team-building event for the next quarter. Consider activities that can be done remotely, such as virtual escape rooms or online game nights. Survey the team for their preferences and availability. Draft a budget proposal for the event and submit it for approval."
    }
]

存储嵌入与元数据相当简单。

import cohere
import uuid

cohere_client = cohere.Client(api_key="my-cohere-api-key")

response = cohere_client.embed(
    texts=[
        note.get("text")
        for note in notes
    ],
    model="embed-multilingual-v3.0",
    input_type="search_document",
)

client.upload_points(
    collection_name="personal-notes",
    points=[
        models.PointStruct(
            id=uuid.uuid4().hex,
            vector=embedding,
            payload=note,
        )
        for note, embedding in zip(notes, response.embeddings)
    ]
)

我们的集合现在已经准备好进行搜索了。在现实世界中,笔记的集合会随着时间的推移而变化,因此摄取过程不会那么简单。这些数据尚未暴露给LLM,但我们将在下一步中构建连接器。

连接器网络服务

FastAPI 是一个现代的Web框架,是简单HTTP API的完美选择。我们将使用它来实现我们的连接器。根据模型的要求,将只有一个端点。它将在/search路径接受POST请求。需要一个单一的query参数。让我们定义一个相应的模型。

from pydantic import BaseModel

class SearchQuery(BaseModel):
    query: str

RAG连接器不必以任何特定格式返回文档。有一些好的实践可以遵循,但Cohere模型在这方面非常灵活。结果只需以JSON格式返回,输出中的results属性中包含一个对象列表。我们将使用与Qdrant负载相同的文档结构,因此不需要进行转换。这需要创建两个额外的模型。

from typing import List

class Document(BaseModel):
    title: str
    text: str

class SearchResults(BaseModel):
    results: List[Document]

一旦我们的模型类准备就绪,我们就可以实现获取查询并提供相关笔记的逻辑。请注意,LLM不会定义要返回的文档数量。这完全取决于您希望将多少文档带入上下文中。

我们需要与两个服务进行交互 - Qdrant 服务器和 Cohere API。FastAPI 有一个依赖注入的概念,我们将使用它来为实现提供这两个客户端。

在查询的情况下,我们需要在调用Cohere API时将input_type设置为search_query

from fastapi import FastAPI, Depends
from typing import Annotated

app = FastAPI()

def client() -> QdrantClient:
    return QdrantClient(config.QDRANT_URL, api_key=config.QDRANT_API_KEY)

def cohere_client() -> cohere.Client:
    return cohere.Client(api_key=config.COHERE_API_KEY)

@app.post("/search")
def search(
    query: SearchQuery,
    client: Annotated[QdrantClient, Depends(client)],
    cohere_client: Annotated[cohere.Client, Depends(cohere_client)],
) -> SearchResults:
    response = cohere_client.embed(
        texts=[query.query],
        model="embed-multilingual-v3.0",
        input_type="search_query",
    )
    results = client.query_points(
        collection_name="personal-notes",
        query=response.embeddings[0],
        limit=2,
    ).points
    return SearchResults(
        results=[
            Document(**point.payload)
            for point in results
        ]
    )

我们的应用程序可能会在本地启动用于开发目的,假设我们已经安装了uvicorn服务器:

uvicorn main:app

FastAPI 在 http://localhost:8000/docs 提供了一个交互式文档,我们可以在那里测试我们的端点。/search 端点在那里是可用的。

FastAPI documentation

我们可以与之交互并检查将返回的特定查询的文档。例如,我们想知道关于项目基础设施我们应该做什么。

curl -X "POST" \
    -H "Content-type: application/json" \
    -d '{"query": "Is there anything I have to do regarding the project infrastructure?"}' \
    "http://localhost:8000/search"

输出应如下所示:

{
  "results": [
    {
      "title": "Cloud Migration Strategy",
      "text": "Draft a plan for migrating our current on-premise infrastructure to the cloud. The plan should cover the selection of a cloud provider, cost analysis, and a phased migration approach. Identify critical applications for the first phase and any potential risks or challenges. Schedule a meeting with the IT department to discuss the plan."
    },
    {
      "title": "Project Alpha Review",
      "text": "Review the current progress of Project Alpha, focusing on the integration of the new API. Check for any compatibility issues with the existing system and document the steps needed to resolve them. Schedule a meeting with the development team to discuss the timeline and any potential roadblocks."
    }
  ]
}

连接到Command-R

我们的网络服务已经实现,但目前仅在我们的本地机器上运行。在Command-R能够与之交互之前,必须将其公开。对于快速实验,使用诸如ngrok之类的服务设置隧道可能就足够了。我们不会在本教程中涵盖所有细节,但他们的快速入门是一个很好的资源,逐步描述了该过程。或者,您也可以使用公共URL部署服务。

完成后,我们可以先创建连接器,然后告诉模型使用它,同时通过聊天API进行交互。创建连接器是对Cohere客户端的单次调用:

connector_response = cohere_client.connectors.create(
    name="personal-notes",
    url="https:/this-is-my-domain.app/search",
)

connector_response.connector 将是一个描述符,其中 id 是其中一个属性。我们将使用这个标识符进行如下交互:

response = cohere_client.chat(
    message=(
        "Is there anything I have to do regarding the project infrastructure? "
        "Please mention the tasks briefly."
    ),
    connectors=[
        cohere.ChatConnector(id=connector_response.connector.id)
    ],
    model="command-r",
)

我们将model更改为command-r,因为这是目前公开可用的最佳Cohere模型。response.text是模型的输出:

Here are some of the tasks related to project infrastructure that you might have to perform:
- You need to draft a plan for migrating your on-premise infrastructure to the cloud and come up with a plan for the selection of a cloud provider, cost analysis, and a gradual migration approach.
- It's important to evaluate your current technology stack to identify any outdated technologies. You should also research emerging technologies and the benefits they could bring to your projects.

您只需要创建一次特定的连接器!请不要对发送到chat方法的每条消息都调用cohere_client.connectors.create

总结

我们已经构建了一个Cohere RAG连接器,它可以与存储在Qdrant中的现有知识库集成。我们只涵盖了基本流程,但在实际场景中,您还应该考虑例如构建认证系统以防止未经授权的访问。

这个页面有用吗?

感谢您的反馈!🙏

我们很抱歉听到这个消息。😔 你可以在GitHub上编辑这个页面,或者创建一个GitHub问题。