尼罗河向量存储（多租户 PostgreSQL）

本笔记本展示了如何使用基于Postgres的向量存储 NileVectorStore 来存储和查询面向多租户RAG应用的向量嵌入。

什么是Nile？

Nile 是一个 Postgres 数据库，支持每个租户的所有数据库操作，包括自动扩展、分支和备份，并实现完整的客户隔离。

多租户RAG应用日益流行，因为它们在使用大型语言模型的同时提供了安全性和隐私保护。

然而，管理底层的Postgres数据库并不简单。每个租户独立数据库的管理成本高昂且复杂，而共享数据库则存在安全和隐私问题，同时也限制了RAG应用的可扩展性和性能。Nile重新设计了Postgres，实现了两全其美的方案——既具备每个租户独立数据库的隔离性，又拥有共享数据库的成本效益和开发体验。

在共享数据库中存储数百万个向量可能会很慢，并且需要大量资源进行索引和查询。但如果您将1000个租户存储在Nile的虚拟租户数据库中，每个租户包含1000个向量，这将变得相当易于管理。特别是因为您可以将较大的租户部署在专属计算资源上，而较小的租户可以高效共享计算资源并根据需要自动扩展。

Nile 入门指南

首先注册 Nile。完成注册后，系统将引导您创建第一个数据库。请继续操作。随后您将被重定向至新数据库的“查询编辑器”页面。

接下来，点击“首页”（左侧菜单顶部图标），点击“生成凭据”并复制生成的连接字符串。稍后您会用到它。

附加资源

开始之前

安装依赖项

让我们安装并导入依赖项。

如果您在 Colab 上打开这个笔记本，您可能需要安装 LlamaIndex 🦙。

%pip install llama-index-vector-stores-nile
%pip install /Users/gwen/workspaces/llama_index/llama-index-integrations/vector_stores/llama-index-vector-stores-nile/dist/llama_index_vector_stores_nile-0.1.1.tar.gz

!pip install llama-index

import logging

from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.core.vector_stores import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)
from llama_index.vector_stores.nile import NileVectorStore, IndexType

设置与Nile数据库的连接

假设您已按照上一节《Nile入门指南》中的说明操作，现在应该已获得Nile数据库的连接字符串。

您可以在名为 NILEDB_SERVICE_URL 的环境变量中设置，或直接在 Python 中设置。

%env NILEDB_SERVICE_URL=postgresql://username:password@us-west-2.db.thenile.dev:5432/niledb

现在，我们将创建一个 NileVectorStore。请注意，除了常见的参数（如URL和尺寸）之外，我们还设置了 tenant_aware=True。

:fire: NileVectorStore 支持租户感知向量存储（隔离每个租户的文档）和常规存储（通常用于所有租户均可访问的共享数据）。下面我们将演示租户感知向量存储。

# Get the service url by reading local .env file with NILE_SERVICE_URL variable
import os

NILEDB_SERVICE_URL = os.environ["NILEDB_SERVICE_URL"]

# OR set it explicitly
# NILE_SERVICE_URL = "postgresql://nile:password@db.thenile.dev:5432/nile"

vector_store = NileVectorStore(
    service_url=NILEDB_SERVICE_URL,
    table_name="documents",
    tenant_aware=True,
    num_dimensions=1536,
)

设置OpenAI

您可以在 .env 文件中设置，或直接在 Python 中设置

%env OPENAI_API_KEY=sk-...

# Uncomment and set it explicitly if you prefer not to use .env
# os.environ["OPENAI_API_KEY"] = "sk-..."

多租户相似性搜索

为了演示使用LlamaIndex和Nile进行多租户相似性搜索，我们将下载两份文档——每份包含不同公司的销售电话记录。Nexiv提供IT服务，ModaMart从事零售业务。我们将为每份文档添加租户标识符，并将其加载到支持租户感知的向量存储中。然后，我们将针对每个租户查询该存储库。您将看到相同的问题如何生成两种不同的响应，因为它为每个租户检索了不同的文档。

下载数据

!mkdir -p data
!wget "https://raw.githubusercontent.com/niledatabase/niledatabase/main/examples/ai/sales_insight/data/transcripts/nexiv-solutions__0_transcript.txt" -O "data/nexiv-solutions__0_transcript.txt"
!wget "https://raw.githubusercontent.com/niledatabase/niledatabase/main/examples/ai/sales_insight/data/transcripts/modamart__0_transcript.txt" -O "data/modamart__0_transcript.txt"

加载文档

我们将使用LlamaIndex的SimpleDirectoryReader来加载文档。由于我们希望在加载后使用租户元数据更新文档，我们将为每个租户使用单独的读取器

reader = SimpleDirectoryReader(
    input_files=["data/nexiv-solutions__0_transcript.txt"]
)
documents_nexiv = reader.load_data()

reader = SimpleDirectoryReader(input_files=["data/modamart__0_transcript.txt"])
documents_modamart = reader.load_data()

使用租户元数据丰富文档

我们将创建两个Nile租户，并将每个租户的ID添加到文档元数据中。我们还会添加一些额外的元数据，例如自定义文档ID和类别。这些元数据可在检索过程中用于筛选文档。当然，在您自己的应用程序中，您也可以为现有租户加载文档，并添加任何您认为有用的元数据信息。

tenant_id_nexiv = str(vector_store.create_tenant("nexiv-solutions"))
tenant_id_modamart = str(vector_store.create_tenant("modamart"))

# Add the tenant id to the metadata
for i, doc in enumerate(documents_nexiv, start=1):
    doc.metadata["tenant_id"] = tenant_id_nexiv
    doc.metadata[
        "category"
    ] = "IT"  # We will use this to apply additional filters in a later example
    doc.id_ = f"nexiv_doc_id_{i}"  # We are also setting a custom id, this is optional but can be useful

for i, doc in enumerate(documents_modamart, start=1):
    doc.metadata["tenant_id"] = tenant_id_modamart
    doc.metadata["category"] = "Retail"
    doc.id_ = f"modamart_doc_id_{i}"

使用NileVectorStore创建VectorStore索引

我们将所有文档加载到同一个 VectorStoreIndex 中。由于我们在设置时创建了一个支持多租户的 NileVectorStore，Nile 将正确使用元数据中的 tenant_id 字段来隔离它们。

将没有tenant_id的文档加载到租户感知存储中会抛出ValueException。

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents_nexiv + documents_modamart,
    storage_context=storage_context,
    show_progress=True,
)

/Users/gwen/.pyenv/versions/3.10.15/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 2/2 [00:00<00:00, 1129.32it/s]
Generating embeddings: 100%|██████████| 2/2 [00:00<00:00,  4.58it/s]

为每个租户查询索引

您可以看到下面我们如何为每个查询指定租户，因此我们获得与该租户相关且仅适用于他们的答案

nexiv_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_nexiv),
    },
)

print(nexiv_query_engine.query("What were the customer pain points?"))

The customer pain points were related to managing customer data using multiple platforms, leading to data discrepancies, time-consuming reconciliation efforts, and decreased productivity.

modamart_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_modamart),
    },
)

print(modamart_query_engine.query("What were the customer pain points?"))

The customer's pain points were concerns about the quality and value of the winter jackets, skepticism towards reviews, worries about sizing and fit when ordering clothes online, and the desire for a warm but lightweight jacket.

查询现有嵌入

在上面的示例中，我们通过加载和嵌入新文档创建了索引。但如果我们已经生成了嵌入向量并存储在Nile中该怎么办？在这种情况下，您仍然需要像上面那样初始化 NileVectorStore，但不再使用 VectorStoreIndex.from_documents(...)，而是使用以下代码：

index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
query_engine = index.as_query_engine(
    vector_store_kwargs={
        "tenant_id": str(tenant_id_modamart),
    },
)
response = query_engine.query("What action items do we need to follow up on?")

print(response)

The action items to follow up on include sending the customer detailed testimonials about the lightweight and warm qualities of the jackets, providing the customer with a sizing guide, and emailing the customer a 10% discount on their first purchase.

使用ANN索引进行近似最近邻搜索

Nile支持pgvector支持的所有索引类型 - IVFFlat和HNSW。IVFFlat速度更快，使用资源更少且调优简单。HNSW在创建和使用时需要更多资源，调优更具挑战性，但在准确性与速度之间具有出色的权衡。让我们看看如何使用索引，尽管只有两个文档的示例实际上并不需要它们。

IVFFlat 索引

IVFFlat索引的工作原理是将向量空间划分为称为“列表”的区域，首先找到最近的列表，然后在这些列表中搜索最近的邻居。在创建索引时，您需要指定列表的数量（nlists），而在查询时，您可以指定搜索中将使用多少个最近列表（ivfflat_probes）。

try:
    vector_store.create_index(index_type=IndexType.PGVECTOR_IVFFLAT, nlists=10)
except Exception as e:
    # This will throw an error if the index already exists, which may be expected
    print(e)

nexiv_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_nexiv),
        "ivfflat_probes": 10,
    },
)

print(
    nexiv_query_engine.query("What action items do we need to follow up on?")
)

vector_store.drop_index()

Index documents_embedding_idx already exists

HNSW 索引

HNSW索引的工作原理是将向量空间分割成一个多层图结构，每层包含不同粒度级别的点间连接。在搜索过程中，它会从粗糙层导航到精细层，识别数据中的最近邻。在创建索引时，您需要指定每层的最大连接数（m）以及构建图时考虑的候选向量数量（ef_construction）。在查询时，您可以指定待搜索候选列表的大小（hnsw_ef）。

try:
    vector_store.create_index(
        index_type=IndexType.PGVECTOR_HNSW, m=16, ef_construction=64
    )
except Exception as e:
    # This will throw an error if the index already exists, which may be expected
    print(e)

nexiv_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_nexiv),
        "hnsw_ef": 10,
    },
)

print(nexiv_query_engine.query("Did we mention any pricing?"))

vector_store.drop_index()

额外的向量存储操作

元数据过滤器

NileVectorStore 还支持基于元数据过滤向量。例如，当我们加载文档时，我们为每个文档包含了 category 元数据。我们现在可以使用这些信息来过滤检索到的文档。请注意，此过滤是叠加在租户过滤器之上的。在支持租户感知的向量存储中，租户过滤器是强制性的，以防止意外的数据泄露。

filters = MetadataFilters(
    filters=[
        MetadataFilter(
            key="category", operator=FilterOperator.EQ, value="Retail"
        ),
    ]
)

nexiv_query_engine_filtered = index.as_query_engine(
    similarity_top_k=3,
    filters=filters,
    vector_store_kwargs={"tenant_id": str(tenant_id_nexiv)},
)
print(
    "test query on nexiv with filter on category = Retail (should return empty): ",
    nexiv_query_engine_filtered.query("What were the customer pain points?"),
)

test query on nexiv with filter on category = Retail (should return empty):  Empty Response

删除文档

删除文档可能相当重要。特别是当您的某些租户位于需要遵守GDPR的地区时。

ref_doc_id = "nexiv_doc_id_1"
vector_store.delete(ref_doc_id, tenant_id=tenant_id_nexiv)

# Query the data again
print(
    "test query on nexiv after deletion (should return empty): ",
    nexiv_query_engine.query("What were the customer pain points?"),
)

test query on nexiv after deletion (should return empty):  Empty Response