相似性搜索
搜索最近的向量是许多表示学习应用的核心。 现代神经网络被训练来将对象转换为向量,以便现实世界中接近的对象在向量空间中显得接近。 例如,这可能是意义相似的文本、视觉上相似的图片或同一类型的歌曲。

这是向量相似度的工作原理
查询API
自 v1.10.0 起可用
Qdrant 为各种搜索和探索请求提供了一个统一的接口 - Query API。
以下是您可以在 Qdrant 中使用 Query API 执行的查询类型的参考列表:
根据query参数,Qdrant可能会倾向于使用不同的搜索策略。
| 最近邻搜索 | 向量相似性搜索,也称为k-NN |
| 按ID搜索 | 通过已存储的向量进行搜索 - 跳过嵌入模型推理 |
| 推荐 | 提供正面和负面的例子 |
| 发现搜索 | 使用上下文作为一次性训练集来指导搜索 |
| 滚动 | 获取所有点,可选择过滤 |
| 分组 | 按某个字段分组结果 |
| 排序依据 | 按有效载荷键排序点 |
| 混合搜索 | 结合多个查询以获得更好的结果 |
| 多阶段搜索 | 优化大型嵌入的性能 |
| 随机抽样 | 从集合中获取随机点 |
最近邻搜索
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7] // <--- Dense vector
}
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7], # <--- Dense vector
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7], // <--- Dense vector
});
use qdrant_client::Qdrant;
use qdrant_client::qdrant::{Condition, Filter, Query, QueryPointsBuilder};
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(Query::new_nearest(vec![0.2, 0.1, 0.9, 0.7]))
)
.await?;
import java.util.List;
import static io.qdrant.client.QueryFactory.nearest;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
QdrantClient client = new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(QueryPoints.newBuilder()
.setCollectionName("{collectionName}")
.setQuery(nearest(List.of(0.2f, 0.1f, 0.9f, 0.7f)))
.build()).get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f }
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
})
按ID搜索
POST /collections/{collection_name}/points/query
{
"query": "43cf51e2-8777-4f52-bc74-c2cbde0c8b04" // <--- point id
}
client.query_points(
collection_name="{collection_name}",
query="43cf51e2-8777-4f52-bc74-c2cbde0c8b04", # <--- point id
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: '43cf51e2-8777-4f52-bc74-c2cbde0c8b04', // <--- point id
});
use qdrant_client::Qdrant;
use qdrant_client::qdrant::{Condition, Filter, PointId, Query, QueryPointsBuilder};
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(Query::new_nearest(PointId::new("43cf51e2-8777-4f52-bc74-c2cbde0c8b04")))
)
.await?;
import java.util.UUID;
import static io.qdrant.client.QueryFactory.nearest;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
QdrantClient client = new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(QueryPoints.newBuilder()
.setCollectionName("{collectionName}")
.setQuery(nearest(UUID.fromString("43cf51e2-8777-4f52-bc74-c2cbde0c8b04")))
.build()).get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: Guid.Parse("43cf51e2-8777-4f52-bc74-c2cbde0c8b04")
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQueryID(qdrant.NewID("43cf51e2-8777-4f52-bc74-c2cbde0c8b04")),
})
指标
有许多方法可以估计向量之间的相似性。 在Qdrant术语中,这些方法被称为度量。 度量的选择取决于获得的向量,特别是神经网络编码器的训练方法。
Qdrant 支持这些最流行的度量类型:
- 点积:
Dot- https://en.wikipedia.org/wiki/Dot_product - 余弦相似度:
Cosine- https://en.wikipedia.org/wiki/Cosine_similarity - 欧几里得距离:
Euclid- https://en.wikipedia.org/wiki/Euclidean_distance - 曼哈顿距离:
Manhattan*- https://en.wikipedia.org/wiki/Taxi几何 *自v1.7版本起可用
相似性学习模型中最常用的度量是余弦度量。

Qdrant 分两步计算此指标,因此实现了更高的搜索速度。 第一步是在将向量添加到集合时对其进行归一化。 每个向量只发生一次。
第二步是向量的比较。 在这种情况下,它变得等同于点积 - 由于SIMD,这是一个非常快速的操作。
根据查询配置的不同,Qdrant 可能会偏好不同的搜索策略。 更多信息请参阅查询规划部分。
搜索API
让我们来看一个搜索查询的例子。
REST API - API 模式定义可在此处找到 这里
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.79],
"filter": {
"must": [
{
"key": "city",
"match": {
"value": "London"
}
}
]
},
"params": {
"hnsw_ef": 128,
"exact": false
},
"limit": 3
}
from qdrant_client import QdrantClient, models
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
query_filter=models.Filter(
must=[
models.FieldCondition(
key="city",
match=models.MatchValue(
value="London",
),
)
]
),
search_params=models.SearchParams(hnsw_ef=128, exact=False),
limit=3,
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
filter: {
must: [
{
key: "city",
match: {
value: "London",
},
},
],
},
params: {
hnsw_ef: 128,
exact: false,
},
limit: 3,
});
use qdrant_client::qdrant::{Condition, Filter, QueryPointsBuilder, SearchParamsBuilder};
use qdrant_client::Qdrant;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(3)
.filter(Filter::must([Condition::matches(
"city",
"London".to_string(),
)]))
.params(SearchParamsBuilder::default().hnsw_ef(128).exact(false)),
)
.await?;
import java.util.List;
import static io.qdrant.client.ConditionFactory.matchKeyword;
import static io.qdrant.client.QueryFactory.nearest;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.Filter;
import io.qdrant.client.grpc.Points.QueryPoints;
import io.qdrant.client.grpc.Points.SearchParams;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setFilter(Filter.newBuilder().addMust(matchKeyword("city", "London")).build())
.setParams(SearchParams.newBuilder().setExact(false).setHnswEf(128).build())
.setLimit(3)
.build()).get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
using static Qdrant.Client.Grpc.Conditions;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
filter: MatchKeyword("city", "London"),
searchParams: new SearchParams { Exact = false, HnswEf = 128 },
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
Filter: &qdrant.Filter{
Must: []*qdrant.Condition{
qdrant.NewMatch("city", "London"),
},
},
Params: &qdrant.SearchParams{
Exact: qdrant.PtrOf(false),
HnswEf: qdrant.PtrOf(uint64(128)),
},
})
在这个例子中,我们正在寻找与向量 [0.2, 0.1, 0.9, 0.7] 相似的向量。
参数 limit(或其别名 - top)指定了我们希望检索的最相似结果的数量。
键params下的值指定搜索的自定义参数。
目前,它可以是:
hnsw_ef- 指定HNSW算法的ef参数的值。exact- 不使用近似搜索(ANN)的选项。如果设置为true,搜索可能会运行很长时间,因为它会执行全扫描以检索精确结果。indexed_only- 使用此选项,您可以禁用尚未构建向量索引的段落的搜索。如果您希望在更新集合的同时最小化对搜索性能的影响,这可能很有用。如果集合尚未完全索引,使用此选项可能会导致部分结果,只有在最终一致性对您的用例可接受的情况下才考虑使用它。
由于指定了filter参数,搜索仅在满足过滤条件的点中进行。
有关可能的过滤器及其工作的详细信息,请参阅过滤部分。
此API的示例结果将是
{
"result": [
{ "id": 10, "score": 0.81 },
{ "id": 14, "score": 0.75 },
{ "id": 11, "score": 0.73 }
],
"status": "ok",
"time": 0.001
}
result 包含按 score 排序的找到的点 ID 列表。
请注意,默认情况下这些结果中缺少有效载荷和向量数据。 请参阅结果中的有效载荷和向量了解如何包含它。
自 v0.10.0 起可用
如果集合是使用多个向量创建的,则应提供用于搜索的向量名称:
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7],
"using": "image",
"limit": 3
}
from qdrant_client import QdrantClient
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
using="image",
limit=3,
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
using: "image",
limit: 3,
});
use qdrant_client::qdrant::QueryPointsBuilder;
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(3)
.using("image"),
)
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setUsing("image")
.setLimit(3)
.build()).get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
usingVector: "image",
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
Using: qdrant.PtrOf("image"),
})
搜索仅在具有相同名称的向量之间进行处理。
自 v1.7.0 版本起可用
如果集合是使用稀疏向量创建的,则应提供用于搜索的稀疏向量的名称:
您仍然可以使用稀疏向量进行有效负载过滤和搜索API的其他功能。
然而,密集向量搜索和稀疏向量搜索之间存在重要差异:
| 索引 | 稀疏查询 | 密集查询 |
|---|---|---|
| 评分指标 | 默认为Dot product,无需指定 | Distance支持的指标例如Dot, Cosine |
| 搜索类型 | 在Qdrant中始终精确 | HNSW是一种近似最近邻 |
| 返回行为 | 仅返回与查询向量在相同索引处具有非零值的向量 | 返回limit个向量 |
一般来说,搜索速度与查询向量中非零值的数量成正比。
POST /collections/{collection_name}/points/query
{
"query": {
"indices": [6, 7],
"values": [1, 2]
},
"using": "text",
"limit": 3
}
from qdrant_client import QdrantClient, models
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=models.SparseVector(
indices=[1, 7],
values=[2.0, 1.0],
),
using="text",
limit=3,
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: {
indices: [1, 7],
values: [2.0, 1.0]
},
using: "text",
limit: 3,
});
use qdrant_client::qdrant::QueryPointsBuilder;
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![(1, 2.0), (7, 1.0)])
.limit(3)
.using("text"),
)
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setUsing("text")
.setQuery(nearest(List.of(2.0f, 1.0f), List.of(1, 7)))
.setLimit(3)
.build())
.get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new (float, uint)[] { (2.0f, 1), (1.0f, 2) },
usingVector: "text",
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuerySparse(
[]uint32{1, 2},
[]float32{2.0, 1.0}),
Using: qdrant.PtrOf("text"),
})
按分数筛选结果
除了有效载荷过滤外,过滤掉相似度得分较低的结果可能也很有用。
例如,如果您知道模型的最小接受分数,并且不希望任何低于该阈值的结果。
在这种情况下,您可以使用搜索查询的score_threshold参数。
它将排除所有得分低于给定值的结果。
结果中的有效载荷和向量
默认情况下,检索方法不会返回任何存储的信息,例如
有效载荷和向量。额外的参数 with_vectors 和 with_payload
会改变这种行为。
示例:
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7],
"with_vectors": true,
"with_payload": true
}
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
with_vectors=True,
with_payload=True,
)
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
with_vector: true,
with_payload: true,
});
use qdrant_client::qdrant::QueryPointsBuilder;
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(3)
.with_payload(true)
.with_vectors(true),
)
.await?;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.WithVectorsSelectorFactory;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.WithPayloadSelectorFactory.enable;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setWithPayload(enable(true))
.setWithVectors(WithVectorsSelectorFactory.enable(true))
.setLimit(3)
.build())
.get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
payloadSelector: true,
vectorsSelector: true,
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
WithPayload: qdrant.NewWithPayload(true),
WithVectors: qdrant.NewWithVectors(true),
})
你可以使用with_payload来限定或过滤特定的有效载荷子集。
你甚至可以指定一个包含的项目数组,例如city,
village,和town:
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7],
"with_payload": ["city", "village", "town"]
}
from qdrant_client import QdrantClient
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
with_payload=["city", "village", "town"],
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
with_payload: ["city", "village", "town"],
});
use qdrant_client::qdrant::{with_payload_selector::SelectorOptions, QueryPointsBuilder};
use qdrant_client::Qdrant;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(3)
.with_payload(SelectorOptions::Include(
vec![
"city".to_string(),
"village".to_string(),
"town".to_string(),
]
.into(),
))
.with_vectors(true),
)
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.WithPayloadSelectorFactory.include;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setWithPayload(include(List.of("city", "village", "town")))
.setLimit(3)
.build())
.get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
payloadSelector: new WithPayloadSelector
{
Include = new PayloadIncludeSelector
{
Fields = { new string[] { "city", "village", "town" } }
}
},
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
WithPayload: qdrant.NewWithPayloadInclude("city", "village", "town"),
})
或者明确使用include或exclude。例如,要排除city:
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7],
"with_payload": {
"exclude": ["city"]
}
}
from qdrant_client import QdrantClient, models
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
with_payload=models.PayloadSelectorExclude(
exclude=["city"],
),
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
with_payload: {
exclude: ["city"],
},
});
use qdrant_client::qdrant::{with_payload_selector::SelectorOptions, QueryPointsBuilder};
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(3)
.with_payload(SelectorOptions::Exclude(vec!["city".to_string()].into()))
.with_vectors(true),
)
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.WithPayloadSelectorFactory.exclude;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setWithPayload(exclude(List.of("city")))
.setLimit(3)
.build())
.get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
payloadSelector: new WithPayloadSelector
{
Exclude = new PayloadExcludeSelector { Fields = { new string[] { "city" } } }
},
limit: 3
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
WithPayload: qdrant.NewWithPayloadExclude("city"),
})
可以使用点符号来定位嵌套字段:
payload.nested_field- 用于嵌套字段payload.nested_array[].sub_field- 用于投影数组内的嵌套字段
目前不支持通过索引访问数组元素。
批量搜索API
自 v0.10.0 起可用
批量搜索API允许通过单个请求执行多个搜索请求。
它的语义是直接的,n 批量的搜索请求等同于 n 个单独的搜索请求。
这种方法有几个优点。从逻辑上讲,所需的网络连接更少,这本身就可以非常有益。
更重要的是,批处理请求将通过查询计划器高效处理,如果它们具有相同的filter,查询计划器可以检测并优化这些请求。
这对于非平凡过滤器来说,可能会对延迟产生重大影响,因为中间结果可以在请求之间共享。
为了使用它,只需将您的搜索请求打包在一起。当然,搜索请求的所有常规属性都是可用的。
POST /collections/{collection_name}/points/query/batch
{
"searches": [
{
"query": [0.2, 0.1, 0.9, 0.7],
"filter": {
"must": [
{
"key": "city",
"match": {
"value": "London"
}
}
]
},
"limit": 3
},
{
"query": [0.5, 0.3, 0.2, 0.3],
"filter": {
"must": [
{
"key": "city",
"match": {
"value": "London"
}
}
]
},
"limit": 3
}
]
}
from qdrant_client import QdrantClient, models
client = QdrantClient(url="http://localhost:6333")
filter_ = models.Filter(
must=[
models.FieldCondition(
key="city",
match=models.MatchValue(
value="London",
),
)
]
)
search_queries = [
models.QueryRequest(query=[0.2, 0.1, 0.9, 0.7], filter=filter_, limit=3),
models.QueryRequest(query=[0.5, 0.3, 0.2, 0.3], filter=filter_, limit=3),
]
client.query_batch_points(collection_name="{collection_name}", requests=search_queries)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
const filter = {
must: [
{
key: "city",
match: {
value: "London",
},
},
],
};
const searches = [
{
query: [0.2, 0.1, 0.9, 0.7],
filter,
limit: 3,
},
{
query: [0.5, 0.3, 0.2, 0.3],
filter,
limit: 3,
},
];
client.queryBatch("{collection_name}", {
searches,
});
use qdrant_client::qdrant::{Condition, Filter, QueryBatchPointsBuilder, QueryPointsBuilder};
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
let filter = Filter::must([Condition::matches("city", "London".to_string())]);
let searches = vec![
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.1, 0.2, 0.3, 0.4])
.limit(3)
.filter(filter.clone())
.build(),
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.5, 0.3, 0.2, 0.3])
.limit(3)
.filter(filter)
.build(),
];
client
.query_batch(QueryBatchPointsBuilder::new("{collection_name}", searches))
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.Filter;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.ConditionFactory.matchKeyword;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
Filter filter = Filter.newBuilder().addMust(matchKeyword("city", "London")).build();
List<QueryPoints> searches = List.of(
QueryPoints.newBuilder()
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setFilter(filter)
.setLimit(3)
.build(),
QueryPoints.newBuilder()
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setFilter(filter)
.setLimit(3)
.build());
client.queryBatchAsync("{collection_name}", searches).get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
using static Qdrant.Client.Grpc.Conditions;
var client = new QdrantClient("localhost", 6334);
var filter = MatchKeyword("city", "London");
var queries = new List<QueryPoints>
{
new()
{
CollectionName = "{collection_name}",
Query = new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
Filter = filter,
Limit = 3
},
new()
{
CollectionName = "{collection_name}",
Query = new float[] { 0.5f, 0.3f, 0.2f, 0.3f },
Filter = filter,
Limit = 3
}
};
await client.QueryBatchAsync(collectionName: "{collection_name}", queries: queries);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
filter := qdrant.Filter{
Must: []*qdrant.Condition{
qdrant.NewMatch("city", "London"),
},
}
client.QueryBatch(context.Background(), &qdrant.QueryBatchPoints{
CollectionName: "{collection_name}",
QueryPoints: []*qdrant.QueryPoints{
{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
Filter: &filter,
},
{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.5, 0.3, 0.2, 0.3),
Filter: &filter,
},
},
})
此API的结果包含每个搜索请求的一个数组。
{
"result": [
[
{ "id": 10, "score": 0.81 },
{ "id": 14, "score": 0.75 },
{ "id": 11, "score": 0.73 }
],
[
{ "id": 1, "score": 0.92 },
{ "id": 3, "score": 0.89 },
{ "id": 9, "score": 0.75 }
]
],
"status": "ok",
"time": 0.001
}
分页
自 v0.8.3 版本起可用
搜索和推荐 API 允许跳过搜索结果的前几项,只返回从指定偏移量开始的结果:
示例:
POST /collections/{collection_name}/points/query
{
"query": [0.2, 0.1, 0.9, 0.7],
"with_vectors": true,
"with_payload": true,
"limit": 10,
"offset": 100
}
from qdrant_client import QdrantClient
client = QdrantClient(url="http://localhost:6333")
client.query_points(
collection_name="{collection_name}",
query=[0.2, 0.1, 0.9, 0.7],
with_vectors=True,
with_payload=True,
limit=10,
offset=100,
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
client.query("{collection_name}", {
query: [0.2, 0.1, 0.9, 0.7],
with_vector: true,
with_payload: true,
limit: 10,
offset: 100,
});
use qdrant_client::qdrant::QueryPointsBuilder;
use qdrant_client::Qdrant;
let client = Qdrant::from_url("http://localhost:6334").build()?;
client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(vec![0.2, 0.1, 0.9, 0.7])
.with_payload(true)
.with_vectors(true)
.limit(10)
.offset(100),
)
.await?;
import java.util.List;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.WithVectorsSelectorFactory;
import io.qdrant.client.grpc.Points.QueryPoints;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.WithPayloadSelectorFactory.enable;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setWithPayload(enable(true))
.setWithVectors(WithVectorsSelectorFactory.enable(true))
.setLimit(10)
.setOffset(100)
.build())
.get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
payloadSelector: true,
vectorsSelector: true,
limit: 10,
offset: 100
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
WithPayload: qdrant.NewWithPayload(true),
WithVectors: qdrant.NewWithVectors(true),
Offset: qdrant.PtrOf(uint64(100)),
})
相当于检索第11页,每页10条记录。
基于向量的检索,特别是HNSW索引,通常不是为分页设计的。 如果不先检索前N个向量,就无法检索第N个最近的向量。
然而,使用偏移参数通过减少网络流量和存储访问次数来节省资源。
使用offset参数时,将需要在内部检索offset + limit个点,但只从存储中访问那些实际将要返回的点的有效载荷和向量。
分组API
自 v1.2.0 版本起可用
可以按某个字段对结果进行分组。当您有多个相同项目的数据点时,这非常有用,可以避免结果中相同项目的冗余。
例如,如果您有一个被分割成多个块的大型文档,并且您希望按文档进行搜索或推荐,您可以按文档ID对结果进行分组。
考虑具有以下有效载荷的点:
[
{
"id": 0,
"payload": {
"chunk_part": 0,
"document_id": "a"
},
"vector": [0.91]
},
{
"id": 1,
"payload": {
"chunk_part": 1,
"document_id": ["a", "b"]
},
"vector": [0.8]
},
{
"id": 2,
"payload": {
"chunk_part": 2,
"document_id": "a"
},
"vector": [0.2]
},
{
"id": 3,
"payload": {
"chunk_part": 0,
"document_id": 123
},
"vector": [0.79]
},
{
"id": 4,
"payload": {
"chunk_part": 1,
"document_id": 123
},
"vector": [0.75]
},
{
"id": 5,
"payload": {
"chunk_part": 0,
"document_id": -10
},
"vector": [0.6]
}
]
使用groups API,您将能够为每个文档获取最佳的N个点,假设这些点的有效载荷包含文档ID。当然,有时由于点不足或与查询的距离较大,可能无法满足最佳的N个点。在任何情况下,group_size都是一个尽力而为的参数,类似于limit参数。
搜索群组
REST API (模式):
POST /collections/{collection_name}/points/query/groups
{
// Same as in the regular query API
"query": [1.1],
// Grouping parameters
"group_by": "document_id", // Path of the field to group by
"limit": 4, // Max amount of groups
"group_size": 2 // Max amount of points per group
}
client.query_points_groups(
collection_name="{collection_name}",
# Same as in the regular query_points() API
query=[1.1],
# Grouping parameters
group_by="document_id", # Path of the field to group by
limit=4, # Max amount of groups
group_size=2, # Max amount of points per group
)
client.queryGroups("{collection_name}", {
query: [1.1],
group_by: "document_id",
limit: 4,
group_size: 2,
});
use qdrant_client::qdrant::QueryPointGroupsBuilder;
client
.query_groups(
QueryPointGroupsBuilder::new("{collection_name}", "document_id")
.query(vec![0.2, 0.1, 0.9, 0.7])
.group_size(2u64)
.with_payload(true)
.with_vectors(true)
.limit(4u64),
)
.await?;
import java.util.List;
import io.qdrant.client.grpc.Points.SearchPointGroups;
client.queryGroupsAsync(
QueryPointGroups.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setGroupBy("document_id")
.setLimit(4)
.setGroupSize(2)
.build())
.get();
using Qdrant.Client;
var client = new QdrantClient("localhost", 6334);
await client.QueryGroupsAsync(
collectionName: "{collection_name}",
query: new float[] { 0.2f, 0.1f, 0.9f, 0.7f },
groupBy: "document_id",
limit: 4,
groupSize: 2
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.QueryGroups(context.Background(), &qdrant.QueryPointGroups{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
GroupBy: "document_id",
GroupSize: qdrant.PtrOf(uint64(2)),
})
groups 调用的输出如下所示:
{
"result": {
"groups": [
{
"id": "a",
"hits": [
{ "id": 0, "score": 0.91 },
{ "id": 1, "score": 0.85 }
]
},
{
"id": "b",
"hits": [
{ "id": 1, "score": 0.85 }
]
},
{
"id": 123,
"hits": [
{ "id": 3, "score": 0.79 },
{ "id": 4, "score": 0.75 }
]
},
{
"id": -10,
"hits": [
{ "id": 5, "score": 0.6 }
]
}
]
},
"status": "ok",
"time": 0.001
}
组按组内最高分的分数排序。在每个组内,点也会被排序。
如果点的group_by字段是一个数组(例如"document_id": ["a", "b"]),则该点可以包含在多个组中(例如"document_id": "a"和document_id: "b")。
限制:
在组中查找
自 v1.3.0 起可用
对于同一项目的多个部分拥有多个点通常会在存储的数据中引入冗余。如果这些点共享的信息量较小,这可能没有问题,但如果有效载荷很大,这可能会成为一个问题,因为它将存储这些点所需的存储空间乘以每组中点的数量。
在使用组时优化存储的一种方法是,将与相同组ID的点共享的信息存储在另一个集合中的单个点中。然后,在使用组 API时,添加with_lookup参数以将这些点的信息带入每个组中。

这还有一个额外的好处,即当组内点共享的信息发生变化时,只需在一个点进行更新。
例如,如果您有一组文档,您可能希望将它们分块并将这些块的存储点放在一个单独的集合中,确保您在块点的有效载荷中存储了它所属文档的点ID。
在这种情况下,为了将文档中的信息按文档ID分组到块中,您可以使用with_lookup参数:
POST /collections/chunks/points/query/groups
{
// Same as in the regular query API
"query": [1.1],
// Grouping parameters
"group_by": "document_id",
"limit": 2,
"group_size": 2,
// Lookup parameters
"with_lookup": {
// Name of the collection to look up points in
"collection": "documents",
// Options for specifying what to bring from the payload
// of the looked up point, true by default
"with_payload": ["title", "text"],
// Options for specifying what to bring from the vector(s)
// of the looked up point, true by default
"with_vectors": false
}
}
client.query_points_groups(
collection_name="chunks",
# Same as in the regular search() API
query=[1.1],
# Grouping parameters
group_by="document_id", # Path of the field to group by
limit=2, # Max amount of groups
group_size=2, # Max amount of points per group
# Lookup parameters
with_lookup=models.WithLookup(
# Name of the collection to look up points in
collection="documents",
# Options for specifying what to bring from the payload
# of the looked up point, True by default
with_payload=["title", "text"],
# Options for specifying what to bring from the vector(s)
# of the looked up point, True by default
with_vectors=False,
),
)
client.queryGroups("{collection_name}", {
query: [1.1],
group_by: "document_id",
limit: 2,
group_size: 2,
with_lookup: {
collection: "documents",
with_payload: ["title", "text"],
with_vectors: false,
},
});
use qdrant_client::qdrant::{with_payload_selector::SelectorOptions, QueryPointGroupsBuilder, WithLookupBuilder};
client
.query_groups(
QueryPointGroupsBuilder::new("{collection_name}", "document_id")
.query(vec![0.2, 0.1, 0.9, 0.7])
.limit(2u64)
.limit(2u64)
.with_lookup(
WithLookupBuilder::new("documents")
.with_payload(SelectorOptions::Include(
vec!["title".to_string(), "text".to_string()].into(),
))
.with_vectors(false),
),
)
.await?;
import java.util.List;
import io.qdrant.client.grpc.Points.QueryPointGroups;
import io.qdrant.client.grpc.Points.WithLookup;
import static io.qdrant.client.QueryFactory.nearest;
import static io.qdrant.client.WithVectorsSelectorFactory.enable;
import static io.qdrant.client.WithPayloadSelectorFactory.include;
client.queryGroupsAsync(
QueryPointGroups.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(nearest(0.2f, 0.1f, 0.9f, 0.7f))
.setGroupBy("document_id")
.setLimit(2)
.setGroupSize(2)
.setWithLookup(
WithLookup.newBuilder()
.setCollection("documents")
.setWithPayload(include(List.of("title", "text")))
.setWithVectors(enable(false))
.build())
.build())
.get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
var client = new QdrantClient("localhost", 6334);
await client.SearchGroupsAsync(
collectionName: "{collection_name}",
vector: new float[] { 0.2f, 0.1f, 0.9f, 0.7f},
groupBy: "document_id",
limit: 2,
groupSize: 2,
withLookup: new WithLookup
{
Collection = "documents",
WithPayload = new WithPayloadSelector
{
Include = new PayloadIncludeSelector { Fields = { new string[] { "title", "text" } } }
},
WithVectors = false
}
);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.QueryGroups(context.Background(), &qdrant.QueryPointGroups{
CollectionName: "{collection_name}",
Query: qdrant.NewQuery(0.2, 0.1, 0.9, 0.7),
GroupBy: "document_id",
GroupSize: qdrant.PtrOf(uint64(2)),
WithLookup: &qdrant.WithLookup{
Collection: "documents",
WithPayload: qdrant.NewWithPayloadInclude("title", "text"),
},
})
对于with_lookup参数,你也可以使用简写with_lookup="documents"来获取整个有效载荷和向量,而无需明确指定。
查找的结果将显示在每个组的lookup下。
{
"result": {
"groups": [
{
"id": 1,
"hits": [
{ "id": 0, "score": 0.91 },
{ "id": 1, "score": 0.85 }
],
"lookup": {
"id": 1,
"payload": {
"title": "Document A",
"text": "This is document A"
}
}
},
{
"id": 2,
"hits": [
{ "id": 1, "score": 0.85 }
],
"lookup": {
"id": 2,
"payload": {
"title": "Document B",
"text": "This is document B"
}
}
}
]
},
"status": "ok",
"time": 0.001
}
由于查找是通过直接与点ID匹配完成的,任何在查找集合中不存在(且无效)的组ID将被忽略,lookup字段将为空。
随机抽样
自 v1.11.0 起可用
在某些情况下,从集合中检索随机样本点可能很有用。这对于调试、测试或提供探索的入口点非常有用。
随机抽样API是通用查询API的一部分,可以像常规搜索API一样使用。
{
"collection_name": "{collection_name}",
"query": {
"sample": "random"
}
}
from qdrant_client import QdrantClient, models
sampled = client.query_points(
collection_name="{collection_name}",
query=models.SampleQuery(sample=models.Sample.RANDOM)
)
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ host: "localhost", port: 6333 });
const sampled = await client.query("{collection_name}", {
query: {
sample: "random",
},
});
use qdrant_client::Qdrant;
use qdrant_client::qdrant::{Query, QueryPointsBuilder};
let client = Qdrant::from_url("http://localhost:6334").build()?;
let sampled = client
.query(
QueryPointsBuilder::new("{collection_name}")
.query(Query::new_sample(Sample::Random))
)
.await?;
import static io.qdrant.client.QueryFactory.sample;
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.QueryPoints;
import io.qdrant.client.grpc.Points.Sample;
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());
client
.queryAsync(
QueryPoints.newBuilder()
.setCollectionName("{collection_name}")
.setQuery(sample(Sample.Random))
.build())
.get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
var client = new QdrantClient("localhost", 6334);
await client.QueryAsync(collectionName: "{collection_name}", query: Sample.Random);
import (
"context"
"github.com/qdrant/go-client/qdrant"
)
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})
client.QueryGroups(context.Background(), &qdrant.QueryPointGroups{
CollectionName: "{collection_name}",
Query: qdrant.NewQuerySample(qdrant.Sample_Random),
})
查询规划
根据搜索中使用的过滤器 - 查询执行有几种可能的情况。 Qdrant 根据可用的索引、条件的复杂性和过滤结果的数量选择其中一个查询执行选项。 这个过程称为查询规划。
策略选择过程在很大程度上依赖于启发式方法,并且可能因版本而异。 然而,一般原则是:
- 规划是针对每个段独立执行的(有关段的更多信息,请参见存储)
- 如果点的数量低于阈值,则优先选择全扫描
- 在选择策略之前估计过滤结果的数量
- 如果基数低于阈值,则使用有效载荷索引检索点(参见索引)
- 如果基数超过阈值,则使用可过滤的向量索引
您可以使用配置文件调整阈值,也可以为每个集合独立调整。
