使用Weaviate进行嵌入搜索

Jun 28, 2023
Open in Github

本笔记本将引导您完成一个简单的流程,以下载一些数据,嵌入它,然后使用一系列向量数据库对其进行索引和搜索。这是希望在一个安全的环境中存储和搜索我们的嵌入与自己的数据的客户的常见需求,以支持生产用例,如聊天机器人、主题建模等。

什么是向量数据库

向量数据库是一种用于存储、管理和搜索嵌入向量的数据库。近年来,由于人工智能在解决涉及自然语言、图像识别和其他非结构化数据形式的用例中的有效性不断提高,使用嵌入将非结构化数据(文本、音频、视频等)编码为向量以供机器学习模型使用的情况激增。向量数据库已成为企业交付和扩展这些用例的有效解决方案。

为什么使用向量数据库

向量数据库使企业能够利用我们在此仓库中分享的许多嵌入用例(例如问答、聊天机器人和推荐服务),并在一个安全、可扩展的环境中使用它们。我们的许多客户在小规模上使用嵌入解决了他们的问题,但性能和安全问题阻碍了他们进入生产环境——我们认为向量数据库是解决这一问题的关键组成部分,在本指南中,我们将介绍嵌入文本数据的基础知识,将其存储在向量数据库中,并将其用于语义搜索。

演示流程

演示流程如下:

  • 设置: 导入包并设置任何所需的变量
  • 加载数据: 加载一个数据集并使用OpenAI嵌入进行嵌入
  • Weaviate
    • 设置: 这里我们将设置Weaviate的Python客户端。更多详情请访问 这里
    • 索引数据: 我们将创建一个包含标题搜索向量的索引
    • 搜索数据: 我们将运行几次搜索以确认其工作正常

一旦你完成了这个笔记本的学习,你应该对如何设置和使用向量数据库有了基本的理解,并且可以继续学习更复杂的用例,利用我们的嵌入。

设置

导入所需的库并设置我们想要使用的嵌入模型。

# We'll need to install the Weaviate client
!pip install weaviate-client

#Install wget to pull zip file
!pip install wget
import openai

from typing import List, Iterator
import pandas as pd
import numpy as np
import os
import wget
from ast import literal_eval

# Weaviate's client library for Python
import weaviate

# I've set this to our new embeddings model, this can be changed to the embedding model of your choice
EMBEDDING_MODEL = "text-embedding-3-small"

# Ignore unclosed SSL socket warnings - optional in case you get these errors
import warnings

warnings.filterwarnings(action="ignore", message="unclosed", category=ResourceWarning)
warnings.filterwarnings("ignore", category=DeprecationWarning) 

加载数据

在本节中,我们将加载在此会话之前准备的嵌入数据。

embeddings_url = 'https://cdn.openai.com/API/examples/data/vector_database_wikipedia_articles_embedded.zip'

# The file is ~700 MB so this will take some time
wget.download(embeddings_url)
import zipfile
with zipfile.ZipFile("vector_database_wikipedia_articles_embedded.zip","r") as zip_ref:
    zip_ref.extractall("../data")
article_df = pd.read_csv('../data/vector_database_wikipedia_articles_embedded.csv')
article_df.head()
id url title text title_vector content_vector vector_id
0 1 https://simple.wikipedia.org/wiki/April April April is the fourth month of the year in the J... [0.001009464613161981, -0.020700545981526375, ... [-0.011253940872848034, -0.013491976074874401,... 0
1 2 https://simple.wikipedia.org/wiki/August August August (Aug.) is the eighth month of the year ... [0.0009286514250561595, 0.000820168002974242, ... [0.0003609954728744924, 0.007262262050062418, ... 1
2 6 https://simple.wikipedia.org/wiki/Art Art Art is a creative activity that expresses imag... [0.003393713850528002, 0.0061537534929811954, ... [-0.004959689453244209, 0.015772193670272827, ... 2
3 8 https://simple.wikipedia.org/wiki/A A A or a is the first letter of the English alph... [0.0153952119871974, -0.013759135268628597, 0.... [0.024894846603274345, -0.022186409682035446, ... 3
4 9 https://simple.wikipedia.org/wiki/Air Air Air refers to the Earth's atmosphere. Air is a... [0.02224554680287838, -0.02044147066771984, -0... [0.021524671465158463, 0.018522677943110466, -... 4
# Read vectors from strings back into a list
article_df['title_vector'] = article_df.title_vector.apply(literal_eval)
article_df['content_vector'] = article_df.content_vector.apply(literal_eval)

# Set vector_id to be a string
article_df['vector_id'] = article_df['vector_id'].apply(str)
article_df.info(show_counts=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25000 entries, 0 to 24999
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   id              25000 non-null  int64 
 1   url             25000 non-null  object
 2   title           25000 non-null  object
 3   text            25000 non-null  object
 4   title_vector    25000 non-null  object
 5   content_vector  25000 non-null  object
 6   vector_id       25000 non-null  object
dtypes: int64(1), object(6)
memory usage: 1.3+ MB

Weaviate

我们将探索的另一个向量数据库选项是Weaviate,它提供了托管SaaS选项,以及自托管的开源选项。由于我们已经看过云向量数据库,我们将在这里尝试自托管选项。

为此,我们将:

  • 设置Weaviate的本地部署
  • 在Weaviate中创建索引
  • 在那里存储我们的数据
  • 执行一些相似性搜索查询
  • 尝试一个真实的使用案例

自带向量方法

在本手册中,我们提供了已经生成向量的数据。这对于数据已经向量化的场景来说是一个很好的方法。

使用OpenAI模块进行自动化向量化

对于数据尚未向量化的场景,您可以将向量化任务委托给OpenAI和Weaviate。 Weaviate提供了一个内置模块text2vec-openai,该模块会在以下情况下为您处理向量化:

  • 导入
  • 用于任何CRUD操作
  • 用于语义搜索

查看Weaviate和OpenAI模块入门指南,逐步学习如何一步导入和向量化数据。

设置

要在本地运行Weaviate,您需要Docker。按照Weaviate文档中的说明这里,我们在此仓库中创建了一个示例docker-compose.yml文件,保存在./weaviate/docker-compose.yml

启动Docker后,您可以通过导航到examples/vector_databases/weaviate/目录并运行docker-compose up -d来在本地启动Weaviate。

SaaS

或者,您可以使用Weaviate Cloud Service (WCS) 创建一个免费的 Weaviate 集群。

  1. 创建一个免费账户并/或登录到 WCS
  2. 创建一个Weaviate Cluster,使用以下设置:
    • 沙盒:Sandbox Free
    • Weaviate 版本:使用默认(最新)
    • OIDC 认证:Disabled
  3. 您的实例应该在一两分钟内准备就绪
  4. 记下Cluster Id。该链接将带您到集群的完整路径(稍后您将需要它来连接)。它应该类似于:https://your-project-name-suffix.weaviate.network
# Option #1 - Self-hosted - Weaviate Open Source 
client = weaviate.Client(
    url="http://localhost:8080",
    additional_headers={
        "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")
    }
)
# Option #2 - SaaS - (Weaviate Cloud Service)
client = weaviate.Client(
    url="https://your-wcs-instance-name.weaviate.network",
    additional_headers={
        "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")
    }
)
client.is_ready()

索引数据

在Weaviate中,您创建模式来捕获您将要搜索的每个实体。

在这种情况下,我们将创建一个名为Article的模式,其中包含上面的title向量,以便我们进行搜索。

接下来的几个步骤紧密遵循Weaviate提供的文档这里

# Clear up the schema, so that we can recreate it
client.schema.delete_all()
client.schema.get()

# Define the Schema object to use `text-embedding-3-small` on `title` and `content`, but skip it for `url`
article_schema = {
    "class": "Article",
    "description": "A collection of articles",
    "vectorizer": "text2vec-openai",
    "moduleConfig": {
        "text2vec-openai": {
          "model": "ada",
          "modelVersion": "002",
          "type": "text"
        }
    },
    "properties": [{
        "name": "title",
        "description": "Title of the article",
        "dataType": ["string"]
    },
    {
        "name": "content",
        "description": "Contents of the article",
        "dataType": ["text"],
        "moduleConfig": { "text2vec-openai": { "skip": True } }
    }]
}

# add the Article schema
client.schema.create_class(article_schema)

# get the schema to make sure it worked
client.schema.get()
{'classes': [{'class': 'Article',
   'description': 'A collection of articles',
   'invertedIndexConfig': {'bm25': {'b': 0.75, 'k1': 1.2},
    'cleanupIntervalSeconds': 60,
    'stopwords': {'additions': None, 'preset': 'en', 'removals': None}},
   'moduleConfig': {'text2vec-openai': {'model': 'ada',
     'modelVersion': '002',
     'type': 'text',
     'vectorizeClassName': True}},
   'properties': [{'dataType': ['string'],
     'description': 'Title of the article',
     'moduleConfig': {'text2vec-openai': {'skip': False,
       'vectorizePropertyName': False}},
     'name': 'title',
     'tokenization': 'word'},
    {'dataType': ['text'],
     'description': 'Contents of the article',
     'moduleConfig': {'text2vec-openai': {'skip': True,
       'vectorizePropertyName': False}},
     'name': 'content',
     'tokenization': 'word'}],
   'replicationConfig': {'factor': 1},
   'shardingConfig': {'virtualPerPhysical': 128,
    'desiredCount': 1,
    'actualCount': 1,
    'desiredVirtualCount': 128,
    'actualVirtualCount': 128,
    'key': '_id',
    'strategy': 'hash',
    'function': 'murmur3'},
   'vectorIndexConfig': {'skip': False,
    'cleanupIntervalSeconds': 300,
    'maxConnections': 64,
    'efConstruction': 128,
    'ef': -1,
    'dynamicEfMin': 100,
    'dynamicEfMax': 500,
    'dynamicEfFactor': 8,
    'vectorCacheMaxObjects': 1000000000000,
    'flatSearchCutoff': 40000,
    'distance': 'cosine'},
   'vectorIndexType': 'hnsw',
   'vectorizer': 'text2vec-openai'}]}
### Step 1 - configure Weaviate Batch, which optimizes CRUD operations in bulk
# - starting batch size of 100
# - dynamically increase/decrease based on performance
# - add timeout retries if something goes wrong

client.batch.configure(
    batch_size=100,
    dynamic=True,
    timeout_retries=3,
)
<weaviate.batch.crud_batch.Batch at 0x3f0ca0fa0>
### Step 2 - import data

print("Uploading data with vectors to Article schema..")

counter=0

with client.batch as batch:
    for k,v in article_df.iterrows():
        
        # print update message every 100 objects        
        if (counter %100 == 0):
            print(f"Import {counter} / {len(article_df)} ")
        
        properties = {
            "title": v["title"],
            "content": v["text"]
        }
        
        vector = v["title_vector"]
        
        batch.add_data_object(properties, "Article", None, vector)
        counter = counter+1

print(f"Importing ({len(article_df)}) Articles complete")  
Uploading data with vectors to Article schema..
Import 0 / 25000 
Import 100 / 25000 
Import 200 / 25000 
Import 300 / 25000 
Import 400 / 25000 
Import 500 / 25000 
Import 600 / 25000 
Import 700 / 25000 
Import 800 / 25000 
Import 900 / 25000 
Import 1000 / 25000 
Import 1100 / 25000 
Import 1200 / 25000 
Import 1300 / 25000 
Import 1400 / 25000 
Import 1500 / 25000 
Import 1600 / 25000 
Import 1700 / 25000 
Import 1800 / 25000 
Import 1900 / 25000 
Import 2000 / 25000 
Import 2100 / 25000 
Import 2200 / 25000 
Import 2300 / 25000 
Import 2400 / 25000 
Import 2500 / 25000 
Import 2600 / 25000 
Import 2700 / 25000 
Import 2800 / 25000 
Import 2900 / 25000 
Import 3000 / 25000 
Import 3100 / 25000 
Import 3200 / 25000 
Import 3300 / 25000 
Import 3400 / 25000 
Import 3500 / 25000 
Import 3600 / 25000 
Import 3700 / 25000 
Import 3800 / 25000 
Import 3900 / 25000 
Import 4000 / 25000 
Import 4100 / 25000 
Import 4200 / 25000 
Import 4300 / 25000 
Import 4400 / 25000 
Import 4500 / 25000 
Import 4600 / 25000 
Import 4700 / 25000 
Import 4800 / 25000 
Import 4900 / 25000 
Import 5000 / 25000 
Import 5100 / 25000 
Import 5200 / 25000 
Import 5300 / 25000 
Import 5400 / 25000 
Import 5500 / 25000 
Import 5600 / 25000 
Import 5700 / 25000 
Import 5800 / 25000 
Import 5900 / 25000 
Import 6000 / 25000 
Import 6100 / 25000 
Import 6200 / 25000 
Import 6300 / 25000 
Import 6400 / 25000 
Import 6500 / 25000 
Import 6600 / 25000 
Import 6700 / 25000 
Import 6800 / 25000 
Import 6900 / 25000 
Import 7000 / 25000 
Import 7100 / 25000 
Import 7200 / 25000 
Import 7300 / 25000 
Import 7400 / 25000 
Import 7500 / 25000 
Import 7600 / 25000 
Import 7700 / 25000 
Import 7800 / 25000 
Import 7900 / 25000 
Import 8000 / 25000 
Import 8100 / 25000 
Import 8200 / 25000 
Import 8300 / 25000 
Import 8400 / 25000 
Import 8500 / 25000 
Import 8600 / 25000 
Import 8700 / 25000 
Import 8800 / 25000 
Import 8900 / 25000 
Import 9000 / 25000 
Import 9100 / 25000 
Import 9200 / 25000 
Import 9300 / 25000 
Import 9400 / 25000 
Import 9500 / 25000 
Import 9600 / 25000 
Import 9700 / 25000 
Import 9800 / 25000 
Import 9900 / 25000 
Import 10000 / 25000 
Import 10100 / 25000 
Import 10200 / 25000 
Import 10300 / 25000 
Import 10400 / 25000 
Import 10500 / 25000 
Import 10600 / 25000 
Import 10700 / 25000 
Import 10800 / 25000 
Import 10900 / 25000 
Import 11000 / 25000 
Import 11100 / 25000 
Import 11200 / 25000 
Import 11300 / 25000 
Import 11400 / 25000 
Import 11500 / 25000 
Import 11600 / 25000 
Import 11700 / 25000 
Import 11800 / 25000 
Import 11900 / 25000 
Import 12000 / 25000 
Import 12100 / 25000 
Import 12200 / 25000 
Import 12300 / 25000 
Import 12400 / 25000 
Import 12500 / 25000 
Import 12600 / 25000 
Import 12700 / 25000 
Import 12800 / 25000 
Import 12900 / 25000 
Import 13000 / 25000 
Import 13100 / 25000 
Import 13200 / 25000 
Import 13300 / 25000 
Import 13400 / 25000 
Import 13500 / 25000 
Import 13600 / 25000 
Import 13700 / 25000 
Import 13800 / 25000 
Import 13900 / 25000 
Import 14000 / 25000 
Import 14100 / 25000 
Import 14200 / 25000 
Import 14300 / 25000 
Import 14400 / 25000 
Import 14500 / 25000 
Import 14600 / 25000 
Import 14700 / 25000 
Import 14800 / 25000 
Import 14900 / 25000 
Import 15000 / 25000 
Import 15100 / 25000 
Import 15200 / 25000 
Import 15300 / 25000 
Import 15400 / 25000 
Import 15500 / 25000 
Import 15600 / 25000 
Import 15700 / 25000 
Import 15800 / 25000 
Import 15900 / 25000 
Import 16000 / 25000 
Import 16100 / 25000 
Import 16200 / 25000 
Import 16300 / 25000 
Import 16400 / 25000 
Import 16500 / 25000 
Import 16600 / 25000 
Import 16700 / 25000 
Import 16800 / 25000 
Import 16900 / 25000 
Import 17000 / 25000 
Import 17100 / 25000 
Import 17200 / 25000 
Import 17300 / 25000 
Import 17400 / 25000 
Import 17500 / 25000 
Import 17600 / 25000 
Import 17700 / 25000 
Import 17800 / 25000 
Import 17900 / 25000 
Import 18000 / 25000 
Import 18100 / 25000 
Import 18200 / 25000 
Import 18300 / 25000 
Import 18400 / 25000 
Import 18500 / 25000 
Import 18600 / 25000 
Import 18700 / 25000 
Import 18800 / 25000 
Import 18900 / 25000 
Import 19000 / 25000 
Import 19100 / 25000 
Import 19200 / 25000 
Import 19300 / 25000 
Import 19400 / 25000 
Import 19500 / 25000 
Import 19600 / 25000 
Import 19700 / 25000 
Import 19800 / 25000 
Import 19900 / 25000 
Import 20000 / 25000 
Import 20100 / 25000 
Import 20200 / 25000 
Import 20300 / 25000 
Import 20400 / 25000 
Import 20500 / 25000 
Import 20600 / 25000 
Import 20700 / 25000 
Import 20800 / 25000 
Import 20900 / 25000 
Import 21000 / 25000 
Import 21100 / 25000 
Import 21200 / 25000 
Import 21300 / 25000 
Import 21400 / 25000 
Import 21500 / 25000 
Import 21600 / 25000 
Import 21700 / 25000 
Import 21800 / 25000 
Import 21900 / 25000 
Import 22000 / 25000 
Import 22100 / 25000 
Import 22200 / 25000 
Import 22300 / 25000 
Import 22400 / 25000 
Import 22500 / 25000 
Import 22600 / 25000 
Import 22700 / 25000 
Import 22800 / 25000 
Import 22900 / 25000 
Import 23000 / 25000 
Import 23100 / 25000 
Import 23200 / 25000 
Import 23300 / 25000 
Import 23400 / 25000 
Import 23500 / 25000 
Import 23600 / 25000 
Import 23700 / 25000 
Import 23800 / 25000 
Import 23900 / 25000 
Import 24000 / 25000 
Import 24100 / 25000 
Import 24200 / 25000 
Import 24300 / 25000 
Import 24400 / 25000 
Import 24500 / 25000 
Import 24600 / 25000 
Import 24700 / 25000 
Import 24800 / 25000 
Import 24900 / 25000 
Importing (25000) Articles complete
# Test that all data has loaded – get object count
result = (
    client.query.aggregate("Article")
    .with_fields("meta { count }")
    .do()
)
print("Object count: ", result["data"]["Aggregate"]["Article"])
Object count:  [{'meta': {'count': 25000}}]
# Test one article has worked by checking one object
test_article = (
    client.query
    .get("Article", ["title", "content", "_additional {id}"])
    .with_limit(1)
    .do()
)["data"]["Get"]["Article"][0]

print(test_article["_additional"]["id"])
print(test_article["title"])
print(test_article["content"])
000393f2-1182-4e3d-abcf-4217eda64be0
Lago d'Origlio
Lago d'Origlio is a lake in the municipality of Origlio, in Ticino, Switzerland.

Lakes of Ticino

搜索数据

如上所述,我们将向我们的新索引发出一些查询,并根据与现有向量的接近程度返回结果

def query_weaviate(query, collection_name, top_k=20):

    # Creates embedding vector from user query
    embedded_query = openai.Embedding.create(
        input=query,
        model=EMBEDDING_MODEL,
    )["data"][0]['embedding']
    
    near_vector = {"vector": embedded_query}

    # Queries input schema with vectorised user query
    query_result = (
        client.query
        .get(collection_name, ["title", "content", "_additional {certainty distance}"])
        .with_near_vector(near_vector)
        .with_limit(top_k)
        .do()
    )
    
    return query_result
query_result = query_weaviate("modern art in Europe", "Article")
counter = 0
for article in query_result["data"]["Get"]["Article"]:
    counter += 1
    print(f"{counter}. { article['title']} (Certainty: {round(article['_additional']['certainty'],3) }) (Distance: {round(article['_additional']['distance'],3) })")
1. Museum of Modern Art (Certainty: 0.938) (Distance: 0.125)
2. Western Europe (Certainty: 0.934) (Distance: 0.133)
3. Renaissance art (Certainty: 0.932) (Distance: 0.136)
4. Pop art (Certainty: 0.93) (Distance: 0.14)
5. Northern Europe (Certainty: 0.927) (Distance: 0.145)
6. Hellenistic art (Certainty: 0.926) (Distance: 0.147)
7. Modernist literature (Certainty: 0.924) (Distance: 0.153)
8. Art film (Certainty: 0.922) (Distance: 0.157)
9. Central Europe (Certainty: 0.921) (Distance: 0.157)
10. European (Certainty: 0.921) (Distance: 0.159)
11. Art (Certainty: 0.921) (Distance: 0.159)
12. Byzantine art (Certainty: 0.92) (Distance: 0.159)
13. Postmodernism (Certainty: 0.92) (Distance: 0.16)
14. Eastern Europe (Certainty: 0.92) (Distance: 0.161)
15. Europe (Certainty: 0.919) (Distance: 0.161)
16. Cubism (Certainty: 0.919) (Distance: 0.161)
17. Impressionism (Certainty: 0.919) (Distance: 0.162)
18. Bauhaus (Certainty: 0.919) (Distance: 0.162)
19. Expressionism (Certainty: 0.918) (Distance: 0.163)
20. Surrealism (Certainty: 0.918) (Distance: 0.163)
query_result = query_weaviate("Famous battles in Scottish history", "Article")
counter = 0
for article in query_result["data"]["Get"]["Article"]:
    counter += 1
    print(f"{counter}. {article['title']} (Score: {round(article['_additional']['certainty'],3) })")
1. Historic Scotland (Score: 0.946)
2. First War of Scottish Independence (Score: 0.946)
3. Battle of Bannockburn (Score: 0.946)
4. Wars of Scottish Independence (Score: 0.944)
5. Second War of Scottish Independence (Score: 0.94)
6. List of Scottish monarchs (Score: 0.937)
7. Scottish Borders (Score: 0.932)
8. Braveheart (Score: 0.929)
9. John of Scotland (Score: 0.929)
10. Guardians of Scotland (Score: 0.926)
11. Holyrood Abbey (Score: 0.925)
12. Scottish (Score: 0.925)
13. Scots (Score: 0.925)
14. Robert I of Scotland (Score: 0.924)
15. Scottish people (Score: 0.924)
16. Edinburgh Castle (Score: 0.924)
17. Alexander I of Scotland (Score: 0.924)
18. Robert Burns (Score: 0.924)
19. Battle of Bosworth Field (Score: 0.922)
20. David II of Scotland (Score: 0.922)
def near_text_weaviate(query, collection_name):
    
    nearText = {
        "concepts": [query],
        "distance": 0.7,
    }

    properties = [
        "title", "content",
        "_additional {certainty distance}"
    ]

    query_result = (
        client.query
        .get(collection_name, properties)
        .with_near_text(nearText)
        .with_limit(20)
        .do()
    )["data"]["Get"][collection_name]
    
    print (f"Objects returned: {len(query_result)}")
    
    return query_result
query_result = near_text_weaviate("modern art in Europe","Article")
counter = 0
for article in query_result:
    counter += 1
    print(f"{counter}. { article['title']} (Certainty: {round(article['_additional']['certainty'],3) }) (Distance: {round(article['_additional']['distance'],3) })")
Objects returned: 20
1. Museum of Modern Art (Certainty: 0.938) (Distance: 0.125)
2. Western Europe (Certainty: 0.934) (Distance: 0.133)
3. Renaissance art (Certainty: 0.932) (Distance: 0.136)
4. Pop art (Certainty: 0.93) (Distance: 0.14)
5. Northern Europe (Certainty: 0.927) (Distance: 0.145)
6. Hellenistic art (Certainty: 0.926) (Distance: 0.147)
7. Modernist literature (Certainty: 0.923) (Distance: 0.153)
8. Art film (Certainty: 0.922) (Distance: 0.157)
9. Central Europe (Certainty: 0.921) (Distance: 0.157)
10. European (Certainty: 0.921) (Distance: 0.159)
11. Art (Certainty: 0.921) (Distance: 0.159)
12. Byzantine art (Certainty: 0.92) (Distance: 0.159)
13. Postmodernism (Certainty: 0.92) (Distance: 0.16)
14. Eastern Europe (Certainty: 0.92) (Distance: 0.161)
15. Europe (Certainty: 0.919) (Distance: 0.161)
16. Cubism (Certainty: 0.919) (Distance: 0.161)
17. Impressionism (Certainty: 0.919) (Distance: 0.162)
18. Bauhaus (Certainty: 0.919) (Distance: 0.162)
19. Surrealism (Certainty: 0.918) (Distance: 0.163)
20. Expressionism (Certainty: 0.918) (Distance: 0.163)
query_result = near_text_weaviate("Famous battles in Scottish history","Article")
counter = 0
for article in query_result:
    counter += 1
    print(f"{counter}. { article['title']} (Certainty: {round(article['_additional']['certainty'],3) }) (Distance: {round(article['_additional']['distance'],3) })")
Objects returned: 20
1. Historic Scotland (Certainty: 0.946) (Distance: 0.107)
2. First War of Scottish Independence (Certainty: 0.946) (Distance: 0.108)
3. Battle of Bannockburn (Certainty: 0.946) (Distance: 0.109)
4. Wars of Scottish Independence (Certainty: 0.944) (Distance: 0.111)
5. Second War of Scottish Independence (Certainty: 0.94) (Distance: 0.121)
6. List of Scottish monarchs (Certainty: 0.937) (Distance: 0.127)
7. Scottish Borders (Certainty: 0.932) (Distance: 0.137)
8. Braveheart (Certainty: 0.929) (Distance: 0.141)
9. John of Scotland (Certainty: 0.929) (Distance: 0.142)
10. Guardians of Scotland (Certainty: 0.926) (Distance: 0.148)
11. Holyrood Abbey (Certainty: 0.925) (Distance: 0.15)
12. Scottish (Certainty: 0.925) (Distance: 0.15)
13. Scots (Certainty: 0.925) (Distance: 0.15)
14. Robert I of Scotland (Certainty: 0.924) (Distance: 0.151)
15. Scottish people (Certainty: 0.924) (Distance: 0.152)
16. Edinburgh Castle (Certainty: 0.924) (Distance: 0.153)
17. Alexander I of Scotland (Certainty: 0.924) (Distance: 0.153)
18. Robert Burns (Certainty: 0.924) (Distance: 0.153)
19. Battle of Bosworth Field (Certainty: 0.922) (Distance: 0.155)
20. David II of Scotland (Certainty: 0.922) (Distance: 0.157)