跳转到内容

知识图谱查询引擎

创建知识图谱通常涉及专业且复杂的任务。然而,通过利用Llama Index (LLM)、KnowledgeGraphIndex和GraphStore,我们可以促进从Llama Hub支持的任何数据源创建相对有效的知识图谱。

此外,查询知识图谱通常需要与存储系统相关的领域特定知识,例如Cypher。但是,借助LLM和LlamaIndex KnowledgeGraphQueryEngine的帮助,这可以通过自然语言实现!

在本演示中,我们将引导您完成以下步骤:

  • 使用Llama索引提取并设置知识图谱
  • 使用 Cypher 查询知识图谱
  • 使用自然语言查询知识图谱

如果您在 Colab 上打开这个笔记本,您可能需要安装 LlamaIndex 🦙。

%pip install llama-index-readers-wikipedia
%pip install llama-index-llms-azure-openai
%pip install llama-index-graph-stores-nebula
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-azure-openai
!pip install llama-index

首先让我们为Llama Index的基础准备工作做好准备。

# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
# define LLM
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.chunk_size = 512
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
# For Azure OpenAI
api_key = "<api-key>"
azure_endpoint = "https://<your-resource-name>.openai.azure.com/"
api_version = "2023-07-01-preview"
llm = AzureOpenAI(
model="gpt-35-turbo-16k",
deployment_name="my-custom-llm",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="my-custom-embedding",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

在下一步创建知识图谱之前,让我们确保我们有一个正在运行的 NebulaGraph 并已定义数据模式。

# Create a NebulaGraph (version 3.5.0 or newer) cluster with:
# Option 0 for machines with Docker: `curl -fsSL nebula-up.siwei.io/install.sh | bash`
# Option 1 for Desktop: NebulaGraph Docker Extension https://hub.docker.com/extensions/weygu/nebulagraph-dd-ext
# If not, create it with the following commands from NebulaGraph's console:
# CREATE SPACE llamaindex(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
# :sleep 10;
# USE llamaindex;
# CREATE TAG entity(name string);
# CREATE EDGE relationship(relationship string);
# :sleep 10;
# CREATE TAG INDEX entity_index ON entity(name(256));
%pip install ipython-ngql nebula3-python
os.environ["NEBULA_USER"] = "root"
os.environ["NEBULA_PASSWORD"] = "nebula" # default is "nebula"
os.environ[
"NEBULA_ADDRESS"
] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally
space_name = "llamaindex"
edge_types, rel_prop_names = ["relationship"], [
"relationship"
] # default, could be omit if create from an empty kg
tags = ["entity"] # default, could be omit if create from an empty kg
Requirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)
Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.4.0)
Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)
Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)
Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (2023.3)
Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.18.3)
Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.22.0)
Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (1.16.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python) (3.0.9)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)
Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)
Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)
WARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.
You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.

准备将 graph_store 设置为 NebulaGraphStore 的 StorageContext

from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
graph_store = NebulaGraphStore(
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)

在Llama Index和LLM的帮助下,我们可以从给定文档构建知识图谱。

如果我们已经在 NebulaGraphStore 上有了知识图谱,可以跳过此步骤

from llama_index.core import download_loader
from llama_index.readers.wikipedia import WikipediaReader
loader = WikipediaReader()
documents = loader.load_data(
pages=["Guardians of the Galaxy Vol. 3"], auto_suggest=False
)

步骤2,使用NebulaGraph作为图存储生成知识图谱索引

Section titled “Step 2, Generate a KnowledgeGraphIndex with NebulaGraph as graph_store”

接下来,我们将创建一个知识图谱索引以实现基于图谱的检索增强生成,详情请参阅此处。除此之外,我们还运行着一个用于其他目的的知识图谱!

from llama_index.core import KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex.from_documents(
documents,
storage_context=storage_context,
max_triplets_per_chunk=10,
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
include_embeddings=True,
)

现在我们已经在名为 llamaindex 的图空间下,在 NebulaGraph 集群上构建了关于《银河护卫队3》电影的知识图谱,让我们来稍微探索一下。

# install related packages, password is nebula by default
%pip install ipython-ngql networkx pyvis
%load_ext ngql
%ngql --address 127.0.0.1 --port 9669 --user root --password <password>
Requirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)
Requirement already satisfied: networkx in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.1)
Requirement already satisfied: pyvis in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.3.2)
Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)
Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)
Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.4.0)
Requirement already satisfied: jsonpickle>=1.4.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (3.0.1)
Requirement already satisfied: ipython>=5.3.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (8.10.0)
Requirement already satisfied: backcall in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.2.0)
Requirement already satisfied: pickleshare in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.7.5)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.30 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (3.0.39)
Requirement already satisfied: appnope in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.3)
Requirement already satisfied: pygments>=2.4.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (2.15.1)
Requirement already satisfied: traitlets>=5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.9.0)
Requirement already satisfied: pexpect>4.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (4.8.0)
Requirement already satisfied: stack-data in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.6.2)
Requirement already satisfied: decorator in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.18.2)
Requirement already satisfied: matplotlib-inline in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.6)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from jedi>=0.16->ipython>=5.3.0->pyvis) (0.8.3)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)
Requirement already satisfied: ptyprocess>=0.5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pexpect>4.3->ipython>=5.3.0->pyvis) (0.7.0)
Requirement already satisfied: wcwidth in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.30->ipython>=5.3.0->pyvis) (0.2.6)
Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (1.16.0)
Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (2023.3)
Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.18.3)
Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.22.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python->ipython-ngql) (3.0.9)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)
Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)
Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)
Requirement already satisfied: executing>=1.2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (1.2.0)
Requirement already satisfied: pure-eval in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (0.2.2)
Requirement already satisfied: asttokens>=2.1.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (2.2.1)
WARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.
You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
Connection Pool Created
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
[ERROR]:
'IPythonNGQL' object has no attribute '_decode_value'
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
名称
0 llamaindex
# Query some random Relationships with Cypher
%ngql USE llamaindex;
%ngql MATCH ()-[e]->() RETURN e LIMIT 10
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
e
0 ("电影的第二支预告片")-[:关系...
1 ("Adam McKay")-[:relationship@-442854342936029...
2 ("Adam McKay")-[:relationship@8513344855738553...
3 ("Asim Chaudhry")-[:relationship@-803614038978...
4 ("Bakalova")-[:relationship@-25325064520311626...
5 ("Bautista")-[:relationship@-90386029986457371...
6 ("Bautista")-[:relationship@-90386029986457371...
7 ("Beth Mickle")-[:relationship@716197657641767...
8 ("Bradley Cooper")-[:relationship@138630731832...
9 ("Bradley Cooper")-[:relationship@838402633192...
# draw the result
%ng_draw
nebulagraph_draw.html

最后,让我们演示如何用自然语言查询知识图谱!

在这里,我们将利用 KnowledgeGraphQueryEngine,并将 NebulaGraphStore 作为 storage_context.graph_store

from llama_index.core.query_engine import KnowledgeGraphQueryEngine
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
query_engine = KnowledgeGraphQueryEngine(
storage_context=storage_context,
llm=llm,
verbose=True,
)
response = query_engine.query(
"Tell me about Peter Quill?",
)
display(Markdown(f"<b>{response}</b>"))
Graph Store Query:
```
MATCH (p:`entity`)-[:relationship]->(m:`entity`) WHERE p.`entity`.`name` == 'Peter Quill'
RETURN p.`entity`.`name`;
```
Graph Store Response:
{'p.entity.name': ['Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill']}
Final Response:
Peter Quill is a character in the Marvel Universe. He is the son of Meredith Quill and Ego the Living Planet.


彼得·奎尔是漫威宇宙中的一个角色。他是梅雷迪思·奎尔与活体行星伊戈的儿子。

graph_query = query_engine.generate_query(
"Tell me about Peter Quill?",
)
graph_query = graph_query.replace("WHERE", "\n WHERE").replace(
"RETURN", "\nRETURN"
)
display(
Markdown(
f"""
```cypher
{graph_query}

""" ) )

```cypher

匹配 (p:entity)-[:关系]->(m:entity) 当 p.entity.name == ‘彼得·奎尔’

返回 p.entity.name;

我们可以看到它有助于生成图查询:

MATCH (p:`entity`)-[:relationship]->(e:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN e.`entity`.`name`;

并根据其结果综合问题:

{'e2.entity.name': ['grandfather', 'alternate version of Gamora', 'Guardians of the Galaxy']}

当然我们仍然可以查询它!那么这个查询引擎就可以成为我们最好的图查询语言学习机器人了 :)。

%%ngql
MATCH (p:`entity`)-[e:relationship]->(m:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN p.`entity`.`name`, e.relationship, m.`entity`.`name`;
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
p.entity.name e.relationship m.entity.name
0 彼得·奎尔 将回归漫威电影宇宙 2021年5月
1 彼得·奎尔 从地球被绑架 作为子项
2 彼得·奎尔 是...的领导者 银河护卫队
3 彼得·奎尔 一群外星窃贼和走私者
4 彼得·奎尔 是半人类 半神族

并将要渲染的查询进行更改

%%ngql
MATCH (p:`entity`)-[e:relationship]->(m:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN p, e, m;
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
p e m
0 ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) ("Peter Quill")-[:relationship@-84437522554765... ("2021年5月" :实体{名称: "2021年5月"})
1 ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) ("Peter Quill")-[:relationship@-11770408155938... ("作为孩子" :entity{name: "作为孩子"})
2 ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) ("Peter Quill")-[:relationship@-79394488349732... ("银河护卫队" :实体{名称: "守护者...
3 ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) ("Peter Quill")-[:relationship@325695233021653... ("一群外星窃贼和走私者" :ent...
4 ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) ("Peter Quill")-[:relationship@555553046209276... ("半神族" :实体{名称: "半神族...
%ng_draw
nebulagraph_draw.html

从渲染的图表来看,这个知识获取查询的结果再清晰不过了。