知识图谱查询引擎
创建知识图谱通常涉及专业且复杂的任务。然而,通过利用Llama Index (LLM)、KnowledgeGraphIndex和GraphStore,我们可以促进从Llama Hub支持的任何数据源创建相对有效的知识图谱。
此外,查询知识图谱通常需要与存储系统相关的领域特定知识,例如Cypher。但是,借助LLM和LlamaIndex KnowledgeGraphQueryEngine的帮助,这可以通过自然语言实现!
在本演示中,我们将引导您完成以下步骤:
- 使用Llama索引提取并设置知识图谱
- 使用 Cypher 查询知识图谱
- 使用自然语言查询知识图谱
如果您在 Colab 上打开这个笔记本,您可能需要安装 LlamaIndex 🦙。
%pip install llama-index-readers-wikipedia%pip install llama-index-llms-azure-openai%pip install llama-index-graph-stores-nebula%pip install llama-index-llms-openai%pip install llama-index-embeddings-azure-openai!pip install llama-index首先让我们为Llama Index的基础准备工作做好准备。
OpenAI
Section titled “OpenAI”# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import loggingimport sys
logging.basicConfig( stream=sys.stdout, level=logging.INFO) # logging.DEBUG for more verbose output
# define LLMfrom llama_index.llms.openai import OpenAIfrom llama_index.core import Settings
Settings.llm = OpenAI(temperature=0, model="gpt-3.5-turbo")Settings.chunk_size = 512from llama_index.llms.azure_openai import AzureOpenAIfrom llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
# For Azure OpenAIapi_key = "<api-key>"azure_endpoint = "https://<your-resource-name>.openai.azure.com/"api_version = "2023-07-01-preview"
llm = AzureOpenAI( model="gpt-35-turbo-16k", deployment_name="my-custom-llm", api_key=api_key, azure_endpoint=azure_endpoint, api_version=api_version,)
# You need to deploy your own embedding model as well as your own chat completion modelembed_model = AzureOpenAIEmbedding( model="text-embedding-ada-002", deployment_name="my-custom-embedding", api_key=api_key, azure_endpoint=azure_endpoint, api_version=api_version,)from llama_index.core import Settings
Settings.llm = llmSettings.embed_model = embed_modelSettings.chunk_size = 512准备 NebulaGraph
Section titled “Prepare for NebulaGraph”在下一步创建知识图谱之前,让我们确保我们有一个正在运行的 NebulaGraph 并已定义数据模式。
# Create a NebulaGraph (version 3.5.0 or newer) cluster with:# Option 0 for machines with Docker: `curl -fsSL nebula-up.siwei.io/install.sh | bash`# Option 1 for Desktop: NebulaGraph Docker Extension https://hub.docker.com/extensions/weygu/nebulagraph-dd-ext
# If not, create it with the following commands from NebulaGraph's console:# CREATE SPACE llamaindex(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);# :sleep 10;# USE llamaindex;# CREATE TAG entity(name string);# CREATE EDGE relationship(relationship string);# :sleep 10;# CREATE TAG INDEX entity_index ON entity(name(256));
%pip install ipython-ngql nebula3-python
os.environ["NEBULA_USER"] = "root"os.environ["NEBULA_PASSWORD"] = "nebula" # default is "nebula"os.environ[ "NEBULA_ADDRESS"] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally
space_name = "llamaindex"edge_types, rel_prop_names = ["relationship"], [ "relationship"] # default, could be omit if create from an empty kgtags = ["entity"] # default, could be omit if create from an empty kgRequirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.4.0)Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (2023.3)Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.18.3)Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.22.0)Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (1.16.0)Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python) (3.0.9)Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)[33mWARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.[0mNote: you may need to restart the kernel to use updated packages.准备将 graph_store 设置为 NebulaGraphStore 的 StorageContext
from llama_index.core import StorageContextfrom llama_index.graph_stores.nebula import NebulaGraphStore
graph_store = NebulaGraphStore( space_name=space_name, edge_types=edge_types, rel_prop_names=rel_prop_names, tags=tags,)storage_context = StorageContext.from_defaults(graph_store=graph_store)(可选)使用LlamaIndex构建知识图谱
Section titled “(Optional)Build the Knowledge Graph with LlamaIndex”在Llama Index和LLM的帮助下,我们可以从给定文档构建知识图谱。
如果我们已经在 NebulaGraphStore 上有了知识图谱,可以跳过此步骤
步骤1,从维基百科加载《银河护卫队3》的数据
Section titled “Step 1, load data from Wikipedia for “Guardians of the Galaxy Vol. 3””from llama_index.core import download_loader
from llama_index.readers.wikipedia import WikipediaReader
loader = WikipediaReader()
documents = loader.load_data( pages=["Guardians of the Galaxy Vol. 3"], auto_suggest=False)步骤2,使用NebulaGraph作为图存储生成知识图谱索引
Section titled “Step 2, Generate a KnowledgeGraphIndex with NebulaGraph as graph_store”接下来,我们将创建一个知识图谱索引以实现基于图谱的检索增强生成,详情请参阅此处。除此之外,我们还运行着一个用于其他目的的知识图谱!
from llama_index.core import KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex.from_documents( documents, storage_context=storage_context, max_triplets_per_chunk=10, space_name=space_name, edge_types=edge_types, rel_prop_names=rel_prop_names, tags=tags, include_embeddings=True,)现在我们已经在名为 llamaindex 的图空间下,在 NebulaGraph 集群上构建了关于《银河护卫队3》电影的知识图谱,让我们来稍微探索一下。
# install related packages, password is nebula by default%pip install ipython-ngql networkx pyvis%load_ext ngql%ngql --address 127.0.0.1 --port 9669 --user root --password <password>Requirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)Requirement already satisfied: networkx in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.1)Requirement already satisfied: pyvis in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.3.2)Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.4.0)Requirement already satisfied: jsonpickle>=1.4.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (3.0.1)Requirement already satisfied: ipython>=5.3.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (8.10.0)Requirement already satisfied: backcall in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.2.0)Requirement already satisfied: pickleshare in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.7.5)Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.30 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (3.0.39)Requirement already satisfied: appnope in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.3)Requirement already satisfied: pygments>=2.4.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (2.15.1)Requirement already satisfied: traitlets>=5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.9.0)Requirement already satisfied: pexpect>4.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (4.8.0)Requirement already satisfied: stack-data in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.6.2)Requirement already satisfied: decorator in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.1.1)Requirement already satisfied: jedi>=0.16 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.18.2)Requirement already satisfied: matplotlib-inline in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.6)Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from jedi>=0.16->ipython>=5.3.0->pyvis) (0.8.3)Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)Requirement already satisfied: ptyprocess>=0.5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pexpect>4.3->ipython>=5.3.0->pyvis) (0.7.0)Requirement already satisfied: wcwidth in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.30->ipython>=5.3.0->pyvis) (0.2.6)Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (1.16.0)Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (2023.3)Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.18.3)Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.22.0)Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python->ipython-ngql) (3.0.9)Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)Requirement already satisfied: executing>=1.2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (1.2.0)Requirement already satisfied: pure-eval in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (0.2.2)Requirement already satisfied: asttokens>=2.1.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (2.2.1)[33mWARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.[0mNote: you may need to restart the kernel to use updated packages.Connection Pool CreatedINFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)[ERROR]: 'IPythonNGQL' object has no attribute '_decode_value'.dataframe tbody tr th { vertical-align: top;}
.dataframe thead th { text-align: right;}| 名称 | |
|---|---|
| 0 | llamaindex |
# Query some random Relationships with Cypher%ngql USE llamaindex;%ngql MATCH ()-[e]->() RETURN e LIMIT 10INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669).dataframe tbody tr th { vertical-align: top;}
.dataframe thead th { text-align: right;}| e | |
|---|---|
| 0 | ("电影的第二支预告片")-[:关系... |
| 1 | ("Adam McKay")-[:relationship@-442854342936029... |
| 2 | ("Adam McKay")-[:relationship@8513344855738553... |
| 3 | ("Asim Chaudhry")-[:relationship@-803614038978... |
| 4 | ("Bakalova")-[:relationship@-25325064520311626... |
| 5 | ("Bautista")-[:relationship@-90386029986457371... |
| 6 | ("Bautista")-[:relationship@-90386029986457371... |
| 7 | ("Beth Mickle")-[:relationship@716197657641767... |
| 8 | ("Bradley Cooper")-[:relationship@138630731832... |
| 9 | ("Bradley Cooper")-[:relationship@838402633192... |
# draw the result
%ng_drawnebulagraph_draw.html最后,让我们演示如何用自然语言查询知识图谱!
在这里,我们将利用 KnowledgeGraphQueryEngine,并将 NebulaGraphStore 作为 storage_context.graph_store。
from llama_index.core.query_engine import KnowledgeGraphQueryEngine
from llama_index.core import StorageContextfrom llama_index.graph_stores.nebula import NebulaGraphStore
query_engine = KnowledgeGraphQueryEngine( storage_context=storage_context, llm=llm, verbose=True,)response = query_engine.query( "Tell me about Peter Quill?",)display(Markdown(f"<b>{response}</b>"))[33;1m[1;3mGraph Store Query:```MATCH (p:`entity`)-[:relationship]->(m:`entity`) WHERE p.`entity`.`name` == 'Peter Quill'RETURN p.`entity`.`name`;```[0m[33;1m[1;3mGraph Store Response:{'p.entity.name': ['Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill']}[0m[32;1m[1;3mFinal Response:
Peter Quill is a character in the Marvel Universe. He is the son of Meredith Quill and Ego the Living Planet.[0m彼得·奎尔是漫威宇宙中的一个角色。他是梅雷迪思·奎尔与活体行星伊戈的儿子。
graph_query = query_engine.generate_query( "Tell me about Peter Quill?",)
graph_query = graph_query.replace("WHERE", "\n WHERE").replace( "RETURN", "\nRETURN")
display( Markdown( f"""```cypher{graph_query}""" ) )
```cypher匹配 (p:entity)-[:关系]->(m:entity)
当 p.entity.name == ‘彼得·奎尔’
返回 p.entity.name;
我们可以看到它有助于生成图查询:
MATCH (p:`entity`)-[:relationship]->(e:`entity`) WHERE p.`entity`.`name` == 'Peter Quill'RETURN e.`entity`.`name`;并根据其结果综合问题:
{'e2.entity.name': ['grandfather', 'alternate version of Gamora', 'Guardians of the Galaxy']}当然我们仍然可以查询它!那么这个查询引擎就可以成为我们最好的图查询语言学习机器人了 :)。
%%ngqlMATCH (p:`entity`)-[e:relationship]->(m:`entity`) WHERE p.`entity`.`name` == 'Peter Quill'RETURN p.`entity`.`name`, e.relationship, m.`entity`.`name`;INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669).dataframe tbody tr th { vertical-align: top;}
.dataframe thead th { text-align: right;}| p.entity.name | e.relationship | m.entity.name | |
|---|---|---|---|
| 0 | 彼得·奎尔 | 将回归漫威电影宇宙 | 2021年5月 |
| 1 | 彼得·奎尔 | 从地球被绑架 | 作为子项 |
| 2 | 彼得·奎尔 | 是...的领导者 | 银河护卫队 |
| 3 | 彼得·奎尔 | 由 | 一群外星窃贼和走私者 |
| 4 | 彼得·奎尔 | 是半人类 | 半神族 |
并将要渲染的查询进行更改
%%ngqlMATCH (p:`entity`)-[e:relationship]->(m:`entity`) WHERE p.`entity`.`name` == 'Peter Quill'RETURN p, e, m;INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669).dataframe tbody tr th { vertical-align: top;}
.dataframe thead th { text-align: right;}| p | e | m | |
|---|---|---|---|
| 0 | ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) | ("Peter Quill")-[:relationship@-84437522554765... | ("2021年5月" :实体{名称: "2021年5月"}) |
| 1 | ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) | ("Peter Quill")-[:relationship@-11770408155938... | ("作为孩子" :entity{name: "作为孩子"}) |
| 2 | ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) | ("Peter Quill")-[:relationship@-79394488349732... | ("银河护卫队" :实体{名称: "守护者... |
| 3 | ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) | ("Peter Quill")-[:relationship@325695233021653... | ("一群外星窃贼和走私者" :ent... |
| 4 | ("彼得·奎尔" :实体{名称: "彼得·奎尔"}) | ("Peter Quill")-[:relationship@555553046209276... | ("半神族" :实体{名称: "半神族... |
%ng_drawnebulagraph_draw.html从渲染的图表来看,这个知识获取查询的结果再清晰不过了。