Vanna Logo Vanna.AI
Get Started for Free
  • Home
  • Docs
  • Blog
  • Contact
  • Star
  • How It Works
  • Try a Sample Database
  • Train Vanna
    • Built-in Web App
    • Streamlit
    • Flask
    • Slack
    • Connect to Snowflake
    • Connect to BigQuery
    • Connect to Postgres
    • Connect to Other Databases
  • Running Locally
  • API Reference

使用Vanna.AI(推荐)、Vanna托管的向量数据库(推荐)通过OpenAI生成Postgres的SQL¶

本笔记本介绍了使用vanna Python包通过AI(RAG + LLMs)生成SQL的过程,包括连接到数据库和训练。如果您还没有准备好在自己的数据库上进行训练,您仍然可以使用示例SQLite数据库进行尝试。

你想使用哪个LLM?

  • [Selected] OpenAI via Vanna.AI (Recommended)
    Use Vanna.AI for free to generate your queries
  • OpenAI
    Use OpenAI with your own API key
  • Azure OpenAI
    If you have OpenAI models deployed on Azure
  • Mistral via Mistral API
    If you have a Mistral API key
  • Other LLM
    If you have a different LLM model

你想在哪里存储“训练”数据?

  • [Selected] Vanna Hosted Vector DB (Recommended)
    Use Vanna.AIs hosted vector database (pgvector) for free. This is usable across machines with no additional setup.
  • ChromaDB
    Use ChromaDBs open-source vector database for free locally. No additional setup is necessary -- all database files will be created and stored locally.
  • Marqo
    Use Marqo locally for free. Requires additional setup. Or use their hosted option.
  • Other VectorDB
    Use any other vector database. Requires additional setup.

设置¶

In [ ]:
%pip install 'vanna[postgres]'
In [ ]:
import vanna
from vanna.remote import VannaDefault
In [ ]:
api_key = vanna.get_api_key('my-email@example.com')

vanna_model_name = # Put a unique name here
vn = VannaDefault(model=vanna_model_name, api_key=api_key)

你想查询哪个数据库?

  • [Selected] Postgres
  • Snowflake
  • BigQuery
  • SQLite
  • Other Database
    Use Vanna to generate queries for any SQL database
In [ ]:
vn.connect_to_postgres(host='my-host', dbname='my-dbname', user='my-user', password='my-password', port='my-port')

训练¶

你只需要训练一次。除非你想添加更多的训练数据,否则不要再次训练。

In [ ]:
# The information schema query may need some tweaking depending on your database. This is a good starting point.
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")

# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
plan

# If you like the plan, then uncomment this and run it to train
# vn.train(plan=plan)
In [ ]:
# The following are methods for adding training data. Make sure you modify the examples to match your database.

# DDL statements are powerful because they specify table names, colume names, types, and potentially relationships
vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS my-table (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        age INT
    )
""")

# Sometimes you may want to add documentation about your business terminology or definitions.
vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")

# You can also add SQL queries to your training data. This is useful if you have some queries already laying around. You can just copy and paste those from your editor to begin generating new SQL.
vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")
In [ ]:
# At any time you can inspect what training data the package is able to reference
training_data = vn.get_training_data()
training_data
In [ ]:
# You can remove training data if there's obsolete/incorrect information. 
vn.remove_training_data(id='1-ddl')

向AI提问¶

每当你提出一个新问题时,它将找到10个最相关的训练数据,并将其作为LLM提示的一部分来生成SQL。

In [ ]:
vn.ask(question=...)

启动用户界面¶

vanna-flask

In [ ]:
from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()

下一步¶

通过Jupyter笔记本使用Vanna非常适合入门,但也可以查看其他可定制的界面,例如

  • Streamlit app
  • Flask app
  • Slackbot

公司

  • Docs
  • GitHub
  • Blog

帮助中心

  • Discord
  • Slack
  • LinkedIn
  • Email Us

Vanna.AI Logo Vanna.AI © Vanna.AI. All Rights Reserved.