Vanna.ai 与 Google Cloud Platform 的集成:自然语言访问 BigQuery 数据
执行摘要
Vanna.ai 通过 Google Cloud Platform (GCP) 服务实现对 BigQuery 数据的自然语言查询,从而普及数据访问。该解决方案通过终端用户认证保持企业安全性,确保现有的 BigQuery 权限和行级安全性得以保留,同时为业务用户提供直观的界面以访问其数据。
商业价值和收益
数据访问民主化
-
自然语言界面
- Ask questions in plain English, Spanish, Portuguese, etc.
- No SQL knowledge required
- Immediate access to insights
-
Examples:
- "What were our total sales by region last quarter?"
- "Show me customer churn rates over the past 12 months"
- "Which products had the highest profit margin in Q4?"
-
提高业务用户的生产力
- Self-service analytics capabilities
- Reduced dependency on data teams
- Faster time to insight
- Direct access to required data
- Ability to explore data independently
-
更好的决策
- Real-time access to data insights
- Quick validation of business hypotheses
- Data-driven decision support
- Reduced time from question to answer
- Enhanced data exploration capabilities
企业安全与合规
原生安全集成
Vanna.ai 通过终端用户身份验证利用 Google Cloud 的原生安全模型:
- Users can only access data they already have permission to see in BigQuery
- All existing BigQuery permissions are automatically enforced
- Row-level security policies are automatically inherited
- Queries execute under the end user's credentials
安全优势
-
继承的访问控制
- BigQuery permissions carry over automatically
- No separate permission management required
- Row-level security remains enforced
- Column-level security is maintained
-
认证与授权
- Secure OAuth 2.0 integration
- End-user credential validation
- Automatic session management
- Audit trail maintenance
-
合规与治理
- Maintains existing data governance
- Preserves audit capabilities
- Ensures regulatory compliance
- Supports data privacy requirements
行政福利
-
集中管理
- Monitor query patterns
- Track usage metrics
- Manage user access
- Optimize performance
-
培训与提升
- Review query history
- Update training examples
- Customize responses
- Enhance accuracy
-
成本控制
- Monitor resource usage
- Optimize query performance
- Control access patterns
- Manage computational resources
技术实现
系统架构
该集成利用了多个GCP组件,以提供一个安全且可扩展的解决方案:
-
核心组件
- Cloud Run for serverless deployment
- BigQuery for data storage and querying
- Firestore for session management
- OAuth 2.0 for authentication
-
安全架构
实现要求
-
先决条件
-
服务配置
部署流程
-
启用所需API
# Enable Cloud Run API gcloud services enable run.googleapis.com # Enable Firestore API gcloud services enable firestore.googleapis.com -
部署到 Cloud Run
gcloud run deploy vanna-ai \ --image=us-central1-docker.pkg.dev/{provided-by-vanna-ai}:latest \ --region=us-central1 \ --allow-unauthenticated -
环境配置 所需变量:
BASE_URL: Cloud Run service URL PROJECT_ID: GCP project identifier GEMINI_API_KEY: API key for Gemini GOOGLE_OAUTH_CLIENT_CONFIG: OAuth client configuration FLASK_SECRET_KEY: Session encryption key
结论
Vanna.ai与GCP的集成为BigQuery数据提供了一个安全且用户友好的自然语言界面。通过利用最终用户身份验证和现有的安全策略,组织可以在保持严格的安全和治理控制的同时,实现数据访问的民主化。
注意:本白皮书反映的是截至2024年12月的配置。如需了解服务和功能的最新更新或更改,请查阅当前文档。