Published

Vanna.ai 企业版与 Google 云平台集成

如何使用Vanna.ai与Google云平台

Vanna.ai 与 Google Cloud Platform 的集成:自然语言访问 BigQuery 数据

执行摘要

Vanna.ai 通过 Google Cloud Platform (GCP) 服务实现对 BigQuery 数据的自然语言查询,从而普及数据访问。该解决方案通过终端用户认证保持企业安全性,确保现有的 BigQuery 权限和行级安全性得以保留,同时为业务用户提供直观的界面以访问其数据。

商业价值和收益

数据访问民主化

  1. 自然语言界面

    • Ask questions in plain English, Spanish, Portuguese, etc.
    • No SQL knowledge required
    • Immediate access to insights
    • Examples:
      • "What were our total sales by region last quarter?"
      • "Show me customer churn rates over the past 12 months"
      • "Which products had the highest profit margin in Q4?"
  2. 提高业务用户的生产力

    • Self-service analytics capabilities
    • Reduced dependency on data teams
    • Faster time to insight
    • Direct access to required data
    • Ability to explore data independently
  3. 更好的决策

    • Real-time access to data insights
    • Quick validation of business hypotheses
    • Data-driven decision support
    • Reduced time from question to answer
    • Enhanced data exploration capabilities

Business Benefits

企业安全与合规

原生安全集成

Vanna.ai 通过终端用户身份验证利用 Google Cloud 的原生安全模型:

  • Users can only access data they already have permission to see in BigQuery
  • All existing BigQuery permissions are automatically enforced
  • Row-level security policies are automatically inherited
  • Queries execute under the end user's credentials

安全优势

  1. 继承的访问控制

    • BigQuery permissions carry over automatically
    • No separate permission management required
    • Row-level security remains enforced
    • Column-level security is maintained
  2. 认证与授权

    • Secure OAuth 2.0 integration
    • End-user credential validation
    • Automatic session management
    • Audit trail maintenance
  3. 合规与治理

    • Maintains existing data governance
    • Preserves audit capabilities
    • Ensures regulatory compliance
    • Supports data privacy requirements

行政福利

GCP Admin

  1. 集中管理

    • Monitor query patterns
    • Track usage metrics
    • Manage user access
    • Optimize performance
  2. 培训与提升

    • Review query history
    • Update training examples
    • Customize responses
    • Enhance accuracy
  3. 成本控制

    • Monitor resource usage
    • Optimize query performance
    • Control access patterns
    • Manage computational resources

GCP Monitoring

技术实现

系统架构

该集成利用了多个GCP组件,以提供一个安全且可扩展的解决方案:

  1. 核心组件

    • Cloud Run for serverless deployment
    • BigQuery for data storage and querying
    • Firestore for session management
    • OAuth 2.0 for authentication
  2. 安全架构 Security Architecture

实现要求

  1. 先决条件 Prerequisites

  2. 服务配置 Service Configuration

部署流程

  1. 启用所需API

    # Enable Cloud Run API
    gcloud services enable run.googleapis.com
    
    # Enable Firestore API
    gcloud services enable firestore.googleapis.com
    
  2. 部署到 Cloud Run

    gcloud run deploy vanna-ai \
    --image=us-central1-docker.pkg.dev/{provided-by-vanna-ai}:latest \
    --region=us-central1 \
    --allow-unauthenticated
    
  3. 环境配置 所需变量:

    BASE_URL: Cloud Run service URL
    PROJECT_ID: GCP project identifier
    GEMINI_API_KEY: API key for Gemini
    GOOGLE_OAUTH_CLIENT_CONFIG: OAuth client configuration
    FLASK_SECRET_KEY: Session encryption key
    

结论

Vanna.ai与GCP的集成为BigQuery数据提供了一个安全且用户友好的自然语言界面。通过利用最终用户身份验证和现有的安全策略,组织可以在保持严格的安全和治理控制的同时,实现数据访问的民主化。


注意:本白皮书反映的是截至2024年12月的配置。如需了解服务和功能的最新更新或更改,请查阅当前文档。