liurenchaxin/docs/architecture/deployment_architecture.md

181 lines
6.1 KiB
Markdown

# OpenBB Integration Deployment Architecture
## Environment and Dependencies
### Base Requirements
- **Python Version**: 3.8 or higher
- **Core Dependencies**: As specified in `requirements.txt` (Streamlit, etc.)
### Optional OpenBB Dependency
- **OpenBB Library**: `openbb>=4.1.0`
- **Installation**: Not included in default `requirements.txt` to maintain lightweight base installation
- **Activation**: Install via `pip install "openbb>=4.1.0"` when needed
## Configuration Management
### Environment Variables
- **Standard Project Variables**: Managed through Doppler (RAPIDAPI_KEY, GOOGLE_API_KEY, etc.)
- **OpenBB Provider Variables**:
- For public data sources (like yfinance): No specific configuration required
- For premium data sources (e.g., Polygon, FMP):
- Variables are managed by OpenBB internally
- Follow OpenBB documentation for provider-specific setup
- Example: `POLYGON_API_KEY` for Polygon.io data
### Feature Flags
- **JIXIA_MEMORY_BACKEND**: When set to "cloudflare", enables Cloudflare AutoRAG as memory backend
- **GOOGLE_GENAI_USE_VERTEXAI**: When set to "TRUE", enables Vertex AI for memory bank
## Deployment Scenarios
### 1. Base Deployment (OpenBB Not Installed)
- **Characteristics**:
- Lightweight installation
- Relies on existing RapidAPI-based perpetual engine
- UI falls back to demo or synthetic data in OpenBB tab
- **Use Cases**:
- Minimal environment setups
- Systems where OpenBB installation is not feasible
- Development environments focusing on other features
### 2. Full Deployment (With OpenBB)
- **Characteristics**:
- Enhanced data capabilities through OpenBB
- Access to multiple data providers
- Improved data quality and coverage
- **Use Cases**:
- Production environments requiring comprehensive market data
- Advanced financial analysis and debate scenarios
- Integration with premium data sources
### 3. Hybrid Deployment (Selective Features)
- **Characteristics**:
- Selective installation of OpenBB providers
- Mix of OpenBB and perpetual engine data sources
- Fallback mechanisms ensure continuous operation
- **Use Cases**:
- Cost optimization by using free providers where possible
- Gradual migration from perpetual engine to OpenBB
- Testing new data sources without full commitment
## Containerization (Docker)
### Base Image
- Python 3.10-slim or equivalent
### Multi-Stage Build
1. **Builder Stage**:
- Install build dependencies
- Install Python dependencies
2. **Runtime Stage**:
- Copy installed packages from builder
- Copy application code
- Install optional OpenBB dependencies if specified
### Docker Compose Configuration
- Service definitions for main application
- Optional service for database (if using persistent memory backends)
- Volume mounts for configuration and data persistence
## Cloud Deployment
### Google Cloud Platform (GCP)
- **App Engine**:
- Standard environment with custom runtime
- Environment variables configured through `app.yaml`
- **Cloud Run**:
- Containerized deployment
- Secrets managed through Secret Manager
- **Compute Engine**:
- Full control over VM configuration
- Persistent disks for data storage
### Considerations for Cloud Deployment
- **API Key Security**:
- Use secret management services (Google Secret Manager, Doppler)
- Never store keys in code or environment files
- **Memory Backend Configuration**:
- For Vertex AI Memory Bank: Configure GOOGLE_CLOUD_PROJECT_ID and authentication
- For Cloudflare AutoRAG: Configure CLOUDFLARE_ACCOUNT_ID and API token
- **Scalability**:
- Stateless application design allows horizontal scaling
- Memory backends provide persistence across instances
## Memory Backend Integration
### Vertex AI Memory Bank (Default/Primary)
- **Activation**: Requires GOOGLE_GENAI_USE_VERTEXAI=true and proper GCP authentication
- **Dependencies**: `google-cloud-aiplatform` (installed with google-adk)
- **Deployment**: Requires GCP project with Vertex AI API enabled
### Cloudflare AutoRAG (Alternative)
- **Activation**: Requires JIXIA_MEMORY_BACKEND=cloudflare and Cloudflare credentials
- **Dependencies**: `aiohttp` (already in requirements)
- **Deployment**: Requires Cloudflare account with Vectorize and Workers AI enabled
## Monitoring and Observability
### Health Checks
- Application startup verification
- OpenBB availability check endpoint
- Memory backend connectivity verification
### Logging
- Structured logging for data access patterns
- Error tracking for failed data retrievals
- Performance metrics for data loading times
### Metrics Collection
- API usage counters (both RapidAPI and OpenBB)
- Fallback trigger rates
- Memory backend operation statistics
## Security Posture
### Data Security
- In-memory data processing
- No persistent storage of sensitive financial data
- Secure handling of API responses
### Access Control
- Streamlit authentication (if enabled)
- API key isolation per data provider
- Memory backend access controls through provider mechanisms
### Network Security
- HTTPS for all external API calls
- Outbound firewall rules for API endpoints
- Secure credential injection mechanisms
## Disaster Recovery and Business Continuity
### Data Source Redundancy
- Multiple API providers through OpenBB
- Fallback to perpetual engine when OpenBB fails
- Synthetic data generation for UI continuity
### Memory Backend Failover
- Local simulation mode when cloud backends are unavailable
- Graceful degradation of memory features
### Recovery Procedures
- Automated restart on critical failures
- Manual intervention procedures for configuration issues
- Rollback capabilities through version control
## Performance Optimization
### Caching Strategies
- OpenBB's internal caching mechanisms
- Streamlit's built-in caching for UI components
- Memory backend for persistent agent knowledge
### Resource Management
- Asynchronous data loading where possible
- Memory-efficient data structures
- Connection pooling for API requests
### Scaling Considerations
- Horizontal scaling for handling concurrent users
- Vertical scaling for memory-intensive operations
- Load balancing for distributed deployments