284 lines
		
	
	
		
			8.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			284 lines
		
	
	
		
			8.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
| # Management Infrastructure
 | ||
| 
 | ||
| ## 🚨 关键问题记录
 | ||
| 
 | ||
| ### Nomad Consul KV 模板语法问题
 | ||
| 
 | ||
| **问题描述:**
 | ||
| Nomad 无法从 Consul KV 读取配置,报错:`Missing: kv.block(config/dev/cloudflare/token)`
 | ||
| 
 | ||
| **根本原因:**
 | ||
| 1. **Nomad 客户端未配置 Consul 连接** - Nomad 无法访问 Consul KV
 | ||
| 2. **模板语法正确** - `{{ key "path/to/key" }}` 是正确语法
 | ||
| 3. **Consul KV 数据存在** - `config/dev/cloudflare/token` 确实存在
 | ||
| 
 | ||
| **解决方案:**
 | ||
| 1. **临时方案** - 硬编码 token 到配置文件中
 | ||
| 2. **长期方案** - 配置 Nomad 客户端连接 Consul
 | ||
| 
 | ||
| **核心诉求:**
 | ||
| - **集中化存储** → Consul KV 存储所有敏感配置
 | ||
| - **分散化部署** → Nomad 从 Consul 读取配置部署到多节点
 | ||
| - **直接读取** → Nomad 模板系统直接从 Consul KV 读取配置
 | ||
| 
 | ||
| **当前状态:**
 | ||
| - ✅ Consul KV 存储正常
 | ||
| - ✅ Traefik 服务运行正常
 | ||
| - ❌ Nomad 无法读取 Consul KV(需要配置连接)
 | ||
| 
 | ||
| **下一步:**
 | ||
| 1. 配置 Nomad 客户端连接 Consul
 | ||
| 2. 恢复模板语法从 Consul KV 读取配置
 | ||
| 3. 实现真正的集中化配置管理
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 🎯 Traefik 配置架构:配置与应用分离的最佳实践
 | ||
| 
 | ||
| ### ⚠️ 重要:避免低逼格操作
 | ||
| 
 | ||
| **❌ 错误做法(显得很low):**
 | ||
| - 修改Nomad job文件来添加新域名
 | ||
| - 重新部署整个Traefik服务
 | ||
| - 把配置嵌入在应用定义中
 | ||
| 
 | ||
| **✅ 正确做法(优雅且专业):**
 | ||
| 
 | ||
| ### 配置文件分离架构
 | ||
| 
 | ||
| **1. 配置文件位置:**
 | ||
| - **动态配置**: `/root/mgmt/components/traefik/config/dynamic.yml`
 | ||
| - **应用配置**: `/root/mgmt/components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad`
 | ||
| 
 | ||
| **2. 关键特性:**
 | ||
| - ✅ **热重载**: Traefik配置了`file`提供者,支持`watch: true`
 | ||
| - ✅ **自动生效**: 修改YAML配置文件后自动生效,无需重启
 | ||
| - ✅ **配置分离**: 配置与应用完全分离,符合最佳实践
 | ||
| 
 | ||
| **3. 添加新域名的工作流程:**
 | ||
| ```bash
 | ||
| # 只需要编辑配置文件
 | ||
| vim /root/mgmt/components/traefik/config/dynamic.yml
 | ||
| 
 | ||
| # 添加新的路由配置
 | ||
| routers:
 | ||
|   new-service-ui:
 | ||
|     rule: "Host(`new-service.git-4ta.live`)"
 | ||
|     service: new-service-cluster
 | ||
|     entryPoints:
 | ||
|       - websecure
 | ||
|     tls:
 | ||
|       certResolver: cloudflare
 | ||
| 
 | ||
| # 保存后立即生效,无需重启!
 | ||
| ```
 | ||
| 
 | ||
| **4. 架构优势:**
 | ||
| - 🚀 **零停机时间**: 配置变更无需重启服务
 | ||
| - 🔧 **灵活管理**: 独立管理配置和应用
 | ||
| - 📝 **版本控制**: 配置文件可以独立版本管理
 | ||
| - 🎯 **专业标准**: 符合现代DevOps最佳实践
 | ||
| 
 | ||
| **记住:配置与应用分离是现代基础设施管理的核心原则!**
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 架构概览
 | ||
| 
 | ||
| ### 集中化 + 分散化架构
 | ||
| 
 | ||
| **集中化存储:**
 | ||
| - **Consul KV** → 存储所有敏感配置(tokens、证书、密钥)
 | ||
| - **Consul Service Discovery** → 服务注册和发现
 | ||
| - **Consul Health Checks** → 服务健康检查
 | ||
| 
 | ||
| **分散化部署:**
 | ||
| - **亚洲节点** → `warden.tailnet-68f9.ts.net` (北京)
 | ||
| - **亚洲节点** → `ch4.tailnet-68f9.ts.net` (韩国)
 | ||
| - **美洲节点** → `ash3c.tailnet-68f9.ts.net` (美国)
 | ||
| 
 | ||
| ### 服务端点
 | ||
| 
 | ||
| - `https://consul.git-4ta.live` → Consul UI
 | ||
| - `https://traefik.git-4ta.live` → Traefik Dashboard
 | ||
| - `https://nomad.git-4ta.live` → Nomad UI
 | ||
| - `https://vault.git-4ta.live` → Vault UI
 | ||
| - `https://waypoint.git-4ta.live` → Waypoint UI
 | ||
| - `https://authentik.git-4ta.live` → Authentik 身份认证
 | ||
| 
 | ||
| ### 技术栈
 | ||
| 
 | ||
| - **Nomad** → 工作负载编排
 | ||
| - **Consul** → 服务发现和配置管理
 | ||
| - **Traefik** → 反向代理和负载均衡
 | ||
| - **Cloudflare** → DNS 和 SSL 证书管理
 | ||
| - **Waypoint** → 应用部署平台
 | ||
| - **Authentik** → 身份认证和授权管理
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 部署状态
 | ||
| 
 | ||
| ### ✅ 已完成
 | ||
| - [x] Cloudflare token 存储到 Consul KV
 | ||
| - [x] 泛域名解析 `*.git-4ta.live` 配置
 | ||
| - [x] Traefik 配置和部署
 | ||
| - [x] SSL 证书自动获取
 | ||
| - [x] 所有服务端点配置
 | ||
| - [x] Vault 迁移到 Nomad 管理
 | ||
| - [x] Vault 高可用三节点部署
 | ||
| - [x] Waypoint 服务器部署和引导
 | ||
| - [x] Waypoint 认证 token 获取和存储
 | ||
| - [x] Nomad jobs 配置备份到 Consul KV
 | ||
| - [x] Authentik 容器部署和SSH密钥配置
 | ||
| - [x] Traefik 配置架构优化(配置与应用分离)
 | ||
| 
 | ||
| ### ⚠️ 待解决
 | ||
| - [ ] Nomad 客户端 Consul 连接配置
 | ||
| - [ ] 恢复从 Consul KV 读取配置
 | ||
| - [ ] 实现真正的集中化配置管理
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 快速开始
 | ||
| 
 | ||
| ### 检查服务状态
 | ||
| ```bash
 | ||
| # 检查所有服务
 | ||
| curl -k -I https://consul.git4ta.tech
 | ||
| curl -k -I https://traefik.git4ta.tech
 | ||
| curl -k -I https://nomad.git4ta.tech
 | ||
| curl -k -I https://waypoint.git4ta.tech
 | ||
| ```
 | ||
| 
 | ||
| ### 部署 Traefik
 | ||
| ```bash
 | ||
| cd /root/mgmt
 | ||
| nomad job run components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad
 | ||
| ```
 | ||
| 
 | ||
| ### 管理 Traefik 配置(推荐方式)
 | ||
| ```bash
 | ||
| # 添加新域名只需要编辑配置文件
 | ||
| vim /root/mgmt/components/traefik/config/dynamic.yml
 | ||
| 
 | ||
| # 保存后自动生效,无需重启!
 | ||
| # 这就是配置与应用分离的优雅之处
 | ||
| ```
 | ||
| 
 | ||
| ### 检查 Consul KV
 | ||
| ```bash
 | ||
| consul kv get config/dev/cloudflare/token
 | ||
| consul kv get -recurse config/
 | ||
| ```
 | ||
| 
 | ||
| ### 备份管理
 | ||
| ```bash
 | ||
| # 查看备份列表
 | ||
| consul kv get backup/nomad-jobs/index
 | ||
| 
 | ||
| # 查看最新备份信息
 | ||
| consul kv get backup/nomad-jobs/20251004/metadata
 | ||
| 
 | ||
| # 恢复备份
 | ||
| consul kv get backup/nomad-jobs/20251004/data > restore.tar.gz
 | ||
| tar -xzf restore.tar.gz
 | ||
| ```
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 重要文件
 | ||
| 
 | ||
| - `components/traefik/config/dynamic.yml` → **Traefik 动态配置文件(推荐使用)**
 | ||
| - `components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad` → Traefik Nomad 作业配置
 | ||
| - `README-Traefik.md` → **Traefik 配置管理指南(必读)**
 | ||
| - `infrastructure/opentofu/environments/dev/` → Terraform 基础设施配置
 | ||
| - `deployment/ansible/inventories/production/hosts` → 服务器清单
 | ||
| - `README-Vault.md` → Vault 配置和使用说明
 | ||
| - `README-Waypoint.md` → Waypoint 配置和使用说明
 | ||
| - `README-Backup.md` → 备份管理和恢复说明
 | ||
| - `nomad-jobs/vault-cluster.nomad` → Vault Nomad 作业配置
 | ||
| - `waypoint-server.nomad` → Waypoint Nomad 作业配置
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 🔧 服务初始化说明
 | ||
| 
 | ||
| ### Vault 初始化
 | ||
| 
 | ||
| **当前状态:** Vault使用本地file存储,需要初始化
 | ||
| 
 | ||
| **初始化步骤:**
 | ||
| ```bash
 | ||
| # 1. 检查vault状态
 | ||
| curl -s http://warden.tailnet-68f9.ts.net:8200/v1/sys/health
 | ||
| 
 | ||
| # 2. 初始化vault(如果返回"no available server")
 | ||
| vault operator init -address=http://warden.tailnet-68f9.ts.net:8200
 | ||
| 
 | ||
| # 3. 保存unseal keys和root token
 | ||
| # 4. 解封vault
 | ||
| vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-1>
 | ||
| vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-2>
 | ||
| vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-3>
 | ||
| ```
 | ||
| 
 | ||
| **🔑 Vault 密钥信息 (2025-10-04 最终初始化):**
 | ||
| ```
 | ||
| Unseal Key 1: 5XQ6vSekewZj9SigcIS8KcpnsOyEzgG5UFe/mqPVXkre
 | ||
| Unseal Key 2: vmLu+Ry+hajWjQhX3YVnZG72aZRn5cowcUm5JIVtv/kR
 | ||
| Unseal Key 3: 3eDhfnHZnG9OT6RFOhpoK/aO5TghPypz4XPlXxFMm52F
 | ||
| Unseal Key 4: LWGkYB7qD3GPPc/nRuqKmMUiQex8ygYF1BkSXA1Tov3J
 | ||
| Unseal Key 5: rIidFy7d/SxcPOCrNy569VZ86I56oMQxqL7qVgM+PYPy
 | ||
| 
 | ||
| Root Token: hvs.OgVR2hEihbHM7qFxtFr7oeo3
 | ||
| ```
 | ||
| 
 | ||
| **配置说明:**
 | ||
| - **存储**: file (本地文件系统)
 | ||
| - **路径**: `/opt/nomad/data/vault-storage` (持久化存储)
 | ||
| - **端口**: 8200
 | ||
| - **UI**: 启用
 | ||
| - **重要**: 已配置持久化存储,重启后密钥不会丢失
 | ||
| 
 | ||
| ### Waypoint 初始化
 | ||
| 
 | ||
| **当前状态:** Waypoint正常运行,可能需要重新初始化
 | ||
| 
 | ||
| **初始化步骤:**
 | ||
| ```bash
 | ||
| # 1. 检查waypoint状态
 | ||
| curl -I https://waypoint.git-4ta.live
 | ||
| 
 | ||
| # 2. 如果需要重新初始化
 | ||
| waypoint server init -server-addr=https://waypoint.git-4ta.live
 | ||
| 
 | ||
| # 3. 配置waypoint CLI
 | ||
| waypoint auth login -server-addr=https://waypoint.git-4ta.live
 | ||
| ```
 | ||
| 
 | ||
| **配置说明:**
 | ||
| - **存储**: 本地数据库 `/opt/waypoint/waypoint.db`
 | ||
| - **端口**: HTTP 9701, gRPC 9702
 | ||
| - **UI**: 启用
 | ||
| 
 | ||
| ### Consul 服务注册
 | ||
| 
 | ||
| **已注册服务:**
 | ||
| - ✅ **vault**: `vault.git-4ta.live` (tags: vault, secrets, kv)
 | ||
| - ✅ **waypoint**: `waypoint.git-4ta.live` (tags: waypoint, ci-cd, deployment)
 | ||
| - ✅ **consul**: `consul.git-4ta.live` (tags: consul, service-discovery)
 | ||
| - ✅ **traefik**: `traefik.git-4ta.live` (tags: traefik, proxy, load-balancer)
 | ||
| - ✅ **nomad**: `nomad.git-4ta.live` (tags: nomad, scheduler, orchestrator)
 | ||
| 
 | ||
| **健康检查:**
 | ||
| - **vault**: `/v1/sys/health`
 | ||
| - **waypoint**: `/`
 | ||
| - **consul**: `/v1/status/leader`
 | ||
| - **traefik**: `/ping`
 | ||
| - **nomad**: `/v1/status/leader`
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| **最后更新:** 2025-10-08 02:55 UTC  
 | ||
| **状态:** 服务运行正常,Traefik配置架构已优化,Authentik已集成 |