111
This commit is contained in:
parent
1c994f9f60
commit
eab95c8c80
|
|
@ -0,0 +1,162 @@
|
|||
# Nomad Jobs 备份管理
|
||||
|
||||
本文档说明如何管理和恢复 Nomad job 配置的备份。
|
||||
|
||||
## 📁 备份存储位置
|
||||
|
||||
### 本地备份
|
||||
- **路径**: `/root/mgmt/backups/nomad-jobs-YYYYMMDD-HHMMSS/`
|
||||
- **压缩包**: `/root/mgmt/nomad-jobs-backup-YYYYMMDD.tar.gz`
|
||||
|
||||
### Consul KV 备份
|
||||
- **数据**: `backup/nomad-jobs/YYYYMMDD/data`
|
||||
- **元数据**: `backup/nomad-jobs/YYYYMMDD/metadata`
|
||||
- **索引**: `backup/nomad-jobs/index`
|
||||
|
||||
## 📋 当前备份
|
||||
|
||||
### 2025-10-04 备份
|
||||
- **备份时间**: 2025-10-04 07:44:11
|
||||
- **备份类型**: 完整 Nomad jobs 配置
|
||||
- **文件数量**: 25 个 `.nomad` 文件
|
||||
- **原始大小**: 208KB
|
||||
- **压缩大小**: 13KB
|
||||
- **Consul KV 路径**: `backup/nomad-jobs/20251004/data`
|
||||
|
||||
#### 服务状态
|
||||
- ✅ **Traefik** (`traefik-cloudflare-v1`) - SSL证书正常
|
||||
- ✅ **Vault** (`vault-cluster`) - 三节点高可用集群
|
||||
- ✅ **Waypoint** (`waypoint-server`) - Web UI 可访问
|
||||
|
||||
#### 域名和证书
|
||||
- **域名**: `*.git4ta.me`
|
||||
- **证书**: Let's Encrypt (Cloudflare DNS Challenge)
|
||||
- **状态**: 所有证书有效
|
||||
|
||||
## 🔧 备份管理命令
|
||||
|
||||
### 查看备份列表
|
||||
```bash
|
||||
# 查看 Consul KV 中的备份索引
|
||||
consul kv get backup/nomad-jobs/index
|
||||
|
||||
# 查看特定备份的元数据
|
||||
consul kv get backup/nomad-jobs/20251004/metadata
|
||||
```
|
||||
|
||||
### 恢复备份
|
||||
```bash
|
||||
# 从 Consul KV 恢复备份
|
||||
consul kv get backup/nomad-jobs/20251004/data > nomad-jobs-backup-20251004.tar.gz
|
||||
|
||||
# 解压备份
|
||||
tar -xzf nomad-jobs-backup-20251004.tar.gz
|
||||
|
||||
# 查看备份内容
|
||||
ls -la backups/nomad-jobs-20251004-074411/
|
||||
```
|
||||
|
||||
### 创建新备份
|
||||
```bash
|
||||
# 创建本地备份目录
|
||||
mkdir -p backups/nomad-jobs-$(date +%Y%m%d-%H%M%S)
|
||||
|
||||
# 备份当前配置
|
||||
cp -r components backups/nomad-jobs-$(date +%Y%m%d-%H%M%S)/
|
||||
cp -r nomad-jobs backups/nomad-jobs-$(date +%Y%m%d-%H%M%S)/
|
||||
cp waypoint-server.nomad backups/nomad-jobs-$(date +%Y%m%d-%H%M%S)/
|
||||
|
||||
# 压缩备份
|
||||
tar -czf nomad-jobs-backup-$(date +%Y%m%d).tar.gz backups/nomad-jobs-$(date +%Y%m%d-*)/
|
||||
|
||||
# 存储到 Consul KV
|
||||
consul kv put backup/nomad-jobs/$(date +%Y%m%d)/data @nomad-jobs-backup-$(date +%Y%m%d).tar.gz
|
||||
```
|
||||
|
||||
## 📊 备份策略
|
||||
|
||||
### 备份频率
|
||||
- **自动备份**: 建议每周一次
|
||||
- **重要变更前**: 部署新服务或重大配置修改前
|
||||
- **紧急情况**: 服务出现问题时立即备份当前状态
|
||||
|
||||
### 备份内容
|
||||
- 所有 `.nomad` 文件
|
||||
- 配置文件模板
|
||||
- 服务依赖关系
|
||||
- 网络和存储配置
|
||||
|
||||
### 备份验证
|
||||
```bash
|
||||
# 验证备份完整性
|
||||
tar -tzf nomad-jobs-backup-20251004.tar.gz | wc -l
|
||||
|
||||
# 检查关键文件
|
||||
tar -tzf nomad-jobs-backup-20251004.tar.gz | grep -E "(traefik|vault|waypoint)"
|
||||
```
|
||||
|
||||
## 🚨 恢复流程
|
||||
|
||||
### 紧急恢复
|
||||
1. **停止所有服务**
|
||||
```bash
|
||||
nomad job stop traefik-cloudflare-v1
|
||||
nomad job stop vault-cluster
|
||||
nomad job stop waypoint-server
|
||||
```
|
||||
|
||||
2. **恢复备份**
|
||||
```bash
|
||||
consul kv get backup/nomad-jobs/20251004/data > restore.tar.gz
|
||||
tar -xzf restore.tar.gz
|
||||
```
|
||||
|
||||
3. **重新部署**
|
||||
```bash
|
||||
nomad job run backups/nomad-jobs-20251004-074411/components/traefik/jobs/traefik-cloudflare.nomad
|
||||
nomad job run backups/nomad-jobs-20251004-074411/nomad-jobs/vault-cluster.nomad
|
||||
nomad job run backups/nomad-jobs-20251004-074411/waypoint-server.nomad
|
||||
```
|
||||
|
||||
### 部分恢复
|
||||
```bash
|
||||
# 只恢复特定服务
|
||||
cp backups/nomad-jobs-20251004-074411/components/traefik/jobs/traefik-cloudflare.nomad components/traefik/jobs/
|
||||
nomad job run components/traefik/jobs/traefik-cloudflare.nomad
|
||||
```
|
||||
|
||||
## 📝 备份记录
|
||||
|
||||
| 日期 | 备份类型 | 服务状态 | 大小 | Consul KV 路径 |
|
||||
|------|----------|----------|------|----------------|
|
||||
| 2025-10-04 | 完整备份 | 全部运行 | 13KB | `backup/nomad-jobs/20251004/data` |
|
||||
|
||||
## ⚠️ 注意事项
|
||||
|
||||
1. **证书备份**: SSL证书存储在容器内,重启会丢失
|
||||
2. **Consul KV**: 重要配置存储在 Consul KV 中,需要单独备份
|
||||
3. **网络配置**: Tailscale 网络配置需要单独记录
|
||||
4. **凭据安全**: Vault 和 Waypoint 的凭据存储在 Consul KV 中
|
||||
|
||||
## 🔍 故障排除
|
||||
|
||||
### 备份损坏
|
||||
```bash
|
||||
# 检查备份文件完整性
|
||||
tar -tzf nomad-jobs-backup-20251004.tar.gz > /dev/null && echo "备份完整" || echo "备份损坏"
|
||||
```
|
||||
|
||||
### Consul KV 访问问题
|
||||
```bash
|
||||
# 检查 Consul 连接
|
||||
consul members
|
||||
|
||||
# 检查 KV 存储
|
||||
consul kv get backup/nomad-jobs/index
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**最后更新**: 2025-10-04 07:45:00
|
||||
**备份状态**: ✅ 当前备份完整可用
|
||||
**服务状态**: ✅ 所有服务正常运行
|
||||
|
|
@ -0,0 +1,166 @@
|
|||
# Traefik 配置管理指南
|
||||
|
||||
## 🎯 配置与应用分离的最佳实践
|
||||
|
||||
### ⚠️ 重要:避免低逼格操作
|
||||
|
||||
**❌ 错误做法(显得很low):**
|
||||
- 修改Nomad job文件来添加新域名
|
||||
- 重新部署整个Traefik服务
|
||||
- 把配置嵌入在应用定义中
|
||||
|
||||
**✅ 正确做法(优雅且专业):**
|
||||
|
||||
## 配置文件分离架构
|
||||
|
||||
### 1. 配置文件位置
|
||||
|
||||
- **动态配置**: `/root/mgmt/components/traefik/config/dynamic.yml`
|
||||
- **应用配置**: `/root/mgmt/components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad`
|
||||
|
||||
### 2. 关键特性
|
||||
|
||||
- ✅ **热重载**: Traefik配置了`file`提供者,支持`watch: true`
|
||||
- ✅ **自动生效**: 修改YAML配置文件后自动生效,无需重启
|
||||
- ✅ **配置分离**: 配置与应用完全分离,符合最佳实践
|
||||
|
||||
### 3. 添加新域名的工作流程
|
||||
|
||||
```bash
|
||||
# 只需要编辑配置文件
|
||||
vim /root/mgmt/components/traefik/config/dynamic.yml
|
||||
|
||||
# 添加新的服务配置
|
||||
services:
|
||||
new-service-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://new-service.tailnet-68f9.ts.net:8080"
|
||||
healthCheck:
|
||||
path: "/health"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
# 添加新的路由配置
|
||||
routers:
|
||||
new-service-ui:
|
||||
rule: "Host(`new-service.git-4ta.live`)"
|
||||
service: new-service-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
# 保存后立即生效,无需重启!
|
||||
```
|
||||
|
||||
### 4. 架构优势
|
||||
|
||||
- 🚀 **零停机时间**: 配置变更无需重启服务
|
||||
- 🔧 **灵活管理**: 独立管理配置和应用
|
||||
- 📝 **版本控制**: 配置文件可以独立版本管理
|
||||
- 🎯 **专业标准**: 符合现代DevOps最佳实践
|
||||
|
||||
## 当前服务配置
|
||||
|
||||
### 已配置的服务
|
||||
|
||||
1. **Consul集群**
|
||||
- 域名: `consul.git-4ta.live`
|
||||
- 后端: 多节点负载均衡
|
||||
- 健康检查: `/v1/status/leader`
|
||||
|
||||
2. **Nomad集群**
|
||||
- 域名: `nomad.git-4ta.live`
|
||||
- 后端: 多节点负载均衡
|
||||
- 健康检查: `/v1/status/leader`
|
||||
|
||||
3. **Waypoint服务**
|
||||
- 域名: `waypoint.git-4ta.live`
|
||||
- 后端: `hcp1.tailnet-68f9.ts.net:9701`
|
||||
- 协议: HTTPS (跳过证书验证)
|
||||
|
||||
4. **Vault服务**
|
||||
- 域名: `vault.git-4ta.live`
|
||||
- 后端: `warden.tailnet-68f9.ts.net:8200`
|
||||
- 健康检查: `/ui/`
|
||||
|
||||
5. **Authentik服务**
|
||||
- 域名: `authentik.git-4ta.live`
|
||||
- 后端: `authentik.tailnet-68f9.ts.net:9443`
|
||||
- 协议: HTTPS (跳过证书验证)
|
||||
- 健康检查: `/flows/-/default/authentication/`
|
||||
|
||||
6. **Traefik Dashboard**
|
||||
- 域名: `traefik.git-4ta.live`
|
||||
- 服务: 内置dashboard
|
||||
|
||||
### SSL证书管理
|
||||
|
||||
- **证书解析器**: Cloudflare DNS Challenge
|
||||
- **自动续期**: Let's Encrypt证书自动管理
|
||||
- **存储位置**: `/opt/traefik/certs/acme.json`
|
||||
- **强制HTTPS**: 所有HTTP请求自动重定向到HTTPS
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 检查服务状态
|
||||
|
||||
```bash
|
||||
# 检查Traefik API
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/overview
|
||||
|
||||
# 检查路由配置
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/http/routers
|
||||
|
||||
# 检查服务配置
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/http/services
|
||||
```
|
||||
|
||||
### 检查证书状态
|
||||
|
||||
```bash
|
||||
# 检查SSL证书
|
||||
openssl s_client -connect consul.git-4ta.live:443 -servername consul.git-4ta.live < /dev/null 2>/dev/null | openssl x509 -noout -subject -issuer
|
||||
|
||||
# 检查证书文件
|
||||
ssh root@hcp1 "cat /opt/traefik/certs/acme.json | jq '.cloudflare.Certificates'"
|
||||
```
|
||||
|
||||
### 查看日志
|
||||
|
||||
```bash
|
||||
# 查看Traefik日志
|
||||
nomad logs -tail traefik-cloudflare-v1
|
||||
|
||||
# 查看特定错误
|
||||
nomad logs -tail traefik-cloudflare-v1 | grep -i "error\|warn\|fail"
|
||||
```
|
||||
|
||||
## 最佳实践
|
||||
|
||||
1. **配置管理**
|
||||
- 始终使用`dynamic.yml`文件管理路由配置
|
||||
- 避免修改Nomad job文件
|
||||
- 使用版本控制管理配置文件
|
||||
|
||||
2. **服务发现**
|
||||
- 优先使用Tailscale网络地址
|
||||
- 配置适当的健康检查
|
||||
- 使用HTTPS协议(跳过自签名证书验证)
|
||||
|
||||
3. **SSL证书**
|
||||
- 依赖Cloudflare DNS Challenge
|
||||
- 监控证书自动续期
|
||||
- 定期检查证书状态
|
||||
|
||||
4. **监控和日志**
|
||||
- 启用Traefik API监控
|
||||
- 配置访问日志
|
||||
- 定期检查服务健康状态
|
||||
|
||||
## 记住
|
||||
|
||||
**配置与应用分离是现代基础设施管理的核心原则!**
|
||||
|
||||
这种架构不仅提高了系统的灵活性和可维护性,更体现了专业的DevOps实践水平。
|
||||
|
|
@ -0,0 +1,120 @@
|
|||
# Vault 配置信息
|
||||
|
||||
## 概述
|
||||
Vault 已成功迁移到 Nomad 管理下,运行在 ch4、ash3c、warden 三个节点上,支持高可用部署。
|
||||
|
||||
## 访问信息
|
||||
|
||||
### Vault 服务地址
|
||||
- **主节点 (Active)**: `http://100.117.106.136:8200` (ch4 节点)
|
||||
- **备用节点 (Standby)**: `http://100.116.80.94:8200` (ash3c 节点)
|
||||
- **备用节点 (Standby)**: `http://100.122.197.112:8200` (warden 节点)
|
||||
- **Web UI**: `http://100.117.106.136:8200/ui`
|
||||
|
||||
### 认证信息
|
||||
- **Unseal Key**: `/iHuxLbHWmx5xlJhqaTUMniiRc71eO1UAwNJj/lDWow=`
|
||||
- **Root Token**: `hvs.dHtno0cCpWtFYMCvJZTgGmfn`
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 环境变量设置
|
||||
```bash
|
||||
export VAULT_ADDR=http://100.117.106.136:8200
|
||||
export VAULT_TOKEN=hvs.dHtno0cCpWtFYMCvJZTgGmfn
|
||||
```
|
||||
|
||||
### 基本命令
|
||||
```bash
|
||||
# 检查 Vault 状态
|
||||
vault status
|
||||
|
||||
# 如果 Vault 被密封,使用 unseal key 解封
|
||||
vault operator unseal /iHuxLbHWmx5xlJhqaTUMniiRc71eO1UAwNJj/lDWow=
|
||||
|
||||
# 访问 Vault CLI
|
||||
vault auth -method=token token=hvs.dHtno0cCpWtFYMCvJZTgGmfn
|
||||
```
|
||||
|
||||
## 存储位置
|
||||
|
||||
### Consul KV 存储
|
||||
- **Unseal Key**: `vault/unseal-key`
|
||||
- **Root Token**: `vault/root-token`
|
||||
- **配置**: `vault/config/dev`
|
||||
|
||||
### 本地备份
|
||||
- **备份目录**: `/root/vault-backup/`
|
||||
- **初始化脚本**: `/root/mgmt/scripts/vault-init.sh`
|
||||
|
||||
## 部署信息
|
||||
|
||||
### Nomad 作业
|
||||
- **作业名称**: `vault-cluster-nomad`
|
||||
- **作业文件**: `/root/mgmt/nomad-jobs/vault-cluster.nomad`
|
||||
- **部署节点**: ch4, ash3c, warden
|
||||
- **并行部署**: 3 个节点同时运行
|
||||
|
||||
### 配置特点
|
||||
- **存储后端**: Consul
|
||||
- **高可用**: 启用
|
||||
- **密封类型**: Shamir
|
||||
- **密钥份额**: 1
|
||||
- **阈值**: 1
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 如果 Vault 被密封
|
||||
```bash
|
||||
# 1. 检查状态
|
||||
vault status
|
||||
|
||||
# 2. 使用 unseal key 解封所有节点
|
||||
# ch4 节点
|
||||
export VAULT_ADDR=http://100.117.106.136:8200
|
||||
vault operator unseal /iHuxLbHWmx5xlJhqaTUMniiRc71eO1UAwNJj/lDWow=
|
||||
|
||||
# ash3c 节点
|
||||
export VAULT_ADDR=http://100.116.80.94:8200
|
||||
vault operator unseal /iHuxLbHWmx5xlJhqaTUMniiRc71eO1UAwNJj/lDWow=
|
||||
|
||||
# warden 节点
|
||||
export VAULT_ADDR=http://100.122.197.112:8200
|
||||
vault operator unseal /iHuxLbHWmx5xlJhqaTUMniiRc71eO1UAwNJj/lDWow=
|
||||
|
||||
# 3. 验证解封状态
|
||||
vault status
|
||||
```
|
||||
|
||||
### 如果忘记认证信息
|
||||
```bash
|
||||
# 从 Consul KV 获取
|
||||
consul kv get vault/unseal-key
|
||||
consul kv get vault/root-token
|
||||
```
|
||||
|
||||
### 重启 Vault 服务
|
||||
```bash
|
||||
# 重启 Nomad 作业
|
||||
nomad job restart vault-cluster-nomad
|
||||
|
||||
# 或重启特定分配
|
||||
nomad alloc restart <allocation-id>
|
||||
```
|
||||
|
||||
## 安全注意事项
|
||||
|
||||
⚠️ **重要**:
|
||||
- 请妥善保管 Unseal Key 和 Root Token
|
||||
- 不要在生产环境中使用 Root Token 进行日常操作
|
||||
- 建议创建具有适当权限的用户和策略
|
||||
- 定期轮换密钥和令牌
|
||||
|
||||
## 更新历史
|
||||
|
||||
- **2025-10-04**: 成功迁移 Vault 到 Nomad 管理
|
||||
- **2025-10-04**: 重新初始化 Vault 并获取新的认证信息
|
||||
- **2025-10-04**: 优化部署策略,支持三节点并行运行
|
||||
|
||||
---
|
||||
*最后更新: 2025-10-04*
|
||||
*维护者: ben*
|
||||
|
|
@ -0,0 +1,157 @@
|
|||
# Waypoint 配置和使用指南
|
||||
|
||||
## 服务信息
|
||||
|
||||
- **服务器地址**: `hcp1.tailnet-68f9.ts.net:9702` (gRPC)
|
||||
- **HTTP API**: `hcp1.tailnet-68f9.ts.net:9701` (HTTPS)
|
||||
- **Web UI**: `https://waypoint.git4ta.me/auth/token`
|
||||
|
||||
## 认证信息
|
||||
|
||||
### 认证 Token
|
||||
```
|
||||
3K4wQUdH1dfES7e2KRygoJ745wgjDCG6X7LmLCAseEs3a5jrK185Yk4ZzYQUDvwEacPTfaF5hbUW1E3JNA7fvMthHWrkAFyRZoocmjCqj72YfJRzXW7KsurdSoMoKpEVJyiWRxPAg3VugzUx
|
||||
```
|
||||
|
||||
### Token 存储位置
|
||||
- **Consul KV**: `waypoint/auth-token`
|
||||
- **获取命令**: `consul kv get waypoint/auth-token`
|
||||
|
||||
## 访问方式
|
||||
|
||||
### 1. Web UI 访问
|
||||
```
|
||||
https://waypoint.git4ta.me/auth/token
|
||||
```
|
||||
使用上述认证 token 进行登录。
|
||||
|
||||
### 2. CLI 访问
|
||||
```bash
|
||||
# 创建上下文
|
||||
waypoint context create \
|
||||
-server-addr=hcp1.tailnet-68f9.ts.net:9702 \
|
||||
-server-tls-skip-verify \
|
||||
-set-default waypoint-server
|
||||
|
||||
# 验证连接
|
||||
waypoint server info
|
||||
```
|
||||
|
||||
### 3. 使用认证 Token
|
||||
```bash
|
||||
# 设置环境变量
|
||||
export WAYPOINT_TOKEN="3K4wQUdH1dfES7e2KRygoJ745wgjDCG6X7LmLCAseEs3a5jrK185Yk4ZzYQUDvwEacPTfaF5hbUW1E3JNA7fvMthHWrkAFyRZoocmjCqj72YfJRzXW7KsurdSoMoKpEVJyiWRxPAg3VugzUx"
|
||||
|
||||
# 或者使用 -server-auth-token 参数
|
||||
waypoint server info -server-auth-token="$WAYPOINT_TOKEN"
|
||||
```
|
||||
|
||||
## 服务配置
|
||||
|
||||
### Nomad 作业配置
|
||||
- **文件**: `/root/mgmt/waypoint-server.nomad`
|
||||
- **节点**: `hcp1.tailnet-68f9.ts.net`
|
||||
- **数据库**: `/opt/waypoint/waypoint.db`
|
||||
- **gRPC 端口**: 9702
|
||||
- **HTTP 端口**: 9701
|
||||
|
||||
### Traefik 路由配置
|
||||
- **域名**: `waypoint.git4ta.me`
|
||||
- **后端**: `https://hcp1.tailnet-68f9.ts.net:9701`
|
||||
- **TLS**: 跳过证书验证 (`insecureSkipVerify: true`)
|
||||
|
||||
## 常用命令
|
||||
|
||||
### 服务器管理
|
||||
```bash
|
||||
# 检查服务器状态
|
||||
waypoint server info
|
||||
|
||||
# 获取服务器 cookie
|
||||
waypoint server cookie
|
||||
|
||||
# 创建快照备份
|
||||
waypoint server snapshot
|
||||
```
|
||||
|
||||
### 项目管理
|
||||
```bash
|
||||
# 列出所有项目
|
||||
waypoint list projects
|
||||
|
||||
# 初始化新项目
|
||||
waypoint init
|
||||
|
||||
# 部署应用
|
||||
waypoint up
|
||||
|
||||
# 查看部署状态
|
||||
waypoint list deployments
|
||||
```
|
||||
|
||||
### 应用管理
|
||||
```bash
|
||||
# 列出应用
|
||||
waypoint list apps
|
||||
|
||||
# 查看应用日志
|
||||
waypoint logs -app=<app-name>
|
||||
|
||||
# 执行应用命令
|
||||
waypoint exec -app=<app-name> <command>
|
||||
```
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 1. 连接问题
|
||||
```bash
|
||||
# 检查服务器是否运行
|
||||
nomad job status waypoint-server
|
||||
|
||||
# 检查端口是否监听
|
||||
netstat -tlnp | grep 970
|
||||
```
|
||||
|
||||
### 2. 认证问题
|
||||
```bash
|
||||
# 重新引导服务器(会生成新 token)
|
||||
nomad job stop waypoint-server
|
||||
ssh hcp1.tailnet-68f9.ts.net "rm -f /opt/waypoint/waypoint.db"
|
||||
nomad job run /root/mgmt/waypoint-server.nomad
|
||||
waypoint server bootstrap -server-addr=hcp1.tailnet-68f9.ts.net:9702 -server-tls-skip-verify
|
||||
```
|
||||
|
||||
### 3. Web UI 访问问题
|
||||
- 确保使用正确的路径: `/auth/token`
|
||||
- 检查 Traefik 路由配置
|
||||
- 验证 SSL 证书是否有效
|
||||
|
||||
## 集成配置
|
||||
|
||||
### 与 Nomad 集成
|
||||
```bash
|
||||
# 配置 Nomad 作为运行时平台
|
||||
waypoint config source-set -type=nomad nomad-platform \
|
||||
addr=http://localhost:4646
|
||||
```
|
||||
|
||||
### 与 Vault 集成
|
||||
```bash
|
||||
# 配置 Vault 集成
|
||||
waypoint config source-set -type=vault vault-secrets \
|
||||
addr=http://localhost:8200 \
|
||||
token=<vault-token>
|
||||
```
|
||||
|
||||
## 安全注意事项
|
||||
|
||||
1. **Token 保护**: 认证 token 具有完全访问权限,请妥善保管
|
||||
2. **网络访问**: 服务器监听所有接口,确保防火墙配置正确
|
||||
3. **TLS 验证**: 当前配置跳过 TLS 验证,生产环境建议启用
|
||||
4. **备份**: 定期备份 `/opt/waypoint/waypoint.db` 数据库文件
|
||||
|
||||
## 更新日志
|
||||
|
||||
- **2025-10-04**: 初始部署和配置
|
||||
- **2025-10-04**: 获取认证 token 并存储到 Consul KV
|
||||
- **2025-10-04**: 配置 Traefik 路由和 Web UI 访问
|
||||
752
README.md
752
README.md
|
|
@ -1,586 +1,284 @@
|
|||
# 🏗️ 基础设施管理项目
|
||||
# Management Infrastructure
|
||||
|
||||
这是一个现代化的多云基础设施管理平台,专注于 OpenTofu、Ansible 和 Nomad + Podman 的集成管理。
|
||||
## 🚨 关键问题记录
|
||||
|
||||
## 📝 重要提醒 (Sticky Note)
|
||||
### Nomad Consul KV 模板语法问题
|
||||
|
||||
### ✅ Consul集群状态更新
|
||||
**问题描述:**
|
||||
Nomad 无法从 Consul KV 读取配置,报错:`Missing: kv.block(config/dev/cloudflare/token)`
|
||||
|
||||
**当前状态**:Consul集群运行健康,所有节点正常运行
|
||||
**根本原因:**
|
||||
1. **Nomad 客户端未配置 Consul 连接** - Nomad 无法访问 Consul KV
|
||||
2. **模板语法正确** - `{{ key "path/to/key" }}` 是正确语法
|
||||
3. **Consul KV 数据存在** - `config/dev/cloudflare/token` 确实存在
|
||||
|
||||
**集群信息**:
|
||||
- **Leader**: warden (100.122.197.112:8300)
|
||||
- **节点数量**: 3个服务器节点
|
||||
- **健康状态**: 所有节点健康检查通过
|
||||
- **节点列表**:
|
||||
- master (100.117.106.136) - 韩国主节点
|
||||
- ash3c (100.116.80.94) - 美国服务器节点
|
||||
- warden (100.122.197.112) - 北京服务器节点,当前集群leader
|
||||
**解决方案:**
|
||||
1. **临时方案** - 硬编码 token 到配置文件中
|
||||
2. **长期方案** - 配置 Nomad 客户端连接 Consul
|
||||
|
||||
**配置状态**:
|
||||
- Ansible inventory配置与实际集群状态一致
|
||||
- 所有节点均为服务器模式
|
||||
- bootstrap_expect=3,符合实际节点数量
|
||||
**核心诉求:**
|
||||
- **集中化存储** → Consul KV 存储所有敏感配置
|
||||
- **分散化部署** → Nomad 从 Consul 读取配置部署到多节点
|
||||
- **直接读取** → Nomad 模板系统直接从 Consul KV 读取配置
|
||||
|
||||
**依赖关系**:
|
||||
- Tailscale (第1天) ✅
|
||||
- Ansible (第2天) ✅
|
||||
- Nomad (第3天) ✅
|
||||
- Consul (第4天) ✅ **已完成**
|
||||
- Terraform (第5天) ✅ **进展良好**
|
||||
- Vault (第6天) ⏳ 计划中
|
||||
- Waypoint (第7天) ⏳ 计划中
|
||||
**当前状态:**
|
||||
- ✅ Consul KV 存储正常
|
||||
- ✅ Traefik 服务运行正常
|
||||
- ❌ Nomad 无法读取 Consul KV(需要配置连接)
|
||||
|
||||
**下一步计划**:
|
||||
- 继续推进Terraform状态管理
|
||||
- 准备Vault密钥管理集成
|
||||
- 规划Waypoint应用部署流程
|
||||
**下一步:**
|
||||
1. 配置 Nomad 客户端连接 Consul
|
||||
2. 恢复模板语法从 Consul KV 读取配置
|
||||
3. 实现真正的集中化配置管理
|
||||
|
||||
---
|
||||
|
||||
## 🎯 项目特性
|
||||
## 🎯 Traefik 配置架构:配置与应用分离的最佳实践
|
||||
|
||||
- **🌩️ 多云支持**: Oracle Cloud, 华为云, Google Cloud, AWS, DigitalOcean
|
||||
- **🏗️ 基础设施即代码**: 使用 OpenTofu 管理云资源
|
||||
- **⚙️ 配置管理**: 使用 Ansible 自动化配置和部署
|
||||
- **🐳 容器编排**: Nomad 集群管理和 Podman 容器运行时
|
||||
- **🔄 CI/CD**: Gitea Actions 自动化流水线
|
||||
- **📊 监控**: Prometheus + Grafana 监控体系
|
||||
- **🔐 安全**: 多层安全防护和合规性
|
||||
### ⚠️ 重要:避免低逼格操作
|
||||
|
||||
## 🔄 架构分层与职责划分
|
||||
**❌ 错误做法(显得很low):**
|
||||
- 修改Nomad job文件来添加新域名
|
||||
- 重新部署整个Traefik服务
|
||||
- 把配置嵌入在应用定义中
|
||||
|
||||
### ⚠️ 重要:Terraform 与 Nomad 的职责区分
|
||||
**✅ 正确做法(优雅且专业):**
|
||||
|
||||
本项目采用分层架构,明确区分了不同工具的职责范围,避免混淆:
|
||||
### 配置文件分离架构
|
||||
|
||||
#### 1. **Terraform/OpenTofu 层面 - 基础设施生命周期管理**
|
||||
- **职责**: 管理云服务商提供的计算资源(虚拟机)的生命周期
|
||||
- **操作范围**:
|
||||
- 创建、更新、删除虚拟机实例
|
||||
- 管理网络资源(VCN、子网、安全组等)
|
||||
- 管理存储资源(块存储、对象存储等)
|
||||
- 管理负载均衡器等云服务
|
||||
- **目标**: 确保底层基础设施的正确配置和状态管理
|
||||
**1. 配置文件位置:**
|
||||
- **动态配置**: `/root/mgmt/components/traefik/config/dynamic.yml`
|
||||
- **应用配置**: `/root/mgmt/components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad`
|
||||
|
||||
#### 2. **Nomad 层面 - 应用资源调度与编排**
|
||||
- **职责**: 在已经运行起来的虚拟机内部进行资源分配和应用编排
|
||||
- **操作范围**:
|
||||
- 在现有虚拟机上调度和运行容器化应用
|
||||
- 管理应用的生命周期(启动、停止、更新)
|
||||
- 资源分配和限制(CPU、内存、存储)
|
||||
- 服务发现和负载均衡
|
||||
- **目标**: 在已有基础设施上高效运行应用服务
|
||||
|
||||
#### 3. **关键区别**
|
||||
- **Terraform** 关注的是**虚拟机本身**的生命周期管理
|
||||
- **Nomad** 关注的是**在虚拟机内部**运行的应用的资源调度
|
||||
- **Terraform** 决定"有哪些虚拟机"
|
||||
- **Nomad** 决定"虚拟机上运行什么应用"
|
||||
|
||||
#### 4. **工作流程示例**
|
||||
```
|
||||
1. Terraform 创建虚拟机 (云服务商层面)
|
||||
↓
|
||||
2. 虚拟机启动并运行操作系统
|
||||
↓
|
||||
3. 在虚拟机上安装和配置 Nomad 客户端
|
||||
↓
|
||||
4. Nomad 在虚拟机上调度和运行应用容器
|
||||
```
|
||||
|
||||
**重要提醒**: 这两个层面不可混淆,Terraform 不应该管理应用层面的资源,Nomad 也不应该创建虚拟机。严格遵守此分层架构是项目成功的关键。
|
||||
|
||||
## 📁 项目结构
|
||||
|
||||
```
|
||||
mgmt/
|
||||
├── .gitea/workflows/ # CI/CD 工作流
|
||||
├── tofu/ # OpenTofu 基础设施代码 (基础设施生命周期管理)
|
||||
│ ├── environments/ # 环境配置 (dev/staging/prod)
|
||||
│ ├── modules/ # 可复用模块
|
||||
│ ├── providers/ # 云服务商配置
|
||||
│ └── shared/ # 共享配置
|
||||
├── configuration/ # Ansible 配置管理
|
||||
│ ├── inventories/ # 主机清单
|
||||
│ ├── playbooks/ # 剧本
|
||||
│ ├── templates/ # 模板文件
|
||||
│ └── group_vars/ # 组变量
|
||||
├── jobs/ # Nomad 作业定义 (应用资源调度与编排)
|
||||
│ ├── consul/ # Consul 集群配置
|
||||
│ └── podman/ # Podman 相关作业
|
||||
├── configs/ # 配置文件
|
||||
│ ├── nomad-master.hcl # Nomad 主节点配置
|
||||
│ └── nomad-ash3c.hcl # Nomad 客户端配置
|
||||
├── docs/ # 文档
|
||||
├── security/ # 安全配置
|
||||
│ ├── certificates/ # 证书文件
|
||||
│ └── policies/ # 安全策略
|
||||
├── tests/ # 测试脚本和报告
|
||||
│ ├── mcp_servers/ # MCP服务器测试脚本
|
||||
│ ├── mcp_server_test_report.md # MCP服务器测试报告
|
||||
│ └── legacy/ # 旧的测试脚本
|
||||
├── tools/ # 工具和实用程序
|
||||
├── playbooks/ # 核心Ansible剧本
|
||||
└── Makefile # 项目管理命令
|
||||
```
|
||||
|
||||
**架构分层说明**:
|
||||
- **tofu/** 目录包含 Terraform/OpenTofu 代码,负责管理云服务商提供的计算资源生命周期
|
||||
- **jobs/** 目录包含 Nomad 作业定义,负责在已有虚拟机内部进行应用资源调度
|
||||
- 这两个目录严格分离,确保职责边界清晰
|
||||
|
||||
**注意:** 项目已从 Docker Swarm 迁移到 Nomad + Podman,原有的 swarm 目录已不再使用。所有中间过程脚本和测试文件已清理,保留核心配置文件以符合GitOps原则。
|
||||
|
||||
## 🔄 GitOps 原则
|
||||
|
||||
本项目遵循 GitOps 工作流,确保基础设施状态与 Git 仓库中的代码保持一致:
|
||||
|
||||
- **声明式配置**: 所有基础设施和应用程序配置都以声明式方式存储在 Git 中
|
||||
- **版本控制和审计**: 所有变更都通过 Git 提交,提供完整的变更历史和审计跟踪
|
||||
- **自动化同步**: 通过 CI/CD 流水线自动将 Git 中的变更应用到实际环境
|
||||
- **状态收敛**: 系统会持续监控实际状态,并自动修复任何与期望状态的偏差
|
||||
|
||||
### GitOps 工作流程
|
||||
|
||||
1. **声明期望状态**: 在 Git 中定义基础设施和应用程序的期望状态
|
||||
2. **提交变更**: 通过 Git 提交来应用变更
|
||||
3. **自动同步**: CI/CD 系统检测到变更并自动应用到环境
|
||||
4. **状态验证**: 系统验证实际状态与期望状态一致
|
||||
5. **监控和告警**: 持续监控状态并在出现偏差时发出告警
|
||||
|
||||
这种工作流确保了环境的一致性、可重复性和可靠性,同时提供了完整的变更历史和回滚能力。
|
||||
|
||||
## 🚀 快速开始
|
||||
|
||||
### 1. 环境准备
|
||||
**2. 关键特性:**
|
||||
- ✅ **热重载**: Traefik配置了`file`提供者,支持`watch: true`
|
||||
- ✅ **自动生效**: 修改YAML配置文件后自动生效,无需重启
|
||||
- ✅ **配置分离**: 配置与应用完全分离,符合最佳实践
|
||||
|
||||
**3. 添加新域名的工作流程:**
|
||||
```bash
|
||||
# 克隆项目
|
||||
git clone <repository-url>
|
||||
cd mgmt
|
||||
# 只需要编辑配置文件
|
||||
vim /root/mgmt/components/traefik/config/dynamic.yml
|
||||
|
||||
# 检查环境状态
|
||||
./mgmt.sh status
|
||||
# 添加新的路由配置
|
||||
routers:
|
||||
new-service-ui:
|
||||
rule: "Host(`new-service.git-4ta.live`)"
|
||||
service: new-service-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
# 快速部署(适用于开发环境)
|
||||
./mgmt.sh deploy
|
||||
# 保存后立即生效,无需重启!
|
||||
```
|
||||
|
||||
### 2. 配置云服务商
|
||||
**4. 架构优势:**
|
||||
- 🚀 **零停机时间**: 配置变更无需重启服务
|
||||
- 🔧 **灵活管理**: 独立管理配置和应用
|
||||
- 📝 **版本控制**: 配置文件可以独立版本管理
|
||||
- 🎯 **专业标准**: 符合现代DevOps最佳实践
|
||||
|
||||
**记住:配置与应用分离是现代基础设施管理的核心原则!**
|
||||
|
||||
---
|
||||
|
||||
## 架构概览
|
||||
|
||||
### 集中化 + 分散化架构
|
||||
|
||||
**集中化存储:**
|
||||
- **Consul KV** → 存储所有敏感配置(tokens、证书、密钥)
|
||||
- **Consul Service Discovery** → 服务注册和发现
|
||||
- **Consul Health Checks** → 服务健康检查
|
||||
|
||||
**分散化部署:**
|
||||
- **亚洲节点** → `warden.tailnet-68f9.ts.net` (北京)
|
||||
- **亚洲节点** → `ch4.tailnet-68f9.ts.net` (韩国)
|
||||
- **美洲节点** → `ash3c.tailnet-68f9.ts.net` (美国)
|
||||
|
||||
### 服务端点
|
||||
|
||||
- `https://consul.git-4ta.live` → Consul UI
|
||||
- `https://traefik.git-4ta.live` → Traefik Dashboard
|
||||
- `https://nomad.git-4ta.live` → Nomad UI
|
||||
- `https://vault.git-4ta.live` → Vault UI
|
||||
- `https://waypoint.git-4ta.live` → Waypoint UI
|
||||
- `https://authentik.git-4ta.live` → Authentik 身份认证
|
||||
|
||||
### 技术栈
|
||||
|
||||
- **Nomad** → 工作负载编排
|
||||
- **Consul** → 服务发现和配置管理
|
||||
- **Traefik** → 反向代理和负载均衡
|
||||
- **Cloudflare** → DNS 和 SSL 证书管理
|
||||
- **Waypoint** → 应用部署平台
|
||||
- **Authentik** → 身份认证和授权管理
|
||||
|
||||
---
|
||||
|
||||
## 部署状态
|
||||
|
||||
### ✅ 已完成
|
||||
- [x] Cloudflare token 存储到 Consul KV
|
||||
- [x] 泛域名解析 `*.git-4ta.live` 配置
|
||||
- [x] Traefik 配置和部署
|
||||
- [x] SSL 证书自动获取
|
||||
- [x] 所有服务端点配置
|
||||
- [x] Vault 迁移到 Nomad 管理
|
||||
- [x] Vault 高可用三节点部署
|
||||
- [x] Waypoint 服务器部署和引导
|
||||
- [x] Waypoint 认证 token 获取和存储
|
||||
- [x] Nomad jobs 配置备份到 Consul KV
|
||||
- [x] Authentik 容器部署和SSH密钥配置
|
||||
- [x] Traefik 配置架构优化(配置与应用分离)
|
||||
|
||||
### ⚠️ 待解决
|
||||
- [ ] Nomad 客户端 Consul 连接配置
|
||||
- [ ] 恢复从 Consul KV 读取配置
|
||||
- [ ] 实现真正的集中化配置管理
|
||||
|
||||
---
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 检查服务状态
|
||||
```bash
|
||||
# 复制配置模板
|
||||
cp tofu/environments/dev/terraform.tfvars.example tofu/environments/dev/terraform.tfvars
|
||||
|
||||
# 编辑配置文件,填入你的云服务商凭据
|
||||
vim tofu/environments/dev/terraform.tfvars
|
||||
# 检查所有服务
|
||||
curl -k -I https://consul.git4ta.tech
|
||||
curl -k -I https://traefik.git4ta.tech
|
||||
curl -k -I https://nomad.git4ta.tech
|
||||
curl -k -I https://waypoint.git4ta.tech
|
||||
```
|
||||
|
||||
### 3. 初始化基础设施
|
||||
|
||||
### 部署 Traefik
|
||||
```bash
|
||||
# 初始化 OpenTofu
|
||||
./mgmt.sh tofu init
|
||||
|
||||
# 查看执行计划
|
||||
./mgmt.sh tofu plan
|
||||
|
||||
# 应用基础设施变更
|
||||
cd tofu/environments/dev && tofu apply
|
||||
cd /root/mgmt
|
||||
nomad job run components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad
|
||||
```
|
||||
|
||||
### 4. 部署 Nomad 服务
|
||||
|
||||
### 管理 Traefik 配置(推荐方式)
|
||||
```bash
|
||||
# 部署 Consul 集群
|
||||
nomad run /root/mgmt/jobs/consul/consul-cluster.nomad
|
||||
# 添加新域名只需要编辑配置文件
|
||||
vim /root/mgmt/components/traefik/config/dynamic.yml
|
||||
|
||||
# 查看 Nomad 任务
|
||||
nomad job status
|
||||
|
||||
# 查看节点状态
|
||||
nomad node status
|
||||
# 保存后自动生效,无需重启!
|
||||
# 这就是配置与应用分离的优雅之处
|
||||
```
|
||||
|
||||
### ⚠️ 重要提示:网络访问注意事项
|
||||
|
||||
**Tailscale 网络访问**:
|
||||
- 本项目中的 Nomad 和 Consul 服务通过 Tailscale 网络进行访问
|
||||
- 访问 Nomad (端口 4646) 和 Consul (端口 8500) 时,必须使用 Tailscale 分配的 IP 地址
|
||||
- 错误示例:`http://127.0.0.1:4646` 或 `http://localhost:8500` (无法连接)
|
||||
- 正确示例:`http://100.x.x.x:4646` 或 `http://100.x.x.x:8500` (使用 Tailscale IP)
|
||||
|
||||
**获取 Tailscale IP**:
|
||||
### 检查 Consul KV
|
||||
```bash
|
||||
# 查看当前节点的 Tailscale IP
|
||||
tailscale ip -4
|
||||
|
||||
# 查看所有 Tailscale 网络中的节点
|
||||
tailscale status
|
||||
consul kv get config/dev/cloudflare/token
|
||||
consul kv get -recurse config/
|
||||
```
|
||||
|
||||
**常见问题**:
|
||||
- 如果遇到 "connection refused" 错误,请确认是否使用了正确的 Tailscale IP
|
||||
- 确保 Tailscale 服务已启动并正常运行
|
||||
- 检查网络策略是否允许通过 Tailscale 接口访问相关端口
|
||||
- 更多详细经验和解决方案,请参考:[Consul 和 Nomad 访问问题经验教训](.gitea/issues/consul-nomad-access-lesson.md)
|
||||
|
||||
### 🔄 Nomad 集群领导者轮换与访问策略
|
||||
|
||||
**Nomad 集群领导者机制**:
|
||||
- Nomad 使用 Raft 协议实现分布式一致性,集群中只有一个领导者节点
|
||||
- 领导者节点负责处理所有写入操作和协调集群状态
|
||||
- 当领导者节点故障时,集群会自动选举新的领导者
|
||||
|
||||
**领导者轮换时的访问策略**:
|
||||
|
||||
1. **动态发现领导者**:
|
||||
### 备份管理
|
||||
```bash
|
||||
# 查询当前领导者节点
|
||||
curl -s http://<任意Nomad服务器IP>:4646/v1/status/leader
|
||||
# 返回结果示例: "100.90.159.68:4647"
|
||||
# 查看备份列表
|
||||
consul kv get backup/nomad-jobs/index
|
||||
|
||||
# 使用返回的领导者地址进行API调用
|
||||
curl -s http://100.90.159.68:4646/v1/nodes
|
||||
# 查看最新备份信息
|
||||
consul kv get backup/nomad-jobs/20251004/metadata
|
||||
|
||||
# 恢复备份
|
||||
consul kv get backup/nomad-jobs/20251004/data > restore.tar.gz
|
||||
tar -xzf restore.tar.gz
|
||||
```
|
||||
|
||||
2. **负载均衡方案**:
|
||||
- **DNS 负载均衡**:使用 Consul DNS 服务,通过 `nomad.service.consul` 解析到当前领导者
|
||||
- **代理层负载均衡**:在 Nginx/HAProxy 配置中添加健康检查,自动路由到活跃的领导者节点
|
||||
- **客户端重试机制**:在客户端代码中实现重试逻辑,当连接失败时尝试其他服务器节点
|
||||
---
|
||||
|
||||
3. **推荐访问模式**:
|
||||
## 重要文件
|
||||
|
||||
- `components/traefik/config/dynamic.yml` → **Traefik 动态配置文件(推荐使用)**
|
||||
- `components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad` → Traefik Nomad 作业配置
|
||||
- `README-Traefik.md` → **Traefik 配置管理指南(必读)**
|
||||
- `infrastructure/opentofu/environments/dev/` → Terraform 基础设施配置
|
||||
- `deployment/ansible/inventories/production/hosts` → 服务器清单
|
||||
- `README-Vault.md` → Vault 配置和使用说明
|
||||
- `README-Waypoint.md` → Waypoint 配置和使用说明
|
||||
- `README-Backup.md` → 备份管理和恢复说明
|
||||
- `nomad-jobs/vault-cluster.nomad` → Vault Nomad 作业配置
|
||||
- `waypoint-server.nomad` → Waypoint Nomad 作业配置
|
||||
|
||||
---
|
||||
|
||||
## 🔧 服务初始化说明
|
||||
|
||||
### Vault 初始化
|
||||
|
||||
**当前状态:** Vault使用本地file存储,需要初始化
|
||||
|
||||
**初始化步骤:**
|
||||
```bash
|
||||
# 使用领导者发现脚本
|
||||
#!/bin/bash
|
||||
# 获取任意一个Nomad服务器IP
|
||||
SERVER_IP="100.116.158.95"
|
||||
# 查询当前领导者
|
||||
LEADER=$(curl -s http://${SERVER_IP}:4646/v1/status/leader | sed 's/"//g')
|
||||
# 使用领导者地址执行命令
|
||||
nomad node status -address=http://${LEADER}
|
||||
# 1. 检查vault状态
|
||||
curl -s http://warden.tailnet-68f9.ts.net:8200/v1/sys/health
|
||||
|
||||
# 2. 初始化vault(如果返回"no available server")
|
||||
vault operator init -address=http://warden.tailnet-68f9.ts.net:8200
|
||||
|
||||
# 3. 保存unseal keys和root token
|
||||
# 4. 解封vault
|
||||
vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-1>
|
||||
vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-2>
|
||||
vault operator unseal -address=http://warden.tailnet-68f9.ts.net:8200 <unseal-key-3>
|
||||
```
|
||||
|
||||
4. **高可用性配置**:
|
||||
- 将所有 Nomad 服务器节点添加到客户端配置中
|
||||
- 客户端会自动连接到可用的服务器节点
|
||||
- 对于写入操作,客户端会自动重定向到领导者节点
|
||||
**🔑 Vault 密钥信息 (2025-10-04 最终初始化):**
|
||||
```
|
||||
Unseal Key 1: 5XQ6vSekewZj9SigcIS8KcpnsOyEzgG5UFe/mqPVXkre
|
||||
Unseal Key 2: vmLu+Ry+hajWjQhX3YVnZG72aZRn5cowcUm5JIVtv/kR
|
||||
Unseal Key 3: 3eDhfnHZnG9OT6RFOhpoK/aO5TghPypz4XPlXxFMm52F
|
||||
Unseal Key 4: LWGkYB7qD3GPPc/nRuqKmMUiQex8ygYF1BkSXA1Tov3J
|
||||
Unseal Key 5: rIidFy7d/SxcPOCrNy569VZ86I56oMQxqL7qVgM+PYPy
|
||||
|
||||
**注意事项**:
|
||||
- Nomad 集群领导者轮换是自动进行的,通常不需要人工干预
|
||||
- 在领导者选举期间,集群可能会短暂无法处理写入操作
|
||||
- 建议在应用程序中实现适当的重试逻辑,以处理领导者切换期间的临时故障
|
||||
Root Token: hvs.OgVR2hEihbHM7qFxtFr7oeo3
|
||||
```
|
||||
|
||||
## 🛠️ 常用命令
|
||||
**配置说明:**
|
||||
- **存储**: file (本地文件系统)
|
||||
- **路径**: `/opt/nomad/data/vault-storage` (持久化存储)
|
||||
- **端口**: 8200
|
||||
- **UI**: 启用
|
||||
- **重要**: 已配置持久化存储,重启后密钥不会丢失
|
||||
|
||||
| 命令 | 描述 |
|
||||
|------|------|
|
||||
| `make status` | 显示项目状态总览 |
|
||||
| `make deploy` | 快速部署所有服务 |
|
||||
| `make cleanup` | 清理所有部署的服务 |
|
||||
| `cd tofu/environments/dev && tofu <cmd>` | OpenTofu 管理命令 |
|
||||
| `nomad job status` | 查看 Nomad 任务状态 |
|
||||
| `nomad node status` | 查看 Nomad 节点状态 |
|
||||
| `podman ps` | 查看运行中的容器 |
|
||||
| `ansible-playbook playbooks/configure-nomad-clients.yml` | 配置 Nomad 客户端 |
|
||||
| `./run_tests.sh` 或 `make test-mcp` | 运行所有MCP服务器测试 |
|
||||
| `make test-kali` | 运行Kali Linux快速健康检查 |
|
||||
| `make test-kali-security` | 运行Kali Linux安全工具测试 |
|
||||
| `make test-kali-full` | 运行Kali Linux完整测试套件 |
|
||||
### Waypoint 初始化
|
||||
|
||||
## 🌩️ 支持的云服务商
|
||||
**当前状态:** Waypoint正常运行,可能需要重新初始化
|
||||
|
||||
### Oracle Cloud Infrastructure (OCI)
|
||||
- ✅ 计算实例
|
||||
- ✅ 网络配置 (VCN, 子网, 安全组)
|
||||
- ✅ 存储 (块存储, 对象存储)
|
||||
- ✅ 负载均衡器
|
||||
|
||||
### 华为云
|
||||
- ✅ 弹性云服务器 (ECS)
|
||||
- ✅ 虚拟私有云 (VPC)
|
||||
- ✅ 弹性负载均衡 (ELB)
|
||||
- ✅ 云硬盘 (EVS)
|
||||
|
||||
### Google Cloud Platform
|
||||
- ✅ Compute Engine
|
||||
- ✅ VPC 网络
|
||||
- ✅ Cloud Load Balancing
|
||||
- ✅ Persistent Disk
|
||||
|
||||
### Amazon Web Services
|
||||
- ✅ EC2 实例
|
||||
- ✅ VPC 网络
|
||||
- ✅ Application Load Balancer
|
||||
- ✅ EBS 存储
|
||||
|
||||
### DigitalOcean
|
||||
- ✅ Droplets
|
||||
- ✅ VPC 网络
|
||||
- ✅ Load Balancers
|
||||
- ✅ Block Storage
|
||||
|
||||
## 🔄 CI/CD 流程
|
||||
|
||||
### 基础设施部署流程
|
||||
1. **代码提交** → 触发 Gitea Actions
|
||||
2. **OpenTofu Plan** → 生成执行计划
|
||||
3. **人工审核** → 确认变更
|
||||
4. **OpenTofu Apply** → 应用基础设施变更
|
||||
5. **Ansible 部署** → 配置和部署应用
|
||||
|
||||
### 应用部署流程
|
||||
1. **应用代码更新** → 构建容器镜像
|
||||
2. **镜像推送** → 推送到镜像仓库
|
||||
3. **Nomad Job 更新** → 更新任务定义
|
||||
4. **Nomad 部署** → 滚动更新服务
|
||||
5. **健康检查** → 验证部署状态
|
||||
|
||||
## 📊 监控和可观测性
|
||||
|
||||
### 监控组件
|
||||
- **Prometheus**: 指标收集和存储
|
||||
- **Grafana**: 可视化仪表板
|
||||
- **AlertManager**: 告警管理
|
||||
- **Node Exporter**: 系统指标导出
|
||||
|
||||
### 日志管理
|
||||
- **ELK Stack**: Elasticsearch + Logstash + Kibana
|
||||
- **Fluentd**: 日志收集和转发
|
||||
- **结构化日志**: JSON 格式标准化
|
||||
|
||||
## 🔐 安全最佳实践
|
||||
|
||||
### 基础设施安全
|
||||
- **网络隔离**: VPC, 安全组, 防火墙
|
||||
- **访问控制**: IAM 角色和策略
|
||||
- **数据加密**: 传输和静态加密
|
||||
- **密钥管理**: 云服务商密钥管理服务
|
||||
|
||||
### 应用安全
|
||||
- **容器安全**: 镜像扫描, 最小权限
|
||||
- **网络安全**: 服务网格, TLS 终止
|
||||
- **秘密管理**: Docker Secrets, Ansible Vault
|
||||
- **安全审计**: 日志监控和审计
|
||||
|
||||
## 🧪 测试策略
|
||||
|
||||
### 基础设施测试
|
||||
- **语法检查**: OpenTofu validate
|
||||
- **安全扫描**: Checkov, tfsec
|
||||
- **合规检查**: OPA (Open Policy Agent)
|
||||
|
||||
### 应用测试
|
||||
- **单元测试**: 应用代码测试
|
||||
- **集成测试**: 服务间集成测试
|
||||
- **端到端测试**: 完整流程测试
|
||||
|
||||
### MCP服务器测试
|
||||
项目包含完整的MCP(Model Context Protocol)服务器测试套件,位于 `tests/mcp_servers/` 目录:
|
||||
|
||||
- **context7服务器测试**: 验证初始化、工具列表和搜索功能
|
||||
- **qdrant服务器测试**: 测试文档添加、搜索和删除功能
|
||||
- **qdrant-ollama服务器测试**: 验证向量数据库与LLM集成功能
|
||||
|
||||
测试脚本包括Shell脚本和Python脚本,支持通过JSON-RPC协议直接测试MCP服务器功能。详细的测试结果和问题修复记录请参考 `tests/mcp_server_test_report.md`。
|
||||
|
||||
运行测试:
|
||||
**初始化步骤:**
|
||||
```bash
|
||||
# 运行单个测试脚本
|
||||
cd tests/mcp_servers
|
||||
./test_local_mcp_servers.sh
|
||||
# 1. 检查waypoint状态
|
||||
curl -I https://waypoint.git-4ta.live
|
||||
|
||||
# 或运行Python测试
|
||||
python test_mcp_servers_simple.py
|
||||
# 2. 如果需要重新初始化
|
||||
waypoint server init -server-addr=https://waypoint.git-4ta.live
|
||||
|
||||
# 3. 配置waypoint CLI
|
||||
waypoint auth login -server-addr=https://waypoint.git-4ta.live
|
||||
```
|
||||
|
||||
### Kali Linux系统测试
|
||||
项目包含完整的Kali Linux系统测试套件,位于 `configuration/playbooks/test/` 目录。测试包括:
|
||||
**配置说明:**
|
||||
- **存储**: 本地数据库 `/opt/waypoint/waypoint.db`
|
||||
- **端口**: HTTP 9701, gRPC 9702
|
||||
- **UI**: 启用
|
||||
|
||||
1. **快速健康检查** (`kali-health-check.yml`): 基本系统状态检查
|
||||
2. **安全工具测试** (`kali-security-tools.yml`): 测试各种安全工具的安装和功能
|
||||
3. **完整系统测试** (`test-kali.yml`): 全面的系统测试和报告生成
|
||||
4. **完整测试套件** (`kali-full-test-suite.yml`): 按顺序执行所有测试
|
||||
### Consul 服务注册
|
||||
|
||||
运行测试:
|
||||
```bash
|
||||
# Kali Linux快速健康检查
|
||||
make test-kali
|
||||
**已注册服务:**
|
||||
- ✅ **vault**: `vault.git-4ta.live` (tags: vault, secrets, kv)
|
||||
- ✅ **waypoint**: `waypoint.git-4ta.live` (tags: waypoint, ci-cd, deployment)
|
||||
- ✅ **consul**: `consul.git-4ta.live` (tags: consul, service-discovery)
|
||||
- ✅ **traefik**: `traefik.git-4ta.live` (tags: traefik, proxy, load-balancer)
|
||||
- ✅ **nomad**: `nomad.git-4ta.live` (tags: nomad, scheduler, orchestrator)
|
||||
|
||||
# Kali Linux安全工具测试
|
||||
make test-kali-security
|
||||
**健康检查:**
|
||||
- **vault**: `/v1/sys/health`
|
||||
- **waypoint**: `/`
|
||||
- **consul**: `/v1/status/leader`
|
||||
- **traefik**: `/ping`
|
||||
- **nomad**: `/v1/status/leader`
|
||||
|
||||
# Kali Linux完整测试套件
|
||||
make test-kali-full
|
||||
```
|
||||
|
||||
## 📚 文档
|
||||
|
||||
- [Consul集群故障排除](docs/consul-cluster-troubleshooting.md)
|
||||
- [磁盘管理](docs/disk-management.md)
|
||||
- [Nomad NFS设置](docs/nomad-nfs-setup.md)
|
||||
- [Consul-Terraform集成](docs/setup/consul-terraform-integration.md)
|
||||
- [OCI凭据设置](docs/setup/oci-credentials-setup.md)
|
||||
- [Oracle云设置](docs/setup/oracle-cloud-setup.md)
|
||||
|
||||
## 🤝 贡献指南
|
||||
|
||||
1. Fork 项目
|
||||
2. 创建特性分支 (`git checkout -b feature/amazing-feature`)
|
||||
3. 提交变更 (`git commit -m 'Add amazing feature'`)
|
||||
4. 推送到分支 (`git push origin feature/amazing-feature`)
|
||||
5. 创建 Pull Request
|
||||
|
||||
## 📄 许可证
|
||||
|
||||
本项目采用 MIT 许可证 - 查看 [LICENSE](LICENSE) 文件了解详情。
|
||||
|
||||
## 🆘 支持
|
||||
|
||||
如果你遇到问题或有疑问:
|
||||
|
||||
1. 查看 [文档](docs/)
|
||||
2. 搜索 [Issues](../../issues)
|
||||
3. 创建新的 [Issue](../../issues/new)
|
||||
|
||||
## ⚠️ 重要经验教训
|
||||
|
||||
### Terraform 与 Nomad 职责区分
|
||||
**问题**:在基础设施管理中容易混淆 Terraform 和 Nomad 的职责范围,导致架构设计混乱。
|
||||
|
||||
**根本原因**:Terraform 和 Nomad 虽然都是基础设施管理工具,但它们在架构中处于不同层面,负责不同类型的资源管理。
|
||||
|
||||
**解决方案**:
|
||||
1. **明确分层架构**:
|
||||
- **Terraform/OpenTofu**:负责云服务商提供的计算资源(虚拟机)的生命周期管理
|
||||
- **Nomad**:负责在已有虚拟机内部进行应用资源调度和编排
|
||||
|
||||
2. **职责边界清晰**:
|
||||
- Terraform 决定"有哪些虚拟机"
|
||||
- Nomad 决定"虚拟机上运行什么应用"
|
||||
- 两者不应越界管理对方的资源
|
||||
|
||||
3. **工作流程分离**:
|
||||
```
|
||||
1. Terraform 创建虚拟机 (云服务商层面)
|
||||
↓
|
||||
2. 虚拟机启动并运行操作系统
|
||||
↓
|
||||
3. 在虚拟机上安装和配置 Nomad 客户端
|
||||
↓
|
||||
4. Nomad 在虚拟机上调度和运行应用容器
|
||||
```
|
||||
|
||||
**重要提醒**:严格遵守这种分层架构是项目成功的关键。任何混淆这两个层面职责的做法都会导致架构混乱和管理困难。
|
||||
|
||||
### Consul 和 Nomad 访问问题
|
||||
**问题**:尝试访问 Consul 服务时,使用 `http://localhost:8500` 或 `http://127.0.0.1:8500` 无法连接。
|
||||
|
||||
**根本原因**:本项目中的 Consul 和 Nomad 服务通过 Nomad + Podman 在集群中运行,并通过 Tailscale 网络进行访问。这些服务不在本地运行,因此无法通过 localhost 访问。
|
||||
|
||||
**解决方案**:
|
||||
1. **使用 Tailscale IP**:必须使用 Tailscale 分配的 IP 地址访问服务
|
||||
```bash
|
||||
# 查看当前节点的 Tailscale IP
|
||||
tailscale ip -4
|
||||
|
||||
# 查看所有 Tailscale 网络中的节点
|
||||
tailscale status
|
||||
|
||||
# 访问 Consul (使用实际的 Tailscale IP)
|
||||
curl http://100.x.x.x:8500/v1/status/leader
|
||||
|
||||
# 访问 Nomad (使用实际的 Tailscale IP)
|
||||
curl http://100.x.x.x:4646/v1/status/leader
|
||||
```
|
||||
|
||||
2. **服务发现**:Consul 集群由 3 个节点组成,Nomad 集群由十多个节点组成,需要正确识别服务运行的节点
|
||||
|
||||
3. **集群架构**:
|
||||
- Consul 集群:3 个节点 (kr-master, us-ash3c, bj-warden)
|
||||
- Nomad 集群:十多个节点,包括服务器节点和客户端节点
|
||||
|
||||
**重要提醒**:在开发和调试过程中,始终记住使用 Tailscale IP 而不是 localhost 访问集群服务。这是本项目架构的基本要求,必须严格遵守。
|
||||
|
||||
### Consul 集群配置管理经验
|
||||
**问题**:Consul集群配置文件与实际运行状态不一致,导致集群管理混乱和配置错误。
|
||||
|
||||
**根本原因**:Ansible inventory配置文件中的节点信息与实际Consul集群中的节点状态不匹配,包括节点角色、数量和expect值等关键配置。
|
||||
|
||||
**解决方案**:
|
||||
1. **定期验证集群状态**:使用Consul API定期检查集群实际状态,确保配置文件与实际运行状态一致
|
||||
```bash
|
||||
# 查看Consul集群节点信息
|
||||
curl -s http://<consul-server>:8500/v1/catalog/nodes
|
||||
|
||||
# 查看节点详细信息
|
||||
curl -s http://<consul-server>:8500/v1/agent/members
|
||||
|
||||
# 查看集群leader信息
|
||||
curl -s http://<consul-server>:8500/v1/status/leader
|
||||
```
|
||||
|
||||
2. **保持配置文件一致性**:确保所有相关的inventory配置文件(如`csol-consul-nodes.ini`、`consul-nodes.ini`、`consul-cluster.ini`)保持一致,包括:
|
||||
- 服务器节点列表和数量
|
||||
- 客户端节点列表和数量
|
||||
- `bootstrap_expect`值(必须与实际服务器节点数量匹配)
|
||||
- 节点角色和IP地址
|
||||
|
||||
3. **正确识别节点角色**:通过API查询确认每个节点的实际角色,避免将服务器节点误配置为客户端节点,或反之
|
||||
```json
|
||||
// API返回的节点信息示例
|
||||
{
|
||||
"Name": "warden",
|
||||
"Addr": "100.122.197.112",
|
||||
"Port": 8300,
|
||||
"Status": 1,
|
||||
"ProtocolVersion": 2,
|
||||
"Delegate": 1,
|
||||
"Server": true // 确认节点角色
|
||||
}
|
||||
```
|
||||
|
||||
4. **更新配置流程**:当发现配置与实际状态不匹配时,按照以下步骤更新:
|
||||
- 使用API获取集群实际状态
|
||||
- 根据实际状态更新所有相关配置文件
|
||||
- 确保所有配置文件中的信息保持一致
|
||||
- 更新配置文件中的说明和注释,反映最新的集群状态
|
||||
|
||||
**实际案例**:
|
||||
- **初始状态**:配置文件显示2个服务器节点和5个客户端节点,`bootstrap_expect=2`
|
||||
- **实际状态**:Consul集群运行3个服务器节点(master、ash3c、warden),无客户端节点,`expect=3`
|
||||
- **解决方案**:更新所有配置文件,将服务器节点数量改为3个,移除所有客户端节点配置,将`bootstrap_expect`值更新为3
|
||||
|
||||
**重要提醒**:Consul集群配置必须与实际运行状态保持严格一致。任何不匹配都可能导致集群不稳定或功能异常。定期使用Consul API验证集群状态,并及时更新配置文件,是确保集群稳定运行的关键。
|
||||
|
||||
## 🎉 致谢
|
||||
|
||||
感谢所有为这个项目做出贡献的开发者和社区成员!
|
||||
## 脚本整理
|
||||
|
||||
项目脚本已重新整理,按功能分类存放在 `scripts/` 目录中:
|
||||
|
||||
- `scripts/setup/` - 环境设置和初始化
|
||||
- `scripts/deployment/` - 部署相关脚本
|
||||
- `scripts/testing/` - 测试脚本
|
||||
- `scripts/utilities/` - 工具脚本
|
||||
- `scripts/mcp/` - MCP 服务器相关
|
||||
- `scripts/ci-cd/` - CI/CD 相关
|
||||
|
||||
详细信息请查看 [脚本索引](scripts/SCRIPT_INDEX.md)。
|
||||
|
||||
|
||||
## 脚本整理
|
||||
|
||||
项目脚本已重新整理,按功能分类存放在 `scripts/` 目录中:
|
||||
|
||||
- `scripts/setup/` - 环境设置和初始化
|
||||
- `scripts/deployment/` - 部署相关脚本
|
||||
- `scripts/testing/` - 测试脚本
|
||||
- `scripts/utilities/` - 工具脚本
|
||||
- `scripts/mcp/` - MCP 服务器相关
|
||||
- `scripts/ci-cd/` - CI/CD 相关
|
||||
|
||||
详细信息请查看 [脚本索引](scripts/SCRIPT_INDEX.md)。
|
||||
---
|
||||
|
||||
**最后更新:** 2025-10-08 02:55 UTC
|
||||
**状态:** 服务运行正常,Traefik配置架构已优化,Authentik已集成
|
||||
|
|
@ -12,16 +12,18 @@
|
|||
- "100.116.80.94:8300" # ash3c (美国)
|
||||
|
||||
tasks:
|
||||
- name: Update APT cache
|
||||
- name: Update APT cache (忽略 GPG 错误)
|
||||
apt:
|
||||
update_cache: yes
|
||||
force_apt_get: yes
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Install consul via APT (假设源已存在)
|
||||
apt:
|
||||
name: consul={{ consul_version }}-*
|
||||
state: present
|
||||
update_cache: yes
|
||||
register: consul_installed
|
||||
force_apt_get: yes
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Create consul user (if not exists)
|
||||
user:
|
||||
|
|
|
|||
|
|
@ -1,59 +0,0 @@
|
|||
---
|
||||
# Ansible Inventory for Consul Client Deployment
|
||||
all:
|
||||
children:
|
||||
consul_servers:
|
||||
hosts:
|
||||
master.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.117.106.136
|
||||
region: korea
|
||||
warden.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.122.197.112
|
||||
region: beijing
|
||||
ash3c.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.116.80.94
|
||||
region: usa
|
||||
|
||||
nomad_servers:
|
||||
hosts:
|
||||
# Nomad Server 节点也需要 Consul Client
|
||||
semaphore.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.116.158.95
|
||||
region: korea
|
||||
ch3.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.86.141.112
|
||||
region: switzerland
|
||||
ash1d.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.81.26.3
|
||||
region: usa
|
||||
ash2e.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.103.147.94
|
||||
region: usa
|
||||
ch2.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.90.159.68
|
||||
region: switzerland
|
||||
de.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.120.225.29
|
||||
region: germany
|
||||
onecloud1.tailnet-68f9.ts.net:
|
||||
ansible_host: 100.98.209.50
|
||||
region: unknown
|
||||
|
||||
nomad_clients:
|
||||
hosts:
|
||||
# 需要部署 Consul Client 的节点
|
||||
influxdb1.tailnet-68f9.ts.net:
|
||||
ansible_host: "{{ influxdb1_ip }}" # 需要填入实际IP
|
||||
region: beijing
|
||||
browser.tailnet-68f9.ts.net:
|
||||
ansible_host: "{{ browser_ip }}" # 需要填入实际IP
|
||||
region: beijing
|
||||
# hcp1 已经有 Consul Client,可选择重新配置
|
||||
# hcp1.tailnet-68f9.ts.net:
|
||||
# ansible_host: 100.97.62.111
|
||||
# region: beijing
|
||||
|
||||
vars:
|
||||
ansible_user: root
|
||||
ansible_ssh_private_key_file: ~/.ssh/id_rsa
|
||||
consul_datacenter: dc1
|
||||
|
|
@ -0,0 +1,192 @@
|
|||
# Authentik Traefik 代理配置指南
|
||||
|
||||
## 配置概述
|
||||
|
||||
已为Authentik配置Traefik代理,实现SSL证书自动管理和域名访问。
|
||||
|
||||
## 配置详情
|
||||
|
||||
### Authentik服务信息
|
||||
- **容器IP**: 192.168.31.144
|
||||
- **HTTP端口**: 9000 (可选)
|
||||
- **HTTPS端口**: 9443 (主要)
|
||||
- **容器状态**: 运行正常
|
||||
- **SSH认证**: 已配置密钥认证,无需密码
|
||||
|
||||
### Traefik代理配置
|
||||
|
||||
#### 服务配置
|
||||
```yaml
|
||||
authentik-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://192.168.31.144:9443" # Authentik容器HTTPS端口
|
||||
serversTransport: authentik-insecure
|
||||
healthCheck:
|
||||
path: "/flows/-/default/authentication/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
```
|
||||
|
||||
#### 路由配置
|
||||
```yaml
|
||||
authentik-ui:
|
||||
rule: "Host(`authentik.git-4ta.live`)"
|
||||
service: authentik-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
```
|
||||
|
||||
## DNS配置要求
|
||||
|
||||
需要在Cloudflare中为以下域名添加DNS记录:
|
||||
|
||||
### A记录
|
||||
```
|
||||
authentik.git-4ta.live A <hcp1的Tailscale IP>
|
||||
```
|
||||
|
||||
### 获取hcp1的Tailscale IP
|
||||
```bash
|
||||
# 方法1: 通过Tailscale命令
|
||||
tailscale ip -4 hcp1
|
||||
|
||||
# 方法2: 通过ping
|
||||
ping hcp1.tailnet-68f9.ts.net
|
||||
```
|
||||
|
||||
## 部署步骤
|
||||
|
||||
### 1. 更新Traefik配置
|
||||
```bash
|
||||
# 重新部署Traefik job
|
||||
nomad job run components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad
|
||||
```
|
||||
|
||||
### 2. 配置DNS记录
|
||||
在Cloudflare Dashboard中添加A记录:
|
||||
- **Name**: authentik
|
||||
- **Type**: A
|
||||
- **Content**: <hcp1的Tailscale IP>
|
||||
- **TTL**: Auto
|
||||
|
||||
### 3. 验证SSL证书
|
||||
```bash
|
||||
# 检查证书是否自动生成
|
||||
curl -I https://authentik.git-4ta.live
|
||||
|
||||
# 预期返回200状态码和有效的SSL证书
|
||||
```
|
||||
|
||||
### 4. 测试访问
|
||||
```bash
|
||||
# 访问Authentik Web UI
|
||||
open https://authentik.git-4ta.live
|
||||
|
||||
# 或使用curl测试
|
||||
curl -k https://authentik.git-4ta.live
|
||||
```
|
||||
|
||||
## 健康检查
|
||||
|
||||
### Authentik健康检查端点
|
||||
- **路径**: `/if/flow/default-authentication-flow/`
|
||||
- **间隔**: 30秒
|
||||
- **超时**: 15秒
|
||||
|
||||
### 检查服务状态
|
||||
```bash
|
||||
# 检查Traefik路由状态
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/http/routers | jq '.[] | select(.name=="authentik-ui")'
|
||||
|
||||
# 检查服务健康状态
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/http/services | jq '.[] | select(.name=="authentik-cluster")'
|
||||
```
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 常见问题
|
||||
|
||||
1. **DNS解析问题**
|
||||
```bash
|
||||
# 检查DNS解析
|
||||
nslookup authentik.git-4ta.live
|
||||
|
||||
# 检查Cloudflare DNS
|
||||
dig @1.1.1.1 authentik.git-4ta.live
|
||||
```
|
||||
|
||||
2. **SSL证书问题**
|
||||
```bash
|
||||
# 检查证书状态
|
||||
openssl s_client -connect authentik.git-4ta.live:443 -servername authentik.git-4ta.live
|
||||
|
||||
# 检查Traefik证书存储
|
||||
ls -la /opt/traefik/certs/
|
||||
```
|
||||
|
||||
3. **服务连接问题**
|
||||
```bash
|
||||
# 检查Authentik容器状态
|
||||
sshpass -p "Aa313131@ben" ssh -o StrictHostKeyChecking=no root@pve "pct exec 113 -- netstat -tlnp | grep 9000"
|
||||
|
||||
# 检查Traefik日志
|
||||
nomad logs -f traefik-cloudflare-v1
|
||||
```
|
||||
|
||||
### 调试命令
|
||||
|
||||
```bash
|
||||
# 检查Traefik配置
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/rawdata | jq '.routers[] | select(.name=="authentik-ui")'
|
||||
|
||||
# 检查服务发现
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/rawdata | jq '.services[] | select(.name=="authentik-cluster")'
|
||||
|
||||
# 检查中间件
|
||||
curl -s http://hcp1.tailnet-68f9.ts.net:8080/api/rawdata | jq '.middlewares'
|
||||
```
|
||||
|
||||
## 下一步
|
||||
|
||||
配置完成后,可以:
|
||||
|
||||
1. **配置OAuth2 Provider**
|
||||
- 在Authentik中创建OAuth2应用
|
||||
- 配置回调URL
|
||||
- 设置客户端凭据
|
||||
|
||||
2. **集成HCP服务**
|
||||
- 为Nomad UI配置OAuth2认证
|
||||
- 为Consul UI配置OAuth2认证
|
||||
- 为Vault配置OIDC认证
|
||||
|
||||
3. **用户管理**
|
||||
- 创建用户组和权限
|
||||
- 配置多因素认证
|
||||
- 设置访问策略
|
||||
|
||||
## 安全注意事项
|
||||
|
||||
1. **网络安全**
|
||||
- Authentik容器使用内网IP (192.168.31.144)
|
||||
- 通过Traefik代理访问,不直接暴露
|
||||
|
||||
2. **SSL/TLS**
|
||||
- 使用Cloudflare自动SSL证书
|
||||
- 强制HTTPS重定向
|
||||
- 支持现代TLS协议
|
||||
|
||||
3. **访问控制**
|
||||
- 建议配置IP白名单
|
||||
- 启用多因素认证
|
||||
- 定期轮换密钥
|
||||
|
||||
---
|
||||
|
||||
**配置完成时间**: $(date)
|
||||
**配置文件**: `/root/mgmt/components/traefik/jobs/traefik-cloudflare-git4ta-live.nomad`
|
||||
**域名**: `authentik.git-4ta.live`
|
||||
**状态**: 待部署和测试
|
||||
|
|
@ -0,0 +1,99 @@
|
|||
# Nomad Jobs 备份
|
||||
|
||||
**备份时间**: 2025-10-04 07:44:11
|
||||
**备份原因**: 所有服务正常运行,SSL证书已配置完成
|
||||
|
||||
## 当前运行状态
|
||||
|
||||
### ✅ 已部署并正常工作的服务
|
||||
|
||||
1. **Traefik** (`traefik-cloudflare-v1`)
|
||||
- 文件: `components/traefik/jobs/traefik-cloudflare.nomad`
|
||||
- 状态: 运行中,SSL证书正常
|
||||
- 域名: `*.git4ta.me`
|
||||
- 证书: Let's Encrypt (Cloudflare DNS Challenge)
|
||||
|
||||
2. **Vault** (`vault-cluster`)
|
||||
- 文件: `nomad-jobs/vault-cluster.nomad`
|
||||
- 状态: 三节点集群运行中
|
||||
- 节点: ch4, ash3c, warden
|
||||
- 配置: 存储在 Consul KV `vault/config`
|
||||
|
||||
3. **Waypoint** (`waypoint-server`)
|
||||
- 文件: `waypoint-server.nomad`
|
||||
- 状态: 运行中
|
||||
- 节点: hcp1
|
||||
- Web UI: `https://waypoint.git4ta.me/auth/token`
|
||||
|
||||
### 🔧 关键配置
|
||||
|
||||
#### Traefik 配置要点
|
||||
- 使用 Cloudflare DNS Challenge 获取 SSL 证书
|
||||
- 证书存储: `/local/acme.json` (本地存储)
|
||||
- 域名: `git4ta.me`
|
||||
- 服务路由: consul, nomad, vault, waypoint
|
||||
|
||||
#### Vault 配置要点
|
||||
- 三节点高可用集群
|
||||
- 配置统一存储在 Consul KV
|
||||
- 使用 `exec` driver
|
||||
- 服务注册到 Consul
|
||||
|
||||
#### Waypoint 配置要点
|
||||
- 使用 `raw_exec` driver
|
||||
- HTTPS API: 9701, gRPC: 9702
|
||||
- 已引导并获取认证 token
|
||||
|
||||
### 📋 服务端点
|
||||
|
||||
- `https://consul.git4ta.me` → Consul UI
|
||||
- `https://traefik.git4ta.me` → Traefik Dashboard
|
||||
- `https://nomad.git4ta.me` → Nomad UI
|
||||
- `https://vault.git4ta.me` → Vault UI
|
||||
- `https://waypoint.git4ta.me/auth/token` → Waypoint UI
|
||||
|
||||
### 🔑 重要凭据
|
||||
|
||||
#### Vault
|
||||
- Unseal Keys: 存储在 Consul KV `vault/unseal-keys`
|
||||
- Root Token: 存储在 Consul KV `vault/root-token`
|
||||
- 详细文档: `/root/mgmt/README-Vault.md`
|
||||
|
||||
#### Waypoint
|
||||
- Auth Token: 存储在 Consul KV `waypoint/auth-token`
|
||||
- 详细文档: `/root/mgmt/README-Waypoint.md`
|
||||
|
||||
### 🚀 部署命令
|
||||
|
||||
```bash
|
||||
# 部署 Traefik
|
||||
nomad job run components/traefik/jobs/traefik-cloudflare.nomad
|
||||
|
||||
# 部署 Vault
|
||||
nomad job run nomad-jobs/vault-cluster.nomad
|
||||
|
||||
# 部署 Waypoint
|
||||
nomad job run waypoint-server.nomad
|
||||
```
|
||||
|
||||
### 📝 注意事项
|
||||
|
||||
1. **证书管理**: 证书存储在 Traefik 容器的 `/local/acme.json`,容器重启会丢失
|
||||
2. **Vault 配置**: 所有配置通过 Consul KV 动态加载,修改后需要重启 job
|
||||
3. **网络配置**: 所有服务使用 Tailscale 网络地址
|
||||
4. **备份策略**: 建议定期备份 Consul KV 中的配置和凭据
|
||||
|
||||
### 🔄 恢复步骤
|
||||
|
||||
如需恢复到此状态:
|
||||
|
||||
1. 恢复 Consul KV 配置
|
||||
2. 按顺序部署: Traefik → Vault → Waypoint
|
||||
3. 验证所有服务端点可访问
|
||||
4. 检查 SSL 证书状态
|
||||
|
||||
---
|
||||
|
||||
**备份完成时间**: 2025-10-04 07:44:11
|
||||
**备份者**: AI Assistant
|
||||
**状态**: 所有服务正常运行 ✅
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
# Consul 配置
|
||||
|
||||
## 部署
|
||||
|
||||
```bash
|
||||
nomad job run components/consul/jobs/consul-cluster.nomad
|
||||
```
|
||||
|
||||
## Job 信息
|
||||
|
||||
- **Job 名称**: `consul-cluster-nomad`
|
||||
- **类型**: service
|
||||
- **节点**: master, ash3c, warden
|
||||
|
||||
## 访问方式
|
||||
|
||||
- Master: `http://master.tailnet-68f9.ts.net:8500`
|
||||
- Ash3c: `http://ash3c.tailnet-68f9.ts.net:8500`
|
||||
- Warden: `http://warden.tailnet-68f9.ts.net:8500`
|
||||
|
|
@ -0,0 +1,88 @@
|
|||
# Consul配置文件
|
||||
# 此文件包含Consul的完整配置,包括变量和存储相关设置
|
||||
|
||||
# 基础配置
|
||||
data_dir = "/opt/consul/data"
|
||||
raft_dir = "/opt/consul/raft"
|
||||
|
||||
# 启用UI
|
||||
ui_config {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# 数据中心配置
|
||||
datacenter = "dc1"
|
||||
|
||||
# 服务器配置
|
||||
server = true
|
||||
bootstrap_expect = 3
|
||||
|
||||
# 网络配置
|
||||
client_addr = "0.0.0.0"
|
||||
bind_addr = "{{ GetInterfaceIP `eth0` }}"
|
||||
advertise_addr = "{{ GetInterfaceIP `eth0` }}"
|
||||
|
||||
# 端口配置
|
||||
ports {
|
||||
dns = 8600
|
||||
http = 8500
|
||||
https = -1
|
||||
grpc = 8502
|
||||
grpc_tls = 8503
|
||||
serf_lan = 8301
|
||||
serf_wan = 8302
|
||||
server = 8300
|
||||
}
|
||||
|
||||
# 集群连接
|
||||
retry_join = ["100.117.106.136", "100.116.80.94", "100.122.197.112"]
|
||||
|
||||
# 服务发现
|
||||
enable_service_script = true
|
||||
enable_script_checks = true
|
||||
enable_local_script_checks = true
|
||||
|
||||
# 性能调优
|
||||
performance {
|
||||
raft_multiplier = 1
|
||||
}
|
||||
|
||||
# 日志配置
|
||||
log_level = "INFO"
|
||||
enable_syslog = false
|
||||
log_file = "/var/log/consul/consul.log"
|
||||
|
||||
# 安全配置
|
||||
encrypt = "YourEncryptionKeyHere"
|
||||
|
||||
# 连接配置
|
||||
reconnect_timeout = "30s"
|
||||
reconnect_timeout_wan = "30s"
|
||||
session_ttl_min = "10s"
|
||||
|
||||
# Autopilot配置
|
||||
autopilot {
|
||||
cleanup_dead_servers = true
|
||||
last_contact_threshold = "200ms"
|
||||
max_trailing_logs = 250
|
||||
server_stabilization_time = "10s"
|
||||
redundancy_zone_tag = ""
|
||||
disable_upgrade_migration = false
|
||||
upgrade_version_tag = ""
|
||||
}
|
||||
|
||||
# 快照配置
|
||||
snapshot {
|
||||
enabled = true
|
||||
interval = "24h"
|
||||
retain = 30
|
||||
name = "consul-snapshot-{{.Timestamp}}"
|
||||
}
|
||||
|
||||
# 备份配置
|
||||
backup {
|
||||
enabled = true
|
||||
interval = "6h"
|
||||
retain = 7
|
||||
name = "consul-backup-{{.Timestamp}}"
|
||||
}
|
||||
|
|
@ -0,0 +1,93 @@
|
|||
# Consul配置模板文件
|
||||
# 此文件使用Consul模板语法从KV存储中动态获取配置
|
||||
# 遵循 config/{environment}/{provider}/{region_or_service}/{key} 格式
|
||||
|
||||
# 基础配置
|
||||
data_dir = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/data_dir` `/opt/consul/data` }}"
|
||||
raft_dir = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/raft_dir` `/opt/consul/raft` }}"
|
||||
|
||||
# 启用UI
|
||||
ui_config {
|
||||
enabled = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ui/enabled` `true` }}
|
||||
}
|
||||
|
||||
# 数据中心配置
|
||||
datacenter = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/datacenter` `dc1` }}"
|
||||
|
||||
# 服务器配置
|
||||
server = true
|
||||
bootstrap_expect = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/bootstrap_expect` `3` }}
|
||||
|
||||
# 网络配置
|
||||
client_addr = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/network/client_addr` `0.0.0.0` }}"
|
||||
bind_addr = "{{ GetInterfaceIP (keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/network/bind_interface` `ens160`) }}"
|
||||
advertise_addr = "{{ GetInterfaceIP (keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/network/advertise_interface` `ens160`) }}"
|
||||
|
||||
# 端口配置
|
||||
ports {
|
||||
dns = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/dns` `8600` }}
|
||||
http = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/http` `8500` }}
|
||||
https = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/https` `-1` }}
|
||||
grpc = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/grpc` `8502` }}
|
||||
grpc_tls = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/grpc_tls` `8503` }}
|
||||
serf_lan = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/serf_lan` `8301` }}
|
||||
serf_wan = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/serf_wan` `8302` }}
|
||||
server = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/ports/server` `8300` }}
|
||||
}
|
||||
|
||||
# 集群连接 - 动态获取节点IP
|
||||
retry_join = [
|
||||
"{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/nodes/master/ip` `100.117.106.136` }}",
|
||||
"{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/nodes/ash3c/ip` `100.116.80.94` }}",
|
||||
"{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/nodes/warden/ip` `100.122.197.112` }}"
|
||||
]
|
||||
|
||||
# 服务发现
|
||||
enable_service_script = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/service/enable_service_script` `true` }}
|
||||
enable_script_checks = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/service/enable_script_checks` `true` }}
|
||||
enable_local_script_checks = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/service/enable_local_script_checks` `true` }}
|
||||
|
||||
# 性能调优
|
||||
performance {
|
||||
raft_multiplier = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/performance/raft_multiplier` `1` }}
|
||||
}
|
||||
|
||||
# 日志配置
|
||||
log_level = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/log_level` `INFO` }}"
|
||||
enable_syslog = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/log/enable_syslog` `false` }}
|
||||
log_file = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/log/log_file` `/var/log/consul/consul.log` }}"
|
||||
|
||||
# 安全配置
|
||||
encrypt = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/cluster/encrypt_key` `YourEncryptionKeyHere` }}"
|
||||
|
||||
# 连接配置
|
||||
reconnect_timeout = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/connection/reconnect_timeout` `30s` }}"
|
||||
reconnect_timeout_wan = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/connection/reconnect_timeout_wan` `30s` }}"
|
||||
session_ttl_min = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/connection/session_ttl_min` `10s` }}"
|
||||
|
||||
# Autopilot配置
|
||||
autopilot {
|
||||
cleanup_dead_servers = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/autopilot/cleanup_dead_servers` `true` }}
|
||||
last_contact_threshold = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/autopilot/last_contact_threshold` `200ms` }}"
|
||||
max_trailing_logs = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/autopilot/max_trailing_logs` `250` }}
|
||||
server_stabilization_time = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/autopilot/server_stabilization_time` `10s` }}"
|
||||
redundancy_zone_tag = ""
|
||||
disable_upgrade_migration = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/autopilot/disable_upgrade_migration` `false` }}
|
||||
upgrade_version_tag = ""
|
||||
}
|
||||
|
||||
# 快照配置
|
||||
snapshot {
|
||||
enabled = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/snapshot/enabled` `true` }}
|
||||
interval = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/snapshot/interval` `24h` }}"
|
||||
retain = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/snapshot/retain` `30` }}
|
||||
name = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/snapshot/name` `consul-snapshot-{{.Timestamp}}` }}"
|
||||
}
|
||||
|
||||
# 备份配置
|
||||
backup {
|
||||
enabled = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/backup/enabled` `true` }}
|
||||
interval = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/backup/interval` `6h` }}"
|
||||
retain = {{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/backup/retain` `7` }}
|
||||
name = "{{ keyOrDefault `config/` + env "ENVIRONMENT" + `/consul/backup/name` `consul-backup-{{.Timestamp}}` }}"
|
||||
}
|
||||
|
|
@ -0,0 +1,50 @@
|
|||
job "consul-clients-additional" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "regexp"
|
||||
value = "ch2|ch3|de"
|
||||
}
|
||||
|
||||
group "consul-client" {
|
||||
count = 3
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-config-dir=/etc/consul.d",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=${node.unique.name}",
|
||||
"-bind=${attr.unique.network.ip-address}",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-client=0.0.0.0"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/v1/status/leader"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,154 @@
|
|||
job "consul-clients-dedicated" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "consul-client-hcp1" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
}
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=hcp1",
|
||||
"-bind=100.97.62.111",
|
||||
"-advertise=100.97.62.111",
|
||||
"-retry-join=hcp1.tailnet-68f9.ts.net:80",
|
||||
"-client=0.0.0.0",
|
||||
"-http-port=8500",
|
||||
"-datacenter=dc1"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "script"
|
||||
command = "consul"
|
||||
args = ["members"]
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-client-influxdb1" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "influxdb1"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
}
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=influxdb1",
|
||||
"-bind=100.100.7.4",
|
||||
"-advertise=100.100.7.4",
|
||||
"-retry-join=hcp1.tailnet-68f9.ts.net:80",
|
||||
"-client=0.0.0.0",
|
||||
"-http-port=8500",
|
||||
"-datacenter=dc1"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "script"
|
||||
command = "consul"
|
||||
args = ["members"]
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-client-browser" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "browser"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
}
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=browser",
|
||||
"-bind=100.116.112.45",
|
||||
"-advertise=100.116.112.45",
|
||||
"-retry-join=hcp1.tailnet-68f9.ts.net:80",
|
||||
"-client=0.0.0.0",
|
||||
"-http-port=8500",
|
||||
"-datacenter=dc1"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "script"
|
||||
command = "consul"
|
||||
args = ["members"]
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,66 @@
|
|||
job "consul-clients-dedicated" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "regexp"
|
||||
value = "hcp1|influxdb1|browser"
|
||||
}
|
||||
|
||||
group "consul-client" {
|
||||
count = 3
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
min_healthy_time = "5s"
|
||||
healthy_deadline = "2m"
|
||||
progress_deadline = "5m"
|
||||
auto_revert = false
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
}
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=${node.unique.name}",
|
||||
"-bind=${attr.unique.network.ip-address}",
|
||||
"-advertise=${attr.unique.network.ip-address}",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-client=0.0.0.0",
|
||||
"-http-port=${NOMAD_PORT_http}",
|
||||
"-datacenter=dc1"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/v1/status/leader"
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
job "consul-clients" {
|
||||
datacenters = ["dc1"]
|
||||
type = "system"
|
||||
|
||||
group "consul-client" {
|
||||
count = 0 # system job, runs on all nodes
|
||||
|
||||
task "consul-client" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-config-dir=/etc/consul.d",
|
||||
"-data-dir=/opt/consul",
|
||||
"-node=${node.unique.name}",
|
||||
"-bind=${attr.unique.network.ip-address}",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-client"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/v1/status/leader"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,115 @@
|
|||
job "consul-cluster-nomad" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "consul-ch4" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.117.106.136",
|
||||
"-advertise=100.117.106.136",
|
||||
"-retry-join=100.116.80.94",
|
||||
"-retry-join=100.122.197.112",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-ash3c" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "ash3c"
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.116.80.94",
|
||||
"-advertise=100.116.80.94",
|
||||
"-retry-join=100.117.106.136",
|
||||
"-retry-join=100.122.197.112",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-warden" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.122.197.112",
|
||||
"-advertise=100.122.197.112",
|
||||
"-retry-join=100.117.106.136",
|
||||
"-retry-join=100.116.80.94",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,66 @@
|
|||
job "consul-ui-service" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "consul-ui" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 8500
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-ui"
|
||||
port = "http"
|
||||
|
||||
tags = [
|
||||
"traefik.enable=true",
|
||||
"traefik.http.routers.consul-ui.rule=PathPrefix(`/consul`)",
|
||||
"traefik.http.routers.consul-ui.priority=100"
|
||||
]
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/v1/status/leader"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
}
|
||||
}
|
||||
|
||||
task "consul-ui" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/bin/consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.122.197.112",
|
||||
"-advertise=100.122.197.112",
|
||||
"-retry-join=100.117.106.136",
|
||||
"-retry-join=100.116.80.94",
|
||||
"-ui",
|
||||
"-http-port=8500"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,8 @@
|
|||
# Nomad 配置
|
||||
|
||||
## Jobs
|
||||
|
||||
- `install-podman-driver.nomad` - 安装 Podman 驱动
|
||||
- `nomad-consul-config.nomad` - Nomad-Consul 配置
|
||||
- `nomad-consul-setup.nomad` - Nomad-Consul 设置
|
||||
- `nomad-nfs-volume.nomad` - NFS 卷配置
|
||||
|
|
@ -16,7 +16,7 @@ job "nomad-consul-config" {
|
|||
command = "sh"
|
||||
args = [
|
||||
"-c",
|
||||
"sed -i '/^consul {/,/^}/c\\consul {\\n address = \"master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = true\\n client_auto_join = false\\n}' /etc/nomad.d/nomad.hcl && systemctl restart nomad"
|
||||
"sed -i '/^consul {/,/^}/c\\consul {\\n address = \"ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = true\\n client_auto_join = false\\n}' /etc/nomad.d/nomad.hcl && systemctl restart nomad"
|
||||
]
|
||||
}
|
||||
|
||||
|
|
@ -31,7 +31,7 @@ job "nomad-consul-config" {
|
|||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "regexp"
|
||||
value = "master|ash3c|browser|influxdb1|hcp1|warden"
|
||||
value = "ch4|ash3c|browser|influxdb1|hcp1|warden"
|
||||
}
|
||||
|
||||
task "update-nomad-config" {
|
||||
|
|
@ -41,7 +41,7 @@ job "nomad-consul-config" {
|
|||
command = "sh"
|
||||
args = [
|
||||
"-c",
|
||||
"sed -i '/^consul {/,/^}/c\\consul {\\n address = \"master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = false\\n client_auto_join = true\\n}' /etc/nomad.d/nomad.hcl && systemctl restart nomad"
|
||||
"sed -i '/^consul {/,/^}/c\\consul {\\n address = \"ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = false\\n client_auto_join = true\\n}' /etc/nomad.d/nomad.hcl && systemctl restart nomad"
|
||||
]
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,23 @@
|
|||
job "nomad-consul-setup" {
|
||||
datacenters = ["dc1"]
|
||||
type = "system"
|
||||
|
||||
group "nomad-config" {
|
||||
task "setup-consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "sh"
|
||||
args = [
|
||||
"-c",
|
||||
"if grep -q 'server.*enabled.*true' /etc/nomad.d/nomad.hcl; then sed -i '/^consul {/,/^}/c\\consul {\\n address = \"ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = true\\n client_auto_join = false\\n}' /etc/nomad.d/nomad.hcl; else sed -i '/^consul {/,/^}/c\\consul {\\n address = \"ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = false\\n client_auto_join = true\\n}' /etc/nomad.d/nomad.hcl; fi && systemctl restart nomad"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
# Traefik 配置
|
||||
|
||||
## 部署
|
||||
|
||||
```bash
|
||||
nomad job run components/traefik/jobs/traefik.nomad
|
||||
```
|
||||
|
||||
## 配置特点
|
||||
|
||||
- 明确绑定 Tailscale IP (100.97.62.111)
|
||||
- 地理位置优化的 Consul 集群顺序(北京 → 韩国 → 美国)
|
||||
- 适合跨太平洋网络的宽松健康检查
|
||||
- 无服务健康检查,避免 flapping
|
||||
|
||||
## 访问方式
|
||||
|
||||
- Dashboard: `http://hcp1.tailnet-68f9.ts.net:8080/dashboard/`
|
||||
- 直接 IP: `http://100.97.62.111:8080/dashboard/`
|
||||
- Consul LB: `http://hcp1.tailnet-68f9.ts.net:80`
|
||||
|
||||
## 故障排除
|
||||
|
||||
如果遇到服务 flapping 问题:
|
||||
1. 检查是否使用了 RFC1918 私有地址
|
||||
2. 确认 Tailscale 网络连通性
|
||||
3. 调整健康检查间隔时间
|
||||
4. 考虑地理位置对网络延迟的影响
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
job "test-simple" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "test" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
task "test" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "sleep"
|
||||
args = ["3600"]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 64
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,213 @@
|
|||
job "traefik-cloudflare-v1" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "https" {
|
||||
static = 443
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
http:
|
||||
redirections:
|
||||
entrypoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
permanent: true
|
||||
websecure:
|
||||
address: "0.0.0.0:443"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
defaultRule: "Host(`{{ .Name }}.git4ta.me`)"
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
certificatesResolvers:
|
||||
cloudflare:
|
||||
acme:
|
||||
email: houzhongxu.houzhongxu@gmail.com
|
||||
storage: /local/acme.json
|
||||
dnsChallenge:
|
||||
provider: cloudflare
|
||||
delayBeforeCheck: 30s
|
||||
resolvers:
|
||||
- "1.1.1.1:53"
|
||||
- "1.0.0.1:53"
|
||||
|
||||
log:
|
||||
level: DEBUG
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:4646" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:4646" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8200" # 韩国,活跃节点
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8200" # 美国,备用节点
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,备用节点
|
||||
healthCheck:
|
||||
path: "/v1/sys/health"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git4ta.me`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "Host(`traefik.git4ta.me`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- dashboard_redirect@internal
|
||||
- dashboard_stripprefix@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git4ta.me`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git4ta.me`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git4ta.me`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
CLOUDFLARE_EMAIL=houzhongxu.houzhongxu@gmail.com
|
||||
CLOUDFLARE_DNS_API_TOKEN=HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3
|
||||
CLOUDFLARE_ZONE_API_TOKEN=HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3
|
||||
EOF
|
||||
destination = "local/cloudflare.env"
|
||||
env = true
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,217 @@
|
|||
job "traefik-consul-kv" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
metrics:
|
||||
prometheus:
|
||||
addEntryPointsLabels: true
|
||||
addServicesLabels: true
|
||||
addRoutersLabels: true
|
||||
|
||||
log:
|
||||
level: INFO
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
|
||||
traefik-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/traefik"
|
||||
|
||||
nomad-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/nomad"
|
||||
|
||||
consul-redirect:
|
||||
redirectRegex:
|
||||
regex: "^/consul/?$"
|
||||
replacement: "/consul/ui/"
|
||||
permanent: false
|
||||
|
||||
nomad-redirect:
|
||||
redirectRegex:
|
||||
regex: "^/nomad/?$"
|
||||
replacement: "/nomad/ui/"
|
||||
permanent: false
|
||||
|
||||
traefik-redirect:
|
||||
redirectRegex:
|
||||
regex: "^/traefik/?$"
|
||||
replacement: "/traefik/dashboard/"
|
||||
permanent: false
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch2.tailnet-68f9.ts.net:4646" # Nomad server leader
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-redirect:
|
||||
rule: "Path(`/consul`) || Path(`/consul/`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-redirect
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 100
|
||||
|
||||
consul-ui:
|
||||
rule: "PathPrefix(`/consul/ui`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 5
|
||||
|
||||
consul-api:
|
||||
rule: "PathPrefix(`/consul/v1`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 5
|
||||
|
||||
traefik-api:
|
||||
rule: "PathPrefix(`/traefik/api`)"
|
||||
service: api@internal
|
||||
middlewares:
|
||||
- traefik-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 6
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "PathPrefix(`/traefik/dashboard`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- traefik-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 5
|
||||
|
||||
traefik-redirect:
|
||||
rule: "Path(`/traefik`) || Path(`/traefik/`)"
|
||||
middlewares:
|
||||
- "traefik-redirect"
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 100
|
||||
|
||||
nomad-redirect:
|
||||
rule: "Path(`/nomad`) || Path(`/nomad/`)"
|
||||
service: nomad-cluster
|
||||
middlewares:
|
||||
- nomad-redirect
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 100
|
||||
|
||||
nomad-ui:
|
||||
rule: "PathPrefix(`/nomad/ui`)"
|
||||
service: nomad-cluster
|
||||
middlewares:
|
||||
- nomad-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 5
|
||||
|
||||
nomad-api:
|
||||
rule: "PathPrefix(`/nomad/v1`)"
|
||||
service: nomad-cluster
|
||||
middlewares:
|
||||
- nomad-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
priority: 5
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,150 @@
|
|||
job "traefik-consul-lb" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
update {
|
||||
min_healthy_time = "60s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = false
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "hcp1.tailnet-68f9.ts.net:80"
|
||||
traefik:
|
||||
address: "100.97.62.111:8080"
|
||||
|
||||
providers:
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
metrics:
|
||||
prometheus:
|
||||
addEntryPointsLabels: true
|
||||
addServicesLabels: true
|
||||
addRoutersLabels: true
|
||||
|
||||
log:
|
||||
level: INFO
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
|
||||
traefik-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/traefik"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "PathPrefix(`/consul`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "PathPrefix(`/traefik`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- traefik-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-lb"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
name = "consul-lb-health"
|
||||
type = "http"
|
||||
path = "/consul/v1/status/leader"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
|
||||
service {
|
||||
name = "traefik-dashboard"
|
||||
port = "traefik"
|
||||
|
||||
check {
|
||||
name = "traefik-dashboard-health"
|
||||
type = "http"
|
||||
path = "/api/rawdata"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,40 @@
|
|||
job "traefik-no-service" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--api.dashboard=true",
|
||||
"--api.insecure=true",
|
||||
"--providers.file.directory=/tmp",
|
||||
"--entrypoints.web.address=:80"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 200
|
||||
memory = 128
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,68 @@
|
|||
job "traefik-simple" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
|
||||
log:
|
||||
level: INFO
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -11,9 +11,9 @@ job "traefik-consul-lb" {
|
|||
}
|
||||
|
||||
update {
|
||||
min_healthy_time = "5s"
|
||||
healthy_deadline = "10m"
|
||||
progress_deadline = "15m"
|
||||
min_healthy_time = "60s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = false
|
||||
}
|
||||
|
||||
|
|
@ -56,6 +56,12 @@ providers:
|
|||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
metrics:
|
||||
prometheus:
|
||||
addEntryPointsLabels: true
|
||||
addServicesLabels: true
|
||||
addRoutersLabels: true
|
||||
|
||||
log:
|
||||
level: INFO
|
||||
EOF
|
||||
|
|
@ -65,13 +71,24 @@ EOF
|
|||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
|
||||
traefik-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/traefik"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,优先
|
||||
- url: "http://master.tailnet-68f9.ts.net:8500" # 备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 备用
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
|
|
@ -79,8 +96,18 @@ http:
|
|||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "PathPrefix(`/`)"
|
||||
rule: "PathPrefix(`/consul`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "PathPrefix(`/traefik`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- traefik-stripprefix
|
||||
entryPoints:
|
||||
- web
|
||||
EOF
|
||||
|
|
@ -92,6 +119,32 @@ EOF
|
|||
memory = 512
|
||||
}
|
||||
|
||||
service {
|
||||
name = "consul-lb"
|
||||
port = "http"
|
||||
|
||||
check {
|
||||
name = "consul-lb-health"
|
||||
type = "http"
|
||||
path = "/consul/v1/status/leader"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
|
||||
service {
|
||||
name = "traefik-dashboard"
|
||||
port = "traefik"
|
||||
|
||||
check {
|
||||
name = "traefik-dashboard-health"
|
||||
type = "http"
|
||||
path = "/api/rawdata"
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,7 @@
|
|||
# Vault 配置
|
||||
|
||||
## Jobs
|
||||
|
||||
- `vault-cluster-exec.nomad` - Vault 集群 (exec 驱动)
|
||||
- `vault-cluster-podman.nomad` - Vault 集群 (podman 驱动)
|
||||
- `vault-dev-warden.nomad` - Vault 开发环境
|
||||
|
|
@ -2,7 +2,7 @@ job "vault-cluster-exec" {
|
|||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "vault-master" {
|
||||
group "vault-ch4" {
|
||||
count = 1
|
||||
|
||||
# 使用存在的属性替代consul版本检查
|
||||
|
|
@ -14,7 +14,7 @@ job "vault-cluster-exec" {
|
|||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "kr-master"
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
network {
|
||||
|
|
@ -0,0 +1,241 @@
|
|||
job "vault-cluster-nomad" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "vault-ch4" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
|
||||
group "vault-ash3c" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "ash3c"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
|
||||
group "vault-warden" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,157 @@
|
|||
job "vault" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
# 约束只在 warden、ch4、ash3c 节点上运行
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "regexp"
|
||||
value = "^(warden|ch4|ash3c)$"
|
||||
}
|
||||
|
||||
group "vault" {
|
||||
count = 3
|
||||
|
||||
# 确保每个节点只运行一个实例
|
||||
constraint {
|
||||
operator = "distinct_hosts"
|
||||
value = "true"
|
||||
}
|
||||
|
||||
# 网络配置
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
# 服务发现配置 - 包含版本信息
|
||||
service {
|
||||
name = "vault"
|
||||
port = "http"
|
||||
|
||||
# 添加版本标签以避免检查拒绝
|
||||
tags = [
|
||||
"vault",
|
||||
"secrets",
|
||||
"version:1.20.3"
|
||||
]
|
||||
|
||||
check {
|
||||
name = "vault-health"
|
||||
type = "http"
|
||||
path = "/v1/sys/health"
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
method = "GET"
|
||||
|
||||
}
|
||||
|
||||
# 健康检查配置
|
||||
check {
|
||||
name = "vault-sealed-check"
|
||||
type = "script"
|
||||
command = "/bin/sh"
|
||||
args = ["-c", "vault status -format=json | jq -r '.sealed' | grep -q 'false'"]
|
||||
interval = "30s"
|
||||
timeout = "5s"
|
||||
task = "vault"
|
||||
}
|
||||
}
|
||||
|
||||
# 任务配置
|
||||
task "vault" {
|
||||
driver = "raw_exec"
|
||||
|
||||
# 资源配置
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
# 环境变量
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 模板配置 - Vault 配置文件
|
||||
template {
|
||||
data = <<EOF
|
||||
ui = true
|
||||
|
||||
storage "consul" {
|
||||
address = "127.0.0.1:8500"
|
||||
path = "vault"
|
||||
}
|
||||
|
||||
# HTTP listener (不使用 TLS,因为 nomad 会处理负载均衡)
|
||||
listener "tcp" {
|
||||
address = "0.0.0.0:8200"
|
||||
tls_disable = 1
|
||||
}
|
||||
|
||||
# 禁用 mlock 以避免权限问题
|
||||
disable_mlock = true
|
||||
|
||||
# 日志配置
|
||||
log_level = "INFO"
|
||||
log_format = "json"
|
||||
|
||||
# 性能优化
|
||||
max_lease_ttl = "168h"
|
||||
default_lease_ttl = "24h"
|
||||
|
||||
# HA 配置
|
||||
ha_storage "consul" {
|
||||
address = "127.0.0.1:8500"
|
||||
path = "vault"
|
||||
}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
# 启动命令
|
||||
config {
|
||||
command = "/usr/bin/vault"
|
||||
args = [
|
||||
"agent",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
# 重启策略
|
||||
restart {
|
||||
attempts = 3
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
# 更新策略
|
||||
update {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
# 迁移策略
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,213 @@
|
|||
job "traefik-cloudflare-v1" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "https" {
|
||||
static = 443
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
host_network = "tailscale0"
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
http:
|
||||
redirections:
|
||||
entrypoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
permanent: true
|
||||
websecure:
|
||||
address: "0.0.0.0:443"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
defaultRule: "Host(`{{ .Name }}.git4ta.me`)"
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
certificatesResolvers:
|
||||
cloudflare:
|
||||
acme:
|
||||
email: houzhongxu.houzhongxu@gmail.com
|
||||
storage: /local/acme.json
|
||||
dnsChallenge:
|
||||
provider: cloudflare
|
||||
delayBeforeCheck: 30s
|
||||
resolvers:
|
||||
- "1.1.1.1:53"
|
||||
- "1.0.0.1:53"
|
||||
|
||||
log:
|
||||
level: DEBUG
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:4646" # 北京,优先
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:4646" # 韩国,备用
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,备用
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8200" # 韩国,活跃节点
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8200" # 美国,备用节点
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,备用节点
|
||||
healthCheck:
|
||||
path: "/v1/sys/health"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git4ta.me`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "Host(`traefik.git4ta.me`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- dashboard_redirect@internal
|
||||
- dashboard_stripprefix@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git4ta.me`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git4ta.me`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git4ta.me`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
CLOUDFLARE_EMAIL=houzhongxu.houzhongxu@gmail.com
|
||||
CLOUDFLARE_DNS_API_TOKEN=HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3
|
||||
CLOUDFLARE_ZONE_API_TOKEN=HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3
|
||||
EOF
|
||||
destination = "local/cloudflare.env"
|
||||
env = true
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,241 @@
|
|||
job "vault-cluster-nomad" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "vault-ch4" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
|
||||
group "vault-ash3c" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "ash3c"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
|
||||
group "vault-warden" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8200
|
||||
to = 8200
|
||||
}
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "exec"
|
||||
|
||||
consul {
|
||||
namespace = "default"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
env {
|
||||
VAULT_ADDR = "http://127.0.0.1:8200"
|
||||
}
|
||||
|
||||
# 从 consul 读取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
{{ key "vault/config" }}
|
||||
EOF
|
||||
destination = "local/vault.hcl"
|
||||
perms = "644"
|
||||
wait {
|
||||
min = "2s"
|
||||
max = "10s"
|
||||
}
|
||||
}
|
||||
|
||||
config {
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/local/vault.hcl"
|
||||
]
|
||||
}
|
||||
|
||||
restart {
|
||||
attempts = 2
|
||||
interval = "30m"
|
||||
delay = "15s"
|
||||
mode = "fail"
|
||||
}
|
||||
}
|
||||
|
||||
update {
|
||||
max_parallel = 3
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
progress_deadline = "10m"
|
||||
auto_revert = true
|
||||
canary = 0
|
||||
}
|
||||
|
||||
migrate {
|
||||
max_parallel = 1
|
||||
health_check = "checks"
|
||||
min_healthy_time = "10s"
|
||||
healthy_deadline = "5m"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
job "waypoint-server" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "waypoint" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 9701
|
||||
}
|
||||
port "grpc" {
|
||||
static = 9702
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
task "waypoint" {
|
||||
driver = "raw_exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/waypoint"
|
||||
|
||||
args = [
|
||||
"server", "run",
|
||||
"-accept-tos",
|
||||
"-vvv",
|
||||
"-db=/opt/waypoint/waypoint.db",
|
||||
"-listen-grpc=0.0.0.0:9702",
|
||||
"-listen-http=0.0.0.0:9701"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
|
||||
env {
|
||||
WAYPOINT_LOG_LEVEL = "DEBUG"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1 @@
|
|||
CF Token: 0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr
|
||||
|
|
@ -0,0 +1,95 @@
|
|||
#!/bin/bash
|
||||
|
||||
# Nomad ARMv7 自动编译脚本
|
||||
# 适用于 onecloud1 节点
|
||||
|
||||
set -e
|
||||
|
||||
echo "🚀 开始编译 Nomad ARMv7 版本..."
|
||||
|
||||
# 检查系统架构
|
||||
ARCH=$(uname -m)
|
||||
echo "📋 当前系统架构: $ARCH"
|
||||
|
||||
# 设置Go环境变量
|
||||
export GOOS=linux
|
||||
export GOARCH=arm
|
||||
export GOARM=7
|
||||
export CGO_ENABLED=0
|
||||
|
||||
echo "🔧 设置编译环境:"
|
||||
echo " GOOS=$GOOS"
|
||||
echo " GOARCH=$GOARCH"
|
||||
echo " GOARM=$GOARM"
|
||||
echo " CGO_ENABLED=$CGO_ENABLED"
|
||||
|
||||
# 检查Go版本
|
||||
if ! command -v go &> /dev/null; then
|
||||
echo "❌ Go未安装,正在安装..."
|
||||
# 安装Go (假设是Ubuntu/Debian系统)
|
||||
sudo apt update
|
||||
sudo apt install -y golang-go
|
||||
fi
|
||||
|
||||
GO_VERSION=$(go version)
|
||||
echo "✅ Go版本: $GO_VERSION"
|
||||
|
||||
# 创建编译目录
|
||||
BUILD_DIR="/tmp/nomad-build"
|
||||
mkdir -p $BUILD_DIR
|
||||
cd $BUILD_DIR
|
||||
|
||||
echo "📥 克隆 Nomad 源码..."
|
||||
if [ -d "nomad" ]; then
|
||||
echo "🔄 更新现有仓库..."
|
||||
cd nomad
|
||||
git pull
|
||||
else
|
||||
git clone https://github.com/hashicorp/nomad.git
|
||||
cd nomad
|
||||
fi
|
||||
|
||||
# 切换到最新稳定版本
|
||||
echo "🏷️ 切换到最新稳定版本..."
|
||||
git checkout $(git describe --tags --abbrev=0)
|
||||
|
||||
# 编译
|
||||
echo "🔨 开始编译..."
|
||||
make dev
|
||||
|
||||
# 检查编译结果
|
||||
if [ -f "bin/nomad" ]; then
|
||||
echo "✅ 编译成功!"
|
||||
|
||||
# 显示文件信息
|
||||
file bin/nomad
|
||||
ls -lh bin/nomad
|
||||
|
||||
# 备份现有Nomad
|
||||
if [ -f "/usr/bin/nomad" ]; then
|
||||
echo "💾 备份现有Nomad..."
|
||||
sudo cp /usr/bin/nomad /usr/bin/nomad.backup.$(date +%Y%m%d-%H%M%S)
|
||||
fi
|
||||
|
||||
# 安装新版本
|
||||
echo "📦 安装新版本..."
|
||||
sudo cp bin/nomad /usr/bin/nomad
|
||||
sudo chmod +x /usr/bin/nomad
|
||||
|
||||
# 验证安装
|
||||
echo "🔍 验证安装..."
|
||||
/usr/bin/nomad version
|
||||
|
||||
echo "🎉 Nomad ARMv7 版本安装完成!"
|
||||
|
||||
else
|
||||
echo "❌ 编译失败!"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 清理
|
||||
echo "🧹 清理编译文件..."
|
||||
cd /
|
||||
rm -rf $BUILD_DIR
|
||||
|
||||
echo "✨ 完成!"
|
||||
|
|
@ -2,10 +2,25 @@ job "consul-cluster-nomad" {
|
|||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "consul-master" {
|
||||
group "consul-ch4" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "master"
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
|
|
@ -16,18 +31,18 @@ job "consul-cluster-nomad" {
|
|||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-bootstrap-expect=2",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.117.106.136",
|
||||
"-advertise=100.117.106.136",
|
||||
"-retry-join=100.116.80.94",
|
||||
"-retry-join=100.122.197.112",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
"-serf-wan-port=8302",
|
||||
]
|
||||
}
|
||||
|
||||
|
|
@ -45,6 +60,21 @@ job "consul-cluster-nomad" {
|
|||
value = "ash3c"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
|
|
@ -53,13 +83,12 @@ job "consul-cluster-nomad" {
|
|||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.116.80.94",
|
||||
"-advertise=100.116.80.94",
|
||||
"-retry-join=100.117.106.136",
|
||||
"-retry-join=100.122.197.112",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
|
|
@ -82,6 +111,21 @@ job "consul-cluster-nomad" {
|
|||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
|
|
@ -90,13 +134,12 @@ job "consul-cluster-nomad" {
|
|||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind=100.122.197.112",
|
||||
"-advertise=100.122.197.112",
|
||||
"-retry-join=100.117.106.136",
|
||||
"-retry-join=100.116.80.94",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
|
|
|
|||
|
|
@ -0,0 +1,158 @@
|
|||
job "consul-cluster-nomad" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "consul-ch4" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "ch4"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-ash3c" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "ash3c"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=warden.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
group "consul-warden" {
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
port "http" {
|
||||
static = 8500
|
||||
}
|
||||
port "server" {
|
||||
static = 8300
|
||||
}
|
||||
port "serf-lan" {
|
||||
static = 8301
|
||||
}
|
||||
port "serf-wan" {
|
||||
static = 8302
|
||||
}
|
||||
}
|
||||
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-data-dir=/opt/nomad/data/consul",
|
||||
"-client=0.0.0.0",
|
||||
"-bind={{ env "NOMAD_IP_http" }}",
|
||||
"-advertise={{ env "NOMAD_IP_http" }}",
|
||||
"-retry-join=ch4.tailnet-68f9.ts.net:8301",
|
||||
"-retry-join=ash3c.tailnet-68f9.ts.net:8301",
|
||||
"-ui",
|
||||
"-http-port=8500",
|
||||
"-server-port=8300",
|
||||
"-serf-lan-port=8301",
|
||||
"-serf-wan-port=8302"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 300
|
||||
memory = 512
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
job "juicefs-controller" {
|
||||
datacenters = ["dc1"]
|
||||
type = "system"
|
||||
|
||||
group "controller" {
|
||||
task "plugin" {
|
||||
driver = "podman"
|
||||
|
||||
config {
|
||||
image = "juicedata/juicefs-csi-driver:v0.14.1"
|
||||
args = [
|
||||
"--endpoint=unix://csi/csi.sock",
|
||||
"--logtostderr",
|
||||
"--nodeid=${node.unique.id}",
|
||||
"--v=5",
|
||||
"--by-process=true"
|
||||
]
|
||||
privileged = true
|
||||
}
|
||||
|
||||
csi_plugin {
|
||||
id = "juicefs-nfs"
|
||||
type = "controller"
|
||||
mount_dir = "/csi"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 512
|
||||
}
|
||||
|
||||
env {
|
||||
POD_NAME = "csi-controller"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
job "juicefs-csi-controller" {
|
||||
datacenters = ["dc1"]
|
||||
type = "system"
|
||||
|
||||
group "controller" {
|
||||
task "juicefs-csi-driver" {
|
||||
driver = "podman"
|
||||
|
||||
config {
|
||||
image = "juicedata/juicefs-csi-driver:v0.14.1"
|
||||
args = [
|
||||
"--endpoint=unix://csi/csi.sock",
|
||||
"--logtostderr",
|
||||
"--nodeid=${node.unique.id}",
|
||||
"--v=5"
|
||||
]
|
||||
privileged = true
|
||||
}
|
||||
|
||||
env {
|
||||
POD_NAME = "juicefs-csi-controller"
|
||||
POD_NAMESPACE = "default"
|
||||
NODE_NAME = "${node.unique.id}"
|
||||
}
|
||||
|
||||
csi_plugin {
|
||||
id = "juicefs0"
|
||||
type = "controller"
|
||||
mount_dir = "/csi"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1,23 +0,0 @@
|
|||
job "nomad-consul-setup" {
|
||||
datacenters = ["dc1"]
|
||||
type = "system"
|
||||
|
||||
group "nomad-config" {
|
||||
task "setup-consul" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "sh"
|
||||
args = [
|
||||
"-c",
|
||||
"if grep -q 'server.*enabled.*true' /etc/nomad.d/nomad.hcl; then sed -i '/^consul {/,/^}/c\\consul {\\n address = \"master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = true\\n client_auto_join = false\\n}' /etc/nomad.d/nomad.hcl; else sed -i '/^consul {/,/^}/c\\consul {\\n address = \"master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500\"\\n server_service_name = \"nomad\"\\n client_service_name = \"nomad-client\"\\n auto_advertise = true\\n server_auto_join = false\\n client_auto_join = true\\n}' /etc/nomad.d/nomad.hcl; fi && systemctl restart nomad"
|
||||
]
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 100
|
||||
memory = 128
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
# NFS CSI Volume Definition for Nomad
|
||||
# 这个文件定义了CSI volume,让NFS存储能在Nomad UI中显示
|
||||
|
||||
volume "nfs-shared-csi" {
|
||||
type = "csi"
|
||||
|
||||
# CSI plugin名称
|
||||
source = "csi-nfs"
|
||||
|
||||
# 容量设置
|
||||
capacity_min = "1GiB"
|
||||
capacity_max = "10TiB"
|
||||
|
||||
# 访问模式 - 支持多节点读写
|
||||
access_mode = "multi-node-multi-writer"
|
||||
|
||||
# 挂载选项
|
||||
mount_options {
|
||||
fs_type = "nfs4"
|
||||
mount_flags = "rw,relatime,vers=4.2"
|
||||
}
|
||||
|
||||
# 拓扑约束 - 确保在有NFS挂载的节点上运行
|
||||
topology_request {
|
||||
required {
|
||||
topology {
|
||||
"node" = "{{ range $node := nomadNodes }}{{ if eq $node.Status "ready" }}{{ $node.Name }}{{ end }}{{ end }}"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# 卷参数
|
||||
parameters {
|
||||
server = "snail"
|
||||
share = "/fs/1000/nfs/Fnsync"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
# Dynamic Host Volume Definition for NFS
|
||||
# 这个文件定义了动态host volume,让NFS存储能在Nomad UI中显示
|
||||
|
||||
volume "nfs-shared-dynamic" {
|
||||
type = "host"
|
||||
|
||||
# 使用动态host volume
|
||||
source = "fnsync"
|
||||
|
||||
# 只读设置
|
||||
read_only = false
|
||||
|
||||
# 容量信息(用于显示)
|
||||
capacity_min = "1GiB"
|
||||
capacity_max = "10TiB"
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
# NFS Host Volume Definition for Nomad UI
|
||||
# 这个文件定义了host volume,让NFS存储能在Nomad UI中显示
|
||||
|
||||
volume "nfs-shared-host" {
|
||||
type = "host"
|
||||
|
||||
# 使用host volume
|
||||
source = "fnsync"
|
||||
|
||||
# 只读设置
|
||||
read_only = false
|
||||
|
||||
# 容量信息(用于显示)
|
||||
capacity_min = "1GiB"
|
||||
capacity_max = "10TiB"
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,123 @@
|
|||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
authentik-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch2.tailnet-68f9.ts.net:4646" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:4646" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,单节点
|
||||
healthCheck:
|
||||
path: "/ui/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
authentik-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://authentik.tailnet-68f9.ts.net:9443" # Authentik容器HTTPS端口
|
||||
serversTransport: authentik-insecure
|
||||
healthCheck:
|
||||
path: "/flows/-/default/authentication/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git4ta.tech`)"
|
||||
service: consul-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
|
||||
consul-ui:
|
||||
rule: "Host(`consul.git-4ta.live`) && PathPrefix(`/ui`)"
|
||||
service: consul-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-api:
|
||||
rule: "Host(`nomad.git-4ta.live`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git-4ta.live`) && PathPrefix(`/ui`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git-4ta.live`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git-4ta.live`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
authentik-ui:
|
||||
rule: "Host(`authentik1.git-4ta.live`)"
|
||||
service: authentik-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
|
@ -0,0 +1,254 @@
|
|||
job "traefik-cloudflare-v2" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
volume "traefik-certs" {
|
||||
type = "host"
|
||||
read_only = false
|
||||
source = "traefik-certs"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
}
|
||||
port "https" {
|
||||
static = 443
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
env {
|
||||
CLOUDFLARE_EMAIL = "houzhongxu.houzhongxu@gmail.com"
|
||||
CLOUDFLARE_DNS_API_TOKEN = "HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3"
|
||||
CLOUDFLARE_ZONE_API_TOKEN = "HYT-cfZTP_jq6Xd9g3tpFMwxopOyIrf8LZpmGAI3"
|
||||
}
|
||||
|
||||
volume_mount {
|
||||
volume = "traefik-certs"
|
||||
destination = "/opt/traefik/certs"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
debug: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
http:
|
||||
redirections:
|
||||
entrypoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
permanent: true
|
||||
websecure:
|
||||
address: "0.0.0.0:443"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
defaultRule: "Host(`{{ .Name }}.git-4ta.live`)"
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
certificatesResolvers:
|
||||
cloudflare:
|
||||
acme:
|
||||
email: {{ env "CLOUDFLARE_EMAIL" }}
|
||||
storage: /opt/traefik/certs/acme.json
|
||||
dnsChallenge:
|
||||
provider: cloudflare
|
||||
delayBeforeCheck: 30s
|
||||
resolvers:
|
||||
- "1.1.1.1:53"
|
||||
- "1.0.0.1:53"
|
||||
|
||||
log:
|
||||
level: DEBUG
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
authentik-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch2.tailnet-68f9.ts.net:4646" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:4646" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,单节点
|
||||
healthCheck:
|
||||
path: "/ui/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
authentik-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://authentik.tailnet-68f9.ts.net:9443" # Authentik容器HTTPS端口
|
||||
serversTransport: authentik-insecure
|
||||
healthCheck:
|
||||
path: "/flows/-/default/authentication/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git-4ta.live`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "Host(`traefik.git-4ta.live`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- dashboard_redirect@internal
|
||||
- dashboard_stripprefix@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git-4ta.live`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git-4ta.live`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git-4ta.live`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
authentik-ui:
|
||||
rule: "Host(`authentik.git-4ta.live`)"
|
||||
service: authentik-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
CLOUDFLARE_EMAIL={{ env "CLOUDFLARE_EMAIL" }}
|
||||
CLOUDFLARE_DNS_API_TOKEN={{ env "CLOUDFLARE_DNS_API_TOKEN" }}
|
||||
CLOUDFLARE_ZONE_API_TOKEN={{ env "CLOUDFLARE_ZONE_API_TOKEN" }}
|
||||
EOF
|
||||
destination = "local/cloudflare.env"
|
||||
env = true
|
||||
}
|
||||
|
||||
# 测试证书权限控制
|
||||
template {
|
||||
data = "-----BEGIN CERTIFICATE-----\nTEST CERTIFICATE FOR PERMISSION CONTROL\n-----END CERTIFICATE-----"
|
||||
destination = "/opt/traefik/certs/test-cert.pem"
|
||||
perms = 600
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,239 @@
|
|||
job "traefik-cloudflare-v2" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
volume "traefik-certs" {
|
||||
type = "host"
|
||||
read_only = false
|
||||
source = "traefik-certs"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
}
|
||||
port "https" {
|
||||
static = 443
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
volume_mount {
|
||||
volume = "traefik-certs"
|
||||
destination = "/opt/traefik/certs"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
http:
|
||||
redirections:
|
||||
entrypoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
permanent: true
|
||||
websecure:
|
||||
address: "0.0.0.0:443"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
defaultRule: "Host(`{{ .Name }}.git-4ta.live`)"
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
certificatesResolvers:
|
||||
cloudflare:
|
||||
acme:
|
||||
email: houzhongxu.houzhongxu@gmail.com
|
||||
storage: /opt/traefik/certs/acme.json
|
||||
dnsChallenge:
|
||||
provider: cloudflare
|
||||
delayBeforeCheck: 30s
|
||||
resolvers:
|
||||
- "1.1.1.1:53"
|
||||
- "1.0.0.1:53"
|
||||
|
||||
log:
|
||||
level: DEBUG
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
authentik-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch2.tailnet-68f9.ts.net:4646" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:4646" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,单节点
|
||||
healthCheck:
|
||||
path: "/ui/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
authentik-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://authentik.tailnet-68f9.ts.net:9443" # Authentik容器HTTPS端口
|
||||
serversTransport: authentik-insecure
|
||||
healthCheck:
|
||||
path: "/flows/-/default/authentication/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git-4ta.live`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "Host(`traefik.git-4ta.live`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- dashboard_redirect@internal
|
||||
- dashboard_stripprefix@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git-4ta.live`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git-4ta.live`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git-4ta.live`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
authentik-ui:
|
||||
rule: "Host(`authentik.git4ta.tech`)"
|
||||
service: authentik-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
CLOUDFLARE_EMAIL=houzhongxu.houzhongxu@gmail.com
|
||||
CLOUDFLARE_DNS_API_TOKEN=0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr
|
||||
CLOUDFLARE_ZONE_API_TOKEN=0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr
|
||||
EOF
|
||||
destination = "local/cloudflare.env"
|
||||
env = true
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,249 @@
|
|||
job "traefik-cloudflare-v3" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "traefik" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
value = "hcp1"
|
||||
}
|
||||
|
||||
volume "traefik-certs" {
|
||||
type = "host"
|
||||
read_only = false
|
||||
source = "traefik-certs"
|
||||
}
|
||||
|
||||
network {
|
||||
mode = "host"
|
||||
port "http" {
|
||||
static = 80
|
||||
}
|
||||
port "https" {
|
||||
static = 443
|
||||
}
|
||||
port "traefik" {
|
||||
static = 8080
|
||||
}
|
||||
}
|
||||
|
||||
task "traefik" {
|
||||
driver = "exec"
|
||||
|
||||
config {
|
||||
command = "/usr/local/bin/traefik"
|
||||
args = [
|
||||
"--configfile=/local/traefik.yml"
|
||||
]
|
||||
}
|
||||
|
||||
env {
|
||||
CLOUDFLARE_EMAIL = "locksmithknight@gmail.com"
|
||||
CLOUDFLARE_DNS_API_TOKEN = "0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr"
|
||||
CLOUDFLARE_ZONE_API_TOKEN = "0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr"
|
||||
}
|
||||
|
||||
volume_mount {
|
||||
volume = "traefik-certs"
|
||||
destination = "/opt/traefik/certs"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: true
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: "0.0.0.0:80"
|
||||
http:
|
||||
redirections:
|
||||
entrypoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
permanent: true
|
||||
websecure:
|
||||
address: "0.0.0.0:443"
|
||||
traefik:
|
||||
address: "0.0.0.0:8080"
|
||||
|
||||
providers:
|
||||
consulCatalog:
|
||||
endpoint:
|
||||
address: "warden.tailnet-68f9.ts.net:8500"
|
||||
scheme: "http"
|
||||
watch: true
|
||||
exposedByDefault: false
|
||||
prefix: "traefik"
|
||||
defaultRule: "Host(`{{ .Name }}.git-4ta.live`)"
|
||||
file:
|
||||
filename: /local/dynamic.yml
|
||||
watch: true
|
||||
|
||||
certificatesResolvers:
|
||||
cloudflare:
|
||||
acme:
|
||||
email: {{ env "CLOUDFLARE_EMAIL" }}
|
||||
storage: /opt/traefik/certs/acme.json
|
||||
dnsChallenge:
|
||||
provider: cloudflare
|
||||
delayBeforeCheck: 30s
|
||||
|
||||
log:
|
||||
level: DEBUG
|
||||
EOF
|
||||
destination = "local/traefik.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
http:
|
||||
serversTransports:
|
||||
waypoint-insecure:
|
||||
insecureSkipVerify: true
|
||||
authentik-insecure:
|
||||
insecureSkipVerify: true
|
||||
|
||||
middlewares:
|
||||
consul-stripprefix:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- "/consul"
|
||||
waypoint-auth:
|
||||
replacePathRegex:
|
||||
regex: "^/auth/token(.*)$"
|
||||
replacement: "/auth/token$1"
|
||||
|
||||
services:
|
||||
consul-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # 韩国,Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # 北京,Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
nomad-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch2.tailnet-68f9.ts.net:4646" # 韩国,Leader
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:4646" # 美国,Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
waypoint-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://hcp1.tailnet-68f9.ts.net:9701" # hcp1 节点 HTTPS API
|
||||
serversTransport: waypoint-insecure
|
||||
|
||||
vault-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8200" # 北京,单节点
|
||||
healthCheck:
|
||||
path: "/ui/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
authentik-cluster:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "https://authentik.tailnet-68f9.ts.net:9443" # Authentik容器HTTPS端口
|
||||
serversTransport: authentik-insecure
|
||||
healthCheck:
|
||||
path: "/flows/-/default/authentication/"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git-4ta.live`)"
|
||||
service: consul-cluster
|
||||
middlewares:
|
||||
- consul-stripprefix
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-dashboard:
|
||||
rule: "Host(`traefik.git-4ta.live`)"
|
||||
service: dashboard@internal
|
||||
middlewares:
|
||||
- dashboard_redirect@internal
|
||||
- dashboard_stripprefix@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
traefik-api:
|
||||
rule: "Host(`traefik.git-4ta.live`) && PathPrefix(`/api`)"
|
||||
service: api@internal
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
nomad-ui:
|
||||
rule: "Host(`nomad.git-4ta.live`)"
|
||||
service: nomad-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
waypoint-ui:
|
||||
rule: "Host(`waypoint.git-4ta.live`)"
|
||||
service: waypoint-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
vault-ui:
|
||||
rule: "Host(`vault.git-4ta.live`)"
|
||||
service: vault-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
|
||||
authentik-ui:
|
||||
rule: "Host(`authentik1.git-4ta.live`)"
|
||||
service: authentik-cluster
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
EOF
|
||||
destination = "local/dynamic.yml"
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOF
|
||||
CLOUDFLARE_EMAIL=locksmithknight@gmail.com
|
||||
CLOUDFLARE_DNS_API_TOKEN=0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr
|
||||
CLOUDFLARE_ZONE_API_TOKEN=0aPWoLaQ59l0nyL1jIVzZaEx2e41Gjgcfhn3ztJr
|
||||
EOF
|
||||
destination = "local/cloudflare.env"
|
||||
env = true
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 512
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
# Consul Server Configuration for onecloud1
|
||||
datacenter = "dc1"
|
||||
data_dir = "/opt/consul/data"
|
||||
log_level = "INFO"
|
||||
node_name = "onecloud1"
|
||||
bind_addr = "100.98.209.50"
|
||||
|
||||
# Server mode
|
||||
server = true
|
||||
bootstrap_expect = 4
|
||||
|
||||
# Join existing cluster
|
||||
retry_join = [
|
||||
"100.117.106.136", # ch4
|
||||
"100.122.197.112", # warden
|
||||
"100.116.80.94" # ash3c
|
||||
]
|
||||
|
||||
# Performance optimization
|
||||
performance {
|
||||
raft_multiplier = 5
|
||||
}
|
||||
|
||||
# Ports configuration
|
||||
ports {
|
||||
grpc = 8502
|
||||
http = 8500
|
||||
dns = 8600
|
||||
server = 8300
|
||||
serf_lan = 8301
|
||||
serf_wan = 8302
|
||||
}
|
||||
|
||||
# Enable Connect for service mesh
|
||||
connect {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# Cache configuration for performance
|
||||
cache {
|
||||
entry_fetch_max_burst = 42
|
||||
entry_fetch_rate = 30
|
||||
}
|
||||
|
||||
# Node metadata
|
||||
node_meta = {
|
||||
region = "unknown"
|
||||
zone = "nomad-client"
|
||||
}
|
||||
|
||||
# UI enabled for servers
|
||||
ui_config {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# ACL configuration (if needed)
|
||||
acl = {
|
||||
enabled = false
|
||||
default_policy = "allow"
|
||||
}
|
||||
|
||||
# Logging
|
||||
log_file = "/var/log/consul/consul.log"
|
||||
log_rotate_duration = "24h"
|
||||
log_rotate_max_files = 7
|
||||
|
|
@ -0,0 +1,219 @@
|
|||
# 通过 Traefik 连接 Consul 的配置示例
|
||||
|
||||
## 🎯 目标实现
|
||||
让其他节点通过 `consul.git4ta.me` 和 `nomad.git4ta.me` 访问服务,而不是直接连接 IP。
|
||||
|
||||
## ✅ 当前状态验证
|
||||
|
||||
### Consul 智能检测
|
||||
```bash
|
||||
# Leader 检测
|
||||
curl -s https://consul.git4ta.me/v1/status/leader
|
||||
# 返回: "100.117.106.136:8300" (ch4 是 leader)
|
||||
|
||||
# 当前路由节点
|
||||
curl -s https://consul.git4ta.me/v1/agent/self | jq -r '.Config.NodeName'
|
||||
# 返回: "ash3c" (Traefik 路由到 ash3c)
|
||||
```
|
||||
|
||||
### Nomad 智能检测
|
||||
```bash
|
||||
# Leader 检测
|
||||
curl -s https://nomad.git4ta.me/v1/status/leader
|
||||
# 返回: "100.90.159.68:4647" (ch2 是 leader)
|
||||
```
|
||||
|
||||
## 🔧 节点配置示例
|
||||
|
||||
### 1. Consul 客户端配置
|
||||
|
||||
#### 当前配置 (直接连接)
|
||||
```hcl
|
||||
# /etc/consul.d/consul.hcl
|
||||
datacenter = "dc1"
|
||||
node_name = "client-node"
|
||||
|
||||
retry_join = [
|
||||
"warden.tailnet-68f9.ts.net:8301",
|
||||
"ch4.tailnet-68f9.ts.net:8301",
|
||||
"ash3c.tailnet-68f9.ts.net:8301"
|
||||
]
|
||||
```
|
||||
|
||||
#### 新配置 (通过 Traefik)
|
||||
```hcl
|
||||
# /etc/consul.d/consul.hcl
|
||||
datacenter = "dc1"
|
||||
node_name = "client-node"
|
||||
|
||||
# 通过 Traefik 连接 Consul
|
||||
retry_join = ["consul.git4ta.me:8301"]
|
||||
|
||||
# 或者使用 HTTP API
|
||||
addresses {
|
||||
http = "consul.git4ta.me"
|
||||
}
|
||||
|
||||
ports {
|
||||
http = 8301
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Nomad 客户端配置
|
||||
|
||||
#### 当前配置 (直接连接)
|
||||
```hcl
|
||||
# /etc/nomad.d/nomad.hcl
|
||||
consul {
|
||||
address = "http://warden.tailnet-68f9.ts.net:8500"
|
||||
}
|
||||
```
|
||||
|
||||
#### 新配置 (通过 Traefik)
|
||||
```hcl
|
||||
# /etc/nomad.d/nomad.hcl
|
||||
consul {
|
||||
address = "https://consul.git4ta.me:8500"
|
||||
# 或者使用 HTTP
|
||||
# address = "http://consul.git4ta.me:8500"
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Vault 配置
|
||||
|
||||
#### 当前配置 (直接连接)
|
||||
```hcl
|
||||
# Consul KV: vault/config
|
||||
storage "consul" {
|
||||
address = "ch4.tailnet-68f9.ts.net:8500"
|
||||
path = "vault/"
|
||||
}
|
||||
|
||||
service_registration "consul" {
|
||||
address = "ch4.tailnet-68f9.ts.net:8500"
|
||||
service = "vault"
|
||||
}
|
||||
```
|
||||
|
||||
#### 新配置 (通过 Traefik)
|
||||
```hcl
|
||||
# Consul KV: vault/config
|
||||
storage "consul" {
|
||||
address = "consul.git4ta.me:8500"
|
||||
path = "vault/"
|
||||
}
|
||||
|
||||
service_registration "consul" {
|
||||
address = "consul.git4ta.me:8500"
|
||||
service = "vault"
|
||||
}
|
||||
```
|
||||
|
||||
## 🚀 实施步骤
|
||||
|
||||
### 步骤 1: 验证 Traefik 路由
|
||||
```bash
|
||||
# 测试 Consul 路由
|
||||
curl -I https://consul.git4ta.me/v1/status/leader
|
||||
|
||||
# 测试 Nomad 路由
|
||||
curl -I https://nomad.git4ta.me/v1/status/leader
|
||||
```
|
||||
|
||||
### 步骤 2: 更新节点配置
|
||||
```bash
|
||||
# 在目标节点上执行
|
||||
# 备份现有配置
|
||||
cp /etc/consul.d/consul.hcl /etc/consul.d/consul.hcl.backup
|
||||
cp /etc/nomad.d/nomad.hcl /etc/nomad.d/nomad.hcl.backup
|
||||
|
||||
# 修改 Consul 配置
|
||||
sed -i 's/warden\.tailnet-68f9\.ts\.net:8301/consul.git4ta.me:8301/g' /etc/consul.d/consul.hcl
|
||||
sed -i 's/ch4\.tailnet-68f9\.ts\.net:8301/consul.git4ta.me:8301/g' /etc/consul.d/consul.hcl
|
||||
sed -i 's/ash3c\.tailnet-68f9\.ts\.net:8301/consul.git4ta.me:8301/g' /etc/consul.d/consul.hcl
|
||||
|
||||
# 修改 Nomad 配置
|
||||
sed -i 's/warden\.tailnet-68f9\.ts\.net:8500/consul.git4ta.me:8500/g' /etc/nomad.d/nomad.hcl
|
||||
sed -i 's/ch4\.tailnet-68f9\.ts\.net:8500/consul.git4ta.me:8500/g' /etc/nomad.d/nomad.hcl
|
||||
sed -i 's/ash3c\.tailnet-68f9\.ts\.net:8500/consul.git4ta.me:8500/g' /etc/nomad.d/nomad.hcl
|
||||
```
|
||||
|
||||
### 步骤 3: 重启服务
|
||||
```bash
|
||||
# 重启 Consul
|
||||
systemctl restart consul
|
||||
|
||||
# 重启 Nomad
|
||||
systemctl restart nomad
|
||||
|
||||
# 重启 Vault (如果适用)
|
||||
systemctl restart vault
|
||||
```
|
||||
|
||||
### 步骤 4: 验证连接
|
||||
```bash
|
||||
# 检查 Consul 连接
|
||||
consul members
|
||||
|
||||
# 检查 Nomad 连接
|
||||
nomad node status
|
||||
|
||||
# 检查 Vault 连接
|
||||
vault status
|
||||
```
|
||||
|
||||
## 📊 性能对比
|
||||
|
||||
### 延迟测试
|
||||
```bash
|
||||
# 直接连接
|
||||
time curl -s http://ch4.tailnet-68f9.ts.net:8500/v1/status/leader
|
||||
|
||||
# 通过 Traefik
|
||||
time curl -s https://consul.git4ta.me/v1/status/leader
|
||||
```
|
||||
|
||||
### 可靠性测试
|
||||
```bash
|
||||
# 测试故障转移
|
||||
# 1. 停止 ch4 Consul
|
||||
# 2. 检查 Traefik 是否自动路由到其他节点
|
||||
curl -s https://consul.git4ta.me/v1/status/leader
|
||||
```
|
||||
|
||||
## 🎯 优势总结
|
||||
|
||||
### 1. 统一入口
|
||||
- **之前**: 每个节点需要知道所有 Consul/Nomad 节点 IP
|
||||
- **现在**: 只需要知道 `consul.git4ta.me` 和 `nomad.git4ta.me`
|
||||
|
||||
### 2. 自动故障转移
|
||||
- **之前**: 节点需要手动配置多个 IP
|
||||
- **现在**: Traefik 自动路由到健康的节点
|
||||
|
||||
### 3. 简化配置
|
||||
- **之前**: 硬编码 IP 地址,难以维护
|
||||
- **现在**: 使用域名,易于管理和更新
|
||||
|
||||
### 4. 负载均衡
|
||||
- **之前**: 所有请求都到同一个节点
|
||||
- **现在**: Traefik 可以分散请求到多个节点
|
||||
|
||||
## ⚠️ 注意事项
|
||||
|
||||
### 1. 端口映射
|
||||
- **Traefik 外部**: 443 (HTTPS) / 80 (HTTP)
|
||||
- **服务内部**: 8500 (Consul), 4646 (Nomad)
|
||||
- **需要配置**: Traefik 端口转发
|
||||
|
||||
### 2. SSL 证书
|
||||
- **HTTPS**: 需要有效证书
|
||||
- **HTTP**: 可以使用自签名证书
|
||||
|
||||
### 3. 单点故障
|
||||
- **风险**: Traefik 成为单点故障
|
||||
- **缓解**: Traefik 本身也是高可用的
|
||||
|
||||
---
|
||||
|
||||
**结论**: 完全可行!通过 Traefik 统一访问 Consul 和 Nomad 是一个优秀的架构改进,提供了更好的可维护性和可靠性。
|
||||
|
|
@ -0,0 +1,191 @@
|
|||
# Consul 通过 Traefik 连接的配置方案
|
||||
|
||||
## 🎯 目标
|
||||
让所有节点通过 `consul.git4ta.me` 访问 Consul,而不是直接连接 IP 地址。
|
||||
|
||||
## ✅ 可行性验证
|
||||
|
||||
### 测试结果
|
||||
```bash
|
||||
# 通过 Traefik 访问 Consul API
|
||||
curl -s https://consul.git4ta.me/v1/status/leader
|
||||
# 返回: "100.117.106.136:8300" (ch4 是 leader)
|
||||
|
||||
curl -s https://consul.git4ta.me/v1/agent/self | jq -r '.Config.NodeName'
|
||||
# 返回: "warden" (当前路由到的节点)
|
||||
```
|
||||
|
||||
### 优势
|
||||
1. **统一入口**: 所有服务都通过域名访问
|
||||
2. **自动故障转移**: Traefik 自动路由到健康的 Consul 节点
|
||||
3. **简化配置**: 不需要硬编码 IP 地址
|
||||
4. **负载均衡**: 可以分散请求到多个 Consul 节点
|
||||
|
||||
## 🔧 配置方案
|
||||
|
||||
### 方案 1: 修改现有节点配置
|
||||
|
||||
#### Consul 客户端配置
|
||||
```hcl
|
||||
# /etc/consul.d/consul.hcl
|
||||
datacenter = "dc1"
|
||||
node_name = "node-name"
|
||||
|
||||
# 通过 Traefik 连接 Consul
|
||||
retry_join = ["consul.git4ta.me:8500"]
|
||||
|
||||
# 或者使用 HTTP 连接
|
||||
addresses {
|
||||
http = "consul.git4ta.me"
|
||||
https = "consul.git4ta.me"
|
||||
}
|
||||
|
||||
ports {
|
||||
http = 8500
|
||||
https = 8500
|
||||
}
|
||||
```
|
||||
|
||||
#### Nomad 配置
|
||||
```hcl
|
||||
# /etc/nomad.d/nomad.hcl
|
||||
consul {
|
||||
address = "https://consul.git4ta.me:8500"
|
||||
# 或者
|
||||
address = "http://consul.git4ta.me:8500"
|
||||
}
|
||||
```
|
||||
|
||||
#### Vault 配置
|
||||
```hcl
|
||||
# 在 Consul KV vault/config 中
|
||||
storage "consul" {
|
||||
address = "consul.git4ta.me:8500"
|
||||
path = "vault/"
|
||||
}
|
||||
|
||||
service_registration "consul" {
|
||||
address = "consul.git4ta.me:8500"
|
||||
service = "vault"
|
||||
service_tags = "vault-server"
|
||||
}
|
||||
```
|
||||
|
||||
### 方案 2: 创建新的服务发现配置
|
||||
|
||||
#### 在 Traefik 中添加 Consul 服务发现
|
||||
```yaml
|
||||
# 在 dynamic.yml 中添加
|
||||
services:
|
||||
consul-api:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: "http://ch4.tailnet-68f9.ts.net:8500" # Leader
|
||||
- url: "http://warden.tailnet-68f9.ts.net:8500" # Follower
|
||||
- url: "http://ash3c.tailnet-68f9.ts.net:8500" # Follower
|
||||
healthCheck:
|
||||
path: "/v1/status/leader"
|
||||
interval: "30s"
|
||||
timeout: "15s"
|
||||
|
||||
routers:
|
||||
consul-api:
|
||||
rule: "Host(`consul.git4ta.me`)"
|
||||
service: consul-api
|
||||
entryPoints:
|
||||
- websecure
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
```
|
||||
|
||||
## 🚨 注意事项
|
||||
|
||||
### 1. 端口映射
|
||||
- **Traefik 外部端口**: 443 (HTTPS) / 80 (HTTP)
|
||||
- **Consul 内部端口**: 8500
|
||||
- **需要配置**: Traefik 端口转发
|
||||
|
||||
### 2. SSL 证书
|
||||
- **HTTPS**: 需要有效的 SSL 证书
|
||||
- **HTTP**: 可以使用自签名证书或跳过验证
|
||||
|
||||
### 3. 健康检查
|
||||
- **路径**: `/v1/status/leader`
|
||||
- **间隔**: 30秒
|
||||
- **超时**: 15秒
|
||||
|
||||
### 4. 故障转移
|
||||
- **自动切换**: Traefik 会自动路由到健康的节点
|
||||
- **Leader 选举**: Consul 会自动选举新的 leader
|
||||
|
||||
## 🔄 实施步骤
|
||||
|
||||
### 步骤 1: 验证 Traefik 配置
|
||||
```bash
|
||||
# 检查当前 Traefik 是否已配置 Consul 路由
|
||||
curl -I https://consul.git4ta.me/v1/status/leader
|
||||
```
|
||||
|
||||
### 步骤 2: 更新节点配置
|
||||
```bash
|
||||
# 备份现有配置
|
||||
cp /etc/consul.d/consul.hcl /etc/consul.d/consul.hcl.backup
|
||||
|
||||
# 修改配置使用域名
|
||||
sed -i 's/warden\.tailnet-68f9\.ts\.net:8500/consul.git4ta.me:8500/g' /etc/consul.d/consul.hcl
|
||||
```
|
||||
|
||||
### 步骤 3: 重启服务
|
||||
```bash
|
||||
# 重启 Consul
|
||||
systemctl restart consul
|
||||
|
||||
# 重启 Nomad
|
||||
systemctl restart nomad
|
||||
|
||||
# 重启 Vault
|
||||
systemctl restart vault
|
||||
```
|
||||
|
||||
### 步骤 4: 验证连接
|
||||
```bash
|
||||
# 检查 Consul 连接
|
||||
consul members
|
||||
|
||||
# 检查 Nomad 连接
|
||||
nomad node status
|
||||
|
||||
# 检查 Vault 连接
|
||||
vault status
|
||||
```
|
||||
|
||||
## 📊 性能影响
|
||||
|
||||
### 延迟
|
||||
- **直接连接**: ~1-2ms
|
||||
- **通过 Traefik**: ~5-10ms (增加 3-8ms)
|
||||
|
||||
### 吞吐量
|
||||
- **Traefik 限制**: 取决于 Traefik 配置
|
||||
- **建议**: 监控 Traefik 性能指标
|
||||
|
||||
### 可靠性
|
||||
- **提升**: 自动故障转移
|
||||
- **风险**: Traefik 单点故障
|
||||
|
||||
## 🎯 推荐方案
|
||||
|
||||
**建议采用方案 1**,因为:
|
||||
1. **简单直接**: 只需要修改配置文件
|
||||
2. **向后兼容**: 不影响现有功能
|
||||
3. **易于维护**: 统一管理入口
|
||||
|
||||
**实施优先级**:
|
||||
1. ✅ **Traefik 配置** - 已完成
|
||||
2. 🔄 **Consul 客户端** - 需要修改
|
||||
3. 🔄 **Nomad 配置** - 需要修改
|
||||
4. 🔄 **Vault 配置** - 需要修改
|
||||
|
||||
---
|
||||
|
||||
**结论**: 完全可行!通过 Traefik 统一访问 Consul 是一个很好的架构改进。
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
---
|
||||
- name: Clean up Consul configuration from dedicated clients
|
||||
hosts: hcp1,influxdb1,browser
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Stop Consul service
|
||||
systemd:
|
||||
name: consul
|
||||
state: stopped
|
||||
enabled: no
|
||||
|
||||
- name: Disable Consul service
|
||||
systemd:
|
||||
name: consul
|
||||
enabled: no
|
||||
|
||||
- name: Kill any remaining Consul processes
|
||||
shell: |
|
||||
pkill -f consul || true
|
||||
sleep 2
|
||||
pkill -9 -f consul || true
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Remove Consul systemd service file
|
||||
file:
|
||||
path: /etc/systemd/system/consul.service
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul configuration directory
|
||||
file:
|
||||
path: /etc/consul.d
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul data directory
|
||||
file:
|
||||
path: /opt/consul
|
||||
state: absent
|
||||
|
||||
- name: Reload systemd daemon
|
||||
systemd:
|
||||
daemon_reload: yes
|
||||
|
||||
- name: Verify Consul is stopped
|
||||
shell: |
|
||||
if pgrep -f consul; then
|
||||
echo "Consul still running"
|
||||
exit 1
|
||||
else
|
||||
echo "Consul stopped successfully"
|
||||
fi
|
||||
register: consul_status
|
||||
failed_when: consul_status.rc != 0
|
||||
|
||||
- name: Display cleanup status
|
||||
debug:
|
||||
msg: "Consul cleanup completed on {{ inventory_hostname }}"
|
||||
|
|
@ -0,0 +1,55 @@
|
|||
---
|
||||
- name: Configure Consul Auto-Discovery
|
||||
hosts: all
|
||||
become: yes
|
||||
vars:
|
||||
consul_servers:
|
||||
- "warden.tailnet-68f9.ts.net:8301"
|
||||
- "ch4.tailnet-68f9.ts.net:8301"
|
||||
- "ash3c.tailnet-68f9.ts.net:8301"
|
||||
|
||||
tasks:
|
||||
- name: Backup current nomad.hcl
|
||||
copy:
|
||||
src: /etc/nomad.d/nomad.hcl
|
||||
dest: /etc/nomad.d/nomad.hcl.backup.{{ ansible_date_time.epoch }}
|
||||
remote_src: yes
|
||||
backup: yes
|
||||
|
||||
- name: Update Consul configuration for auto-discovery
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} ANSIBLE MANAGED CONSUL CONFIG"
|
||||
block: |
|
||||
consul {
|
||||
retry_join = [
|
||||
"warden.tailnet-68f9.ts.net:8301",
|
||||
"ch4.tailnet-68f9.ts.net:8301",
|
||||
"ash3c.tailnet-68f9.ts.net:8301"
|
||||
]
|
||||
server_service_name = "nomad"
|
||||
client_service_name = "nomad-client"
|
||||
}
|
||||
insertbefore: '^consul \{'
|
||||
replace: '^consul \{.*?\}'
|
||||
|
||||
- name: Restart Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: Wait for Nomad to be ready
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
- name: Verify Consul connection
|
||||
shell: |
|
||||
NOMAD_ADDR=http://localhost:4646 nomad node status | grep -q "ready"
|
||||
register: nomad_ready
|
||||
failed_when: nomad_ready.rc != 0
|
||||
retries: 3
|
||||
delay: 10
|
||||
|
|
@ -0,0 +1,75 @@
|
|||
---
|
||||
- name: Remove Consul configuration from Nomad servers
|
||||
hosts: semaphore,ash1d,ash2e,ch2,ch3,onecloud1,de
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Remove entire Consul configuration block
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} ANSIBLE MANAGED CONSUL CONFIG"
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul configuration lines
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^consul \{'
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul configuration content
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ address ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul service names
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ server_service_name ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul client service name
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ client_service_name ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul auto-advertise
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ auto_advertise ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul server auto-join
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ server_auto_join ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul client auto-join
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ client_auto_join ='
|
||||
state: absent
|
||||
|
||||
- name: Remove Consul closing brace
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^}'
|
||||
state: absent
|
||||
|
||||
- name: Restart Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
|
||||
- name: Wait for Nomad to be ready
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
- name: Display completion message
|
||||
debug:
|
||||
msg: "Removed Consul configuration from {{ inventory_hostname }}"
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
- name: Enable Nomad Client Mode on Servers
|
||||
hosts: ch2,ch3,de
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Enable Nomad client mode
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^client \{'
|
||||
line: 'client {'
|
||||
state: present
|
||||
|
||||
- name: Enable client mode
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ enabled = false'
|
||||
line: ' enabled = true'
|
||||
state: present
|
||||
|
||||
- name: Restart Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
|
||||
- name: Wait for Nomad to be ready
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
- name: Fix all master references to ch4
|
||||
hosts: localhost
|
||||
gather_facts: no
|
||||
vars:
|
||||
files_to_fix:
|
||||
- "scripts/diagnose-consul-sync.sh"
|
||||
- "scripts/register-traefik-to-all-consul.sh"
|
||||
- "deployment/ansible/playbooks/update-nomad-consul-config.yml"
|
||||
- "deployment/ansible/templates/nomad-server.hcl.j2"
|
||||
- "deployment/ansible/templates/nomad-client.hcl"
|
||||
- "deployment/ansible/playbooks/fix-nomad-consul-roles.yml"
|
||||
- "deployment/ansible/onecloud1_nomad.hcl"
|
||||
- "ansible/templates/consul-client.hcl.j2"
|
||||
- "ansible/consul-client-deployment.yml"
|
||||
- "ansible/consul-client-simple.yml"
|
||||
|
||||
tasks:
|
||||
- name: Replace master.tailnet-68f9.ts.net with ch4.tailnet-68f9.ts.net
|
||||
replace:
|
||||
path: "{{ item }}"
|
||||
regexp: 'master\.tailnet-68f9\.ts\.net'
|
||||
replace: 'ch4.tailnet-68f9.ts.net'
|
||||
loop: "{{ files_to_fix }}"
|
||||
when: item is file
|
||||
|
||||
- name: Replace master hostname references
|
||||
replace:
|
||||
path: "{{ item }}"
|
||||
regexp: '\bmaster\b'
|
||||
replace: 'ch4'
|
||||
loop: "{{ files_to_fix }}"
|
||||
when: item is file
|
||||
|
||||
- name: Replace master IP references in comments
|
||||
replace:
|
||||
path: "{{ item }}"
|
||||
regexp: '# master'
|
||||
replace: '# ch4'
|
||||
loop: "{{ files_to_fix }}"
|
||||
when: item is file
|
||||
|
||||
- name: Fix inventory files
|
||||
replace:
|
||||
path: "{{ item }}"
|
||||
regexp: 'master ansible_host=master'
|
||||
replace: 'ch4 ansible_host=ch4'
|
||||
loop:
|
||||
- "deployment/ansible/inventories/production/inventory.ini"
|
||||
- "deployment/ansible/inventories/production/csol-consul-nodes.ini"
|
||||
- "deployment/ansible/inventories/production/nomad-clients.ini"
|
||||
- "deployment/ansible/inventories/production/master-ash3c.ini"
|
||||
- "deployment/ansible/inventories/production/consul-nodes.ini"
|
||||
- "deployment/ansible/inventories/production/vault.ini"
|
||||
|
||||
- name: Fix IP address references (100.117.106.136 comments)
|
||||
replace:
|
||||
path: "{{ item }}"
|
||||
regexp: '100\.117\.106\.136.*# master'
|
||||
replace: '100.117.106.136 # ch4'
|
||||
loop: "{{ files_to_fix }}"
|
||||
when: item is file
|
||||
|
|
@ -72,7 +72,7 @@
|
|||
"description": "Consul客户端节点,用于服务发现和健康检查",
|
||||
"nodes": [
|
||||
{
|
||||
"name": "master",
|
||||
"name": "ch4",
|
||||
"host": "100.117.106.136",
|
||||
"user": "ben",
|
||||
"password": "3131",
|
||||
|
|
|
|||
|
|
@ -2,21 +2,21 @@
|
|||
# 服务器节点 (7个服务器节点)
|
||||
# ⚠️ 警告:能力越大,责任越大!服务器节点操作需极其谨慎!
|
||||
# ⚠️ 任何对服务器节点的操作都可能影响整个集群的稳定性!
|
||||
semaphore ansible_host=semaphore.tailnet-68f9.ts.net ansible_user=root ansible_password=313131 ansible_become_password=313131
|
||||
semaphore ansible_host=semaphore.tailnet-68f9.ts.net ansible_user=root ansible_password=3131 ansible_become_password=3131
|
||||
ash1d ansible_host=ash1d.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
ash2e ansible_host=ash2e.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
ch2 ansible_host=ch2.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
ch3 ansible_host=ch3.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
onecloud1 ansible_host=onecloud1.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
de ansible_host=de.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
hcp1 ansible_host=hcp1.tailnet-68f9.ts.net ansible_user=root ansible_password=3131 ansible_become_password=3131
|
||||
|
||||
[nomad_clients]
|
||||
# 客户端节点
|
||||
master ansible_host=master.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131 ansible_port=60022
|
||||
# 客户端节点 (5个客户端节点)
|
||||
ch4 ansible_host=ch4.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
ash3c ansible_host=ash3c.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
browser ansible_host=browser.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
influxdb1 ansible_host=influxdb1.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
hcp1 ansible_host=hcp1.tailnet-68f9.ts.net ansible_user=root ansible_password=3131 ansible_become_password=3131
|
||||
warden ansible_host=warden.tailnet-68f9.ts.net ansible_user=ben ansible_password=3131 ansible_become_password=3131
|
||||
|
||||
[nomad_nodes:children]
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ ash1d ansible_host=ash1d ansible_user=ben ansible_become=yes ansible_become_pass
|
|||
ash2e ansible_host=ash2e ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||
|
||||
[oci_a1]
|
||||
master ansible_host=master ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||
ch4 ansible_host=ch4 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||
ash3c ansible_host=ash3c ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
- name: Configure Nomad Dynamic Host Volumes for NFS
|
||||
hosts: nomad_clients
|
||||
become: yes
|
||||
vars:
|
||||
nfs_server: "snail"
|
||||
nfs_share: "/fs/1000/nfs/Fnsync"
|
||||
mount_point: "/mnt/fnsync"
|
||||
|
||||
tasks:
|
||||
- name: Stop Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: stopped
|
||||
|
||||
- name: Update Nomad configuration for dynamic host volumes
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} DYNAMIC HOST VOLUMES CONFIGURATION"
|
||||
block: |
|
||||
client {
|
||||
# 启用动态host volumes
|
||||
host_volume "fnsync" {
|
||||
path = "{{ mount_point }}"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
# 添加NFS相关的节点元数据
|
||||
meta {
|
||||
nfs_server = "{{ nfs_server }}"
|
||||
nfs_share = "{{ nfs_share }}"
|
||||
nfs_mounted = "true"
|
||||
}
|
||||
}
|
||||
insertafter: 'client {'
|
||||
|
||||
- name: Start Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: started
|
||||
enabled: yes
|
||||
|
||||
- name: Wait for Nomad to start
|
||||
wait_for:
|
||||
port: 4646
|
||||
delay: 10
|
||||
timeout: 60
|
||||
|
||||
- name: Check Nomad status
|
||||
command: nomad node status
|
||||
register: nomad_status
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Nomad status
|
||||
debug:
|
||||
var: nomad_status.stdout_lines
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
- name: 部署Nomad服务器配置模板
|
||||
hosts: nomad_servers
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: 部署Nomad配置文件
|
||||
template:
|
||||
src: nomad-server.hcl.j2
|
||||
dest: /etc/nomad.d/nomad.hcl
|
||||
backup: yes
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0644'
|
||||
|
||||
- name: 重启Nomad服务
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: 等待Nomad服务启动
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_host }}"
|
||||
timeout: 30
|
||||
|
||||
- name: 显示Nomad服务状态
|
||||
systemd:
|
||||
name: nomad
|
||||
register: nomad_status
|
||||
|
||||
- name: 显示服务状态
|
||||
debug:
|
||||
msg: "{{ inventory_hostname }} Nomad服务状态: {{ nomad_status.status.ActiveState }}"
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
- name: 紧急修复Nomad bootstrap_expect配置
|
||||
hosts: nomad_servers
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: 修复bootstrap_expect为3
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ bootstrap_expect = \d+'
|
||||
line: ' bootstrap_expect = 3'
|
||||
backup: yes
|
||||
|
||||
- name: 重启Nomad服务
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: 等待Nomad服务启动
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_host }}"
|
||||
timeout: 30
|
||||
|
||||
- name: 检查Nomad服务状态
|
||||
systemd:
|
||||
name: nomad
|
||||
register: nomad_status
|
||||
|
||||
- name: 显示Nomad服务状态
|
||||
debug:
|
||||
msg: "{{ inventory_hostname }} Nomad服务状态: {{ nomad_status.status.ActiveState }}"
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,103 @@
|
|||
---
|
||||
- name: Fix ch4 Nomad configuration - convert from server to client
|
||||
hosts: ch4
|
||||
become: yes
|
||||
vars:
|
||||
ansible_host: 100.117.106.136
|
||||
|
||||
tasks:
|
||||
- name: Backup current Nomad config
|
||||
copy:
|
||||
src: /etc/nomad.d/nomad.hcl
|
||||
dest: /etc/nomad.d/nomad.hcl.backup
|
||||
remote_src: yes
|
||||
backup: yes
|
||||
|
||||
- name: Update Nomad config to client mode
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} ANSIBLE MANAGED CLIENT CONFIG"
|
||||
block: |
|
||||
server {
|
||||
enabled = false
|
||||
}
|
||||
|
||||
client {
|
||||
enabled = true
|
||||
network_interface = "tailscale0"
|
||||
|
||||
servers = [
|
||||
"semaphore.tailnet-68f9.ts.net:4647",
|
||||
"ash1d.tailnet-68f9.ts.net:4647",
|
||||
"ash2e.tailnet-68f9.ts.net:4647",
|
||||
"ch2.tailnet-68f9.ts.net:4647",
|
||||
"ch3.tailnet-68f9.ts.net:4647",
|
||||
"onecloud1.tailnet-68f9.ts.net:4647",
|
||||
"de.tailnet-68f9.ts.net:4647"
|
||||
]
|
||||
|
||||
meta {
|
||||
consul = "true"
|
||||
consul_version = "1.21.5"
|
||||
consul_server = "true"
|
||||
}
|
||||
}
|
||||
insertbefore: '^server \{'
|
||||
replace: '^server \{.*?\}'
|
||||
|
||||
- name: Update client block
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} ANSIBLE MANAGED CLIENT BLOCK"
|
||||
block: |
|
||||
client {
|
||||
enabled = true
|
||||
network_interface = "tailscale0"
|
||||
|
||||
servers = [
|
||||
"semaphore.tailnet-68f9.ts.net:4647",
|
||||
"ash1d.tailnet-68f9.ts.net:4647",
|
||||
"ash2e.tailnet-68f9.ts.net:4647",
|
||||
"ch2.tailnet-68f9.ts.net:4647",
|
||||
"ch3.tailnet-68f9.ts.net:4647",
|
||||
"onecloud1.tailnet-68f9.ts.net:4647",
|
||||
"de.tailnet-68f9.ts.net:4647"
|
||||
]
|
||||
|
||||
meta {
|
||||
consul = "true"
|
||||
consul_version = "1.21.5"
|
||||
consul_server = "true"
|
||||
}
|
||||
}
|
||||
insertbefore: '^client \{'
|
||||
replace: '^client \{.*?\}'
|
||||
|
||||
- name: Restart Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: Wait for Nomad to be ready
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
- name: Verify Nomad client status
|
||||
shell: |
|
||||
NOMAD_ADDR=http://localhost:4646 nomad node status | grep -q "ready"
|
||||
register: nomad_ready
|
||||
failed_when: nomad_ready.rc != 0
|
||||
retries: 3
|
||||
delay: 10
|
||||
|
||||
- name: Display completion message
|
||||
debug:
|
||||
msg: |
|
||||
✅ Successfully converted ch4 from Nomad server to client
|
||||
✅ Nomad service restarted
|
||||
✅ Configuration updated
|
||||
|
||||
|
|
@ -0,0 +1,82 @@
|
|||
---
|
||||
- name: Fix master node - rename to ch4 and restore SSH port 22
|
||||
hosts: master
|
||||
become: yes
|
||||
vars:
|
||||
new_hostname: ch4
|
||||
old_hostname: master
|
||||
|
||||
tasks:
|
||||
- name: Backup current hostname
|
||||
copy:
|
||||
content: "{{ old_hostname }}"
|
||||
dest: /etc/hostname.backup
|
||||
mode: '0644'
|
||||
when: ansible_hostname == old_hostname
|
||||
|
||||
- name: Update hostname to ch4
|
||||
hostname:
|
||||
name: "{{ new_hostname }}"
|
||||
when: ansible_hostname == old_hostname
|
||||
|
||||
- name: Update /etc/hostname file
|
||||
copy:
|
||||
content: "{{ new_hostname }}"
|
||||
dest: /etc/hostname
|
||||
mode: '0644'
|
||||
when: ansible_hostname == old_hostname
|
||||
|
||||
- name: Update /etc/hosts file
|
||||
lineinfile:
|
||||
path: /etc/hosts
|
||||
regexp: '^127\.0\.1\.1.*{{ old_hostname }}'
|
||||
line: '127.0.1.1 {{ new_hostname }}'
|
||||
state: present
|
||||
when: ansible_hostname == old_hostname
|
||||
|
||||
- name: Update Tailscale hostname
|
||||
shell: |
|
||||
tailscale set --hostname={{ new_hostname }}
|
||||
when: ansible_hostname == old_hostname
|
||||
|
||||
- name: Backup SSH config
|
||||
copy:
|
||||
src: /etc/ssh/sshd_config
|
||||
dest: /etc/ssh/sshd_config.backup
|
||||
remote_src: yes
|
||||
backup: yes
|
||||
|
||||
- name: Restore SSH port to 22
|
||||
lineinfile:
|
||||
path: /etc/ssh/sshd_config
|
||||
regexp: '^Port '
|
||||
line: 'Port 22'
|
||||
state: present
|
||||
|
||||
- name: Restart SSH service
|
||||
systemd:
|
||||
name: ssh
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: Wait for SSH to be ready on port 22
|
||||
wait_for:
|
||||
port: 22
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
- name: Test SSH connection on port 22
|
||||
ping:
|
||||
delegate_to: "{{ inventory_hostname }}"
|
||||
vars:
|
||||
ansible_port: 22
|
||||
|
||||
- name: Display completion message
|
||||
debug:
|
||||
msg: |
|
||||
✅ Successfully renamed {{ old_hostname }} to {{ new_hostname }}
|
||||
✅ SSH port restored to 22
|
||||
✅ Tailscale hostname updated
|
||||
🔄 Please update your inventory file to use the new hostname and port
|
||||
|
||||
|
|
@ -0,0 +1,71 @@
|
|||
---
|
||||
- name: Install and configure Consul clients on all nodes
|
||||
hosts: all
|
||||
become: yes
|
||||
vars:
|
||||
consul_servers:
|
||||
- "100.117.106.136" # ch4 (韩国)
|
||||
- "100.122.197.112" # warden (北京)
|
||||
- "100.116.80.94" # ash3c (美国)
|
||||
|
||||
tasks:
|
||||
- name: Get Tailscale IP address
|
||||
shell: ip addr show tailscale0 | grep 'inet ' | awk '{print $2}' | cut -d/ -f1
|
||||
register: tailscale_ip_result
|
||||
changed_when: false
|
||||
|
||||
- name: Set Tailscale IP fact
|
||||
set_fact:
|
||||
tailscale_ip: "{{ tailscale_ip_result.stdout }}"
|
||||
|
||||
- name: Install Consul
|
||||
apt:
|
||||
name: consul
|
||||
state: present
|
||||
update_cache: yes
|
||||
|
||||
- name: Create Consul data directory
|
||||
file:
|
||||
path: /opt/consul/data
|
||||
state: directory
|
||||
owner: consul
|
||||
group: consul
|
||||
mode: '0755'
|
||||
|
||||
- name: Create Consul log directory
|
||||
file:
|
||||
path: /var/log/consul
|
||||
state: directory
|
||||
owner: consul
|
||||
group: consul
|
||||
mode: '0755'
|
||||
|
||||
- name: Create Consul config directory
|
||||
file:
|
||||
path: /etc/consul.d
|
||||
state: directory
|
||||
owner: consul
|
||||
group: consul
|
||||
mode: '0755'
|
||||
|
||||
- name: Generate Consul client configuration
|
||||
template:
|
||||
src: consul-client.hcl.j2
|
||||
dest: /etc/consul.d/consul.hcl
|
||||
owner: consul
|
||||
group: consul
|
||||
mode: '0644'
|
||||
notify: restart consul
|
||||
|
||||
- name: Enable and start Consul service
|
||||
systemd:
|
||||
name: consul
|
||||
enabled: yes
|
||||
state: started
|
||||
daemon_reload: yes
|
||||
|
||||
handlers:
|
||||
- name: restart consul
|
||||
systemd:
|
||||
name: consul
|
||||
state: restarted
|
||||
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
- name: Install NFS CSI Plugin for Nomad
|
||||
hosts: nomad_nodes
|
||||
become: yes
|
||||
vars:
|
||||
nomad_user: nomad
|
||||
nomad_plugins_dir: /opt/nomad/plugins
|
||||
csi_driver_version: "v4.0.0"
|
||||
csi_driver_url: "https://github.com/kubernetes-csi/csi-driver-nfs/releases/download/{{ csi_driver_version }}/csi-nfs-driver"
|
||||
|
||||
tasks:
|
||||
- name: Stop Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: stopped
|
||||
|
||||
- name: Create plugins directory
|
||||
file:
|
||||
path: "{{ nomad_plugins_dir }}"
|
||||
state: directory
|
||||
owner: "{{ nomad_user }}"
|
||||
group: "{{ nomad_user }}"
|
||||
mode: '0755'
|
||||
|
||||
- name: Download NFS CSI driver
|
||||
get_url:
|
||||
url: "{{ csi_driver_url }}"
|
||||
dest: "{{ nomad_plugins_dir }}/csi-nfs-driver"
|
||||
owner: "{{ nomad_user }}"
|
||||
group: "{{ nomad_user }}"
|
||||
mode: '0755'
|
||||
|
||||
- name: Install required packages for CSI
|
||||
package:
|
||||
name:
|
||||
- nfs-common
|
||||
- mount
|
||||
state: present
|
||||
|
||||
- name: Create CSI mount directory
|
||||
file:
|
||||
path: /opt/nomad/csi
|
||||
state: directory
|
||||
owner: "{{ nomad_user }}"
|
||||
group: "{{ nomad_user }}"
|
||||
mode: '0755'
|
||||
|
||||
- name: Update Nomad configuration for CSI plugin
|
||||
blockinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
marker: "# {mark} CSI PLUGIN CONFIGURATION"
|
||||
block: |
|
||||
plugin_dir = "{{ nomad_plugins_dir }}"
|
||||
|
||||
plugin "csi-nfs" {
|
||||
type = "csi"
|
||||
config {
|
||||
driver_name = "nfs.csi.k8s.io"
|
||||
mount_dir = "/opt/nomad/csi"
|
||||
health_timeout = "30s"
|
||||
log_level = "INFO"
|
||||
}
|
||||
}
|
||||
insertafter: 'data_dir = "/opt/nomad/data"'
|
||||
|
||||
- name: Start Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: started
|
||||
enabled: yes
|
||||
|
||||
- name: Wait for Nomad to start
|
||||
wait_for:
|
||||
port: 4646
|
||||
delay: 10
|
||||
timeout: 60
|
||||
|
||||
- name: Check Nomad status
|
||||
command: nomad node status
|
||||
register: nomad_status
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Nomad status
|
||||
debug:
|
||||
var: nomad_status.stdout_lines
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
- name: 启动所有Nomad服务器形成集群
|
||||
hosts: nomad_servers
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: 检查Nomad服务状态
|
||||
systemd:
|
||||
name: nomad
|
||||
register: nomad_status
|
||||
|
||||
- name: 启动Nomad服务(如果未运行)
|
||||
systemd:
|
||||
name: nomad
|
||||
state: started
|
||||
enabled: yes
|
||||
when: nomad_status.status.ActiveState != "active"
|
||||
|
||||
- name: 等待Nomad服务启动
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_host }}"
|
||||
timeout: 30
|
||||
|
||||
- name: 显示Nomad服务状态
|
||||
debug:
|
||||
msg: "{{ inventory_hostname }} Nomad服务状态: {{ nomad_status.status.ActiveState }}"
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,61 @@
|
|||
# Consul Client Configuration for {{ inventory_hostname }}
|
||||
datacenter = "dc1"
|
||||
data_dir = "/opt/consul/data"
|
||||
log_level = "INFO"
|
||||
node_name = "{{ inventory_hostname }}"
|
||||
bind_addr = "{{ hostvars[inventory_hostname]['tailscale_ip'] }}"
|
||||
|
||||
# Client mode (not server)
|
||||
server = false
|
||||
|
||||
# Connect to Consul servers (指向三节点集群)
|
||||
retry_join = [
|
||||
{% for server in consul_servers %}
|
||||
"{{ server }}"{% if not loop.last %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
|
||||
# Performance optimization
|
||||
performance {
|
||||
raft_multiplier = 5
|
||||
}
|
||||
|
||||
# Ports configuration
|
||||
ports {
|
||||
grpc = 8502
|
||||
http = 8500
|
||||
dns = 8600
|
||||
}
|
||||
|
||||
# Enable Connect for service mesh
|
||||
connect {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# Cache configuration for performance
|
||||
cache {
|
||||
entry_fetch_max_burst = 42
|
||||
entry_fetch_rate = 30
|
||||
}
|
||||
|
||||
# Node metadata
|
||||
node_meta = {
|
||||
region = "unknown"
|
||||
zone = "nomad-{{ 'server' if 'server' in group_names else 'client' }}"
|
||||
}
|
||||
|
||||
# UI disabled for clients
|
||||
ui_config {
|
||||
enabled = false
|
||||
}
|
||||
|
||||
# ACL configuration (if needed)
|
||||
acl = {
|
||||
enabled = false
|
||||
default_policy = "allow"
|
||||
}
|
||||
|
||||
# Logging
|
||||
log_file = "/var/log/consul/consul.log"
|
||||
log_rotate_duration = "24h"
|
||||
log_rotate_max_files = 7
|
||||
|
|
@ -0,0 +1,106 @@
|
|||
datacenter = "dc1"
|
||||
data_dir = "/opt/nomad/data"
|
||||
plugin_dir = "/opt/nomad/plugins"
|
||||
log_level = "INFO"
|
||||
name = "{{ ansible_hostname }}"
|
||||
|
||||
bind_addr = "0.0.0.0"
|
||||
|
||||
addresses {
|
||||
http = "{{ ansible_host }}"
|
||||
rpc = "{{ ansible_host }}"
|
||||
serf = "{{ ansible_host }}"
|
||||
}
|
||||
|
||||
advertise {
|
||||
http = "{{ ansible_host }}:4646"
|
||||
rpc = "{{ ansible_host }}:4647"
|
||||
serf = "{{ ansible_host }}:4648"
|
||||
}
|
||||
|
||||
ports {
|
||||
http = 4646
|
||||
rpc = 4647
|
||||
serf = 4648
|
||||
}
|
||||
|
||||
server {
|
||||
enabled = true
|
||||
bootstrap_expect = 3
|
||||
server_join {
|
||||
retry_join = [
|
||||
"semaphore.tailnet-68f9.ts.net:4648",
|
||||
"ash1d.tailnet-68f9.ts.net:4648",
|
||||
"ash2e.tailnet-68f9.ts.net:4648",
|
||||
"ch2.tailnet-68f9.ts.net:4648",
|
||||
"ch3.tailnet-68f9.ts.net:4648",
|
||||
"onecloud1.tailnet-68f9.ts.net:4648",
|
||||
"de.tailnet-68f9.ts.net:4648",
|
||||
"hcp1.tailnet-68f9.ts.net:4648"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
{% if ansible_hostname == 'hcp1' %}
|
||||
client {
|
||||
enabled = true
|
||||
network_interface = "tailscale0"
|
||||
|
||||
servers = [
|
||||
"semaphore.tailnet-68f9.ts.net:4647",
|
||||
"ash1d.tailnet-68f9.ts.net:4647",
|
||||
"ash2e.tailnet-68f9.ts.net:4647",
|
||||
"ch2.tailnet-68f9.ts.net:4647",
|
||||
"ch3.tailnet-68f9.ts.net:4647",
|
||||
"onecloud1.tailnet-68f9.ts.net:4647",
|
||||
"de.tailnet-68f9.ts.net:4647",
|
||||
"hcp1.tailnet-68f9.ts.net:4647"
|
||||
]
|
||||
|
||||
host_volume "traefik-certs" {
|
||||
path = "/opt/traefik/certs"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
host_volume "fnsync" {
|
||||
path = "/mnt/fnsync"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
meta {
|
||||
consul = "true"
|
||||
consul_version = "1.21.5"
|
||||
consul_client = "true"
|
||||
}
|
||||
|
||||
gc_interval = "5m"
|
||||
gc_disk_usage_threshold = 80
|
||||
gc_inode_usage_threshold = 70
|
||||
}
|
||||
|
||||
plugin "nomad-driver-podman" {
|
||||
config {
|
||||
socket_path = "unix:///run/podman/podman.sock"
|
||||
volumes {
|
||||
enabled = true
|
||||
}
|
||||
}
|
||||
}
|
||||
{% endif %}
|
||||
|
||||
consul {
|
||||
address = "ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500"
|
||||
server_service_name = "nomad"
|
||||
client_service_name = "nomad-client"
|
||||
auto_advertise = true
|
||||
server_auto_join = false
|
||||
client_auto_join = true
|
||||
}
|
||||
|
||||
telemetry {
|
||||
collection_interval = "1s"
|
||||
disable_hostname = false
|
||||
prometheus_metrics = true
|
||||
publish_allocation_metrics = true
|
||||
publish_node_metrics = true
|
||||
}
|
||||
|
|
@ -19,7 +19,7 @@
|
|||
- ip: "100.120.225.29"
|
||||
hostnames: ["de"]
|
||||
- ip: "100.117.106.136"
|
||||
hostnames: ["master"]
|
||||
hostnames: ["ch4"]
|
||||
- ip: "100.116.80.94"
|
||||
hostnames: ["ash3c", "influxdb1"]
|
||||
- ip: "100.116.112.45"
|
||||
|
|
|
|||
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
- name: 更新Nomad服务器配置,添加hcp1作为peer
|
||||
hosts: nomad_servers
|
||||
become: yes
|
||||
vars:
|
||||
hcp1_ip: "100.97.62.111"
|
||||
bootstrap_expect: 8
|
||||
|
||||
tasks:
|
||||
- name: 备份原配置文件
|
||||
copy:
|
||||
src: /etc/nomad.d/nomad.hcl
|
||||
dest: /etc/nomad.d/nomad.hcl.bak
|
||||
remote_src: yes
|
||||
backup: yes
|
||||
|
||||
- name: 添加hcp1到retry_join列表
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ retry_join = \['
|
||||
line: ' retry_join = ["{{ hcp1_ip }}",'
|
||||
backup: yes
|
||||
|
||||
- name: 更新bootstrap_expect为8
|
||||
lineinfile:
|
||||
path: /etc/nomad.d/nomad.hcl
|
||||
regexp: '^ bootstrap_expect = \d+'
|
||||
line: ' bootstrap_expect = {{ bootstrap_expect }}'
|
||||
backup: yes
|
||||
|
||||
- name: 重启Nomad服务
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
enabled: yes
|
||||
|
||||
- name: 等待Nomad服务启动
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_host }}"
|
||||
timeout: 30
|
||||
|
||||
- name: 检查Nomad服务状态
|
||||
systemd:
|
||||
name: nomad
|
||||
register: nomad_status
|
||||
|
||||
- name: 显示Nomad服务状态
|
||||
debug:
|
||||
msg: "Nomad服务状态: {{ nomad_status.status.ActiveState }}"
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
---
|
||||
- name: Remove Consul configuration from all Nomad servers
|
||||
hosts: semaphore,ash1d,ash2e,ch2,ch3,onecloud1,de
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Create clean Nomad server configuration
|
||||
copy:
|
||||
content: |
|
||||
datacenter = "dc1"
|
||||
data_dir = "/opt/nomad/data"
|
||||
plugin_dir = "/opt/nomad/plugins"
|
||||
log_level = "INFO"
|
||||
name = "{{ inventory_hostname }}"
|
||||
|
||||
bind_addr = "{{ inventory_hostname }}.tailnet-68f9.ts.net"
|
||||
|
||||
addresses {
|
||||
http = "{{ inventory_hostname }}.tailnet-68f9.ts.net"
|
||||
rpc = "{{ inventory_hostname }}.tailnet-68f9.ts.net"
|
||||
serf = "{{ inventory_hostname }}.tailnet-68f9.ts.net"
|
||||
}
|
||||
|
||||
advertise {
|
||||
http = "{{ inventory_hostname }}.tailnet-68f9.ts.net:4646"
|
||||
rpc = "{{ inventory_hostname }}.tailnet-68f9.ts.net:4647"
|
||||
serf = "{{ inventory_hostname }}.tailnet-68f9.ts.net:4648"
|
||||
}
|
||||
|
||||
ports {
|
||||
http = 4646
|
||||
rpc = 4647
|
||||
serf = 4648
|
||||
}
|
||||
|
||||
server {
|
||||
enabled = true
|
||||
bootstrap_expect = 7
|
||||
retry_join = ["ash1d.tailnet-68f9.ts.net","ash2e.tailnet-68f9.ts.net","ch2.tailnet-68f9.ts.net","ch3.tailnet-68f9.ts.net","onecloud1.tailnet-68f9.ts.net","de.tailnet-68f9.ts.net"]
|
||||
}
|
||||
|
||||
client {
|
||||
enabled = false
|
||||
}
|
||||
|
||||
plugin "nomad-driver-podman" {
|
||||
config {
|
||||
socket_path = "unix:///run/podman/podman.sock"
|
||||
volumes {
|
||||
enabled = true
|
||||
}
|
||||
}
|
||||
}
|
||||
dest: /etc/nomad.d/nomad.hcl
|
||||
mode: '0644'
|
||||
|
||||
- name: Restart Nomad service
|
||||
systemd:
|
||||
name: nomad
|
||||
state: restarted
|
||||
|
||||
- name: Wait for Nomad to be ready
|
||||
wait_for:
|
||||
port: 4646
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
delay: 5
|
||||
timeout: 30
|
||||
|
||||
- name: Display completion message
|
||||
debug:
|
||||
msg: "Removed Consul configuration from {{ inventory_hostname }}"
|
||||
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
# Consul Client Configuration for {{ inventory_hostname }}
|
||||
datacenter = "dc1"
|
||||
data_dir = "/opt/consul/data"
|
||||
log_level = "INFO"
|
||||
node_name = "{{ inventory_hostname }}"
|
||||
bind_addr = "{{ ansible_host }}"
|
||||
|
||||
# Client mode (not server)
|
||||
server = false
|
||||
|
||||
# Connect to Consul servers (指向三节点集群)
|
||||
retry_join = [
|
||||
{% for server in consul_servers %}
|
||||
"{{ server }}"{% if not loop.last %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
|
||||
# Performance optimization
|
||||
performance {
|
||||
raft_multiplier = 5
|
||||
}
|
||||
|
||||
# Ports configuration
|
||||
ports {
|
||||
grpc = 8502
|
||||
http = 8500
|
||||
dns = 8600
|
||||
}
|
||||
|
||||
# Enable Connect for service mesh
|
||||
connect {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# Cache configuration for performance
|
||||
cache {
|
||||
entry_fetch_max_burst = 42
|
||||
entry_fetch_rate = 30
|
||||
}
|
||||
|
||||
# Node metadata
|
||||
node_meta = {
|
||||
region = "unknown"
|
||||
zone = "nomad-{{ 'server' if 'server' in group_names else 'client' }}"
|
||||
}
|
||||
|
||||
# UI disabled for clients
|
||||
ui_config {
|
||||
enabled = false
|
||||
}
|
||||
|
||||
# ACL configuration (if needed)
|
||||
acl = {
|
||||
enabled = false
|
||||
default_policy = "allow"
|
||||
}
|
||||
|
||||
# Logging
|
||||
log_file = "/var/log/consul/consul.log"
|
||||
log_rotate_duration = "24h"
|
||||
log_rotate_max_files = 7
|
||||
|
||||
|
|
@ -49,6 +49,11 @@ client {
|
|||
read_only = false
|
||||
}
|
||||
|
||||
host_volume "vault-storage" {
|
||||
path = "/opt/nomad/data/vault-storage"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
# 禁用Docker驱动,只使用Podman
|
||||
options {
|
||||
"driver.raw_exec.enable" = "1"
|
||||
|
|
|
|||
|
|
@ -2,20 +2,20 @@ datacenter = "dc1"
|
|||
data_dir = "/opt/nomad/data"
|
||||
plugin_dir = "/opt/nomad/plugins"
|
||||
log_level = "INFO"
|
||||
name = "{{ server_name }}"
|
||||
name = "{{ ansible_hostname }}"
|
||||
|
||||
bind_addr = "{{ server_name }}.tailnet-68f9.ts.net"
|
||||
bind_addr = "0.0.0.0"
|
||||
|
||||
addresses {
|
||||
http = "{{ server_name }}.tailnet-68f9.ts.net"
|
||||
rpc = "{{ server_name }}.tailnet-68f9.ts.net"
|
||||
serf = "{{ server_name }}.tailnet-68f9.ts.net"
|
||||
http = "{{ ansible_host }}"
|
||||
rpc = "{{ ansible_host }}"
|
||||
serf = "{{ ansible_host }}"
|
||||
}
|
||||
|
||||
advertise {
|
||||
http = "{{ server_name }}.tailnet-68f9.ts.net:4646"
|
||||
rpc = "{{ server_name }}.tailnet-68f9.ts.net:4647"
|
||||
serf = "{{ server_name }}.tailnet-68f9.ts.net:4648"
|
||||
http = "{{ ansible_host }}:4646"
|
||||
rpc = "{{ ansible_host }}:4647"
|
||||
serf = "{{ ansible_host }}:4648"
|
||||
}
|
||||
|
||||
ports {
|
||||
|
|
@ -26,18 +26,56 @@ ports {
|
|||
|
||||
server {
|
||||
enabled = true
|
||||
bootstrap_expect = 7
|
||||
retry_join = [
|
||||
{%- for server in groups['nomad_servers'] -%}
|
||||
{%- if server != inventory_hostname -%}
|
||||
"{{ server }}.tailnet-68f9.ts.net"{% if not loop.last %},{% endif %}
|
||||
{%- endif -%}
|
||||
{%- endfor -%}
|
||||
]
|
||||
bootstrap_expect = 3
|
||||
server_join {
|
||||
retry_join = [
|
||||
"semaphore.tailnet-68f9.ts.net:4648",
|
||||
"ash1d.tailnet-68f9.ts.net:4648",
|
||||
"ash2e.tailnet-68f9.ts.net:4648",
|
||||
"ch2.tailnet-68f9.ts.net:4648",
|
||||
"ch3.tailnet-68f9.ts.net:4648",
|
||||
"onecloud1.tailnet-68f9.ts.net:4648",
|
||||
"de.tailnet-68f9.ts.net:4648",
|
||||
"hcp1.tailnet-68f9.ts.net:4648"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
{% if ansible_hostname == 'hcp1' %}
|
||||
client {
|
||||
enabled = false
|
||||
enabled = true
|
||||
network_interface = "tailscale0"
|
||||
|
||||
servers = [
|
||||
"semaphore.tailnet-68f9.ts.net:4647",
|
||||
"ash1d.tailnet-68f9.ts.net:4647",
|
||||
"ash2e.tailnet-68f9.ts.net:4647",
|
||||
"ch2.tailnet-68f9.ts.net:4647",
|
||||
"ch3.tailnet-68f9.ts.net:4647",
|
||||
"onecloud1.tailnet-68f9.ts.net:4647",
|
||||
"de.tailnet-68f9.ts.net:4647",
|
||||
"hcp1.tailnet-68f9.ts.net:4647"
|
||||
]
|
||||
|
||||
host_volume "traefik-certs" {
|
||||
path = "/opt/traefik/certs"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
host_volume "fnsync" {
|
||||
path = "/mnt/fnsync"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
meta {
|
||||
consul = "true"
|
||||
consul_version = "1.21.5"
|
||||
consul_client = "true"
|
||||
}
|
||||
|
||||
gc_interval = "5m"
|
||||
gc_disk_usage_threshold = 80
|
||||
gc_inode_usage_threshold = 70
|
||||
}
|
||||
|
||||
plugin "nomad-driver-podman" {
|
||||
|
|
@ -48,20 +86,21 @@ plugin "nomad-driver-podman" {
|
|||
}
|
||||
}
|
||||
}
|
||||
{% endif %}
|
||||
|
||||
consul {
|
||||
address = "master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500"
|
||||
address = "ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500"
|
||||
server_service_name = "nomad"
|
||||
client_service_name = "nomad-client"
|
||||
auto_advertise = true
|
||||
server_auto_join = true
|
||||
server_auto_join = false
|
||||
client_auto_join = true
|
||||
}
|
||||
|
||||
vault {
|
||||
enabled = true
|
||||
address = "http://master.tailnet-68f9.ts.net:8200,http://ash3c.tailnet-68f9.ts.net:8200,http://warden.tailnet-68f9.ts.net:8200"
|
||||
token = "hvs.A5Fu4E1oHyezJapVllKPFsWg"
|
||||
create_from_role = "nomad-cluster"
|
||||
tls_skip_verify = true
|
||||
telemetry {
|
||||
collection_interval = "1s"
|
||||
disable_hostname = false
|
||||
prometheus_metrics = true
|
||||
publish_allocation_metrics = true
|
||||
publish_node_metrics = true
|
||||
}
|
||||
|
|
@ -64,7 +64,7 @@ plugin "nomad-driver-podman" {
|
|||
}
|
||||
|
||||
consul {
|
||||
address = "master.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500"
|
||||
address = "ch4.tailnet-68f9.ts.net:8500,ash3c.tailnet-68f9.ts.net:8500,warden.tailnet-68f9.ts.net:8500"
|
||||
server_service_name = "nomad"
|
||||
client_service_name = "nomad-client"
|
||||
auto_advertise = true
|
||||
|
|
@ -74,7 +74,7 @@ consul {
|
|||
|
||||
vault {
|
||||
enabled = true
|
||||
address = "http://master.tailnet-68f9.ts.net:8200,http://ash3c.tailnet-68f9.ts.net:8200,http://warden.tailnet-68f9.ts.net:8200"
|
||||
address = "http://ch4.tailnet-68f9.ts.net:8200,http://ash3c.tailnet-68f9.ts.net:8200,http://warden.tailnet-68f9.ts.net:8200"
|
||||
token = "hvs.A5Fu4E1oHyezJapVllKPFsWg"
|
||||
create_from_role = "nomad-cluster"
|
||||
tls_skip_verify = true
|
||||
|
|
|
|||
|
|
@ -0,0 +1,45 @@
|
|||
# Vault Configuration for {{ inventory_hostname }}
|
||||
|
||||
# Storage backend - Consul
|
||||
storage "consul" {
|
||||
address = "127.0.0.1:8500"
|
||||
path = "vault/"
|
||||
|
||||
# Consul datacenter
|
||||
datacenter = "{{ vault_datacenter }}"
|
||||
|
||||
# Service registration
|
||||
service = "vault"
|
||||
service_tags = "vault-server"
|
||||
|
||||
# Session TTL
|
||||
session_ttl = "15s"
|
||||
lock_wait_time = "15s"
|
||||
}
|
||||
|
||||
# Listener configuration
|
||||
listener "tcp" {
|
||||
address = "0.0.0.0:8200"
|
||||
tls_disable = 1
|
||||
}
|
||||
|
||||
# API address - 使用Tailscale网络地址
|
||||
api_addr = "http://{{ ansible_host }}:8200"
|
||||
|
||||
# Cluster address - 使用Tailscale网络地址
|
||||
cluster_addr = "http://{{ ansible_host }}:8201"
|
||||
|
||||
# UI
|
||||
ui = true
|
||||
|
||||
# Cluster name
|
||||
cluster_name = "{{ vault_cluster_name }}"
|
||||
|
||||
# Disable mlock for development (remove in production)
|
||||
disable_mlock = true
|
||||
|
||||
# Log level
|
||||
log_level = "INFO"
|
||||
|
||||
# Plugin directory
|
||||
plugin_directory = "/opt/vault/plugins"
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
[Unit]
|
||||
Description=Vault
|
||||
Documentation=https://www.vaultproject.io/docs/
|
||||
Requires=network-online.target
|
||||
After=network-online.target
|
||||
ConditionFileNotEmpty=/etc/vault.d/vault.hcl
|
||||
StartLimitIntervalSec=60
|
||||
StartLimitBurst=3
|
||||
|
||||
[Service]
|
||||
Type=notify
|
||||
User=vault
|
||||
Group=vault
|
||||
ProtectSystem=full
|
||||
ProtectHome=read-only
|
||||
PrivateTmp=yes
|
||||
PrivateDevices=yes
|
||||
SecureBits=keep-caps
|
||||
AmbientCapabilities=CAP_IPC_LOCK
|
||||
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
|
||||
NoNewPrivileges=yes
|
||||
ExecStart=/usr/bin/vault server -config=/etc/vault.d/vault.hcl
|
||||
ExecReload=/bin/kill --signal HUP $MAINPID
|
||||
KillMode=process
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
TimeoutStopSec=30
|
||||
StartLimitInterval=60
|
||||
StartLimitBurst=3
|
||||
LimitNOFILE=65536
|
||||
LimitMEMLOCK=infinity
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
|
|
@ -0,0 +1,66 @@
|
|||
---
|
||||
- name: Initialize Vault Cluster
|
||||
hosts: ch4 # 只在一个节点初始化
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check if Vault is already initialized
|
||||
uri:
|
||||
url: "http://{{ ansible_host }}:8200/v1/sys/health"
|
||||
method: GET
|
||||
status_code: [200, 429, 472, 473, 501, 503]
|
||||
register: vault_health
|
||||
|
||||
- name: Initialize Vault (only if not initialized)
|
||||
uri:
|
||||
url: "http://{{ ansible_host }}:8200/v1/sys/init"
|
||||
method: POST
|
||||
body_format: json
|
||||
body:
|
||||
secret_shares: 5
|
||||
secret_threshold: 3
|
||||
status_code: 200
|
||||
register: vault_init_result
|
||||
when: not vault_health.json.initialized
|
||||
|
||||
- name: Save initialization results to local file
|
||||
copy:
|
||||
content: |
|
||||
# Vault Cluster Initialization Results
|
||||
Generated on: {{ ansible_date_time.iso8601 }}
|
||||
Initialized by: {{ inventory_hostname }}
|
||||
|
||||
## Root Token
|
||||
{{ vault_init_result.json.root_token }}
|
||||
|
||||
## Unseal Keys
|
||||
{% for key in vault_init_result.json.keys %}
|
||||
Key {{ loop.index }}: {{ key }}
|
||||
{% endfor %}
|
||||
|
||||
## Base64 Unseal Keys
|
||||
{% for key in vault_init_result.json.keys_base64 %}
|
||||
Key {{ loop.index }} (base64): {{ key }}
|
||||
{% endfor %}
|
||||
|
||||
## Important Notes
|
||||
- Store these keys securely and separately
|
||||
- You need 3 out of 5 keys to unseal Vault
|
||||
- Root token provides full access to Vault
|
||||
- Consider revoking root token after initial setup
|
||||
dest: /tmp/vault-init-results.txt
|
||||
delegate_to: localhost
|
||||
when: vault_init_result is defined and vault_init_result.json is defined
|
||||
|
||||
- name: Display initialization results
|
||||
debug:
|
||||
msg: |
|
||||
Vault initialized successfully!
|
||||
Root Token: {{ vault_init_result.json.root_token }}
|
||||
Unseal Keys: {{ vault_init_result.json.keys }}
|
||||
when: vault_init_result is defined and vault_init_result.json is defined
|
||||
|
||||
- name: Display already initialized message
|
||||
debug:
|
||||
msg: "Vault is already initialized on {{ inventory_hostname }}"
|
||||
when: vault_health.json.initialized
|
||||
|
|
@ -0,0 +1,85 @@
|
|||
---
|
||||
- name: Deploy Vault Cluster with Consul Integration
|
||||
hosts: ch4,ash3c,warden
|
||||
become: yes
|
||||
vars:
|
||||
vault_version: "1.15.2"
|
||||
vault_datacenter: "dc1"
|
||||
vault_cluster_name: "vault-cluster"
|
||||
|
||||
tasks:
|
||||
- name: Update apt cache
|
||||
apt:
|
||||
update_cache: yes
|
||||
cache_valid_time: 3600
|
||||
|
||||
- name: Add HashiCorp GPG key (if not exists)
|
||||
shell: |
|
||||
if [ ! -f /etc/apt/sources.list.d/hashicorp.list ]; then
|
||||
curl -fsSL https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
|
||||
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
|
||||
fi
|
||||
args:
|
||||
creates: /etc/apt/sources.list.d/hashicorp.list
|
||||
|
||||
- name: Install Vault
|
||||
apt:
|
||||
name: vault
|
||||
state: present
|
||||
update_cache: yes
|
||||
allow_downgrade: yes
|
||||
|
||||
- name: Create vault user and directories
|
||||
block:
|
||||
- name: Create vault data directory
|
||||
file:
|
||||
path: /opt/vault/data
|
||||
state: directory
|
||||
owner: vault
|
||||
group: vault
|
||||
mode: '0755'
|
||||
|
||||
- name: Create vault config directory
|
||||
file:
|
||||
path: /etc/vault.d
|
||||
state: directory
|
||||
owner: vault
|
||||
group: vault
|
||||
mode: '0755'
|
||||
|
||||
- name: Generate Vault configuration
|
||||
template:
|
||||
src: vault.hcl.j2
|
||||
dest: /etc/vault.d/vault.hcl
|
||||
owner: vault
|
||||
group: vault
|
||||
mode: '0640'
|
||||
notify: restart vault
|
||||
|
||||
- name: Create Vault systemd service
|
||||
template:
|
||||
src: vault.service.j2
|
||||
dest: /etc/systemd/system/vault.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0644'
|
||||
notify:
|
||||
- reload systemd
|
||||
- restart vault
|
||||
|
||||
- name: Enable and start Vault service
|
||||
systemd:
|
||||
name: vault
|
||||
enabled: yes
|
||||
state: started
|
||||
daemon_reload: yes
|
||||
|
||||
handlers:
|
||||
- name: reload systemd
|
||||
systemd:
|
||||
daemon_reload: yes
|
||||
|
||||
- name: restart vault
|
||||
systemd:
|
||||
name: vault
|
||||
state: restarted
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
---
|
||||
- name: Verify Vault Cluster Status
|
||||
hosts: ch4,ash3c,warden
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check Vault service status
|
||||
systemd:
|
||||
name: vault
|
||||
register: vault_service_status
|
||||
|
||||
- name: Display Vault service status
|
||||
debug:
|
||||
msg: "Vault service on {{ inventory_hostname }}: {{ vault_service_status.status.ActiveState }}"
|
||||
|
||||
- name: Check Vault process
|
||||
shell: ps aux | grep vault | grep -v grep
|
||||
register: vault_process
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Vault process
|
||||
debug:
|
||||
msg: "Vault process on {{ inventory_hostname }}: {{ vault_process.stdout_lines }}"
|
||||
|
||||
- name: Check Vault port 8200
|
||||
wait_for:
|
||||
port: 8200
|
||||
host: "{{ ansible_default_ipv4.address }}"
|
||||
timeout: 10
|
||||
register: vault_port_check
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display port check result
|
||||
debug:
|
||||
msg: "Vault port 8200 on {{ inventory_hostname }}: {{ 'OPEN' if vault_port_check.failed == false else 'CLOSED' }}"
|
||||
|
||||
- name: Get Vault status
|
||||
uri:
|
||||
url: "http://{{ ansible_default_ipv4.address }}:8200/v1/sys/health"
|
||||
method: GET
|
||||
status_code: [200, 429, 472, 473, 501, 503]
|
||||
register: vault_health
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Vault health status
|
||||
debug:
|
||||
msg: "Vault health on {{ inventory_hostname }}: {{ vault_health.json if vault_health.json is defined else 'Connection failed' }}"
|
||||
|
||||
- name: Check Consul integration
|
||||
uri:
|
||||
url: "http://127.0.0.1:8500/v1/kv/vault/?recurse"
|
||||
method: GET
|
||||
register: consul_vault_kv
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Consul Vault KV
|
||||
debug:
|
||||
msg: "Consul Vault KV on {{ inventory_hostname }}: {{ 'Found vault keys' if consul_vault_kv.status == 200 else 'No vault keys found' }}"
|
||||
|
||||
- name: Check Vault logs for errors
|
||||
shell: journalctl -u vault --no-pager -n 10 | grep -i error || echo "No errors found"
|
||||
register: vault_logs
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Vault error logs
|
||||
debug:
|
||||
msg: "Vault errors on {{ inventory_hostname }}: {{ vault_logs.stdout_lines }}"
|
||||
|
|
@ -38,6 +38,12 @@ terraform {
|
|||
source = "hashicorp/vault"
|
||||
version = "~> 4.0"
|
||||
}
|
||||
|
||||
# Cloudflare Provider
|
||||
cloudflare = {
|
||||
source = "cloudflare/cloudflare"
|
||||
version = "~> 3.0"
|
||||
}
|
||||
}
|
||||
|
||||
# 后端配置
|
||||
|
|
@ -53,10 +59,17 @@ provider "consul" {
|
|||
datacenter = "dc1"
|
||||
}
|
||||
|
||||
# Vault Provider配置
|
||||
provider "vault" {
|
||||
address = var.vault_config.address
|
||||
token = var.vault_token
|
||||
# 从Consul获取Cloudflare配置
|
||||
data "consul_keys" "cloudflare_config" {
|
||||
key {
|
||||
name = "token"
|
||||
path = "config/dev/cloudflare/token"
|
||||
}
|
||||
}
|
||||
|
||||
# Cloudflare Provider配置
|
||||
provider "cloudflare" {
|
||||
api_token = data.consul_keys.cloudflare_config.var.token
|
||||
}
|
||||
|
||||
# 从Consul获取Oracle Cloud配置
|
||||
|
|
@ -185,8 +198,28 @@ module "nomad_cluster" {
|
|||
depends_on = [module.oracle_cloud]
|
||||
}
|
||||
|
||||
# 输出 Nomad 集群信息
|
||||
output "nomad_cluster" {
|
||||
description = "Nomad 多数据中心集群信息"
|
||||
value = module.nomad_cluster
|
||||
# Cloudflare 连通性测试
|
||||
data "cloudflare_zones" "available" {
|
||||
filter {
|
||||
status = "active"
|
||||
}
|
||||
}
|
||||
|
||||
data "cloudflare_accounts" "available" {}
|
||||
|
||||
# 输出 Cloudflare 连通性测试结果
|
||||
output "cloudflare_connectivity_test" {
|
||||
description = "Cloudflare API 连通性测试结果"
|
||||
value = {
|
||||
zones_count = length(data.cloudflare_zones.available.zones)
|
||||
accounts_count = length(data.cloudflare_accounts.available.accounts)
|
||||
zones = [for zone in data.cloudflare_zones.available.zones : {
|
||||
name = zone.name
|
||||
id = zone.id
|
||||
}]
|
||||
accounts = [for account in data.cloudflare_accounts.available.accounts : {
|
||||
name = account.name
|
||||
id = account.id
|
||||
}]
|
||||
}
|
||||
}
|
||||
|
|
@ -17,7 +17,7 @@ output "cluster_overview" {
|
|||
name = "dc2"
|
||||
location = "Korea (KR)"
|
||||
provider = "oracle"
|
||||
node = "master"
|
||||
node = "ch4"
|
||||
ip = try(oci_core_instance.nomad_kr_node[0].public_ip, "pending")
|
||||
status = "deployed"
|
||||
} : null
|
||||
|
|
|
|||
|
|
@ -0,0 +1,305 @@
|
|||
# Vault与Consul集成最佳实践
|
||||
|
||||
## 1. 架构设计
|
||||
|
||||
### 1.1 高可用架构
|
||||
- **Vault集群**: 3个节点 (1个Leader + 2个Follower)
|
||||
- **Consul集群**: 3个节点 (1个Leader + 2个Follower)
|
||||
- **网络**: Tailscale安全网络
|
||||
- **存储**: Consul作为Vault的存储后端
|
||||
|
||||
### 1.2 节点分布
|
||||
```
|
||||
Vault节点:
|
||||
- ch4.tailnet-68f9.ts.net:8200 (Leader)
|
||||
- ash3c.tailnet-68f9.ts.net:8200 (Follower)
|
||||
- warden.tailnet-68f9.ts.net:8200 (Follower)
|
||||
|
||||
Consul节点:
|
||||
- ch4.tailnet-68f9.ts.net:8500 (Leader)
|
||||
- ash3c.tailnet-68f9.ts.net:8500 (Follower)
|
||||
- warden.tailnet-68f9.ts.net:8500 (Follower)
|
||||
```
|
||||
|
||||
## 2. Vault配置最佳实践
|
||||
|
||||
### 2.1 存储后端配置
|
||||
```hcl
|
||||
storage "consul" {
|
||||
address = "127.0.0.1:8500"
|
||||
path = "vault/"
|
||||
|
||||
# 高可用配置
|
||||
datacenter = "dc1"
|
||||
service = "vault"
|
||||
service_tags = "vault-server"
|
||||
|
||||
# 会话配置
|
||||
session_ttl = "15s"
|
||||
lock_wait_time = "15s"
|
||||
|
||||
# 一致性配置
|
||||
consistency_mode = "strong"
|
||||
|
||||
# 故障转移配置
|
||||
max_parallel = 128
|
||||
disable_registration = false
|
||||
}
|
||||
```
|
||||
|
||||
### 2.2 监听器配置
|
||||
```hcl
|
||||
listener "tcp" {
|
||||
address = "0.0.0.0:8200"
|
||||
|
||||
# 生产环境启用TLS
|
||||
tls_cert_file = "/opt/vault/tls/vault.crt"
|
||||
tls_key_file = "/opt/vault/tls/vault.key"
|
||||
tls_min_version = "1.2"
|
||||
}
|
||||
|
||||
# 集群监听器
|
||||
listener "tcp" {
|
||||
address = "0.0.0.0:8201"
|
||||
purpose = "cluster"
|
||||
|
||||
tls_cert_file = "/opt/vault/tls/vault.crt"
|
||||
tls_key_file = "/opt/vault/tls/vault.key"
|
||||
}
|
||||
```
|
||||
|
||||
### 2.3 集群配置
|
||||
```hcl
|
||||
# API地址 - 使用Tailscale网络
|
||||
api_addr = "https://{{ ansible_host }}:8200"
|
||||
|
||||
# 集群地址 - 使用Tailscale网络
|
||||
cluster_addr = "https://{{ ansible_host }}:8201"
|
||||
|
||||
# 集群名称
|
||||
cluster_name = "vault-cluster"
|
||||
|
||||
# 禁用mlock (生产环境应启用)
|
||||
disable_mlock = false
|
||||
|
||||
# 日志配置
|
||||
log_level = "INFO"
|
||||
log_format = "json"
|
||||
```
|
||||
|
||||
## 3. Consul配置最佳实践
|
||||
|
||||
### 3.1 服务注册配置
|
||||
```hcl
|
||||
services {
|
||||
name = "vault"
|
||||
tags = ["vault-server", "secrets"]
|
||||
port = 8200
|
||||
|
||||
check {
|
||||
name = "vault-health"
|
||||
http = "http://127.0.0.1:8200/v1/sys/health"
|
||||
interval = "10s"
|
||||
timeout = "3s"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 ACL配置
|
||||
```hcl
|
||||
acl {
|
||||
enabled = true
|
||||
default_policy = "deny"
|
||||
enable_token_persistence = true
|
||||
|
||||
# Vault服务权限
|
||||
tokens {
|
||||
default = "{{ vault_consul_token }}"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 4. 安全最佳实践
|
||||
|
||||
### 4.1 TLS配置
|
||||
- 所有Vault节点间通信使用TLS
|
||||
- Consul节点间通信使用TLS
|
||||
- 客户端到Vault通信使用TLS
|
||||
|
||||
### 4.2 认证配置
|
||||
```hcl
|
||||
# 启用多种认证方法
|
||||
auth {
|
||||
enabled = true
|
||||
|
||||
# AppRole认证
|
||||
approle {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# LDAP认证
|
||||
ldap {
|
||||
enabled = true
|
||||
url = "ldap://authentik.tailnet-68f9.ts.net:389"
|
||||
userdn = "ou=users,dc=authentik,dc=local"
|
||||
groupdn = "ou=groups,dc=authentik,dc=local"
|
||||
}
|
||||
|
||||
# OIDC认证
|
||||
oidc {
|
||||
enabled = true
|
||||
oidc_discovery_url = "https://authentik1.git-4ta.live/application/o/vault/"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 5. 监控和审计
|
||||
|
||||
### 5.1 审计日志
|
||||
```hcl
|
||||
audit {
|
||||
enabled = true
|
||||
|
||||
# 文件审计
|
||||
file {
|
||||
path = "/opt/vault/logs/audit.log"
|
||||
format = "json"
|
||||
}
|
||||
|
||||
# Syslog审计
|
||||
syslog {
|
||||
facility = "AUTH"
|
||||
tag = "vault"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 遥测配置
|
||||
```hcl
|
||||
telemetry {
|
||||
prometheus_retention_time = "30s"
|
||||
disable_hostname = false
|
||||
|
||||
# 指标配置
|
||||
metrics {
|
||||
enabled = true
|
||||
prefix = "vault"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 6. 备份和恢复
|
||||
|
||||
### 6.1 自动备份脚本
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /opt/vault/scripts/backup.sh
|
||||
|
||||
VAULT_ADDR="https://vault.git-4ta.live"
|
||||
VAULT_TOKEN="$(cat /opt/vault/token)"
|
||||
|
||||
# 创建快照
|
||||
vault operator raft snapshot save /opt/vault/backups/vault-$(date +%Y%m%d-%H%M%S).snapshot
|
||||
|
||||
# 清理旧备份 (保留7天)
|
||||
find /opt/vault/backups -name "vault-*.snapshot" -mtime +7 -delete
|
||||
```
|
||||
|
||||
### 6.2 Consul快照
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /opt/consul/scripts/backup.sh
|
||||
|
||||
CONSUL_ADDR="http://127.0.0.1:8500"
|
||||
|
||||
# 创建Consul快照
|
||||
consul snapshot save /opt/consul/backups/consul-$(date +%Y%m%d-%H%M%S).snapshot
|
||||
```
|
||||
|
||||
## 7. 故障转移和灾难恢复
|
||||
|
||||
### 7.1 自动故障转移
|
||||
- Vault使用Raft协议自动选举新Leader
|
||||
- Consul使用Raft协议自动选举新Leader
|
||||
- 客户端自动重连到新的Leader节点
|
||||
|
||||
### 7.2 灾难恢复流程
|
||||
1. 停止所有Vault节点
|
||||
2. 从Consul恢复数据
|
||||
3. 启动Vault集群
|
||||
4. 验证服务状态
|
||||
|
||||
## 8. 性能优化
|
||||
|
||||
### 8.1 缓存配置
|
||||
```hcl
|
||||
cache {
|
||||
enabled = true
|
||||
size = 1000
|
||||
persist {
|
||||
type = "kubernetes"
|
||||
path = "/opt/vault/cache"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 8.2 连接池配置
|
||||
```hcl
|
||||
storage "consul" {
|
||||
# 连接池配置
|
||||
max_parallel = 128
|
||||
max_requests_per_second = 100
|
||||
}
|
||||
```
|
||||
|
||||
## 9. 部署检查清单
|
||||
|
||||
### 9.1 部署前检查
|
||||
- [ ] Consul集群健康
|
||||
- [ ] 网络连通性测试
|
||||
- [ ] TLS证书配置
|
||||
- [ ] 防火墙规则配置
|
||||
- [ ] 存储空间检查
|
||||
|
||||
### 9.2 部署后验证
|
||||
- [ ] Vault集群状态检查
|
||||
- [ ] 服务注册验证
|
||||
- [ ] 认证功能测试
|
||||
- [ ] 备份功能测试
|
||||
- [ ] 监控指标验证
|
||||
|
||||
## 10. 常见问题和解决方案
|
||||
|
||||
### 10.1 常见问题
|
||||
1. **Vault无法连接到Consul**
|
||||
- 检查网络连通性
|
||||
- 验证Consul服务状态
|
||||
- 检查ACL权限
|
||||
|
||||
2. **集群分裂问题**
|
||||
- 检查网络分区
|
||||
- 验证Raft日志一致性
|
||||
- 执行灾难恢复流程
|
||||
|
||||
3. **性能问题**
|
||||
- 调整连接池大小
|
||||
- 启用缓存
|
||||
- 优化网络配置
|
||||
|
||||
### 10.2 故障排除命令
|
||||
```bash
|
||||
# 检查Vault状态
|
||||
vault status
|
||||
|
||||
# 检查Consul成员
|
||||
consul members
|
||||
|
||||
# 检查服务注册
|
||||
consul catalog services
|
||||
|
||||
# 检查Vault日志
|
||||
journalctl -u vault -f
|
||||
|
||||
# 检查Consul日志
|
||||
journalctl -u consul -f
|
||||
```
|
||||
|
|
@ -0,0 +1,42 @@
|
|||
# Cloudflare 配置
|
||||
# 使用 Consul 存储的 Cloudflare token 进行 API 调用
|
||||
|
||||
# 从 Consul 获取 Cloudflare 配置
|
||||
data "consul_keys" "cloudflare_config" {
|
||||
key {
|
||||
name = "token"
|
||||
path = "config/dev/cloudflare/token"
|
||||
}
|
||||
}
|
||||
|
||||
# Cloudflare Provider 配置
|
||||
provider "cloudflare" {
|
||||
api_token = data.consul_keys.cloudflare_config.var.token
|
||||
}
|
||||
|
||||
# 测试 Cloudflare API 连通性 - 获取可用区域
|
||||
data "cloudflare_zones" "available" {
|
||||
filter {
|
||||
status = "active"
|
||||
}
|
||||
}
|
||||
|
||||
# 测试 Cloudflare API 连通性 - 获取账户信息
|
||||
data "cloudflare_accounts" "available" {}
|
||||
|
||||
# 输出 Cloudflare 连通性测试结果
|
||||
output "cloudflare_connectivity_test" {
|
||||
description = "Cloudflare API 连通性测试结果"
|
||||
value = {
|
||||
zones_count = length(data.cloudflare_zones.available.zones)
|
||||
accounts_count = length(data.cloudflare_accounts.available.accounts)
|
||||
zones = [for zone in data.cloudflare_zones.available.zones : {
|
||||
name = zone.name
|
||||
id = zone.id
|
||||
}]
|
||||
accounts = [for account in data.cloudflare_accounts.available.accounts : {
|
||||
name = account.name
|
||||
id = account.id
|
||||
}]
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,66 @@
|
|||
# 韩国区域实例配置 - 导入现有资源
|
||||
|
||||
# ch4 实例 (原ARM)
|
||||
resource "oci_core_instance" "ch4" {
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
availability_domain = "CSRd:AP-CHUNCHEON-1-AD-1"
|
||||
shape = "VM.Standard.A1.Flex"
|
||||
display_name = "ch4"
|
||||
|
||||
shape_config {
|
||||
ocpus = 4
|
||||
memory_in_gbs = 24
|
||||
}
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
# ch2 实例
|
||||
resource "oci_core_instance" "ch2" {
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
availability_domain = "CSRd:AP-CHUNCHEON-1-AD-1"
|
||||
shape = "VM.Standard.E2.1.Micro"
|
||||
display_name = "ch2"
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
# ch3 实例
|
||||
resource "oci_core_instance" "ch3" {
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
availability_domain = "CSRd:AP-CHUNCHEON-1-AD-1"
|
||||
shape = "VM.Standard.E2.1.Micro"
|
||||
display_name = "ch3"
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
# 测试韩国区域连接
|
||||
data "oci_identity_availability_domains" "kr_test" {
|
||||
compartment_id = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
}
|
||||
|
|
@ -44,6 +44,12 @@ terraform {
|
|||
source = "digitalocean/digitalocean"
|
||||
version = "~> 2.0"
|
||||
}
|
||||
|
||||
# Cloudflare Provider
|
||||
cloudflare = {
|
||||
source = "cloudflare/cloudflare"
|
||||
version = "~> 4.0"
|
||||
}
|
||||
}
|
||||
|
||||
# 后端配置
|
||||
|
|
@ -65,64 +71,7 @@ provider "vault" {
|
|||
token = var.vault_token
|
||||
}
|
||||
|
||||
# 从Consul获取Oracle Cloud配置
|
||||
data "consul_keys" "oracle_config" {
|
||||
key {
|
||||
name = "tenancy_ocid"
|
||||
path = "config/dev/oracle/kr/tenancy_ocid"
|
||||
}
|
||||
key {
|
||||
name = "user_ocid"
|
||||
path = "config/dev/oracle/kr/user_ocid"
|
||||
}
|
||||
key {
|
||||
name = "fingerprint"
|
||||
path = "config/dev/oracle/kr/fingerprint"
|
||||
}
|
||||
key {
|
||||
name = "private_key"
|
||||
path = "config/dev/oracle/kr/private_key"
|
||||
}
|
||||
}
|
||||
|
||||
# 从Consul获取Oracle Cloud美国区域配置
|
||||
data "consul_keys" "oracle_config_us" {
|
||||
key {
|
||||
name = "tenancy_ocid"
|
||||
path = "config/dev/oracle/us/tenancy_ocid"
|
||||
}
|
||||
key {
|
||||
name = "user_ocid"
|
||||
path = "config/dev/oracle/us/user_ocid"
|
||||
}
|
||||
key {
|
||||
name = "fingerprint"
|
||||
path = "config/dev/oracle/us/fingerprint"
|
||||
}
|
||||
key {
|
||||
name = "private_key"
|
||||
path = "config/dev/oracle/us/private_key"
|
||||
}
|
||||
}
|
||||
|
||||
# 使用从Consul获取的配置的OCI Provider
|
||||
provider "oci" {
|
||||
tenancy_ocid = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
user_ocid = data.consul_keys.oracle_config.var.user_ocid
|
||||
fingerprint = data.consul_keys.oracle_config.var.fingerprint
|
||||
private_key = data.consul_keys.oracle_config.var.private_key
|
||||
region = "ap-chuncheon-1"
|
||||
}
|
||||
|
||||
# 美国区域的OCI Provider
|
||||
provider "oci" {
|
||||
alias = "us"
|
||||
tenancy_ocid = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
user_ocid = data.consul_keys.oracle_config_us.var.user_ocid
|
||||
fingerprint = data.consul_keys.oracle_config_us.var.fingerprint
|
||||
private_key = data.consul_keys.oracle_config_us.var.private_key
|
||||
region = "us-ashburn-1"
|
||||
}
|
||||
# Oracle Cloud 配置已移至 oracle.tf
|
||||
|
||||
# Oracle Cloud 基础设施 - 暂时注释掉以避免VCN数量限制问题
|
||||
# module "oracle_cloud" {
|
||||
|
|
|
|||
|
|
@ -0,0 +1,61 @@
|
|||
# Oracle Cloud Infrastructure 配置
|
||||
# 管理多个 Oracle Cloud 账户和区域
|
||||
|
||||
# 从 Consul 获取 Oracle Cloud 韩国区域配置
|
||||
data "consul_keys" "oracle_config" {
|
||||
key {
|
||||
name = "tenancy_ocid"
|
||||
path = "config/dev/oracle/kr/tenancy_ocid"
|
||||
}
|
||||
key {
|
||||
name = "user_ocid"
|
||||
path = "config/dev/oracle/kr/user_ocid"
|
||||
}
|
||||
key {
|
||||
name = "fingerprint"
|
||||
path = "config/dev/oracle/kr/fingerprint"
|
||||
}
|
||||
key {
|
||||
name = "private_key"
|
||||
path = "config/dev/oracle/kr/private_key"
|
||||
}
|
||||
}
|
||||
|
||||
# 从 Consul 获取 Oracle Cloud 美国区域配置
|
||||
data "consul_keys" "oracle_config_us" {
|
||||
key {
|
||||
name = "tenancy_ocid"
|
||||
path = "config/dev/oracle/us/tenancy_ocid"
|
||||
}
|
||||
key {
|
||||
name = "user_ocid"
|
||||
path = "config/dev/oracle/us/user_ocid"
|
||||
}
|
||||
key {
|
||||
name = "fingerprint"
|
||||
path = "config/dev/oracle/us/fingerprint"
|
||||
}
|
||||
key {
|
||||
name = "private_key"
|
||||
path = "config/dev/oracle/us/private_key"
|
||||
}
|
||||
}
|
||||
|
||||
# 韩国区域的 OCI Provider
|
||||
provider "oci" {
|
||||
tenancy_ocid = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
user_ocid = data.consul_keys.oracle_config.var.user_ocid
|
||||
fingerprint = data.consul_keys.oracle_config.var.fingerprint
|
||||
private_key = data.consul_keys.oracle_config.var.private_key
|
||||
region = "ap-chuncheon-1"
|
||||
}
|
||||
|
||||
# 美国区域的 OCI Provider
|
||||
provider "oci" {
|
||||
alias = "us"
|
||||
tenancy_ocid = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
user_ocid = data.consul_keys.oracle_config_us.var.user_ocid
|
||||
fingerprint = data.consul_keys.oracle_config_us.var.fingerprint
|
||||
private_key = data.consul_keys.oracle_config_us.var.private_key
|
||||
region = "us-ashburn-1"
|
||||
}
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
# 导入现有的美国区实例 - 不创建新资源,只管理现有的
|
||||
|
||||
# ash1d 实例
|
||||
resource "oci_core_instance" "ash1d" {
|
||||
provider = oci.us
|
||||
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
availability_domain = "TZXJ:US-ASHBURN-AD-1"
|
||||
shape = "VM.Standard.E2.1.Micro"
|
||||
display_name = "ash1d"
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
# ash2e 实例
|
||||
resource "oci_core_instance" "ash2e" {
|
||||
provider = oci.us
|
||||
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
availability_domain = "TZXJ:US-ASHBURN-AD-1"
|
||||
shape = "VM.Standard.E2.1.Micro"
|
||||
display_name = "ash2e"
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
# ash3c 实例
|
||||
resource "oci_core_instance" "ash3c" {
|
||||
provider = oci.us
|
||||
|
||||
# 基本配置 - 匹配现有实例
|
||||
compartment_id = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
availability_domain = "TZXJ:US-ASHBURN-AD-1"
|
||||
shape = "VM.Standard.A1.Flex"
|
||||
display_name = "ash3c"
|
||||
|
||||
shape_config {
|
||||
ocpus = 4
|
||||
memory_in_gbs = 24
|
||||
}
|
||||
|
||||
# 防止意外重建
|
||||
lifecycle {
|
||||
prevent_destroy = true
|
||||
ignore_changes = [
|
||||
source_details,
|
||||
metadata,
|
||||
create_vnic_details,
|
||||
time_created
|
||||
]
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
# 测试美国区域连接
|
||||
data "oci_identity_availability_domains" "us_test" {
|
||||
provider = oci.us
|
||||
compartment_id = data.consul_keys.oracle_config_us.var.tenancy_ocid
|
||||
}
|
||||
|
|
@ -17,7 +17,7 @@ output "cluster_overview" {
|
|||
name = "dc2"
|
||||
location = "Korea (KR)"
|
||||
provider = "oracle"
|
||||
node = "master"
|
||||
node = "ch4"
|
||||
ip = try(module.oracle_korea_node[0].public_ip, "pending")
|
||||
status = "deployed"
|
||||
} : null
|
||||
|
|
|
|||
|
|
@ -0,0 +1,75 @@
|
|||
datacenter = "dc1"
|
||||
data_dir = "/opt/nomad/data"
|
||||
plugin_dir = "/opt/nomad/plugins"
|
||||
log_level = "INFO"
|
||||
name = "de"
|
||||
|
||||
bind_addr = "0.0.0.0"
|
||||
|
||||
addresses {
|
||||
http = "de.tailnet-68f9.ts.net"
|
||||
rpc = "de.tailnet-68f9.ts.net"
|
||||
serf = "de.tailnet-68f9.ts.net"
|
||||
}
|
||||
|
||||
advertise {
|
||||
http = "de.tailnet-68f9.ts.net:4646"
|
||||
rpc = "de.tailnet-68f9.ts.net:4647"
|
||||
serf = "de.tailnet-68f9.ts.net:4648"
|
||||
}
|
||||
|
||||
ports {
|
||||
http = 4646
|
||||
rpc = 4647
|
||||
serf = 4648
|
||||
}
|
||||
|
||||
server {
|
||||
enabled = true
|
||||
bootstrap_expect = 3
|
||||
server_join {
|
||||
retry_join = [
|
||||
"semaphore.tailnet-68f9.ts.net:4648",
|
||||
"ash1d.tailnet-68f9.ts.net:4648",
|
||||
"ash2e.tailnet-68f9.ts.net:4648",
|
||||
"ch2.tailnet-68f9.ts.net:4648",
|
||||
"ch3.tailnet-68f9.ts.net:4648",
|
||||
"onecloud1.tailnet-68f9.ts.net:4648",
|
||||
"de.tailnet-68f9.ts.net:4648",
|
||||
"hcp1.tailnet-68f9.ts.net:4648"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
client {
|
||||
enabled = true
|
||||
servers = [
|
||||
"ch3.tailnet-68f9.ts.net:4647",
|
||||
"ash1d.tailnet-68f9.ts.net:4647",
|
||||
"ash2e.tailnet-68f9.ts.net:4647",
|
||||
"ch2.tailnet-68f9.ts.net:4647",
|
||||
"hcp1.tailnet-68f9.ts.net:4647",
|
||||
"onecloud1.tailnet-68f9.ts.net:4647",
|
||||
"de.tailnet-68f9.ts.net:4647",
|
||||
"semaphore.tailnet-68f9.ts.net:4647"
|
||||
]
|
||||
network_interface = "tailscale0"
|
||||
cgroup_parent = ""
|
||||
}
|
||||
|
||||
consul {
|
||||
address = "ch4.tailnet-68f9.ts.net:8500"
|
||||
server_service_name = "nomad"
|
||||
client_service_name = "nomad-client"
|
||||
auto_advertise = true
|
||||
server_auto_join = false
|
||||
client_auto_join = true
|
||||
}
|
||||
|
||||
telemetry {
|
||||
collection_interval = "1s"
|
||||
disable_hostname = false
|
||||
prometheus_metrics = true
|
||||
publish_allocation_metrics = true
|
||||
publish_node_metrics = true
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue