feat: 更新OCI Provider版本至7.20并集成Vault配置
refactor: 重构Terraform配置以使用Consul和Vault存储敏感信息 docs: 添加Vault实施文档和配置指南 chore: 清理不再使用的配置文件和脚本 feat: 添加Nomad集群领导者发现脚本和文档 feat: 实现MCP配置共享方案和同步脚本 style: 更新README中的网络访问注意事项 test: 添加Consul Provider集成测试脚本
This commit is contained in:
179
docs/consul-provider-integration.md
Normal file
179
docs/consul-provider-integration.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Terraform Consul Provider 集成指南
|
||||
|
||||
本指南说明如何使用Terraform Consul Provider直接从Consul获取Oracle Cloud配置,无需手动保存私钥到临时文件。
|
||||
|
||||
## 集成概述
|
||||
|
||||
我们已经将Terraform Consul Provider集成到现有的Terraform配置中,实现了以下功能:
|
||||
|
||||
1. 直接从Consul获取Oracle Cloud配置(包括tenancy_ocid、user_ocid、fingerprint和private_key)
|
||||
2. 自动将从Consul获取的私钥保存到临时文件
|
||||
3. 使用从Consul获取的配置初始化OCI Provider
|
||||
4. 支持多个区域(韩国和美国)的配置
|
||||
|
||||
## 配置结构
|
||||
|
||||
### 1. Consul中的配置存储
|
||||
|
||||
Oracle Cloud配置存储在Consul的以下路径中:
|
||||
|
||||
- 韩国区域:`config/dev/oracle/kr/`
|
||||
- `tenancy_ocid`
|
||||
- `user_ocid`
|
||||
- `fingerprint`
|
||||
- `private_key`
|
||||
|
||||
- 美国区域:`config/dev/oracle/us/`
|
||||
- `tenancy_ocid`
|
||||
- `user_ocid`
|
||||
- `fingerprint`
|
||||
- `private_key`
|
||||
|
||||
### 2. Terraform配置
|
||||
|
||||
#### Provider配置
|
||||
|
||||
```hcl
|
||||
# Consul Provider配置
|
||||
provider "consul" {
|
||||
address = "localhost:8500"
|
||||
scheme = "http"
|
||||
datacenter = "dc1"
|
||||
}
|
||||
```
|
||||
|
||||
#### 数据源配置
|
||||
|
||||
```hcl
|
||||
# 从Consul获取Oracle Cloud配置
|
||||
data "consul_keys" "oracle_config" {
|
||||
key {
|
||||
name = "tenancy_ocid"
|
||||
path = "config/dev/oracle/kr/tenancy_ocid"
|
||||
}
|
||||
key {
|
||||
name = "user_ocid"
|
||||
path = "config/dev/oracle/kr/user_ocid"
|
||||
}
|
||||
key {
|
||||
name = "fingerprint"
|
||||
path = "config/dev/oracle/kr/fingerprint"
|
||||
}
|
||||
key {
|
||||
name = "private_key"
|
||||
path = "config/dev/oracle/kr/private_key"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 私钥文件创建
|
||||
|
||||
```hcl
|
||||
# 将从Consul获取的私钥保存到临时文件
|
||||
resource "local_file" "oci_kr_private_key" {
|
||||
content = data.consul_keys.oracle_config.var.private_key
|
||||
filename = "/tmp/oci_kr_private_key.pem"
|
||||
}
|
||||
```
|
||||
|
||||
#### OCI Provider配置
|
||||
|
||||
```hcl
|
||||
# 使用从Consul获取的配置的OCI Provider
|
||||
provider "oci" {
|
||||
tenancy_ocid = data.consul_keys.oracle_config.var.tenancy_ocid
|
||||
user_ocid = data.consul_keys.oracle_config.var.user_ocid
|
||||
fingerprint = data.consul_keys.oracle_config.var.fingerprint
|
||||
private_key_path = local_file.oci_kr_private_key.filename
|
||||
region = "ap-chuncheon-1"
|
||||
}
|
||||
```
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 1. 确保Consul正在运行
|
||||
|
||||
```bash
|
||||
# 检查Consul是否运行
|
||||
pgrep consul
|
||||
```
|
||||
|
||||
### 2. 确保Oracle Cloud配置已存储在Consul中
|
||||
|
||||
```bash
|
||||
# 检查韩国区域配置
|
||||
consul kv get config/dev/oracle/kr/tenancy_ocid
|
||||
consul kv get config/dev/oracle/kr/user_ocid
|
||||
consul kv get config/dev/oracle/kr/fingerprint
|
||||
consul kv get config/dev/oracle/kr/private_key
|
||||
|
||||
# 检查美国区域配置
|
||||
consul kv get config/dev/oracle/us/tenancy_ocid
|
||||
consul kv get config/dev/oracle/us/user_ocid
|
||||
consul kv get config/dev/oracle/us/fingerprint
|
||||
consul kv get config/dev/oracle/us/private_key
|
||||
```
|
||||
|
||||
### 3. 初始化Terraform
|
||||
|
||||
```bash
|
||||
cd /root/mgmt/tofu/environments/dev
|
||||
terraform init -upgrade
|
||||
```
|
||||
|
||||
### 4. 运行测试脚本
|
||||
|
||||
```bash
|
||||
# 从项目根目录运行
|
||||
/root/mgmt/test_consul_provider.sh
|
||||
```
|
||||
|
||||
### 5. 使用Consul配置运行Terraform
|
||||
|
||||
```bash
|
||||
cd /root/mgmt/tofu/environments/dev
|
||||
terraform plan -var-file=consul.tfvars
|
||||
terraform apply -var-file=consul.tfvars
|
||||
```
|
||||
|
||||
## 优势
|
||||
|
||||
使用Consul Provider直接从Consul获取配置有以下优势:
|
||||
|
||||
1. **更高的安全性**:私钥不再需要存储在磁盘上的配置文件中,而是直接从Consul获取
|
||||
2. **更简洁的配置**:无需手动创建临时文件,Terraform自动处理
|
||||
3. **声明式风格**:完全符合Terraform的声明式配置风格
|
||||
4. **更好的维护性**:配置集中存储在Consul中,便于管理和更新
|
||||
5. **多环境支持**:可以轻松支持多个环境(dev、staging、production)的配置
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 1. Consul连接问题
|
||||
|
||||
如果无法连接到Consul,请检查:
|
||||
|
||||
- Consul服务是否正在运行
|
||||
- Consul地址和端口是否正确(默认为localhost:8500)
|
||||
- 网络连接是否正常
|
||||
|
||||
### 2. 配置获取问题
|
||||
|
||||
如果无法从Consul获取配置,请检查:
|
||||
|
||||
- 配置是否已正确存储在Consul中
|
||||
- 路径是否正确
|
||||
- 权限是否足够
|
||||
|
||||
### 3. Terraform初始化问题
|
||||
|
||||
如果Terraform初始化失败,请检查:
|
||||
|
||||
- Terraform版本是否符合要求(>=1.6)
|
||||
- 网络连接是否正常
|
||||
- Provider源是否可访问
|
||||
|
||||
## 版本信息
|
||||
|
||||
- Terraform: >=1.6
|
||||
- Consul Provider: ~2.22.0
|
||||
- OCI Provider: ~5.0
|
||||
268
docs/vault/ansible_vault_integration.md
Normal file
268
docs/vault/ansible_vault_integration.md
Normal file
@@ -0,0 +1,268 @@
|
||||
# Ansible与HashiCorp Vault集成指南
|
||||
|
||||
本文档介绍如何将Ansible与HashiCorp Vault集成,以安全地管理和使用敏感信息。
|
||||
|
||||
## 1. 安装必要的Python包
|
||||
|
||||
首先,需要安装Ansible的Vault集成包:
|
||||
|
||||
```bash
|
||||
pip install hvac
|
||||
```
|
||||
|
||||
## 2. 配置Ansible使用Vault
|
||||
|
||||
### 2.1 创建Vault连接配置
|
||||
|
||||
创建一个Vault连接配置文件 `vault_config.yml`:
|
||||
|
||||
```yaml
|
||||
vault_addr: http://localhost:8200
|
||||
vault_role_id: "your-approle-role-id"
|
||||
vault_secret_id: "your-approle-secret-id"
|
||||
```
|
||||
|
||||
### 2.2 创建Vault查询角色
|
||||
|
||||
在Vault中创建一个专用于Ansible的AppRole:
|
||||
|
||||
```bash
|
||||
# 启用AppRole认证
|
||||
vault auth enable approle
|
||||
|
||||
# 创建策略
|
||||
cat > ansible-policy.hcl <<EOF
|
||||
path "kv/data/ansible/*" {
|
||||
capabilities = ["read"]
|
||||
}
|
||||
EOF
|
||||
|
||||
vault policy write ansible ansible-policy.hcl
|
||||
|
||||
# 创建AppRole
|
||||
vault write auth/approle/role/ansible \
|
||||
token_policies="ansible" \
|
||||
token_ttl=1h \
|
||||
token_max_ttl=4h
|
||||
|
||||
# 获取Role ID
|
||||
vault read auth/approle/role/ansible/role-id
|
||||
|
||||
# 生成Secret ID
|
||||
vault write -f auth/approle/role/ansible/secret-id
|
||||
```
|
||||
|
||||
## 3. 在Ansible中使用Vault
|
||||
|
||||
### 3.1 使用lookup插件
|
||||
|
||||
在Ansible playbook中使用`hashi_vault`查找插件:
|
||||
|
||||
```yaml
|
||||
---
|
||||
- name: 使用HashiCorp Vault的示例
|
||||
hosts: all
|
||||
vars:
|
||||
vault_addr: "http://localhost:8200"
|
||||
role_id: "{{ lookup('file', '/path/to/role_id') }}"
|
||||
secret_id: "{{ lookup('file', '/path/to/secret_id') }}"
|
||||
|
||||
# 从Vault获取数据库密码
|
||||
db_password: "{{ lookup('hashi_vault', 'secret=kv/data/ansible/db:password auth_method=approle role_id=' + role_id + ' secret_id=' + secret_id + ' url=' + vault_addr) }}"
|
||||
|
||||
tasks:
|
||||
- name: 配置数据库连接
|
||||
template:
|
||||
src: db_config.j2
|
||||
dest: /etc/app/db_config.ini
|
||||
```
|
||||
|
||||
### 3.2 使用环境变量
|
||||
|
||||
也可以通过环境变量设置Vault认证信息:
|
||||
|
||||
```yaml
|
||||
---
|
||||
- name: 使用环境变量的Vault示例
|
||||
hosts: all
|
||||
environment:
|
||||
VAULT_ADDR: "http://localhost:8200"
|
||||
VAULT_ROLE_ID: "{{ lookup('file', '/path/to/role_id') }}"
|
||||
VAULT_SECRET_ID: "{{ lookup('file', '/path/to/secret_id') }}"
|
||||
|
||||
tasks:
|
||||
- name: 从Vault获取密钥
|
||||
set_fact:
|
||||
api_key: "{{ lookup('hashi_vault', 'secret=kv/data/ansible/api:key') }}"
|
||||
```
|
||||
|
||||
## 4. 创建Vault密钥模块
|
||||
|
||||
创建一个自定义的Ansible角色,用于管理Vault中的密钥:
|
||||
|
||||
### 4.1 角色结构
|
||||
|
||||
```
|
||||
roles/
|
||||
└── vault_secrets/
|
||||
├── defaults/
|
||||
│ └── main.yml
|
||||
├── tasks/
|
||||
│ └── main.yml
|
||||
└── vars/
|
||||
└── main.yml
|
||||
```
|
||||
|
||||
### 4.2 主任务文件
|
||||
|
||||
`roles/vault_secrets/tasks/main.yml`:
|
||||
|
||||
```yaml
|
||||
---
|
||||
- name: 确保Vault令牌有效
|
||||
block:
|
||||
- name: 获取Vault令牌
|
||||
set_fact:
|
||||
vault_token: "{{ lookup('hashi_vault', 'auth_method=approle role_id=' + vault_role_id + ' secret_id=' + vault_secret_id + ' url=' + vault_addr) }}"
|
||||
no_log: true
|
||||
rescue:
|
||||
- name: Vault认证失败
|
||||
fail:
|
||||
msg: "无法从Vault获取有效令牌"
|
||||
|
||||
- name: 从Vault读取密钥
|
||||
set_fact:
|
||||
secrets: "{{ lookup('hashi_vault', 'secret=' + vault_path + ' token=' + vault_token + ' url=' + vault_addr) }}"
|
||||
no_log: true
|
||||
|
||||
- name: 设置各个密钥变量
|
||||
set_fact:
|
||||
"{{ item.key }}": "{{ item.value }}"
|
||||
with_dict: "{{ secrets.data.data }}"
|
||||
no_log: true
|
||||
```
|
||||
|
||||
## 5. 将现有Ansible Vault迁移到HashiCorp Vault
|
||||
|
||||
### 5.1 创建迁移脚本
|
||||
|
||||
创建一个脚本来自动迁移Ansible Vault内容到HashiCorp Vault:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# migrate_to_hashicorp_vault.sh
|
||||
|
||||
# 设置变量
|
||||
ANSIBLE_VAULT_FILE=$1
|
||||
VAULT_PATH=$2
|
||||
VAULT_ADDR=${VAULT_ADDR:-"http://localhost:8200"}
|
||||
|
||||
# 检查参数
|
||||
if [ -z "$ANSIBLE_VAULT_FILE" ] || [ -z "$VAULT_PATH" ]; then
|
||||
echo "用法: $0 <ansible_vault_file> <vault_path>"
|
||||
echo "示例: $0 group_vars/all/vault.yml kv/ansible/group_vars/all"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 检查Vault登录状态
|
||||
if ! vault token lookup >/dev/null 2>&1; then
|
||||
echo "请先登录Vault: vault login <token>"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 解密Ansible Vault文件
|
||||
echo "解密Ansible Vault文件..."
|
||||
TEMP_FILE=$(mktemp)
|
||||
ansible-vault decrypt --output="$TEMP_FILE" "$ANSIBLE_VAULT_FILE"
|
||||
|
||||
# 将YAML转换为JSON并存储到HashiCorp Vault
|
||||
echo "迁移密钥到HashiCorp Vault..."
|
||||
python3 -c "
|
||||
import yaml, json, sys, subprocess
|
||||
with open('$TEMP_FILE', 'r') as f:
|
||||
data = yaml.safe_load(f)
|
||||
for key, value in data.items():
|
||||
cmd = ['vault', 'kv', 'put', '$VAULT_PATH/' + key, 'value=' + json.dumps(value)]
|
||||
subprocess.run(cmd)
|
||||
"
|
||||
|
||||
# 清理临时文件
|
||||
rm "$TEMP_FILE"
|
||||
|
||||
echo "迁移完成! 数据已存储在Vault路径: $VAULT_PATH/"
|
||||
```
|
||||
|
||||
### 5.2 执行迁移
|
||||
|
||||
```bash
|
||||
# 赋予脚本执行权限
|
||||
chmod +x migrate_to_hashicorp_vault.sh
|
||||
|
||||
# 执行迁移
|
||||
./migrate_to_hashicorp_vault.sh group_vars/all/vault.yml kv/ansible/group_vars/all
|
||||
```
|
||||
|
||||
## 6. 更新Ansible配置
|
||||
|
||||
### 6.1 修改ansible.cfg
|
||||
|
||||
更新`ansible.cfg`文件,添加Vault相关配置:
|
||||
|
||||
```ini
|
||||
[defaults]
|
||||
vault_identity_list = dev@~/.ansible/vault_dev.txt, prod@~/.ansible/vault_prod.txt
|
||||
|
||||
[hashi_vault_collection]
|
||||
url = http://localhost:8200
|
||||
auth_method = approle
|
||||
role_id = /path/to/role_id
|
||||
secret_id = /path/to/secret_id
|
||||
```
|
||||
|
||||
### 6.2 更新现有Playbook
|
||||
|
||||
将现有playbook中的Ansible Vault引用替换为HashiCorp Vault引用:
|
||||
|
||||
```yaml
|
||||
# 旧方式
|
||||
- name: 使用Ansible Vault变量
|
||||
debug:
|
||||
msg: "数据库密码: {{ vault_db_password }}"
|
||||
|
||||
# 新方式
|
||||
- name: 使用HashiCorp Vault变量
|
||||
debug:
|
||||
msg: "数据库密码: {{ lookup('hashi_vault', 'secret=kv/data/ansible/db:password') }}"
|
||||
```
|
||||
|
||||
## 7. 最佳实践
|
||||
|
||||
1. **避免硬编码认证信息**:使用环境变量或外部文件存储Vault认证信息
|
||||
2. **限制令牌权限**:为Ansible创建的Vault令牌仅授予必要的最小权限
|
||||
3. **设置合理的TTL**:为Vault令牌设置合理的生命周期,避免长期有效的令牌
|
||||
4. **使用no_log**:对包含敏感信息的任务使用`no_log: true`防止日志泄露
|
||||
5. **定期轮换认证凭据**:定期轮换AppRole的Secret ID
|
||||
6. **使用CI/CD集成**:在CI/CD流程中集成Vault认证,避免手动管理令牌
|
||||
|
||||
## 8. 故障排除
|
||||
|
||||
### 8.1 常见问题
|
||||
|
||||
1. **认证失败**:
|
||||
- 检查Role ID和Secret ID是否正确
|
||||
- 验证AppRole是否有正确的策略附加
|
||||
|
||||
2. **路径错误**:
|
||||
- KV v2引擎需要在路径中包含`data`,例如`kv/data/path`而不是`kv/path`
|
||||
|
||||
3. **权限问题**:
|
||||
- 确保AppRole有足够的权限访问请求的密钥
|
||||
|
||||
### 8.2 调试技巧
|
||||
|
||||
```yaml
|
||||
- name: 调试Vault查询
|
||||
debug:
|
||||
msg: "{{ lookup('hashi_vault', 'secret=kv/data/ansible/db:password auth_method=approle role_id=' + role_id + ' secret_id=' + secret_id + ' url=' + vault_addr) }}"
|
||||
vars:
|
||||
ansible_hashi_vault_debug: true
|
||||
94
docs/vault/vault-cluster.nomad
Normal file
94
docs/vault/vault-cluster.nomad
Normal file
@@ -0,0 +1,94 @@
|
||||
job "vault-cluster" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "vault-servers" {
|
||||
count = 3
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "regexp"
|
||||
value = "(warden|ash3c|master)"
|
||||
}
|
||||
|
||||
task "vault" {
|
||||
driver = "podman"
|
||||
|
||||
config {
|
||||
image = "hashicorp/vault:latest"
|
||||
ports = ["api", "cluster"]
|
||||
|
||||
# 确保容器在退出时不会自动重启
|
||||
command = "vault"
|
||||
args = [
|
||||
"server",
|
||||
"-config=/vault/config/vault.hcl"
|
||||
]
|
||||
|
||||
# 容器网络设置
|
||||
network_mode = "host"
|
||||
|
||||
# 安全设置
|
||||
cap_add = ["IPC_LOCK"]
|
||||
}
|
||||
|
||||
template {
|
||||
data = <<EOH
|
||||
storage "consul" {
|
||||
address = "127.0.0.1:8500"
|
||||
path = "vault/"
|
||||
token = "{{ with secret "consul/creds/vault" }}{{ .Data.token }}{{ end }}"
|
||||
}
|
||||
|
||||
listener "tcp" {
|
||||
address = "0.0.0.0:8200"
|
||||
tls_disable = 1 # 生产环境应启用TLS
|
||||
}
|
||||
|
||||
api_addr = "http://{{ env "NOMAD_IP_api" }}:8200"
|
||||
cluster_addr = "http://{{ env "NOMAD_IP_cluster" }}:8201"
|
||||
|
||||
ui = true
|
||||
disable_mlock = true
|
||||
EOH
|
||||
destination = "vault/config/vault.hcl"
|
||||
}
|
||||
|
||||
volume_mount {
|
||||
volume = "vault-data"
|
||||
destination = "/vault/data"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
|
||||
network {
|
||||
mbits = 10
|
||||
port "api" { static = 8200 }
|
||||
port "cluster" { static = 8201 }
|
||||
}
|
||||
}
|
||||
|
||||
service {
|
||||
name = "vault"
|
||||
port = "api"
|
||||
|
||||
check {
|
||||
name = "vault-health"
|
||||
type = "http"
|
||||
path = "/v1/sys/health"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
volume "vault-data" {
|
||||
type = "host"
|
||||
read_only = false
|
||||
source = "vault-data"
|
||||
}
|
||||
}
|
||||
}
|
||||
169
docs/vault/vault_implementation_proposal.md
Normal file
169
docs/vault/vault_implementation_proposal.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# HashiCorp Vault 实施方案论证
|
||||
|
||||
## 1. 项目现状分析
|
||||
|
||||
### 1.1 现有基础设施
|
||||
- **多云环境**: Oracle Cloud, 华为云, Google Cloud, AWS, DigitalOcean
|
||||
- **基础设施管理**: OpenTofu (Terraform)
|
||||
- **配置管理**: Ansible
|
||||
- **容器编排**: Nomad + Podman
|
||||
- **服务发现**: Consul (部署在warden、ash3c、master三个节点上)
|
||||
- **CI/CD**: Gitea Actions
|
||||
|
||||
### 1.2 当前密钥管理现状
|
||||
- 部分使用Ansible Vault管理敏感信息
|
||||
- 存在明文密钥存储在代码库中(如`security/secrets/key.md`)
|
||||
- 缺乏统一的密钥管理和轮换机制
|
||||
- 没有集中的访问控制和审计机制
|
||||
|
||||
### 1.3 安全风险
|
||||
- 明文密钥存储导致潜在的安全漏洞
|
||||
- 缺乏密钥轮换机制增加了长期凭据泄露的风险
|
||||
- 分散的密钥管理增加了维护难度和安全风险
|
||||
- 缺乏审计机制,难以追踪谁在何时访问了敏感信息
|
||||
|
||||
## 2. HashiCorp Vault 解决方案
|
||||
|
||||
### 2.1 Vault 简介
|
||||
HashiCorp Vault是一个密钥管理和数据保护工具,专为现代云环境设计,提供以下核心功能:
|
||||
- 密钥和敏感数据的安全存储
|
||||
- 动态生成临时凭据
|
||||
- 数据加密服务
|
||||
- 详细的审计日志
|
||||
- 精细的访问控制
|
||||
|
||||
### 2.2 Vault 如何解决当前问题
|
||||
- **集中式密钥管理**: 所有密钥和敏感信息统一存储和管理
|
||||
- **动态密钥生成**: 为数据库、云服务等生成临时凭据,减少长期凭据泄露风险
|
||||
- **自动密钥轮换**: 定期自动轮换密钥,提高安全性
|
||||
- **访问控制**: 基于角色的访问控制,确保只有授权用户能访问特定密钥
|
||||
- **审计日志**: 详细记录所有密钥访问操作,便于安全审计
|
||||
- **与现有基础设施集成**: 与Nomad和Consul无缝集成
|
||||
|
||||
## 3. 部署方案
|
||||
|
||||
### 3.1 部署架构
|
||||
建议在现有的Consul集群节点(warden、ash3c、master)上部署Vault,形成高可用的Vault集群:
|
||||
|
||||
```
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
| warden | | ash3c | | master |
|
||||
| | | | | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | Consul | | | | Consul | | | | Consul | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | | | | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | Vault | | | | Vault | | | | Vault | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
```
|
||||
|
||||
### 3.2 存储后端
|
||||
使用现有的Consul集群作为Vault的存储后端,利用Consul的高可用性和一致性特性:
|
||||
- Vault数据加密存储在Consul中
|
||||
- 利用Consul的分布式特性确保数据的高可用性
|
||||
- Vault服务器本身无状态,便于扩展和维护
|
||||
|
||||
### 3.3 资源需求
|
||||
每个节点上的Vault服务建议配置:
|
||||
- CPU: 2-4核
|
||||
- 内存: 4-8GB
|
||||
- 存储: 20GB (用于日志和临时数据)
|
||||
|
||||
### 3.4 网络配置
|
||||
- Vault API端口: 8200
|
||||
- Vault集群通信端口: 8201
|
||||
- 配置TLS加密所有通信
|
||||
- 设置适当的防火墙规则,限制对Vault API的访问
|
||||
|
||||
## 4. 实施计划
|
||||
|
||||
### 4.1 准备阶段
|
||||
1. **环境准备**
|
||||
- 在目标节点上安装必要的依赖
|
||||
- 生成TLS证书用于Vault通信加密
|
||||
- 配置防火墙规则
|
||||
|
||||
2. **配置文件准备**
|
||||
- 创建Vault配置文件
|
||||
- 配置Consul存储后端
|
||||
- 设置TLS和加密参数
|
||||
|
||||
### 4.2 部署阶段
|
||||
1. **初始部署**
|
||||
- 在三个节点上安装Vault
|
||||
- 配置为使用Consul作为存储后端
|
||||
- 初始化Vault并生成解封密钥
|
||||
|
||||
2. **高可用性配置**
|
||||
- 配置Vault集群
|
||||
- 设置自动解封机制
|
||||
- 配置负载均衡
|
||||
|
||||
### 4.3 集成阶段
|
||||
1. **与现有系统集成**
|
||||
- 配置Nomad使用Vault获取密钥
|
||||
- 更新Ansible脚本,使用Vault API获取敏感信息
|
||||
- 集成到CI/CD流程中
|
||||
|
||||
2. **密钥迁移**
|
||||
- 将现有密钥迁移到Vault
|
||||
- 设置密钥轮换策略
|
||||
- 移除代码库中的明文密钥
|
||||
|
||||
### 4.4 验证和测试
|
||||
1. **功能测试**
|
||||
- 验证Vault的基本功能
|
||||
- 测试密钥访问和管理
|
||||
- 验证高可用性和故障转移
|
||||
|
||||
2. **安全测试**
|
||||
- 进行渗透测试
|
||||
- 验证访问控制策略
|
||||
- 测试审计日志功能
|
||||
|
||||
## 5. 运维和管理
|
||||
|
||||
### 5.1 日常运维
|
||||
- 定期备份Vault数据
|
||||
- 监控Vault服务状态
|
||||
- 审查审计日志
|
||||
|
||||
### 5.2 灾难恢复
|
||||
- 制定详细的灾难恢复计划
|
||||
- 定期进行恢复演练
|
||||
- 确保解封密钥的安全存储
|
||||
|
||||
### 5.3 安全最佳实践
|
||||
- 实施最小权限原则
|
||||
- 定期轮换根密钥
|
||||
- 使用多因素认证
|
||||
- 定期审查访问策略
|
||||
|
||||
## 6. 实施时间表
|
||||
|
||||
| 阶段 | 任务 | 时间估计 |
|
||||
|------|------|----------|
|
||||
| 准备 | 环境准备 | 1天 |
|
||||
| 准备 | 配置文件准备 | 1天 |
|
||||
| 部署 | 初始部署 | 1天 |
|
||||
| 部署 | 高可用性配置 | 1天 |
|
||||
| 集成 | 与现有系统集成 | 3天 |
|
||||
| 集成 | 密钥迁移 | 2天 |
|
||||
| 测试 | 功能和安全测试 | 2天 |
|
||||
| 文档 | 编写运维文档 | 1天 |
|
||||
| **总计** | | **12天** |
|
||||
|
||||
## 7. 结论和建议
|
||||
|
||||
基于对当前基础设施和安全需求的分析,我们强烈建议在现有的Consul集群节点上部署HashiCorp Vault,以提升项目的安全性和密钥管理能力。
|
||||
|
||||
主要优势包括:
|
||||
- 消除明文密钥存储的安全风险
|
||||
- 提供集中式的密钥管理和访问控制
|
||||
- 支持动态密钥生成和自动轮换
|
||||
- 与现有的HashiCorp生态系统(Nomad、Consul)无缝集成
|
||||
- 提供详细的审计日志,满足合规要求
|
||||
|
||||
通过在现有节点上部署Vault,我们可以充分利用现有资源,同时显著提升项目的安全性,为多云环境提供统一的密钥管理解决方案。
|
||||
252
docs/vault/vault_setup_guide.md
Normal file
252
docs/vault/vault_setup_guide.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# Vault 部署和配置指南
|
||||
|
||||
本文档提供了在现有Consul集群节点上部署和配置HashiCorp Vault的详细步骤。
|
||||
|
||||
## 1. 前置准备
|
||||
|
||||
### 1.1 创建数据目录
|
||||
|
||||
在每个节点上创建Vault数据目录:
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /opt/vault/data
|
||||
sudo chown -R nomad:nomad /opt/vault
|
||||
```
|
||||
|
||||
### 1.2 生成TLS证书(生产环境必须)
|
||||
|
||||
```bash
|
||||
# 生成CA证书
|
||||
vault operator generate-root -generate-only -type=tls > ca.cert
|
||||
|
||||
# 生成服务器证书
|
||||
vault operator generate-server-cert > server.cert
|
||||
```
|
||||
|
||||
## 2. 部署Vault集群
|
||||
|
||||
### 2.1 使用Nomad部署
|
||||
|
||||
将`vault-cluster.nomad`文件提交到Nomad:
|
||||
|
||||
```bash
|
||||
nomad job run vault-cluster.nomad
|
||||
```
|
||||
|
||||
### 2.2 验证部署状态
|
||||
|
||||
```bash
|
||||
# 检查Nomad任务状态
|
||||
nomad job status vault-cluster
|
||||
|
||||
# 检查Vault服务状态
|
||||
curl http://localhost:8200/v1/sys/health
|
||||
```
|
||||
|
||||
## 3. 初始化和解封Vault
|
||||
|
||||
### 3.1 初始化Vault
|
||||
|
||||
在任一节点上执行:
|
||||
|
||||
```bash
|
||||
# 初始化Vault,生成解封密钥和根令牌
|
||||
vault operator init -key-shares=5 -key-threshold=3
|
||||
```
|
||||
|
||||
**重要提示:** 安全保存生成的解封密钥和根令牌!
|
||||
|
||||
### 3.2 解封Vault
|
||||
|
||||
在每个节点上执行解封操作(需要至少3个解封密钥):
|
||||
|
||||
```bash
|
||||
# 解封Vault
|
||||
vault operator unseal <解封密钥1>
|
||||
vault operator unseal <解封密钥2>
|
||||
vault operator unseal <解封密钥3>
|
||||
```
|
||||
|
||||
## 4. 配置Vault
|
||||
|
||||
### 4.1 登录Vault
|
||||
|
||||
```bash
|
||||
# 设置Vault地址
|
||||
export VAULT_ADDR='http://127.0.0.1:8200'
|
||||
|
||||
# 使用根令牌登录
|
||||
vault login <根令牌>
|
||||
```
|
||||
|
||||
### 4.2 启用密钥引擎
|
||||
|
||||
```bash
|
||||
# 启用KV v2密钥引擎
|
||||
vault secrets enable -version=2 kv
|
||||
|
||||
# 启用AWS密钥引擎(如需要)
|
||||
vault secrets enable aws
|
||||
|
||||
# 启用数据库密钥引擎(如需要)
|
||||
vault secrets enable database
|
||||
```
|
||||
|
||||
### 4.3 配置访问策略
|
||||
|
||||
```bash
|
||||
# 创建策略文件
|
||||
cat > nomad-server-policy.hcl <<EOF
|
||||
path "kv/data/nomad/*" {
|
||||
capabilities = ["read"]
|
||||
}
|
||||
EOF
|
||||
|
||||
# 创建策略
|
||||
vault policy write nomad-server nomad-server-policy.hcl
|
||||
|
||||
# 创建令牌
|
||||
vault token create -policy=nomad-server
|
||||
```
|
||||
|
||||
## 5. 与Nomad集成
|
||||
|
||||
### 5.1 配置Nomad使用Vault
|
||||
|
||||
编辑Nomad配置文件(`/etc/nomad.d/nomad.hcl`),添加Vault配置:
|
||||
|
||||
```hcl
|
||||
vault {
|
||||
enabled = true
|
||||
address = "http://127.0.0.1:8200"
|
||||
token = "<Nomad服务器的Vault令牌>"
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 重启Nomad服务
|
||||
|
||||
```bash
|
||||
sudo systemctl restart nomad
|
||||
```
|
||||
|
||||
## 6. 迁移现有密钥到Vault
|
||||
|
||||
### 6.1 存储API密钥
|
||||
|
||||
```bash
|
||||
# 存储OCI API密钥
|
||||
vault kv put kv/oci/api-key key="$(cat /root/mgmt/security/secrets/key.md)"
|
||||
|
||||
# 存储其他云服务商密钥
|
||||
vault kv put kv/aws/credentials aws_access_key_id="<访问密钥ID>" aws_secret_access_key="<秘密访问密钥>"
|
||||
```
|
||||
|
||||
### 6.2 配置密钥轮换策略
|
||||
|
||||
```bash
|
||||
# 为数据库凭据配置自动轮换
|
||||
vault write database/config/mysql \
|
||||
plugin_name=mysql-database-plugin \
|
||||
connection_url="{{username}}:{{password}}@tcp(database.example.com:3306)/" \
|
||||
allowed_roles="app-role" \
|
||||
username="root" \
|
||||
password="<数据库根密码>"
|
||||
|
||||
# 配置角色
|
||||
vault write database/roles/app-role \
|
||||
db_name=mysql \
|
||||
creation_statements="CREATE USER '{{name}}'@'%' IDENTIFIED BY '{{password}}';GRANT SELECT ON *.* TO '{{name}}'@'%';" \
|
||||
default_ttl="1h" \
|
||||
max_ttl="24h"
|
||||
```
|
||||
|
||||
## 7. 安全最佳实践
|
||||
|
||||
### 7.1 启用审计日志
|
||||
|
||||
```bash
|
||||
# 启用文件审计设备
|
||||
vault audit enable file file_path=/var/log/vault/audit.log
|
||||
```
|
||||
|
||||
### 7.2 配置自动解封(生产环境)
|
||||
|
||||
对于生产环境,建议配置自动解封机制,可以使用云KMS服务:
|
||||
|
||||
```hcl
|
||||
# AWS KMS自动解封配置示例
|
||||
seal "awskms" {
|
||||
region = "us-west-2"
|
||||
kms_key_id = "<AWS KMS密钥ID>"
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 定期轮换根密钥
|
||||
|
||||
```bash
|
||||
# 轮换根密钥
|
||||
vault operator rotate
|
||||
```
|
||||
|
||||
## 8. 故障排除
|
||||
|
||||
### 8.1 检查Vault状态
|
||||
|
||||
```bash
|
||||
# 检查Vault状态
|
||||
vault status
|
||||
|
||||
# 检查密封状态
|
||||
vault status -format=json | jq '.sealed'
|
||||
```
|
||||
|
||||
### 8.2 检查Consul存储
|
||||
|
||||
```bash
|
||||
# 检查Consul中的Vault数据
|
||||
consul kv get -recurse vault/
|
||||
```
|
||||
|
||||
### 8.3 常见问题解决
|
||||
|
||||
- **Vault启动失败**:检查配置文件语法和权限
|
||||
- **解封失败**:确保使用正确的解封密钥
|
||||
- **API不可访问**:检查防火墙规则和监听地址配置
|
||||
|
||||
## 9. 备份和恢复
|
||||
|
||||
### 9.1 备份Vault数据
|
||||
|
||||
```bash
|
||||
# 备份Consul中的Vault数据
|
||||
consul snapshot save vault-backup.snap
|
||||
```
|
||||
|
||||
### 9.2 恢复Vault数据
|
||||
|
||||
```bash
|
||||
# 恢复Consul快照
|
||||
consul snapshot restore vault-backup.snap
|
||||
```
|
||||
|
||||
## 10. 日常维护
|
||||
|
||||
### 10.1 监控Vault状态
|
||||
|
||||
```bash
|
||||
# 设置Prometheus监控
|
||||
vault write sys/metrics/collector prometheus
|
||||
```
|
||||
|
||||
### 10.2 查看审计日志
|
||||
|
||||
```bash
|
||||
# 分析审计日志
|
||||
cat /var/log/vault/audit.log | jq
|
||||
```
|
||||
|
||||
### 10.3 定期更新Vault版本
|
||||
|
||||
```bash
|
||||
# 更新Vault版本(通过更新Nomad作业)
|
||||
nomad job run -detach vault-cluster.nomad
|
||||
99
docs/waypoint/waypoint-server.nomad
Normal file
99
docs/waypoint/waypoint-server.nomad
Normal file
@@ -0,0 +1,99 @@
|
||||
job "waypoint-server" {
|
||||
datacenters = ["dc1"]
|
||||
type = "service"
|
||||
|
||||
group "waypoint" {
|
||||
count = 1
|
||||
|
||||
constraint {
|
||||
attribute = "${node.unique.name}"
|
||||
operator = "="
|
||||
value = "warden"
|
||||
}
|
||||
|
||||
network {
|
||||
port "ui" {
|
||||
static = 9701
|
||||
}
|
||||
|
||||
port "api" {
|
||||
static = 9702
|
||||
}
|
||||
|
||||
port "grpc" {
|
||||
static = 9703
|
||||
}
|
||||
}
|
||||
|
||||
task "server" {
|
||||
driver = "podman"
|
||||
|
||||
config {
|
||||
image = "hashicorp/waypoint:latest"
|
||||
ports = ["ui", "api", "grpc"]
|
||||
|
||||
args = [
|
||||
"server",
|
||||
"run",
|
||||
"-accept-tos",
|
||||
"-vvv",
|
||||
"-platform=nomad",
|
||||
"-nomad-host=${attr.nomad.advertise.address}",
|
||||
"-nomad-consul-service=true",
|
||||
"-nomad-consul-service-hostname=${attr.unique.hostname}",
|
||||
"-nomad-consul-datacenter=dc1",
|
||||
"-listen-grpc=0.0.0.0:9703",
|
||||
"-listen-http=0.0.0.0:9702",
|
||||
"-url-api=http://${attr.unique.hostname}:9702",
|
||||
"-url-ui=http://${attr.unique.hostname}:9701"
|
||||
]
|
||||
}
|
||||
|
||||
env {
|
||||
WAYPOINT_SERVER_DISABLE_MEMORY_DB = "true"
|
||||
}
|
||||
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 1024
|
||||
}
|
||||
|
||||
service {
|
||||
name = "waypoint-ui"
|
||||
port = "ui"
|
||||
|
||||
check {
|
||||
name = "waypoint-ui-alive"
|
||||
type = "http"
|
||||
path = "/"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
}
|
||||
}
|
||||
|
||||
service {
|
||||
name = "waypoint-api"
|
||||
port = "api"
|
||||
|
||||
check {
|
||||
name = "waypoint-api-alive"
|
||||
type = "tcp"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
}
|
||||
}
|
||||
|
||||
volume_mount {
|
||||
volume = "waypoint-data"
|
||||
destination = "/data"
|
||||
read_only = false
|
||||
}
|
||||
}
|
||||
|
||||
volume "waypoint-data" {
|
||||
type = "host"
|
||||
read_only = false
|
||||
source = "waypoint-data"
|
||||
}
|
||||
}
|
||||
}
|
||||
245
docs/waypoint/waypoint_implementation_proposal.md
Normal file
245
docs/waypoint/waypoint_implementation_proposal.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# HashiCorp Waypoint 实施方案论证
|
||||
|
||||
## 1. 项目现状分析
|
||||
|
||||
### 1.1 现有部署流程
|
||||
- **基础设施管理**: OpenTofu (Terraform)
|
||||
- **配置管理**: Ansible
|
||||
- **容器编排**: Nomad + Podman
|
||||
- **CI/CD**: Gitea Actions
|
||||
- **多云环境**: Oracle Cloud, 华为云, Google Cloud, AWS, DigitalOcean
|
||||
|
||||
### 1.2 当前部署流程挑战
|
||||
- 跨多个云平台的部署流程不一致
|
||||
- 不同环境(开发、测试、生产)的配置差异管理复杂
|
||||
- 应用生命周期管理分散在多个工具中
|
||||
- 缺乏统一的应用部署和发布界面
|
||||
- 开发团队需要了解多种工具和平台特性
|
||||
|
||||
### 1.3 现有GitOps工作流
|
||||
项目已实施GitOps工作流,包括:
|
||||
- 声明式配置存储在Git中
|
||||
- 通过CI/CD流水线自动应用变更
|
||||
- 状态收敛和监控
|
||||
|
||||
## 2. HashiCorp Waypoint 解决方案
|
||||
|
||||
### 2.1 Waypoint 简介
|
||||
HashiCorp Waypoint是一个应用部署工具,提供一致的工作流来构建、部署和发布应用,无论底层平台如何。主要特性包括:
|
||||
|
||||
- 统一的工作流接口
|
||||
- 多平台支持
|
||||
- 应用版本管理
|
||||
- 自动化发布控制
|
||||
- 可扩展的插件系统
|
||||
|
||||
### 2.2 Waypoint 如何补充现有工具链
|
||||
|
||||
| 现有工具 | 主要职责 | Waypoint 补充 |
|
||||
|---------|---------|--------------|
|
||||
| OpenTofu | 基础设施管理 | 不替代,而是与之集成,使用已创建的基础设施 |
|
||||
| Ansible | 配置管理 | 可以作为构建或部署步骤的一部分调用Ansible |
|
||||
| Nomad | 容器编排 | 直接集成,简化Nomad作业的部署和管理 |
|
||||
| Gitea Actions | CI/CD流水线 | 可以在流水线中调用Waypoint,或由Waypoint触发流水线 |
|
||||
|
||||
### 2.3 Waypoint 与现有工具的协同工作
|
||||
```
|
||||
+----------------+ +----------------+ +----------------+
|
||||
| OpenTofu | | Waypoint | | Nomad |
|
||||
| |---->| |---->| |
|
||||
| (基础设施管理) | | (应用部署流程) | | (容器编排) |
|
||||
+----------------+ +----------------+ +----------------+
|
||||
|
|
||||
v
|
||||
+----------------+
|
||||
| Ansible |
|
||||
| |
|
||||
| (配置管理) |
|
||||
+----------------+
|
||||
```
|
||||
|
||||
## 3. Waypoint 实施价值分析
|
||||
|
||||
### 3.1 潜在优势
|
||||
|
||||
#### 3.1.1 开发体验提升
|
||||
- **简化接口**: 开发人员通过统一接口部署应用,无需了解底层平台细节
|
||||
- **本地开发一致性**: 开发环境与生产环境使用相同的部署流程
|
||||
- **快速反馈**: 部署结果和日志集中可见
|
||||
|
||||
#### 3.1.2 运维效率提升
|
||||
- **标准化部署流程**: 跨团队和项目的一致部署方法
|
||||
- **减少平台特定脚本**: 减少为不同平台维护的自定义脚本
|
||||
- **集中式部署管理**: 通过UI或CLI集中管理所有应用部署
|
||||
|
||||
#### 3.1.3 多云策略支持
|
||||
- **平台无关的部署**: 相同的Waypoint配置可用于不同云平台
|
||||
- **简化云迁移**: 更容易在不同云提供商之间迁移应用
|
||||
- **混合云支持**: 统一管理跨多个云平台的部署
|
||||
|
||||
#### 3.1.4 与现有HashiCorp生态系统集成
|
||||
- **Nomad集成**: 原生支持Nomad作为部署平台
|
||||
- **Consul集成**: 服务发现和配置管理
|
||||
- **Vault集成**: 安全获取部署所需的密钥和证书
|
||||
|
||||
### 3.2 潜在挑战
|
||||
|
||||
#### 3.2.1 实施成本
|
||||
- **学习曲线**: 团队需要学习新工具
|
||||
- **迁移工作**: 现有部署流程需要适配到Waypoint
|
||||
- **维护开销**: 额外的基础设施组件需要维护
|
||||
|
||||
#### 3.2.2 与现有流程的重叠
|
||||
- **与Gitea Actions重叠**: 部分功能与现有CI/CD流程重叠
|
||||
- **工具链复杂性**: 添加新工具可能增加整体复杂性
|
||||
|
||||
#### 3.2.3 成熟度考量
|
||||
- **相对较新的项目**: 与其他HashiCorp产品相比,Waypoint相对较新
|
||||
- **社区规模**: 社区和生态系统仍在发展中
|
||||
- **插件生态**: 某些特定平台的插件可能不够成熟
|
||||
|
||||
## 4. 实施方案
|
||||
|
||||
### 4.1 部署架构
|
||||
建议将Waypoint服务器部署在与Nomad和Consul相同的环境中:
|
||||
|
||||
```
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
| warden | | ash3c | | master |
|
||||
| | | | | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | Consul | | | | Consul | | | | Consul | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | | | | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | Nomad | | | | Nomad | | | | Nomad | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | | | | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | Vault | | | | Vault | | | | Vault | |
|
||||
| +-------------+ | | +-------------+ | | +-------------+ |
|
||||
| | | | | |
|
||||
| +-------------+ | | | | |
|
||||
| | Waypoint | | | | | |
|
||||
| +-------------+ | | | | |
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
```
|
||||
|
||||
### 4.2 资源需求
|
||||
Waypoint服务器建议配置:
|
||||
- CPU: 2核
|
||||
- 内存: 2GB
|
||||
- 存储: 10GB
|
||||
|
||||
### 4.3 网络配置
|
||||
- Waypoint API端口: 9702
|
||||
- Waypoint UI端口: 9701
|
||||
- 配置TLS加密所有通信
|
||||
|
||||
## 5. 实施计划
|
||||
|
||||
### 5.1 试点阶段
|
||||
1. **环境准备**
|
||||
- 在单个节点上部署Waypoint服务器
|
||||
- 配置与Nomad、Consul和Vault的集成
|
||||
|
||||
2. **选择试点项目**
|
||||
- 选择一个非关键应用作为试点
|
||||
- 创建Waypoint配置文件
|
||||
- 实施构建、部署和发布流程
|
||||
|
||||
3. **评估结果**
|
||||
- 收集开发和运维反馈
|
||||
- 评估部署效率提升
|
||||
- 识别潜在问题和改进点
|
||||
|
||||
### 5.2 扩展阶段
|
||||
1. **扩展到更多应用**
|
||||
- 逐步将更多应用迁移到Waypoint
|
||||
- 创建标准化的Waypoint模板
|
||||
- 建立最佳实践文档
|
||||
|
||||
2. **团队培训**
|
||||
- 为开发和运维团队提供Waypoint培训
|
||||
- 创建内部知识库和示例
|
||||
|
||||
3. **与CI/CD集成**
|
||||
- 将Waypoint集成到现有Gitea Actions流水线
|
||||
- 实现自动触发部署
|
||||
|
||||
### 5.3 完全集成阶段
|
||||
1. **扩展到所有环境**
|
||||
- 在开发、测试和生产环境中统一使用Waypoint
|
||||
- 实现环境特定配置管理
|
||||
|
||||
2. **高级功能实施**
|
||||
- 配置自动回滚策略
|
||||
- 实现蓝绿部署和金丝雀发布
|
||||
- 集成监控和告警
|
||||
|
||||
3. **持续优化**
|
||||
- 定期评估和优化部署流程
|
||||
- 跟踪Waypoint更新和新功能
|
||||
|
||||
## 6. 实施时间表
|
||||
|
||||
| 阶段 | 任务 | 时间估计 |
|
||||
|------|------|----------|
|
||||
| 准备 | 环境准备和Waypoint服务器部署 | 2天 |
|
||||
| 试点 | 试点项目实施 | 5天 |
|
||||
| 试点 | 评估和调整 | 3天 |
|
||||
| 扩展 | 扩展到更多应用 | 10天 |
|
||||
| 扩展 | 团队培训 | 2天 |
|
||||
| 扩展 | CI/CD集成 | 3天 |
|
||||
| 集成 | 扩展到所有环境 | 5天 |
|
||||
| 集成 | 高级功能实施 | 5天 |
|
||||
| **总计** | | **35天** |
|
||||
|
||||
## 7. 成本效益分析
|
||||
|
||||
### 7.1 实施成本
|
||||
- **基础设施成本**: 低(利用现有节点)
|
||||
- **许可成本**: 无(开源版本)
|
||||
- **人力成本**: 中(学习和迁移工作)
|
||||
- **维护成本**: 低(与现有HashiCorp产品集成)
|
||||
|
||||
### 7.2 预期收益
|
||||
- **开发效率提升**: 预计减少20-30%的部署相关工作
|
||||
- **部署一致性**: 减少50%的环境特定问题
|
||||
- **上线时间缩短**: 预计缩短15-25%的应用上线时间
|
||||
- **运维负担减轻**: 减少跨平台部署脚本维护
|
||||
|
||||
### 7.3 投资回报周期
|
||||
- 预计在实施后3-6个月内开始看到明显收益
|
||||
- 完全投资回报预计在9-12个月内实现
|
||||
|
||||
## 8. 结论和建议
|
||||
|
||||
### 8.1 是否实施Waypoint的决策因素
|
||||
|
||||
#### 支持实施的因素
|
||||
- 项目已经使用HashiCorp生态系统(Nomad、Consul)
|
||||
- 多云环境需要统一的部署流程
|
||||
- 需要简化开发人员的部署体验
|
||||
- 应用部署流程需要标准化
|
||||
|
||||
#### 不支持实施的因素
|
||||
- 现有CI/CD流程已经满足需求
|
||||
- 团队资源有限,难以支持额外工具的学习和维护
|
||||
- 应用部署需求相对简单,不需要高级发布策略
|
||||
|
||||
### 8.2 建议实施路径
|
||||
|
||||
基于对项目现状的分析,我们建议采取**渐进式实施**策略:
|
||||
|
||||
1. **先实施Vault**: 优先解决安全问题,实施Vault进行密钥管理
|
||||
2. **小规模试点Waypoint**: 在非关键应用上试点Waypoint,评估实际价值
|
||||
3. **基于试点结果决定**: 根据试点结果决定是否扩大Waypoint的使用范围
|
||||
|
||||
### 8.3 最终建议
|
||||
|
||||
虽然Waypoint提供了统一的应用部署体验和多云支持,但考虑到项目已有相对成熟的GitOps工作流和CI/CD流程,Waypoint的实施优先级应低于Vault。
|
||||
|
||||
建议先完成Vault的实施,解决当前的安全问题,然后在资源允许的情况下,通过小规模试点评估Waypoint的实际价值。这种渐进式方法可以降低风险,同时确保资源投入到最有价值的改进上。
|
||||
|
||||
如果试点结果显示Waypoint能显著提升开发效率和部署一致性,再考虑更广泛的实施。
|
||||
712
docs/waypoint/waypoint_integration_examples.md
Normal file
712
docs/waypoint/waypoint_integration_examples.md
Normal file
@@ -0,0 +1,712 @@
|
||||
# Waypoint 集成示例
|
||||
|
||||
本文档提供了将Waypoint与现有基础设施和工具集成的具体示例。
|
||||
|
||||
## 1. 与Nomad集成
|
||||
|
||||
### 1.1 基本Nomad部署配置
|
||||
|
||||
```hcl
|
||||
app "api-service" {
|
||||
build {
|
||||
use "docker" {
|
||||
dockerfile = "Dockerfile"
|
||||
disable_entrypoint = true
|
||||
}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// Nomad集群地址
|
||||
address = "http://nomad-server:4646"
|
||||
|
||||
// 部署配置
|
||||
datacenter = "dc1"
|
||||
namespace = "default"
|
||||
|
||||
// 资源配置
|
||||
resources {
|
||||
cpu = 500
|
||||
memory = 256
|
||||
}
|
||||
|
||||
// 服务配置
|
||||
service_provider = "consul" {
|
||||
service_name = "api-service"
|
||||
tags = ["api", "v1"]
|
||||
|
||||
check {
|
||||
type = "http"
|
||||
path = "/health"
|
||||
interval = "10s"
|
||||
timeout = "2s"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 1.2 高级Nomad配置
|
||||
|
||||
```hcl
|
||||
app "web-app" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
// 存储卷配置
|
||||
volume_mount {
|
||||
volume = "app-data"
|
||||
destination = "/data"
|
||||
read_only = false
|
||||
}
|
||||
|
||||
// 网络配置
|
||||
network {
|
||||
mode = "bridge"
|
||||
port "http" {
|
||||
static = 8080
|
||||
to = 80
|
||||
}
|
||||
}
|
||||
|
||||
// 环境变量
|
||||
env {
|
||||
NODE_ENV = "production"
|
||||
}
|
||||
|
||||
// 健康检查
|
||||
health_check {
|
||||
timeout = "5m"
|
||||
check {
|
||||
name = "http-check"
|
||||
route = "/health"
|
||||
method = "GET"
|
||||
code = 200
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 2. 与Vault集成
|
||||
|
||||
### 2.1 从Vault获取静态密钥
|
||||
|
||||
```hcl
|
||||
app "database-service" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
env {
|
||||
// 从Vault获取数据库凭据
|
||||
DB_USERNAME = dynamic("vault", {
|
||||
path = "kv/data/database/creds"
|
||||
key = "username"
|
||||
})
|
||||
|
||||
DB_PASSWORD = dynamic("vault", {
|
||||
path = "kv/data/database/creds"
|
||||
key = "password"
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2.2 使用Vault动态密钥
|
||||
|
||||
```hcl
|
||||
app "api-service" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
template {
|
||||
destination = "secrets/db-creds.txt"
|
||||
data = <<EOF
|
||||
{{- with secret "database/creds/api-role" -}}
|
||||
DB_USERNAME={{ .Data.username }}
|
||||
DB_PASSWORD={{ .Data.password }}
|
||||
{{- end -}}
|
||||
EOF
|
||||
}
|
||||
|
||||
env_from_file = ["secrets/db-creds.txt"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 3. 与Consul集成
|
||||
|
||||
### 3.1 服务发现配置
|
||||
|
||||
```hcl
|
||||
app "frontend" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
service_provider = "consul" {
|
||||
service_name = "frontend"
|
||||
|
||||
meta {
|
||||
version = "v1.2.3"
|
||||
team = "frontend"
|
||||
}
|
||||
|
||||
tags = ["web", "frontend"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 使用Consul KV存储配置
|
||||
|
||||
```hcl
|
||||
app "config-service" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
template {
|
||||
destination = "config/app-config.json"
|
||||
data = <<EOF
|
||||
{
|
||||
"settings": {{ key "config/app-settings" | toJSON }},
|
||||
"features": {{ key "config/features" | toJSON }}
|
||||
}
|
||||
EOF
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 4. 与Gitea Actions集成
|
||||
|
||||
### 4.1 基本CI/CD流水线
|
||||
|
||||
```yaml
|
||||
name: Build and Deploy
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main ]
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
|
||||
- name: Install Waypoint
|
||||
run: |
|
||||
curl -fsSL https://releases.hashicorp.com/waypoint/0.11.0/waypoint_0.11.0_linux_amd64.zip -o waypoint.zip
|
||||
unzip waypoint.zip
|
||||
sudo mv waypoint /usr/local/bin/
|
||||
|
||||
- name: Configure Waypoint
|
||||
run: |
|
||||
waypoint context create \
|
||||
-server-addr=${{ secrets.WAYPOINT_SERVER_ADDR }} \
|
||||
-server-auth-token=${{ secrets.WAYPOINT_AUTH_TOKEN }} \
|
||||
-set-default ci-context
|
||||
|
||||
- name: Build and Deploy
|
||||
run: waypoint up
|
||||
```
|
||||
|
||||
### 4.2 多环境部署流水线
|
||||
|
||||
```yaml
|
||||
name: Multi-Environment Deploy
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main, staging, production ]
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
|
||||
- name: Install Waypoint
|
||||
run: |
|
||||
curl -fsSL https://releases.hashicorp.com/waypoint/0.11.0/waypoint_0.11.0_linux_amd64.zip -o waypoint.zip
|
||||
unzip waypoint.zip
|
||||
sudo mv waypoint /usr/local/bin/
|
||||
|
||||
- name: Configure Waypoint
|
||||
run: |
|
||||
waypoint context create \
|
||||
-server-addr=${{ secrets.WAYPOINT_SERVER_ADDR }} \
|
||||
-server-auth-token=${{ secrets.WAYPOINT_AUTH_TOKEN }} \
|
||||
-set-default ci-context
|
||||
|
||||
- name: Determine Environment
|
||||
id: env
|
||||
run: |
|
||||
if [[ ${{ github.ref }} == 'refs/heads/main' ]]; then
|
||||
echo "::set-output name=environment::development"
|
||||
elif [[ ${{ github.ref }} == 'refs/heads/staging' ]]; then
|
||||
echo "::set-output name=environment::staging"
|
||||
elif [[ ${{ github.ref }} == 'refs/heads/production' ]]; then
|
||||
echo "::set-output name=environment::production"
|
||||
fi
|
||||
|
||||
- name: Build and Deploy
|
||||
run: |
|
||||
waypoint up -workspace=${{ steps.env.outputs.environment }}
|
||||
```
|
||||
|
||||
## 5. 多云部署示例
|
||||
|
||||
### 5.1 AWS ECS部署
|
||||
|
||||
```hcl
|
||||
app "microservice" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "aws-ecs" {
|
||||
region = "us-west-2"
|
||||
cluster = "production"
|
||||
|
||||
service {
|
||||
name = "microservice"
|
||||
desired_count = 3
|
||||
|
||||
load_balancer {
|
||||
target_group_arn = "arn:aws:elasticloadbalancing:us-west-2:..."
|
||||
container_name = "microservice"
|
||||
container_port = 8080
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Google Cloud Run部署
|
||||
|
||||
```hcl
|
||||
app "api" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "google-cloud-run" {
|
||||
project = "my-gcp-project"
|
||||
location = "us-central1"
|
||||
|
||||
port = 8080
|
||||
|
||||
capacity {
|
||||
memory = 512
|
||||
cpu_count = 1
|
||||
max_requests_per_container = 10
|
||||
request_timeout = 300
|
||||
}
|
||||
|
||||
auto_scaling {
|
||||
max_instances = 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 多云部署策略
|
||||
|
||||
```hcl
|
||||
// 使用变量决定部署目标
|
||||
variable "deploy_target" {
|
||||
type = string
|
||||
default = "nomad"
|
||||
}
|
||||
|
||||
app "multi-cloud-app" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
// 根据变量选择部署平台
|
||||
use dynamic {
|
||||
value = var.deploy_target
|
||||
|
||||
// Nomad部署配置
|
||||
nomad {
|
||||
datacenter = "dc1"
|
||||
// 其他Nomad配置...
|
||||
}
|
||||
|
||||
// AWS ECS部署配置
|
||||
aws-ecs {
|
||||
region = "us-west-2"
|
||||
cluster = "production"
|
||||
// 其他ECS配置...
|
||||
}
|
||||
|
||||
// Google Cloud Run部署配置
|
||||
google-cloud-run {
|
||||
project = "my-gcp-project"
|
||||
location = "us-central1"
|
||||
// 其他Cloud Run配置...
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 6. 高级发布策略
|
||||
|
||||
### 6.1 蓝绿部署
|
||||
|
||||
```hcl
|
||||
app "web-app" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本部署配置...
|
||||
}
|
||||
}
|
||||
|
||||
release {
|
||||
use "nomad-bluegreen" {
|
||||
service = "web-app"
|
||||
datacenter = "dc1"
|
||||
namespace = "default"
|
||||
|
||||
// 流量转移配置
|
||||
traffic_step = 25 // 每次转移25%的流量
|
||||
confirm_step = true // 每步需要确认
|
||||
|
||||
// 健康检查
|
||||
health_check {
|
||||
timeout = "2m"
|
||||
check {
|
||||
route = "/health"
|
||||
method = "GET"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 金丝雀发布
|
||||
|
||||
```hcl
|
||||
app "api-service" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本部署配置...
|
||||
}
|
||||
}
|
||||
|
||||
release {
|
||||
use "nomad-canary" {
|
||||
service = "api-service"
|
||||
datacenter = "dc1"
|
||||
|
||||
// 金丝雀配置
|
||||
canary {
|
||||
percentage = 10 // 先发布到10%的实例
|
||||
duration = "15m" // 观察15分钟
|
||||
}
|
||||
|
||||
// 自动回滚配置
|
||||
auto_rollback = true
|
||||
|
||||
// 指标监控
|
||||
metrics {
|
||||
provider = "prometheus"
|
||||
address = "http://prometheus:9090"
|
||||
query = "sum(rate(http_requests_total{status=~\"5..\"}[5m])) / sum(rate(http_requests_total[5m])) > 0.01"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 7. 自定义插件示例
|
||||
|
||||
### 7.1 自定义构建器插件
|
||||
|
||||
```go
|
||||
// custom_builder.go
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
sdk "github.com/hashicorp/waypoint-plugin-sdk"
|
||||
)
|
||||
|
||||
// CustomBuilder 实现自定义构建逻辑
|
||||
type CustomBuilder struct {
|
||||
config BuildConfig
|
||||
}
|
||||
|
||||
type BuildConfig struct {
|
||||
Command string `hcl:"command"`
|
||||
}
|
||||
|
||||
// ConfigSet 设置配置
|
||||
func (b *CustomBuilder) ConfigSet(config interface{}) error {
|
||||
c, ok := config.(*BuildConfig)
|
||||
if !ok {
|
||||
return fmt.Errorf("invalid configuration")
|
||||
}
|
||||
b.config = *c
|
||||
return nil
|
||||
}
|
||||
|
||||
// BuildFunc 执行构建
|
||||
func (b *CustomBuilder) BuildFunc() interface{} {
|
||||
return b.build
|
||||
}
|
||||
|
||||
func (b *CustomBuilder) build(ctx context.Context, ui terminal.UI) (*Binary, error) {
|
||||
// 执行自定义构建命令
|
||||
cmd := exec.CommandContext(ctx, "sh", "-c", b.config.Command)
|
||||
cmd.Stdout = ui.Output()
|
||||
cmd.Stderr = ui.Error()
|
||||
|
||||
if err := cmd.Run(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
return &Binary{
|
||||
Source: "custom",
|
||||
}, nil
|
||||
}
|
||||
|
||||
// 注册插件
|
||||
func main() {
|
||||
sdk.Main(sdk.WithComponents(&CustomBuilder{}))
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 使用自定义插件
|
||||
|
||||
```hcl
|
||||
app "custom-app" {
|
||||
build {
|
||||
use "custom" {
|
||||
command = "make build"
|
||||
}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 部署配置...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 8. 监控和可观测性集成
|
||||
|
||||
### 8.1 Prometheus集成
|
||||
|
||||
```hcl
|
||||
app "monitored-app" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
// Prometheus注解
|
||||
service_provider = "consul" {
|
||||
service_name = "monitored-app"
|
||||
|
||||
meta {
|
||||
"prometheus.io/scrape" = "true"
|
||||
"prometheus.io/path" = "/metrics"
|
||||
"prometheus.io/port" = "8080"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 8.2 与ELK堆栈集成
|
||||
|
||||
```hcl
|
||||
app "logging-app" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
// 基本配置...
|
||||
|
||||
// 日志配置
|
||||
logging {
|
||||
type = "fluentd"
|
||||
config {
|
||||
fluentd_address = "fluentd.service.consul:24224"
|
||||
tag = "app.${nomad.namespace}.${app.name}"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 9. 本地开发工作流
|
||||
|
||||
### 9.1 本地开发配置
|
||||
|
||||
```hcl
|
||||
app "dev-app" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "docker" {
|
||||
service_port = 3000
|
||||
|
||||
// 开发环境特定配置
|
||||
env {
|
||||
NODE_ENV = "development"
|
||||
DEBUG = "true"
|
||||
}
|
||||
|
||||
// 挂载源代码目录
|
||||
binds {
|
||||
source = abspath("./src")
|
||||
destination = "/app/src"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 9.2 本地与远程环境切换
|
||||
|
||||
```hcl
|
||||
variable "environment" {
|
||||
type = string
|
||||
default = "local"
|
||||
}
|
||||
|
||||
app "fullstack-app" {
|
||||
build {
|
||||
use "docker" {}
|
||||
}
|
||||
|
||||
deploy {
|
||||
// 根据环境变量选择部署方式
|
||||
use dynamic {
|
||||
value = var.environment
|
||||
|
||||
// 本地开发
|
||||
local {
|
||||
use "docker" {
|
||||
// 本地Docker配置...
|
||||
}
|
||||
}
|
||||
|
||||
// 开发环境
|
||||
dev {
|
||||
use "nomad" {
|
||||
// 开发环境Nomad配置...
|
||||
}
|
||||
}
|
||||
|
||||
// 生产环境
|
||||
prod {
|
||||
use "nomad" {
|
||||
// 生产环境Nomad配置...
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 10. 多应用协调
|
||||
|
||||
### 10.1 依赖管理
|
||||
|
||||
```hcl
|
||||
project = "microservices"
|
||||
|
||||
app "database" {
|
||||
// 数据库服务配置...
|
||||
}
|
||||
|
||||
app "backend" {
|
||||
// 后端API配置...
|
||||
|
||||
// 声明依赖关系
|
||||
depends_on = ["database"]
|
||||
}
|
||||
|
||||
app "frontend" {
|
||||
// 前端配置...
|
||||
|
||||
// 声明依赖关系
|
||||
depends_on = ["backend"]
|
||||
}
|
||||
```
|
||||
|
||||
### 10.2 共享配置
|
||||
|
||||
```hcl
|
||||
// 定义共享变量
|
||||
variable "version" {
|
||||
type = string
|
||||
default = "1.0.0"
|
||||
}
|
||||
|
||||
variable "environment" {
|
||||
type = string
|
||||
default = "development"
|
||||
}
|
||||
|
||||
// 共享函数
|
||||
function "service_name" {
|
||||
params = [name]
|
||||
result = "${var.environment}-${name}"
|
||||
}
|
||||
|
||||
// 应用配置
|
||||
app "api" {
|
||||
build {
|
||||
use "docker" {
|
||||
tag = "${var.version}"
|
||||
}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
service_provider = "consul" {
|
||||
service_name = service_name("api")
|
||||
}
|
||||
|
||||
env {
|
||||
APP_VERSION = var.version
|
||||
ENVIRONMENT = var.environment
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
331
docs/waypoint/waypoint_setup_guide.md
Normal file
331
docs/waypoint/waypoint_setup_guide.md
Normal file
@@ -0,0 +1,331 @@
|
||||
# Waypoint 部署和配置指南
|
||||
|
||||
本文档提供了在现有基础设施上部署和配置HashiCorp Waypoint的详细步骤。
|
||||
|
||||
## 1. 前置准备
|
||||
|
||||
### 1.1 创建数据目录
|
||||
|
||||
在Waypoint服务器节点上创建数据目录:
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /opt/waypoint/data
|
||||
sudo chown -R nomad:nomad /opt/waypoint
|
||||
```
|
||||
|
||||
### 1.2 安装Waypoint CLI
|
||||
|
||||
在开发机器和CI/CD服务器上安装Waypoint CLI:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://releases.hashicorp.com/waypoint/0.11.0/waypoint_0.11.0_linux_amd64.zip -o waypoint.zip
|
||||
unzip waypoint.zip
|
||||
sudo mv waypoint /usr/local/bin/
|
||||
```
|
||||
|
||||
## 2. 部署Waypoint服务器
|
||||
|
||||
### 2.1 使用Nomad部署
|
||||
|
||||
将`waypoint-server.nomad`文件提交到Nomad:
|
||||
|
||||
```bash
|
||||
nomad job run waypoint-server.nomad
|
||||
```
|
||||
|
||||
### 2.2 验证部署状态
|
||||
|
||||
```bash
|
||||
# 检查Nomad任务状态
|
||||
nomad job status waypoint-server
|
||||
|
||||
# 检查Waypoint UI是否可访问
|
||||
curl -I http://warden:9701
|
||||
```
|
||||
|
||||
## 3. 初始化Waypoint
|
||||
|
||||
### 3.1 连接到Waypoint服务器
|
||||
|
||||
```bash
|
||||
# 连接CLI到服务器
|
||||
waypoint context create \
|
||||
-server-addr=warden:9703 \
|
||||
-server-tls-skip-verify \
|
||||
-set-default my-waypoint-server
|
||||
```
|
||||
|
||||
### 3.2 验证连接
|
||||
|
||||
```bash
|
||||
waypoint context verify
|
||||
waypoint server info
|
||||
```
|
||||
|
||||
## 4. 配置Waypoint
|
||||
|
||||
### 4.1 配置Nomad作为运行时平台
|
||||
|
||||
```bash
|
||||
# 确认Nomad连接
|
||||
waypoint config source-set -type=nomad nomad-platform \
|
||||
addr=http://localhost:4646
|
||||
```
|
||||
|
||||
### 4.2 配置与Vault的集成
|
||||
|
||||
```bash
|
||||
# 配置Vault集成
|
||||
waypoint config source-set -type=vault vault-secrets \
|
||||
addr=http://localhost:8200 \
|
||||
token=<vault-token>
|
||||
```
|
||||
|
||||
## 5. 创建第一个Waypoint项目
|
||||
|
||||
### 5.1 创建项目配置文件
|
||||
|
||||
在应用代码目录中创建`waypoint.hcl`文件:
|
||||
|
||||
```hcl
|
||||
project = "example-app"
|
||||
|
||||
app "web" {
|
||||
build {
|
||||
use "docker" {
|
||||
dockerfile = "Dockerfile"
|
||||
}
|
||||
}
|
||||
|
||||
deploy {
|
||||
use "nomad" {
|
||||
datacenter = "dc1"
|
||||
namespace = "default"
|
||||
|
||||
service_provider = "consul" {
|
||||
service_name = "web"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 初始化和部署项目
|
||||
|
||||
```bash
|
||||
# 初始化项目
|
||||
cd /path/to/app
|
||||
waypoint init
|
||||
|
||||
# 部署应用
|
||||
waypoint up
|
||||
```
|
||||
|
||||
## 6. 与现有工具集成
|
||||
|
||||
### 6.1 与Gitea Actions集成
|
||||
|
||||
创建一个Gitea Actions工作流文件`.gitea/workflows/waypoint.yml`:
|
||||
|
||||
```yaml
|
||||
name: Waypoint Deploy
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main ]
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
|
||||
- name: Install Waypoint
|
||||
run: |
|
||||
curl -fsSL https://releases.hashicorp.com/waypoint/0.11.0/waypoint_0.11.0_linux_amd64.zip -o waypoint.zip
|
||||
unzip waypoint.zip
|
||||
sudo mv waypoint /usr/local/bin/
|
||||
|
||||
- name: Configure Waypoint
|
||||
run: |
|
||||
waypoint context create \
|
||||
-server-addr=${{ secrets.WAYPOINT_SERVER_ADDR }} \
|
||||
-server-auth-token=${{ secrets.WAYPOINT_AUTH_TOKEN }} \
|
||||
-set-default ci-context
|
||||
|
||||
- name: Deploy Application
|
||||
run: waypoint up -app=web
|
||||
```
|
||||
|
||||
### 6.2 与Vault集成
|
||||
|
||||
在`waypoint.hcl`中使用Vault获取敏感配置:
|
||||
|
||||
```hcl
|
||||
app "web" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
# 其他配置...
|
||||
|
||||
env {
|
||||
DB_PASSWORD = dynamic("vault", {
|
||||
path = "kv/data/app/db"
|
||||
key = "password"
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 7. 高级配置
|
||||
|
||||
### 7.1 配置蓝绿部署
|
||||
|
||||
```hcl
|
||||
app "web" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
# 基本配置...
|
||||
}
|
||||
}
|
||||
|
||||
release {
|
||||
use "nomad-bluegreen" {
|
||||
service = "web"
|
||||
datacenter = "dc1"
|
||||
namespace = "default"
|
||||
traffic_step = 25
|
||||
confirm_step = true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 配置金丝雀发布
|
||||
|
||||
```hcl
|
||||
app "web" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
# 基本配置...
|
||||
}
|
||||
}
|
||||
|
||||
release {
|
||||
use "nomad-canary" {
|
||||
service = "web"
|
||||
datacenter = "dc1"
|
||||
namespace = "default"
|
||||
|
||||
canary {
|
||||
percentage = 10
|
||||
duration = "5m"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 配置自动回滚
|
||||
|
||||
```hcl
|
||||
app "web" {
|
||||
deploy {
|
||||
use "nomad" {
|
||||
# 基本配置...
|
||||
|
||||
health_check {
|
||||
timeout = "5m"
|
||||
check {
|
||||
name = "http-check"
|
||||
route = "/health"
|
||||
method = "GET"
|
||||
code = 200
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 8. 监控和日志
|
||||
|
||||
### 8.1 查看部署状态
|
||||
|
||||
```bash
|
||||
# 查看所有应用
|
||||
waypoint list projects
|
||||
|
||||
# 查看特定应用的部署
|
||||
waypoint list deployments -app=web
|
||||
|
||||
# 查看部署详情
|
||||
waypoint deployment inspect <deployment-id>
|
||||
```
|
||||
|
||||
### 8.2 查看应用日志
|
||||
|
||||
```bash
|
||||
# 查看应用日志
|
||||
waypoint logs -app=web
|
||||
```
|
||||
|
||||
## 9. 备份和恢复
|
||||
|
||||
### 9.1 备份Waypoint数据
|
||||
|
||||
```bash
|
||||
# 备份数据目录
|
||||
tar -czf waypoint-backup.tar.gz /opt/waypoint/data
|
||||
```
|
||||
|
||||
### 9.2 恢复Waypoint数据
|
||||
|
||||
```bash
|
||||
# 停止Waypoint服务
|
||||
nomad job stop waypoint-server
|
||||
|
||||
# 恢复数据
|
||||
rm -rf /opt/waypoint/data/*
|
||||
tar -xzf waypoint-backup.tar.gz -C /
|
||||
|
||||
# 重启服务
|
||||
nomad job run waypoint-server.nomad
|
||||
```
|
||||
|
||||
## 10. 故障排除
|
||||
|
||||
### 10.1 常见问题
|
||||
|
||||
1. **连接问题**:
|
||||
- 检查Waypoint服务器是否正常运行
|
||||
- 验证网络连接和防火墙规则
|
||||
|
||||
2. **部署失败**:
|
||||
- 检查Nomad集群状态
|
||||
- 查看详细的部署日志: `waypoint logs -app=<app> -deploy=<deployment-id>`
|
||||
|
||||
3. **权限问题**:
|
||||
- 确保Waypoint有足够的权限访问Nomad和Vault
|
||||
|
||||
### 10.2 调试命令
|
||||
|
||||
```bash
|
||||
# 检查Waypoint服务器状态
|
||||
waypoint server info
|
||||
|
||||
# 验证Nomad连接
|
||||
waypoint config source-get nomad-platform
|
||||
|
||||
# 启用调试日志
|
||||
WAYPOINT_LOG=debug waypoint up
|
||||
```
|
||||
|
||||
## 11. 最佳实践
|
||||
|
||||
1. **模块化配置**: 将通用配置抽取到可重用的Waypoint插件中
|
||||
2. **环境变量**: 使用环境变量区分不同环境的配置
|
||||
3. **版本控制**: 将`waypoint.hcl`文件纳入版本控制
|
||||
4. **自动化测试**: 在部署前添加自动化测试步骤
|
||||
5. **监控集成**: 将部署状态与监控系统集成
|
||||
Reference in New Issue
Block a user