306 lines
5.4 KiB
Markdown
306 lines
5.4 KiB
Markdown
# Vault与Consul集成最佳实践
|
|
|
|
## 1. 架构设计
|
|
|
|
### 1.1 高可用架构
|
|
- **Vault集群**: 3个节点 (1个Leader + 2个Follower)
|
|
- **Consul集群**: 3个节点 (1个Leader + 2个Follower)
|
|
- **网络**: Tailscale安全网络
|
|
- **存储**: Consul作为Vault的存储后端
|
|
|
|
### 1.2 节点分布
|
|
```
|
|
Vault节点:
|
|
- ch4.tailnet-68f9.ts.net:8200 (Leader)
|
|
- ash3c.tailnet-68f9.ts.net:8200 (Follower)
|
|
- warden.tailnet-68f9.ts.net:8200 (Follower)
|
|
|
|
Consul节点:
|
|
- ch4.tailnet-68f9.ts.net:8500 (Leader)
|
|
- ash3c.tailnet-68f9.ts.net:8500 (Follower)
|
|
- warden.tailnet-68f9.ts.net:8500 (Follower)
|
|
```
|
|
|
|
## 2. Vault配置最佳实践
|
|
|
|
### 2.1 存储后端配置
|
|
```hcl
|
|
storage "consul" {
|
|
address = "127.0.0.1:8500"
|
|
path = "vault/"
|
|
|
|
# 高可用配置
|
|
datacenter = "dc1"
|
|
service = "vault"
|
|
service_tags = "vault-server"
|
|
|
|
# 会话配置
|
|
session_ttl = "15s"
|
|
lock_wait_time = "15s"
|
|
|
|
# 一致性配置
|
|
consistency_mode = "strong"
|
|
|
|
# 故障转移配置
|
|
max_parallel = 128
|
|
disable_registration = false
|
|
}
|
|
```
|
|
|
|
### 2.2 监听器配置
|
|
```hcl
|
|
listener "tcp" {
|
|
address = "0.0.0.0:8200"
|
|
|
|
# 生产环境启用TLS
|
|
tls_cert_file = "/opt/vault/tls/vault.crt"
|
|
tls_key_file = "/opt/vault/tls/vault.key"
|
|
tls_min_version = "1.2"
|
|
}
|
|
|
|
# 集群监听器
|
|
listener "tcp" {
|
|
address = "0.0.0.0:8201"
|
|
purpose = "cluster"
|
|
|
|
tls_cert_file = "/opt/vault/tls/vault.crt"
|
|
tls_key_file = "/opt/vault/tls/vault.key"
|
|
}
|
|
```
|
|
|
|
### 2.3 集群配置
|
|
```hcl
|
|
# API地址 - 使用Tailscale网络
|
|
api_addr = "https://{{ ansible_host }}:8200"
|
|
|
|
# 集群地址 - 使用Tailscale网络
|
|
cluster_addr = "https://{{ ansible_host }}:8201"
|
|
|
|
# 集群名称
|
|
cluster_name = "vault-cluster"
|
|
|
|
# 禁用mlock (生产环境应启用)
|
|
disable_mlock = false
|
|
|
|
# 日志配置
|
|
log_level = "INFO"
|
|
log_format = "json"
|
|
```
|
|
|
|
## 3. Consul配置最佳实践
|
|
|
|
### 3.1 服务注册配置
|
|
```hcl
|
|
services {
|
|
name = "vault"
|
|
tags = ["vault-server", "secrets"]
|
|
port = 8200
|
|
|
|
check {
|
|
name = "vault-health"
|
|
http = "http://127.0.0.1:8200/v1/sys/health"
|
|
interval = "10s"
|
|
timeout = "3s"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3.2 ACL配置
|
|
```hcl
|
|
acl {
|
|
enabled = true
|
|
default_policy = "deny"
|
|
enable_token_persistence = true
|
|
|
|
# Vault服务权限
|
|
tokens {
|
|
default = "{{ vault_consul_token }}"
|
|
}
|
|
}
|
|
```
|
|
|
|
## 4. 安全最佳实践
|
|
|
|
### 4.1 TLS配置
|
|
- 所有Vault节点间通信使用TLS
|
|
- Consul节点间通信使用TLS
|
|
- 客户端到Vault通信使用TLS
|
|
|
|
### 4.2 认证配置
|
|
```hcl
|
|
# 启用多种认证方法
|
|
auth {
|
|
enabled = true
|
|
|
|
# AppRole认证
|
|
approle {
|
|
enabled = true
|
|
}
|
|
|
|
# LDAP认证
|
|
ldap {
|
|
enabled = true
|
|
url = "ldap://authentik.tailnet-68f9.ts.net:389"
|
|
userdn = "ou=users,dc=authentik,dc=local"
|
|
groupdn = "ou=groups,dc=authentik,dc=local"
|
|
}
|
|
|
|
# OIDC认证
|
|
oidc {
|
|
enabled = true
|
|
oidc_discovery_url = "https://authentik1.git-4ta.live/application/o/vault/"
|
|
}
|
|
}
|
|
```
|
|
|
|
## 5. 监控和审计
|
|
|
|
### 5.1 审计日志
|
|
```hcl
|
|
audit {
|
|
enabled = true
|
|
|
|
# 文件审计
|
|
file {
|
|
path = "/opt/vault/logs/audit.log"
|
|
format = "json"
|
|
}
|
|
|
|
# Syslog审计
|
|
syslog {
|
|
facility = "AUTH"
|
|
tag = "vault"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 5.2 遥测配置
|
|
```hcl
|
|
telemetry {
|
|
prometheus_retention_time = "30s"
|
|
disable_hostname = false
|
|
|
|
# 指标配置
|
|
metrics {
|
|
enabled = true
|
|
prefix = "vault"
|
|
}
|
|
}
|
|
```
|
|
|
|
## 6. 备份和恢复
|
|
|
|
### 6.1 自动备份脚本
|
|
```bash
|
|
#!/bin/bash
|
|
# /opt/vault/scripts/backup.sh
|
|
|
|
VAULT_ADDR="https://vault.git-4ta.live"
|
|
VAULT_TOKEN="$(cat /opt/vault/token)"
|
|
|
|
# 创建快照
|
|
vault operator raft snapshot save /opt/vault/backups/vault-$(date +%Y%m%d-%H%M%S).snapshot
|
|
|
|
# 清理旧备份 (保留7天)
|
|
find /opt/vault/backups -name "vault-*.snapshot" -mtime +7 -delete
|
|
```
|
|
|
|
### 6.2 Consul快照
|
|
```bash
|
|
#!/bin/bash
|
|
# /opt/consul/scripts/backup.sh
|
|
|
|
CONSUL_ADDR="http://127.0.0.1:8500"
|
|
|
|
# 创建Consul快照
|
|
consul snapshot save /opt/consul/backups/consul-$(date +%Y%m%d-%H%M%S).snapshot
|
|
```
|
|
|
|
## 7. 故障转移和灾难恢复
|
|
|
|
### 7.1 自动故障转移
|
|
- Vault使用Raft协议自动选举新Leader
|
|
- Consul使用Raft协议自动选举新Leader
|
|
- 客户端自动重连到新的Leader节点
|
|
|
|
### 7.2 灾难恢复流程
|
|
1. 停止所有Vault节点
|
|
2. 从Consul恢复数据
|
|
3. 启动Vault集群
|
|
4. 验证服务状态
|
|
|
|
## 8. 性能优化
|
|
|
|
### 8.1 缓存配置
|
|
```hcl
|
|
cache {
|
|
enabled = true
|
|
size = 1000
|
|
persist {
|
|
type = "kubernetes"
|
|
path = "/opt/vault/cache"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 8.2 连接池配置
|
|
```hcl
|
|
storage "consul" {
|
|
# 连接池配置
|
|
max_parallel = 128
|
|
max_requests_per_second = 100
|
|
}
|
|
```
|
|
|
|
## 9. 部署检查清单
|
|
|
|
### 9.1 部署前检查
|
|
- [ ] Consul集群健康
|
|
- [ ] 网络连通性测试
|
|
- [ ] TLS证书配置
|
|
- [ ] 防火墙规则配置
|
|
- [ ] 存储空间检查
|
|
|
|
### 9.2 部署后验证
|
|
- [ ] Vault集群状态检查
|
|
- [ ] 服务注册验证
|
|
- [ ] 认证功能测试
|
|
- [ ] 备份功能测试
|
|
- [ ] 监控指标验证
|
|
|
|
## 10. 常见问题和解决方案
|
|
|
|
### 10.1 常见问题
|
|
1. **Vault无法连接到Consul**
|
|
- 检查网络连通性
|
|
- 验证Consul服务状态
|
|
- 检查ACL权限
|
|
|
|
2. **集群分裂问题**
|
|
- 检查网络分区
|
|
- 验证Raft日志一致性
|
|
- 执行灾难恢复流程
|
|
|
|
3. **性能问题**
|
|
- 调整连接池大小
|
|
- 启用缓存
|
|
- 优化网络配置
|
|
|
|
### 10.2 故障排除命令
|
|
```bash
|
|
# 检查Vault状态
|
|
vault status
|
|
|
|
# 检查Consul成员
|
|
consul members
|
|
|
|
# 检查服务注册
|
|
consul catalog services
|
|
|
|
# 检查Vault日志
|
|
journalctl -u vault -f
|
|
|
|
# 检查Consul日志
|
|
journalctl -u consul -f
|
|
```
|