Clean repository: organized structure and GitOps setup
- Organized root directory structure - Moved orphan files to proper locations - Updated .gitignore to ignore temporary files - Set up Gitea Runner for GitOps automation - Fixed Tailscale access issues - Added workflow for automated Nomad deployment
This commit is contained in:
196
docs/setup/consul-terraform-integration.md
Normal file
196
docs/setup/consul-terraform-integration.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# Consul + Terraform 集成指南
|
||||
|
||||
本指南介绍如何使用 Consul 安全地管理 Terraform 中的敏感配置信息,特别是 Oracle Cloud 的凭据。
|
||||
|
||||
## 概述
|
||||
|
||||
我们使用 Consul 作为安全的密钥存储,避免在 Terraform 配置文件中直接暴露敏感信息。
|
||||
|
||||
## 架构
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Terraform │───▶│ Consul │───▶│ Oracle Cloud │
|
||||
│ │ │ (密钥存储) │ │ │
|
||||
│ consul provider │ │ │ │ │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
## 前提条件
|
||||
|
||||
1. Consul 集群正在运行
|
||||
2. 可以访问 Consul API (默认: http://localhost:8500)
|
||||
3. 已安装 curl 和 Terraform
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1. 启动 Consul 集群
|
||||
|
||||
当前集群已从 Docker Swarm 迁移到 Nomad + Podman,请使用 Nomad 部署 Consul 集群:
|
||||
|
||||
```bash
|
||||
nomad run /root/mgmt/consul-cluster-nomad.nomad
|
||||
```
|
||||
|
||||
### 2. 设置 Oracle Cloud 配置
|
||||
|
||||
```bash
|
||||
# 使用密钥管理脚本设置配置
|
||||
./scripts/utilities/consul-secrets-manager.sh set-oracle
|
||||
```
|
||||
|
||||
脚本会提示你输入:
|
||||
- 租户 OCID
|
||||
- 用户 OCID
|
||||
- API 密钥指纹
|
||||
- 私钥文件路径
|
||||
- 区间 OCID
|
||||
|
||||
### 3. 配置 Terraform
|
||||
|
||||
```bash
|
||||
# 设置 Terraform Consul Provider
|
||||
./scripts/utilities/terraform-consul-provider.sh setup
|
||||
```
|
||||
|
||||
### 4. 验证配置
|
||||
|
||||
```bash
|
||||
# 查看存储在 Consul 中的配置
|
||||
./scripts/utilities/consul-secrets-manager.sh get-oracle
|
||||
```
|
||||
|
||||
### 5. 运行 Terraform
|
||||
|
||||
```bash
|
||||
cd infrastructure/environments/dev
|
||||
|
||||
# 初始化 Terraform
|
||||
terraform init
|
||||
|
||||
# 规划部署
|
||||
terraform plan
|
||||
|
||||
# 应用配置
|
||||
terraform apply
|
||||
```
|
||||
|
||||
## 详细说明
|
||||
|
||||
### Consul 密钥存储结构
|
||||
|
||||
```
|
||||
config/
|
||||
└── dev/
|
||||
└── oracle/
|
||||
├── tenancy_ocid
|
||||
├── user_ocid
|
||||
├── fingerprint
|
||||
├── private_key
|
||||
└── compartment_ocid
|
||||
```
|
||||
|
||||
### 脚本功能
|
||||
|
||||
#### consul-secrets-manager.sh
|
||||
|
||||
- `set-oracle`: 设置 Oracle Cloud 配置到 Consul
|
||||
- `get-oracle`: 从 Consul 获取配置信息
|
||||
- `delete-oracle`: 删除 Consul 中的配置
|
||||
- `generate-vars`: 生成临时 Terraform 变量文件
|
||||
- `cleanup`: 清理临时文件
|
||||
|
||||
#### terraform-consul-provider.sh
|
||||
|
||||
- `setup`: 创建 Terraform Consul Provider 配置文件
|
||||
|
||||
### 安全特性
|
||||
|
||||
1. **敏感信息隔离**: 私钥等敏感信息只存储在 Consul 中
|
||||
2. **临时文件**: 私钥文件只在 Terraform 运行时临时创建
|
||||
3. **权限控制**: 临时私钥文件设置为 600 权限
|
||||
4. **自动清理**: 提供清理脚本删除临时文件
|
||||
|
||||
## 环境变量
|
||||
|
||||
```bash
|
||||
# Consul 地址
|
||||
export CONSUL_ADDR="http://localhost:8500"
|
||||
|
||||
# Consul ACL Token (如果启用了 ACL)
|
||||
export CONSUL_TOKEN="your-token"
|
||||
|
||||
# 环境名称
|
||||
export ENVIRONMENT="dev"
|
||||
```
|
||||
|
||||
## 故障排除
|
||||
|
||||
### Consul 连接问题
|
||||
|
||||
```bash
|
||||
# 检查 Consul 状态
|
||||
curl http://localhost:8500/v1/status/leader
|
||||
|
||||
# 检查 Consul 服务
|
||||
docker ps | grep consul
|
||||
```
|
||||
|
||||
### 配置验证
|
||||
|
||||
```bash
|
||||
# 验证 Consul 中的配置
|
||||
curl http://localhost:8500/v1/kv/config/dev/oracle?recurse
|
||||
|
||||
# 检查 Terraform 配置
|
||||
terraform validate
|
||||
```
|
||||
|
||||
### 清理和重置
|
||||
|
||||
```bash
|
||||
# 清理临时文件
|
||||
./scripts/utilities/consul-secrets-manager.sh cleanup
|
||||
|
||||
# 删除 Consul 中的配置
|
||||
./scripts/utilities/consul-secrets-manager.sh delete-oracle
|
||||
```
|
||||
|
||||
## 最佳实践
|
||||
|
||||
1. **定期轮换密钥**: 定期更新 Oracle Cloud API 密钥
|
||||
2. **备份配置**: 定期备份 Consul 数据
|
||||
3. **监控访问**: 监控 Consul 密钥访问日志
|
||||
4. **环境隔离**: 不同环境使用不同的 Consul 路径
|
||||
|
||||
## 扩展其他云服务商
|
||||
|
||||
可以类似地为其他云服务商添加 Consul 集成:
|
||||
|
||||
```bash
|
||||
# 华为云配置路径
|
||||
config/dev/huawei/access_key
|
||||
config/dev/huawei/secret_key
|
||||
|
||||
# AWS 配置路径
|
||||
config/dev/aws/access_key
|
||||
config/dev/aws/secret_key
|
||||
|
||||
# Google Cloud 配置路径
|
||||
config/dev/gcp/service_account_key
|
||||
```
|
||||
|
||||
## 相关文件
|
||||
|
||||
- `infrastructure/environments/dev/terraform.tfvars` - Terraform 变量配置
|
||||
- `scripts/utilities/consul-secrets-manager.sh` - Consul 密钥管理脚本
|
||||
- `scripts/utilities/terraform-consul-provider.sh` - Terraform Consul Provider 配置脚本
|
||||
- `swarm/configs/traefik-consul-setup.yml` - Consul 集群配置
|
||||
|
||||
## 支持
|
||||
|
||||
如有问题,请检查:
|
||||
1. Consul 集群是否正常运行
|
||||
2. 网络连接是否正常
|
||||
3. 权限设置是否正确
|
||||
4. 环境变量是否正确设置
|
||||
690
docs/setup/consul_variables_and_storage_guide.md
Normal file
690
docs/setup/consul_variables_and_storage_guide.md
Normal file
@@ -0,0 +1,690 @@
|
||||
# Consul 变量和存储配置指南
|
||||
|
||||
本文档介绍如何配置Consul的变量(Variables)和存储(Storage)功能,以增强集群的功能性和可靠性。
|
||||
|
||||
## 概述
|
||||
|
||||
Consul提供了两种关键功能来增强集群能力:
|
||||
1. **变量(Variables)**: 用于存储配置信息、特性开关、应用参数等
|
||||
2. **存储(Storage)**: 用于持久化数据、快照和备份
|
||||
|
||||
## 变量(Variables)配置
|
||||
|
||||
### 变量命名规范
|
||||
|
||||
我们遵循统一的命名规范来管理Consul KV存储中的配置:
|
||||
|
||||
```
|
||||
config/{environment}/{provider}/{region_or_service}/{key}
|
||||
```
|
||||
|
||||
各部分说明:
|
||||
- **config**: 固定前缀,表示这是一个配置项
|
||||
- **environment**: 环境名称,如 `dev`、`staging`、`prod` 等
|
||||
- **provider**: 云服务提供商,如 `oracle`、`digitalocean`、`aws`、`gcp` 等
|
||||
- **region_or_service**: 区域或服务名称,如 `kr`、`us`、`sgp` 等
|
||||
- **key**: 具体的配置键名,如 `token`、`tenancy_ocid`、`user_ocid` 等
|
||||
|
||||
### Consul集群配置变量
|
||||
|
||||
Consul集群自身配置也应遵循上述命名规范。以下是一些关键配置变量的示例:
|
||||
|
||||
#### 集群基本配置
|
||||
```
|
||||
config/dev/consul/cluster/data_dir
|
||||
config/dev/consul/cluster/raft_dir
|
||||
config/dev/consul/cluster/datacenter
|
||||
config/dev/consul/cluster/bootstrap_expect
|
||||
config/dev/consul/cluster/log_level
|
||||
config/dev/consul/cluster/encrypt_key
|
||||
```
|
||||
|
||||
#### 节点配置
|
||||
```
|
||||
config/dev/consul/nodes/master/ip
|
||||
config/dev/consul/nodes/ash3c/ip
|
||||
config/dev/consul/nodes/warden/ip
|
||||
```
|
||||
|
||||
#### 网络配置
|
||||
```
|
||||
config/dev/consul/network/client_addr
|
||||
config/dev/consul/network/bind_interface
|
||||
config/dev/consul/network/advertise_interface
|
||||
```
|
||||
|
||||
#### 端口配置
|
||||
```
|
||||
config/dev/consul/ports/dns
|
||||
config/dev/consul/ports/http
|
||||
config/dev/consul/ports/https
|
||||
config/dev/consul/ports/grpc
|
||||
config/dev/consul/ports/grpc_tls
|
||||
config/dev/consul/ports/serf_lan
|
||||
config/dev/consul/ports/serf_wan
|
||||
config/dev/consul/ports/server
|
||||
```
|
||||
|
||||
#### 服务发现配置
|
||||
```
|
||||
config/dev/consul/service/enable_script_checks
|
||||
config/dev/consul/service/enable_local_script_checks
|
||||
config/dev/consul/service/enable_service_script
|
||||
```
|
||||
|
||||
#### 性能配置
|
||||
```
|
||||
config/dev/consul/performance/raft_multiplier
|
||||
```
|
||||
|
||||
#### 日志配置
|
||||
```
|
||||
config/dev/consul/log/enable_syslog
|
||||
config/dev/consul/log/log_file
|
||||
```
|
||||
|
||||
#### 连接配置
|
||||
```
|
||||
config/dev/consul/connection/reconnect_timeout
|
||||
config/dev/consul/connection/reconnect_timeout_wan
|
||||
config/dev/consul/connection/session_ttl_min
|
||||
```
|
||||
|
||||
#### Autopilot配置
|
||||
```
|
||||
config/dev/consul/autopilot/cleanup_dead_servers
|
||||
config/dev/consul/autopilot/last_contact_threshold
|
||||
config/dev/consul/autopilot/max_trailing_logs
|
||||
config/dev/consul/autopilot/server_stabilization_time
|
||||
config/dev/consul/autopilot/disable_upgrade_migration
|
||||
```
|
||||
|
||||
#### 快照配置
|
||||
```
|
||||
config/dev/consul/snapshot/enabled
|
||||
config/dev/consul/snapshot/interval
|
||||
config/dev/consul/snapshot/retain
|
||||
config/dev/consul/snapshot/name
|
||||
```
|
||||
|
||||
#### 备份配置
|
||||
```
|
||||
config/dev/consul/backup/enabled
|
||||
config/dev/consul/backup/interval
|
||||
config/dev/consul/backup/retain
|
||||
config/dev/consul/backup/name
|
||||
```
|
||||
|
||||
### 示例配置
|
||||
|
||||
#### 应用配置
|
||||
```
|
||||
config/dev/app/name
|
||||
config/dev/app/version
|
||||
config/dev/app/environment
|
||||
```
|
||||
|
||||
#### 数据库配置
|
||||
```
|
||||
config/dev/database/host
|
||||
config/dev/database/port
|
||||
config/dev/database/name
|
||||
```
|
||||
|
||||
#### 缓存配置
|
||||
```
|
||||
config/dev/cache/host
|
||||
config/dev/cache/port
|
||||
```
|
||||
|
||||
#### 特性开关
|
||||
```
|
||||
config/dev/features/new_ui
|
||||
config/dev/features/advanced_analytics
|
||||
```
|
||||
|
||||
### 如何添加变量
|
||||
|
||||
#### 使用curl命令
|
||||
```bash
|
||||
# 添加单个变量
|
||||
curl -X PUT http://localhost:8500/v1/kv/config/dev/app/name -d "my-application"
|
||||
|
||||
# 添加多个变量
|
||||
curl -X PUT http://localhost:8500/v1/kv/config/dev/database/host -d "db.example.com"
|
||||
curl -X PUT http://localhost:8500/v1/kv/config/dev/database/port -d "5432"
|
||||
```
|
||||
|
||||
#### 使用consul CLI
|
||||
```bash
|
||||
# 添加单个变量
|
||||
consul kv put config/dev/app/name my-application
|
||||
|
||||
# 添加多个变量
|
||||
consul kv put config/dev/database/host db.example.com
|
||||
consul kv put config/dev/database/port 5432
|
||||
```
|
||||
|
||||
#### 使用自动化脚本
|
||||
我们提供了自动化脚本来配置Consul变量:
|
||||
|
||||
```bash
|
||||
# 运行配置脚本
|
||||
./deployment/scripts/setup_consul_variables_and_storage.sh
|
||||
```
|
||||
|
||||
### 如何使用变量
|
||||
|
||||
#### 在Terraform中使用
|
||||
```hcl
|
||||
data "consul_keys" "app_config" {
|
||||
key {
|
||||
name = "app_name"
|
||||
path = "config/dev/app/name"
|
||||
}
|
||||
key {
|
||||
name = "db_host"
|
||||
path = "config/dev/database/host"
|
||||
}
|
||||
}
|
||||
|
||||
resource "some_resource" "example" {
|
||||
name = data.consul_keys.app_config.var.app_name
|
||||
host = data.consul_keys.app_config.var.db_host
|
||||
}
|
||||
```
|
||||
|
||||
#### 在应用程序中使用
|
||||
大多数Consul客户端库都提供了读取KV存储的方法。例如,在Go中:
|
||||
|
||||
```go
|
||||
import "github.com/hashicorp/consul/api"
|
||||
|
||||
// 创建Consul客户端
|
||||
client, _ := api.NewClient(api.DefaultConfig())
|
||||
|
||||
// 读取KV
|
||||
kv := client.KV()
|
||||
pair, _, _ := kv.Get("config/dev/app/name", nil)
|
||||
appName := string(pair.Value)
|
||||
```
|
||||
|
||||
## 部署遵循最佳变量命名规范的Consul集群
|
||||
|
||||
为了确保Consul集群完全遵循最佳变量命名规范,我们提供了一套完整的部署方案。
|
||||
|
||||
### 部署流程
|
||||
|
||||
1. **设置Consul变量**: 使用脚本将所有Consul集群配置存储到Consul KV中
|
||||
2. **生成配置文件**: 使用Consul模板从KV存储动态生成配置文件
|
||||
3. **部署集群**: 使用Nomad部署使用动态配置的Consul集群
|
||||
|
||||
### 部署脚本
|
||||
|
||||
我们提供了以下脚本来简化部署过程:
|
||||
|
||||
#### setup_consul_cluster_variables.sh
|
||||
此脚本将Consul集群配置存储到Consul KV中,遵循 `config/{environment}/{provider}/{region_or_service}/{key}` 格式。
|
||||
|
||||
```bash
|
||||
# 设置Consul集群变量
|
||||
./deployment/scripts/setup_consul_cluster_variables.sh
|
||||
```
|
||||
|
||||
#### generate_consul_config.sh
|
||||
此脚本使用Consul模板从KV存储生成最终的Consul配置文件。
|
||||
|
||||
```bash
|
||||
# 生成Consul配置文件
|
||||
./deployment/scripts/generate_consul_config.sh
|
||||
```
|
||||
|
||||
#### deploy_consul_cluster_kv.sh
|
||||
此脚本是一个综合部署脚本,执行完整的部署流程。
|
||||
|
||||
```bash
|
||||
# 部署遵循最佳变量命名规范的Consul集群
|
||||
./deployment/scripts/deploy_consul_cluster_kv.sh
|
||||
```
|
||||
|
||||
### 配置模板
|
||||
|
||||
我们提供了Consul配置模板文件 `consul.hcl.tmpl`,使用Consul模板语法从KV存储中动态获取配置:
|
||||
|
||||
```hcl
|
||||
# 基础配置
|
||||
data_dir = "{{ keyOrDefault `config/dev/consul/cluster/data_dir` `/opt/consul/data` }}"
|
||||
raft_dir = "{{ keyOrDefault `config/dev/consul/cluster/raft_dir` `/opt/consul/raft` }}"
|
||||
|
||||
# 启用UI
|
||||
ui_config {
|
||||
enabled = {{ keyOrDefault `config/dev/consul/ui/enabled` `true` }}
|
||||
}
|
||||
|
||||
# 服务器配置
|
||||
server = true
|
||||
bootstrap_expect = {{ keyOrDefault `config/dev/consul/cluster/bootstrap_expect` `3` }}
|
||||
|
||||
# 网络配置
|
||||
client_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
bind_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
advertise_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
|
||||
# 集群连接 - 从KV获取其他节点IP
|
||||
retry_join = [
|
||||
"{{ keyOrDefault `config/dev/consul/nodes/ash3c/ip` `100.116.80.94` }}",
|
||||
"{{ keyOrDefault `config/dev/consul/nodes/warden/ip` `100.122.197.112` }}"
|
||||
]
|
||||
```
|
||||
|
||||
### Nomad作业配置
|
||||
|
||||
我们提供了完全遵循最佳变量命名规范的Nomad作业配置文件 `consul-cluster-kv.nomad`,该文件使用Consul模板从KV存储动态获取配置:
|
||||
|
||||
```hcl
|
||||
task "consul" {
|
||||
driver = "exec"
|
||||
|
||||
# 使用模板从Consul KV获取配置
|
||||
template {
|
||||
data = <<EOF
|
||||
# Consul配置文件 - 从KV存储动态获取
|
||||
# 遵循 config/{environment}/{provider}/{region_or_service}/{key} 格式
|
||||
|
||||
# 基础配置
|
||||
data_dir = "{{ keyOrDefault `config/dev/consul/cluster/data_dir` `/opt/consul/data` }}"
|
||||
raft_dir = "{{ keyOrDefault `config/dev/consul/cluster/raft_dir` `/opt/consul/raft` }}"
|
||||
|
||||
# 服务器配置
|
||||
server = true
|
||||
bootstrap_expect = {{ keyOrDefault `config/dev/consul/cluster/bootstrap_expect` `3` }}
|
||||
|
||||
# 网络配置
|
||||
client_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
bind_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
advertise_addr = "{{ keyOrDefault `config/dev/consul/nodes/master/ip` `100.117.106.136` }}"
|
||||
|
||||
# 集群连接 - 从KV获取其他节点IP
|
||||
retry_join = [
|
||||
"{{ keyOrDefault `config/dev/consul/nodes/ash3c/ip` `100.116.80.94` }}",
|
||||
"{{ keyOrDefault `config/dev/consul/nodes/warden/ip` `100.122.197.112` }}"
|
||||
]
|
||||
EOF
|
||||
destination = "local/consul.hcl"
|
||||
}
|
||||
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-config-dir=local"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 验证部署
|
||||
|
||||
部署完成后,可以通过以下方式验证Consul集群是否正确遵循了最佳变量命名规范:
|
||||
|
||||
1. **检查Consul KV中的配置**:
|
||||
```bash
|
||||
# 检查Consul集群配置
|
||||
curl -s http://localhost:8500/v1/kv/config/dev/consul/?keys | jq '.'
|
||||
```
|
||||
|
||||
2. **验证Consul集群状态**:
|
||||
```bash
|
||||
# 检查集群leader
|
||||
curl -s http://localhost:8500/v1/status/leader
|
||||
|
||||
# 检查集群节点
|
||||
curl -s http://localhost:8500/v1/status/peers
|
||||
```
|
||||
|
||||
3. **验证配置文件**:
|
||||
```bash
|
||||
# 验证生成的配置文件语法
|
||||
consul validate /root/mgmt/components/consul/configs/consul.hcl
|
||||
```
|
||||
|
||||
### 动态更新配置
|
||||
|
||||
使用这种部署方式,您可以动态更新Consul集群配置,而无需重新部署整个集群:
|
||||
|
||||
1. **更新Consul KV中的配置**:
|
||||
```bash
|
||||
# 更新日志级别
|
||||
curl -X PUT http://localhost:8500/v1/kv/config/dev/consul/cluster/log_level -d "DEBUG"
|
||||
|
||||
# 更新快照间隔
|
||||
curl -X PUT http://localhost:8500/v1/kv/config/dev/consul/snapshot/interval -d "12h"
|
||||
```
|
||||
|
||||
2. **重新生成配置文件**:
|
||||
```bash
|
||||
# 重新生成配置文件
|
||||
./deployment/scripts/generate_consul_config.sh
|
||||
```
|
||||
|
||||
3. **重新加载Consul配置**:
|
||||
```bash
|
||||
# 重新加载Consul配置
|
||||
consul reload
|
||||
```
|
||||
|
||||
### 环境隔离
|
||||
|
||||
通过使用环境变量和不同的配置路径,您可以轻松实现不同环境的隔离:
|
||||
|
||||
```bash
|
||||
# 开发环境
|
||||
ENVIRONMENT=dev ./deployment/scripts/setup_consul_cluster_variables.sh
|
||||
|
||||
# 生产环境
|
||||
ENVIRONMENT=prod ./deployment/scripts/setup_consul_cluster_variables.sh
|
||||
```
|
||||
|
||||
这样,不同环境的配置将存储在不同的路径下:
|
||||
- 开发环境: `config/dev/consul/...`
|
||||
- 生产环境: `config/prod/consul/...`
|
||||
|
||||
## 存储(Storage)配置
|
||||
|
||||
### 持久化存储
|
||||
|
||||
Consul需要持久化存储来保存Raft日志和快照数据。在Nomad作业配置中,我们已经指定了数据目录:
|
||||
|
||||
```hcl
|
||||
config {
|
||||
command = "consul"
|
||||
args = [
|
||||
"agent",
|
||||
"-server",
|
||||
"-bootstrap-expect=3",
|
||||
"-data-dir=/opt/nomad/data/consul", # 数据目录
|
||||
# 其他参数...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 快照配置
|
||||
|
||||
快照是Consul集群状态的时间点备份,用于灾难恢复。
|
||||
|
||||
#### 启用快照
|
||||
在Consul配置文件中添加以下配置:
|
||||
|
||||
```hcl
|
||||
snapshot {
|
||||
enabled = true
|
||||
interval = "24h" # 每24小时创建一次快照
|
||||
retain = 30 # 保留30个快照
|
||||
name = "consul-snapshot-{{.Timestamp}}"
|
||||
}
|
||||
```
|
||||
|
||||
#### 手动创建快照
|
||||
```bash
|
||||
# 创建快照
|
||||
consul snapshot save backup-$(date +%Y%m%d).snap
|
||||
|
||||
# 恢复快照
|
||||
consul snapshot restore backup-20231201.snap
|
||||
```
|
||||
|
||||
### 备份配置
|
||||
|
||||
定期备份Consul数据是确保数据安全的重要措施。
|
||||
|
||||
#### 配置自动备份
|
||||
```hcl
|
||||
backup {
|
||||
enabled = true
|
||||
interval = "6h" # 每6小时备份一次
|
||||
retain = 7 # 保留7个备份
|
||||
name = "consul-backup-{{.Timestamp}}"
|
||||
}
|
||||
```
|
||||
|
||||
#### 备份脚本
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# backup_consul.sh
|
||||
|
||||
DATE=$(date +%Y%m%d_%H%M%S)
|
||||
BACKUP_DIR="/backups/consul"
|
||||
CONSUL_ADDR="http://localhost:8500"
|
||||
|
||||
# 创建备份目录
|
||||
mkdir -p $BACKUP_DIR
|
||||
|
||||
# 创建快照
|
||||
curl -s "${CONSUL_ADDR}/v1/snapshot" > "${BACKUP_DIR}/consul-snapshot-${DATE}.snap"
|
||||
|
||||
# 保留最近7天的备份
|
||||
find $BACKUP_DIR -name "consul-snapshot-*.snap" -mtime +7 -delete
|
||||
|
||||
echo "备份完成: ${BACKUP_DIR}/consul-snapshot-${DATE}.snap"
|
||||
```
|
||||
|
||||
### Autopilot配置
|
||||
|
||||
Autopilot是Consul的自动管理功能,用于处理服务器故障和自动恢复。
|
||||
|
||||
```hcl
|
||||
autopilot {
|
||||
cleanup_dead_servers = true # 自动清理死服务器
|
||||
last_contact_threshold = "200ms" # 最后联系阈值
|
||||
max_trailing_logs = 250 # 最大 trailing 日志数
|
||||
server_stabilization_time = "10s" # 服务器稳定时间
|
||||
redundancy_zone_tag = "" # 冗余区域标签
|
||||
disable_upgrade_migration = false # 禁用升级迁移
|
||||
upgrade_version_tag = "" # 升级版本标签
|
||||
}
|
||||
```
|
||||
|
||||
## 完整配置示例
|
||||
|
||||
### Consul配置文件 (consul.hcl)
|
||||
```hcl
|
||||
# 基础配置
|
||||
data_dir = "/opt/consul/data"
|
||||
raft_dir = "/opt/consul/raft"
|
||||
|
||||
# 启用UI
|
||||
ui_config {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# 数据中心配置
|
||||
datacenter = "dc1"
|
||||
|
||||
# 服务器配置
|
||||
server = true
|
||||
bootstrap_expect = 3
|
||||
|
||||
# 网络配置
|
||||
client_addr = "0.0.0.0"
|
||||
bind_addr = "{{ GetInterfaceIP `eth0` }}"
|
||||
advertise_addr = "{{ GetInterfaceIP `eth0` }}"
|
||||
|
||||
# 端口配置
|
||||
ports {
|
||||
dns = 8600
|
||||
http = 8500
|
||||
https = -1
|
||||
grpc = 8502
|
||||
grpc_tls = 8503
|
||||
serf_lan = 8301
|
||||
serf_wan = 8302
|
||||
server = 8300
|
||||
}
|
||||
|
||||
# 集群连接
|
||||
retry_join = ["100.117.106.136", "100.116.80.94", "100.122.197.112"]
|
||||
|
||||
# 服务发现
|
||||
enable_service_script = true
|
||||
enable_script_checks = true
|
||||
enable_local_script_checks = true
|
||||
|
||||
# 性能调优
|
||||
performance {
|
||||
raft_multiplier = 1
|
||||
}
|
||||
|
||||
# 日志配置
|
||||
log_level = "INFO"
|
||||
enable_syslog = false
|
||||
log_file = "/var/log/consul/consul.log"
|
||||
|
||||
# 安全配置
|
||||
encrypt = "YourEncryptionKeyHere"
|
||||
|
||||
# 连接配置
|
||||
reconnect_timeout = "30s"
|
||||
reconnect_timeout_wan = "30s"
|
||||
session_ttl_min = "10s"
|
||||
|
||||
# Autopilot配置
|
||||
autopilot {
|
||||
cleanup_dead_servers = true
|
||||
last_contact_threshold = "200ms"
|
||||
max_trailing_logs = 250
|
||||
server_stabilization_time = "10s"
|
||||
redundancy_zone_tag = ""
|
||||
disable_upgrade_migration = false
|
||||
upgrade_version_tag = ""
|
||||
}
|
||||
|
||||
# 快照配置
|
||||
snapshot {
|
||||
enabled = true
|
||||
interval = "24h"
|
||||
retain = 30
|
||||
name = "consul-snapshot-{{.Timestamp}}"
|
||||
}
|
||||
|
||||
# 备份配置
|
||||
backup {
|
||||
enabled = true
|
||||
interval = "6h"
|
||||
retain = 7
|
||||
name = "consul-backup-{{.Timestamp}}"
|
||||
}
|
||||
```
|
||||
|
||||
## 部署步骤
|
||||
|
||||
### 1. 准备配置文件
|
||||
```bash
|
||||
# 创建配置目录
|
||||
mkdir -p /root/mgmt/components/consul/configs
|
||||
|
||||
# 创建配置文件
|
||||
cat > /root/mgmt/components/consul/configs/consul.hcl << EOF
|
||||
# 粘贴上面的完整配置示例
|
||||
EOF
|
||||
```
|
||||
|
||||
### 2. 运行配置脚本
|
||||
```bash
|
||||
# 运行自动化脚本
|
||||
./deployment/scripts/setup_consul_variables_and_storage.sh
|
||||
```
|
||||
|
||||
### 3. 重启Consul服务
|
||||
```bash
|
||||
# 停止Consul服务
|
||||
nomad job stop consul-cluster-simple
|
||||
|
||||
# 重新启动Consul服务
|
||||
nomad job run /root/mgmt/components/consul/jobs/consul-cluster-simple.nomad
|
||||
```
|
||||
|
||||
### 4. 验证配置
|
||||
```bash
|
||||
# 检查Consul状态
|
||||
curl http://localhost:8500/v1/status/leader
|
||||
|
||||
# 检查变量配置
|
||||
curl -s http://localhost:8500/v1/kv/config/dev/?recurse | jq
|
||||
|
||||
# 检查存储配置
|
||||
curl -s http://localhost:8500/v1/kv/storage/?recurse | jq
|
||||
```
|
||||
|
||||
## 最佳实践
|
||||
|
||||
1. **定期备份**: 设置定期备份Consul数据,并测试恢复过程
|
||||
2. **监控存储空间**: 监控Consul数据目录的使用情况,避免磁盘空间不足
|
||||
3. **安全配置**: 使用ACL和TLS保护Consul集群
|
||||
4. **版本控制**: 将Consul配置文件纳入版本控制系统
|
||||
5. **环境隔离**: 为不同环境(dev/staging/prod)使用不同的配置路径
|
||||
6. **文档记录**: 记录所有配置项的用途和取值范围
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 常见问题
|
||||
|
||||
#### 1. 变量无法读取
|
||||
- 检查Consul服务是否正常运行
|
||||
- 验证变量路径是否正确
|
||||
- 确认ACL权限是否足够
|
||||
|
||||
#### 2. 存储空间不足
|
||||
- 检查数据目录大小
|
||||
- 调整快照和备份保留策略
|
||||
- 清理旧快照和备份
|
||||
|
||||
#### 3. 快照失败
|
||||
- 检查磁盘空间
|
||||
- 验证文件权限
|
||||
- 查看Consul日志获取详细错误信息
|
||||
|
||||
### 调试命令
|
||||
```bash
|
||||
# 查看Consul成员
|
||||
consul members
|
||||
|
||||
# 查看Raft状态
|
||||
consul operator raft list-peers
|
||||
|
||||
# 查看键值存储
|
||||
consul kv get --recurse config/dev/
|
||||
|
||||
# 查看快照信息
|
||||
consul snapshot inspect backup.snap
|
||||
```
|
||||
|
||||
## 扩展功能
|
||||
|
||||
### 与Vault集成
|
||||
|
||||
Consul可以与Vault集成,提供更强大的密钥管理功能:
|
||||
|
||||
```bash
|
||||
# 配置Vault作为Consul的加密后端
|
||||
vault secrets enable consul
|
||||
|
||||
# 配置Consul使用Vault进行加密
|
||||
consul encrypt -vault-token="$VAULT_TOKEN" -vault-addr="$VAULT_ADDR"
|
||||
```
|
||||
|
||||
### 与Nomad集成
|
||||
|
||||
Consul可以与Nomad集成,提供服务发现和配置管理:
|
||||
|
||||
```hcl
|
||||
# Nomad配置中的Consul集成
|
||||
consul {
|
||||
address = "localhost:8500"
|
||||
token = "your-consul-token"
|
||||
ssl = false
|
||||
}
|
||||
```
|
||||
|
||||
## 总结
|
||||
|
||||
通过配置Consul的变量和存储功能,可以显著增强集群的功能性和可靠性。变量功能提供了灵活的配置管理,而存储功能确保了数据的安全性和持久性。结合自动化脚本和最佳实践,可以构建一个强大且易于维护的Consul集群。
|
||||
86
docs/setup/oci-credentials-setup.md
Normal file
86
docs/setup/oci-credentials-setup.md
Normal file
@@ -0,0 +1,86 @@
|
||||
# Oracle Cloud 凭据配置指南
|
||||
|
||||
## 凭据文件位置
|
||||
|
||||
### 1. OpenTofu 配置文件
|
||||
**文件位置**: `infrastructure/environments/dev/terraform.tfvars`
|
||||
|
||||
这是主要的配置文件,需要填入你的 OCI 凭据:
|
||||
|
||||
```hcl
|
||||
# Oracle Cloud 配置
|
||||
oci_config = {
|
||||
tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa_你的租户ID"
|
||||
user_ocid = "ocid1.user.oc1..aaaaaaaa_你的用户ID"
|
||||
fingerprint = "aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp"
|
||||
private_key_path = "~/.oci/oci_api_key.pem"
|
||||
region = "ap-seoul-1"
|
||||
compartment_ocid = "ocid1.compartment.oc1..aaaaaaaa_你的区间ID"
|
||||
}
|
||||
```
|
||||
|
||||
### 2. OCI 私钥文件
|
||||
**文件位置**: `~/.oci/oci_api_key.pem`
|
||||
|
||||
这是你的 API 私钥文件,内容类似:
|
||||
|
||||
```
|
||||
-----BEGIN PRIVATE KEY-----
|
||||
MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC...
|
||||
你的私钥内容
|
||||
-----END PRIVATE KEY-----
|
||||
```
|
||||
|
||||
### 3. OCI 配置文件 (可选)
|
||||
**文件位置**: `~/.oci/config`
|
||||
|
||||
这是 OCI CLI 的配置文件,可以作为备用:
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
user=ocid1.user.oc1..aaaaaaaa_你的用户ID
|
||||
fingerprint=aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp
|
||||
tenancy=ocid1.tenancy.oc1..aaaaaaaa_你的租户ID
|
||||
region=ap-seoul-1
|
||||
key_file=~/.oci/oci_api_key.pem
|
||||
```
|
||||
|
||||
## 设置步骤
|
||||
|
||||
### 步骤 1: 创建 .oci 目录
|
||||
```bash
|
||||
mkdir -p ~/.oci
|
||||
chmod 700 ~/.oci
|
||||
```
|
||||
|
||||
### 步骤 2: 保存私钥文件
|
||||
```bash
|
||||
# 将你的私钥内容保存到文件
|
||||
nano ~/.oci/oci_api_key.pem
|
||||
|
||||
# 设置正确的权限
|
||||
chmod 400 ~/.oci/oci_api_key.pem
|
||||
```
|
||||
|
||||
### 步骤 3: 编辑 terraform.tfvars
|
||||
```bash
|
||||
# 编辑配置文件
|
||||
nano infrastructure/environments/dev/terraform.tfvars
|
||||
```
|
||||
|
||||
## 安全注意事项
|
||||
|
||||
1. **私钥文件权限**: 确保私钥文件权限为 400 (只有所有者可读)
|
||||
2. **不要提交到 Git**: `.gitignore` 已经配置忽略 `*.tfvars` 文件
|
||||
3. **备份凭据**: 建议安全备份你的私钥和配置信息
|
||||
|
||||
## 验证配置
|
||||
|
||||
配置完成后,可以运行以下命令验证:
|
||||
|
||||
```bash
|
||||
# 检查配置
|
||||
./scripts/setup/setup-opentofu.sh check
|
||||
|
||||
# 初始化 OpenTofu
|
||||
./scripts/setup/setup-opentofu.sh init
|
||||
153
docs/setup/oracle-cloud-setup.md
Normal file
153
docs/setup/oracle-cloud-setup.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# Oracle Cloud 配置指南
|
||||
|
||||
## 概述
|
||||
|
||||
本指南将帮助你配置 Oracle Cloud Infrastructure (OCI) 以便与 OpenTofu 一起使用。
|
||||
|
||||
## 前提条件
|
||||
|
||||
1. Oracle Cloud 账户(可以使用免费层)
|
||||
2. 已安装 OpenTofu
|
||||
3. 已安装 OCI CLI(可选,但推荐)
|
||||
|
||||
## 步骤 1: 创建 Oracle Cloud 账户
|
||||
|
||||
1. 访问 [Oracle Cloud](https://cloud.oracle.com/)
|
||||
2. 点击 "Start for free" 创建免费账户
|
||||
3. 完成注册流程
|
||||
|
||||
## 步骤 2: 获取必要的 OCID
|
||||
|
||||
### 获取 Tenancy OCID
|
||||
|
||||
1. 登录 Oracle Cloud Console
|
||||
2. 点击右上角的用户图标
|
||||
3. 选择 "Tenancy: <your-tenancy-name>"
|
||||
4. 复制 OCID 值
|
||||
|
||||
### 获取 User OCID
|
||||
|
||||
1. 在 Oracle Cloud Console 中
|
||||
2. 点击右上角的用户图标
|
||||
3. 选择 "User Settings"
|
||||
4. 复制 OCID 值
|
||||
|
||||
### 获取 Compartment OCID
|
||||
|
||||
1. 在导航菜单中选择 "Identity & Security" > "Compartments"
|
||||
2. 选择你要使用的 compartment(通常是 root compartment)
|
||||
3. 复制 OCID 值
|
||||
|
||||
## 步骤 3: 创建 API 密钥
|
||||
|
||||
### 生成密钥对
|
||||
|
||||
```bash
|
||||
# 创建 .oci 目录
|
||||
mkdir -p ~/.oci
|
||||
|
||||
# 生成私钥
|
||||
openssl genrsa -out ~/.oci/oci_api_key.pem 2048
|
||||
|
||||
# 生成公钥
|
||||
openssl rsa -pubout -in ~/.oci/oci_api_key.pem -out ~/.oci/oci_api_key_public.pem
|
||||
|
||||
# 设置权限
|
||||
chmod 400 ~/.oci/oci_api_key.pem
|
||||
chmod 400 ~/.oci/oci_api_key_public.pem
|
||||
```
|
||||
|
||||
### 添加公钥到 Oracle Cloud
|
||||
|
||||
1. 在 Oracle Cloud Console 中,进入 "User Settings"
|
||||
2. 在左侧菜单中选择 "API Keys"
|
||||
3. 点击 "Add API Key"
|
||||
4. 选择 "Paste Public Key"
|
||||
5. 复制 `~/.oci/oci_api_key_public.pem` 的内容并粘贴
|
||||
6. 点击 "Add"
|
||||
7. 复制显示的 fingerprint
|
||||
|
||||
## 步骤 4: 配置 terraform.tfvars
|
||||
|
||||
编辑 `infrastructure/environments/dev/terraform.tfvars` 文件:
|
||||
|
||||
```hcl
|
||||
# Oracle Cloud 配置
|
||||
oci_config = {
|
||||
tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa_your_actual_tenancy_id"
|
||||
user_ocid = "ocid1.user.oc1..aaaaaaaa_your_actual_user_id"
|
||||
fingerprint = "aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp"
|
||||
private_key_path = "~/.oci/oci_api_key.pem"
|
||||
region = "ap-seoul-1" # 或你选择的区域
|
||||
compartment_ocid = "ocid1.compartment.oc1..aaaaaaaa_your_compartment_id"
|
||||
}
|
||||
```
|
||||
|
||||
## 步骤 5: 验证配置
|
||||
|
||||
```bash
|
||||
# 检查配置
|
||||
./scripts/setup/setup-opentofu.sh check
|
||||
|
||||
# 初始化 OpenTofu
|
||||
./scripts/setup/setup-opentofu.sh init
|
||||
|
||||
# 生成计划
|
||||
./scripts/setup/setup-opentofu.sh plan
|
||||
```
|
||||
|
||||
## 可用区域
|
||||
|
||||
常用的 Oracle Cloud 区域:
|
||||
|
||||
- `ap-seoul-1` - 韩国首尔
|
||||
- `ap-tokyo-1` - 日本东京
|
||||
- `us-ashburn-1` - 美国弗吉尼亚州
|
||||
- `us-phoenix-1` - 美国亚利桑那州
|
||||
- `eu-frankfurt-1` - 德国法兰克福
|
||||
|
||||
## 免费层资源
|
||||
|
||||
Oracle Cloud 免费层包括:
|
||||
|
||||
- 2 个 AMD 计算实例(VM.Standard.E2.1.Micro)
|
||||
- 4 个 Arm 计算实例(VM.Standard.A1.Flex)
|
||||
- 200 GB 块存储
|
||||
- 10 GB 对象存储
|
||||
- 负载均衡器
|
||||
- 数据库等
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 常见错误
|
||||
|
||||
1. **401 Unauthorized**: 检查 API 密钥配置
|
||||
2. **404 Not Found**: 检查 OCID 是否正确
|
||||
3. **权限错误**: 确保用户有足够的权限
|
||||
|
||||
### 验证连接
|
||||
|
||||
```bash
|
||||
# 安装 OCI CLI(可选)
|
||||
pip install oci-cli
|
||||
|
||||
# 配置 OCI CLI
|
||||
oci setup config
|
||||
|
||||
# 测试连接
|
||||
oci iam compartment list
|
||||
```
|
||||
|
||||
## 安全最佳实践
|
||||
|
||||
1. 定期轮换 API 密钥
|
||||
2. 使用最小权限原则
|
||||
3. 不要在代码中硬编码凭据
|
||||
4. 使用 compartment 隔离资源
|
||||
5. 启用审计日志
|
||||
|
||||
## 参考资料
|
||||
|
||||
- [Oracle Cloud Infrastructure 文档](https://docs.oracle.com/en-us/iaas/)
|
||||
- [OCI Terraform Provider](https://registry.terraform.io/providers/oracle/oci/latest/docs)
|
||||
- [Oracle Cloud 免费层](https://www.oracle.com/cloud/free/)
|
||||
Reference in New Issue
Block a user