feat: 重构Ansible playbooks目录结构并添加新功能
- 将playbooks按功能分类到不同目录(系统管理/安全/服务/监控/云服务) - 新增Traefik和Consul集群部署配置 - 添加Docker Swarm监控栈配置 - 实现自动化部署脚本 - 更新README文档说明新结构和使用方法
This commit is contained in:
168
ansible/README.md
Normal file
168
ansible/README.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Ansible Playbooks 管理文档
|
||||
|
||||
## 📁 目录结构
|
||||
|
||||
```
|
||||
ansible/
|
||||
├── playbooks/ # 主要 playbooks 目录
|
||||
│ ├── 01-system/ # 系统管理类
|
||||
│ ├── 02-security/ # 安全管理类
|
||||
│ ├── 03-services/ # 服务管理类
|
||||
│ ├── 04-monitoring/ # 监控检查类
|
||||
│ ├── 05-cloud/ # 云服务商专用
|
||||
│ └── 99-tools/ # 工具和集成类
|
||||
├── inventory.ini # 主机清单
|
||||
├── ansible.cfg # Ansible 配置
|
||||
├── run.sh # 原始运行脚本
|
||||
└── run-playbook.sh # 新的分类运行脚本
|
||||
```
|
||||
|
||||
## 🎯 分类说明
|
||||
|
||||
### 01-system (系统管理)
|
||||
负责基础系统的维护和管理任务。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `system-update.yml` | 系统包更新和升级 | 所有 Linux 主机 |
|
||||
| `system-cleanup.yml` | 系统清理和维护 | 所有主机 |
|
||||
| `cron-setup.yml` | 定时任务配置 | 需要定时任务的主机 |
|
||||
|
||||
### 02-security (安全管理)
|
||||
处理安全相关的配置和监控。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `security-hardening.yml` | SSH 安全加固和备份 | 所有主机 |
|
||||
| `certificate-management.yml` | SSL 证书管理和监控 | Web 服务器和 SSL 服务 |
|
||||
|
||||
### 03-services (服务管理)
|
||||
管理各种服务和容器。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `docker-management.yml` | Docker 容器管理 | Docker 主机 |
|
||||
| `docker-status-check.yml` | Docker 状态检查 | Docker Swarm 节点 |
|
||||
|
||||
### 04-monitoring (监控检查)
|
||||
系统和服务的健康检查。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `service-health-check.yml` | 服务健康状态监控 | 所有主机 |
|
||||
| `network-connectivity.yml` | 网络连接性能检查 | 所有主机 |
|
||||
|
||||
### 05-cloud (云服务商专用)
|
||||
针对特定云服务商的优化脚本。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `cloud-providers-update.yml` | 云服务商系统更新 | huawei, google, digitalocean, aws |
|
||||
|
||||
### 99-tools (工具和集成)
|
||||
运维工具和集成脚本。
|
||||
|
||||
| Playbook | 功能描述 | 适用主机 |
|
||||
|----------|----------|----------|
|
||||
| `ops-toolkit.yml` | 统一运维管理面板 | 所有主机 |
|
||||
|
||||
## 🚀 使用方法
|
||||
|
||||
### 1. 使用新的分类运行脚本
|
||||
|
||||
```bash
|
||||
# 查看帮助
|
||||
./run-playbook.sh help
|
||||
|
||||
# 列出所有可用的 playbooks
|
||||
./run-playbook.sh list
|
||||
|
||||
# 运行特定分类的 playbook
|
||||
./run-playbook.sh 01-system system-update.yml all
|
||||
./run-playbook.sh 03-services docker-status-check.yml hcp
|
||||
./run-playbook.sh 04-monitoring network-connectivity.yml dev1
|
||||
```
|
||||
|
||||
### 2. 直接使用 ansible-playbook
|
||||
|
||||
```bash
|
||||
# 运行系统更新
|
||||
ansible-playbook -i inventory.ini playbooks/01-system/system-update.yml
|
||||
|
||||
# 检查 Docker 状态
|
||||
ansible-playbook -i inventory.ini playbooks/03-services/docker-status-check.yml --limit hcp
|
||||
|
||||
# 网络连接检查
|
||||
ansible-playbook -i inventory.ini playbooks/04-monitoring/network-connectivity.yml --limit dev1
|
||||
```
|
||||
|
||||
## 📋 主机组说明
|
||||
|
||||
根据 `inventory.ini` 配置的主机组:
|
||||
|
||||
- **dev**: 开发环境 (dev1, dev2)
|
||||
- **hcp**: HCP 节点 (hcp1, hcp2) - Docker Swarm 集群
|
||||
- **oci_kr**: Oracle Cloud Korea (ch2, ch3, master)
|
||||
- **oci_us**: Oracle Cloud US (ash1d, ash2e, ash3c)
|
||||
- **huawei**: 华为云 (hcs)
|
||||
- **google**: Google Cloud (benwork)
|
||||
- **digitalocean**: DigitalOcean (syd)
|
||||
- **aws**: Amazon Web Services (awsirish)
|
||||
- **proxmox**: Proxmox 虚拟化 (pve, xgp, nuc12)
|
||||
- **lxc**: LXC 容器 (warden, gitea, influxdb, mysql, postgresql)
|
||||
- **alpine**: Alpine Linux 容器 (redis, authentik, calibreweb)
|
||||
- **vm**: 虚拟机 (kali)
|
||||
|
||||
## 🔧 配置文件
|
||||
|
||||
### ansible.cfg
|
||||
已更新支持新的目录结构,包含:
|
||||
- 新的 playbooks 路径配置
|
||||
- SSH 连接优化
|
||||
- 动态 inventory 支持
|
||||
|
||||
### inventory.ini
|
||||
包含所有主机的连接信息和分组配置。
|
||||
|
||||
## 📝 最佳实践
|
||||
|
||||
1. **按功能分类运行**: 根据需要选择合适的分类目录
|
||||
2. **使用主机组**: 利用 inventory 中的主机组进行批量操作
|
||||
3. **测试先行**: 在开发环境先测试,再应用到生产环境
|
||||
4. **日志记录**: 重要操作建议记录执行日志
|
||||
5. **定期维护**: 定期运行系统清理和更新脚本
|
||||
|
||||
## 🆘 故障排除
|
||||
|
||||
### 常见问题
|
||||
|
||||
1. **SSH 连接失败**
|
||||
- 检查主机是否可达
|
||||
- 验证 SSH 密钥或密码
|
||||
- 确认用户权限
|
||||
|
||||
2. **Playbook 执行失败**
|
||||
- 检查目标主机的系统类型
|
||||
- 验证所需的软件包是否安装
|
||||
- 查看详细错误日志
|
||||
|
||||
3. **权限问题**
|
||||
- 确认 `ansible_become` 配置正确
|
||||
- 验证 sudo 权限
|
||||
|
||||
### 调试命令
|
||||
|
||||
```bash
|
||||
# 测试连接
|
||||
ansible all -i inventory.ini -m ping
|
||||
|
||||
# 详细输出
|
||||
ansible-playbook -i inventory.ini playbooks/01-system/system-update.yml -vvv
|
||||
|
||||
# 检查语法
|
||||
ansible-playbook --syntax-check playbooks/01-system/system-update.yml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*最后更新: $(date '+%Y-%m-%d %H:%M:%S')*
|
||||
@@ -4,7 +4,14 @@ host_key_checking = False
|
||||
timeout = 30
|
||||
gathering = smart
|
||||
fact_caching = memory
|
||||
# 支持新的 playbooks 目录结构
|
||||
roles_path = playbooks/
|
||||
collections_path = playbooks/
|
||||
|
||||
[ssh_connection]
|
||||
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no
|
||||
pipelining = True
|
||||
pipelining = True
|
||||
|
||||
[inventory]
|
||||
# 启用插件以支持动态 inventory
|
||||
enable_plugins = host_list, script, auto, yaml, ini, toml
|
||||
97
ansible/playbooks/03-services/docker-status-check.yml
Normal file
97
ansible/playbooks/03-services/docker-status-check.yml
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
- name: Docker Status Check for HCP Nodes
|
||||
hosts: hcp
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check if Docker is installed
|
||||
command: docker --version
|
||||
register: docker_version
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Docker version
|
||||
debug:
|
||||
msg: "Docker version: {{ docker_version.stdout }}"
|
||||
when: docker_version.rc == 0
|
||||
|
||||
- name: Check Docker service status
|
||||
systemd:
|
||||
name: docker
|
||||
register: docker_service_status
|
||||
|
||||
- name: Display Docker service status
|
||||
debug:
|
||||
msg: "Docker service is {{ docker_service_status.status.ActiveState }}"
|
||||
|
||||
- name: Check Docker daemon info
|
||||
command: docker info --format "{{ '{{' }}.ServerVersion{{ '}}' }}"
|
||||
register: docker_info
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Docker daemon info
|
||||
debug:
|
||||
msg: "Docker daemon version: {{ docker_info.stdout }}"
|
||||
when: docker_info.rc == 0
|
||||
|
||||
- name: Check Docker Swarm status
|
||||
command: docker info --format "{{ '{{' }}.Swarm.LocalNodeState{{ '}}' }}"
|
||||
register: swarm_status
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Swarm status
|
||||
debug:
|
||||
msg: "Swarm status: {{ swarm_status.stdout }}"
|
||||
when: swarm_status.rc == 0
|
||||
|
||||
- name: Get Docker Swarm node info (if in swarm)
|
||||
command: docker node ls
|
||||
register: swarm_nodes
|
||||
ignore_errors: yes
|
||||
when: swarm_status.stdout == "active"
|
||||
|
||||
- name: Display Swarm nodes
|
||||
debug:
|
||||
msg: "{{ swarm_nodes.stdout_lines }}"
|
||||
when: swarm_nodes is defined and swarm_nodes.rc == 0
|
||||
|
||||
- name: List running containers
|
||||
command: docker ps --format "table {{ '{{' }}.Names{{ '}}' }}\t{{ '{{' }}.Status{{ '}}' }}\t{{ '{{' }}.Ports{{ '}}' }}"
|
||||
register: running_containers
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display running containers
|
||||
debug:
|
||||
msg: "{{ running_containers.stdout_lines }}"
|
||||
when: running_containers.rc == 0
|
||||
|
||||
- name: Check Docker network list
|
||||
command: docker network ls
|
||||
register: docker_networks
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Docker networks
|
||||
debug:
|
||||
msg: "{{ docker_networks.stdout_lines }}"
|
||||
when: docker_networks.rc == 0
|
||||
|
||||
- name: Get Docker system info
|
||||
command: docker system df
|
||||
register: docker_system_info
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display Docker system usage
|
||||
debug:
|
||||
msg: "{{ docker_system_info.stdout_lines }}"
|
||||
when: docker_system_info.rc == 0
|
||||
|
||||
- name: Check if node is Swarm manager
|
||||
command: docker node inspect self --format "{{ '{{' }}.ManagerStatus.Leader{{ '}}' }}"
|
||||
register: is_manager
|
||||
ignore_errors: yes
|
||||
when: swarm_status.stdout == "active"
|
||||
|
||||
- name: Display manager status
|
||||
debug:
|
||||
msg: "Is Swarm manager: {{ is_manager.stdout }}"
|
||||
when: is_manager is defined and is_manager.rc == 0
|
||||
109
ansible/run-playbook.sh
Executable file
109
ansible/run-playbook.sh
Executable file
@@ -0,0 +1,109 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Ansible Playbooks 分类运行脚本
|
||||
# 使用方法: ./run-playbook.sh [category] [playbook] [hosts]
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PLAYBOOKS_DIR="$SCRIPT_DIR/playbooks"
|
||||
|
||||
# 颜色定义
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# 显示使用帮助
|
||||
show_help() {
|
||||
echo -e "${BLUE}Ansible Playbooks 分类运行脚本${NC}"
|
||||
echo ""
|
||||
echo "使用方法:"
|
||||
echo " $0 [category] [playbook] [hosts]"
|
||||
echo ""
|
||||
echo "可用分类:"
|
||||
echo -e " ${GREEN}01-system${NC} - 系统管理 (更新、清理、定时任务)"
|
||||
echo -e " ${GREEN}02-security${NC} - 安全管理 (安全加固、证书管理)"
|
||||
echo -e " ${GREEN}03-services${NC} - 服务管理 (Docker、容器服务)"
|
||||
echo -e " ${GREEN}04-monitoring${NC} - 监控检查 (健康检查、网络连接)"
|
||||
echo -e " ${GREEN}05-cloud${NC} - 云服务商专用"
|
||||
echo -e " ${GREEN}99-tools${NC} - 工具和集成"
|
||||
echo ""
|
||||
echo "示例:"
|
||||
echo " $0 list # 列出所有可用的 playbooks"
|
||||
echo " $0 01-system system-update.yml all # 在所有主机上运行系统更新"
|
||||
echo " $0 03-services docker-status-check.yml hcp # 在 hcp 组上检查 Docker 状态"
|
||||
echo " $0 04-monitoring network-connectivity.yml dev1 # 在 dev1 主机上检查网络连接"
|
||||
}
|
||||
|
||||
# 列出所有可用的 playbooks
|
||||
list_playbooks() {
|
||||
echo -e "${BLUE}可用的 Ansible Playbooks:${NC}"
|
||||
echo ""
|
||||
|
||||
for category in $(ls -1 "$PLAYBOOKS_DIR" | sort); do
|
||||
if [ -d "$PLAYBOOKS_DIR/$category" ]; then
|
||||
echo -e "${GREEN}📁 $category${NC}"
|
||||
for playbook in $(ls -1 "$PLAYBOOKS_DIR/$category"/*.yml 2>/dev/null | sort); do
|
||||
if [ -f "$playbook" ]; then
|
||||
basename_playbook=$(basename "$playbook")
|
||||
echo -e " └── ${YELLOW}$basename_playbook${NC}"
|
||||
fi
|
||||
done
|
||||
echo ""
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
# 运行指定的 playbook
|
||||
run_playbook() {
|
||||
local category="$1"
|
||||
local playbook="$2"
|
||||
local hosts="$3"
|
||||
|
||||
local playbook_path="$PLAYBOOKS_DIR/$category/$playbook"
|
||||
|
||||
if [ ! -f "$playbook_path" ]; then
|
||||
echo -e "${RED}错误: Playbook 文件不存在: $playbook_path${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo -e "${GREEN}运行 Playbook:${NC} $category/$playbook"
|
||||
echo -e "${GREEN}目标主机:${NC} $hosts"
|
||||
echo ""
|
||||
|
||||
# 运行 ansible-playbook
|
||||
ansible-playbook -i inventory.ini "$playbook_path" --limit "$hosts"
|
||||
}
|
||||
|
||||
# 主逻辑
|
||||
case "${1:-}" in
|
||||
"help"|"-h"|"--help"|"")
|
||||
show_help
|
||||
;;
|
||||
"list"|"ls")
|
||||
list_playbooks
|
||||
;;
|
||||
*)
|
||||
if [ $# -lt 3 ]; then
|
||||
echo -e "${RED}错误: 参数不足${NC}"
|
||||
echo ""
|
||||
show_help
|
||||
exit 1
|
||||
fi
|
||||
|
||||
category="$1"
|
||||
playbook="$2"
|
||||
hosts="$3"
|
||||
|
||||
if [ ! -d "$PLAYBOOKS_DIR/$category" ]; then
|
||||
echo -e "${RED}错误: 分类目录不存在: $category${NC}"
|
||||
echo ""
|
||||
list_playbooks
|
||||
exit 1
|
||||
fi
|
||||
|
||||
run_playbook "$category" "$playbook" "$hosts"
|
||||
;;
|
||||
esac
|
||||
Reference in New Issue
Block a user