feat: 集成 OpenTofu + Ansible + Gitea CI/CD

- 重构项目目录结构
- 添加 OpenTofu 多云支持
- 配置 Ansible 自动化部署
- 集成 Gitea Actions CI/CD 流水线
- 添加 Docker Swarm 管理
- 完善监控和安全配置
This commit is contained in:
2025-09-20 10:48:41 +00:00
parent d755f237a0
commit 7eb4a33523
55 changed files with 3745 additions and 1921 deletions

View File

@@ -1,168 +0,0 @@
# Ansible Playbooks 管理文档
## 📁 目录结构
```
ansible/
├── playbooks/ # 主要 playbooks 目录
│ ├── 01-system/ # 系统管理类
│ ├── 02-security/ # 安全管理类
│ ├── 03-services/ # 服务管理类
│ ├── 04-monitoring/ # 监控检查类
│ ├── 05-cloud/ # 云服务商专用
│ └── 99-tools/ # 工具和集成类
├── inventory.ini # 主机清单
├── ansible.cfg # Ansible 配置
├── run.sh # 原始运行脚本
└── run-playbook.sh # 新的分类运行脚本
```
## 🎯 分类说明
### 01-system (系统管理)
负责基础系统的维护和管理任务。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `system-update.yml` | 系统包更新和升级 | 所有 Linux 主机 |
| `system-cleanup.yml` | 系统清理和维护 | 所有主机 |
| `cron-setup.yml` | 定时任务配置 | 需要定时任务的主机 |
### 02-security (安全管理)
处理安全相关的配置和监控。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `security-hardening.yml` | SSH 安全加固和备份 | 所有主机 |
| `certificate-management.yml` | SSL 证书管理和监控 | Web 服务器和 SSL 服务 |
### 03-services (服务管理)
管理各种服务和容器。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `docker-management.yml` | Docker 容器管理 | Docker 主机 |
| `docker-status-check.yml` | Docker 状态检查 | Docker Swarm 节点 |
### 04-monitoring (监控检查)
系统和服务的健康检查。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `service-health-check.yml` | 服务健康状态监控 | 所有主机 |
| `network-connectivity.yml` | 网络连接性能检查 | 所有主机 |
### 05-cloud (云服务商专用)
针对特定云服务商的优化脚本。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `cloud-providers-update.yml` | 云服务商系统更新 | huawei, google, digitalocean, aws |
### 99-tools (工具和集成)
运维工具和集成脚本。
| Playbook | 功能描述 | 适用主机 |
|----------|----------|----------|
| `ops-toolkit.yml` | 统一运维管理面板 | 所有主机 |
## 🚀 使用方法
### 1. 使用新的分类运行脚本
```bash
# 查看帮助
./run-playbook.sh help
# 列出所有可用的 playbooks
./run-playbook.sh list
# 运行特定分类的 playbook
./run-playbook.sh 01-system system-update.yml all
./run-playbook.sh 03-services docker-status-check.yml hcp
./run-playbook.sh 04-monitoring network-connectivity.yml dev1
```
### 2. 直接使用 ansible-playbook
```bash
# 运行系统更新
ansible-playbook -i inventory.ini playbooks/01-system/system-update.yml
# 检查 Docker 状态
ansible-playbook -i inventory.ini playbooks/03-services/docker-status-check.yml --limit hcp
# 网络连接检查
ansible-playbook -i inventory.ini playbooks/04-monitoring/network-connectivity.yml --limit dev1
```
## 📋 主机组说明
根据 `inventory.ini` 配置的主机组:
- **dev**: 开发环境 (dev1, dev2)
- **hcp**: HCP 节点 (hcp1, hcp2) - Docker Swarm 集群
- **oci_kr**: Oracle Cloud Korea (ch2, ch3, master)
- **oci_us**: Oracle Cloud US (ash1d, ash2e, ash3c)
- **huawei**: 华为云 (hcs)
- **google**: Google Cloud (benwork)
- **digitalocean**: DigitalOcean (syd)
- **aws**: Amazon Web Services (awsirish)
- **proxmox**: Proxmox 虚拟化 (pve, xgp, nuc12)
- **lxc**: LXC 容器 (warden, gitea, influxdb, mysql, postgresql)
- **alpine**: Alpine Linux 容器 (redis, authentik, calibreweb)
- **vm**: 虚拟机 (kali)
## 🔧 配置文件
### ansible.cfg
已更新支持新的目录结构,包含:
- 新的 playbooks 路径配置
- SSH 连接优化
- 动态 inventory 支持
### inventory.ini
包含所有主机的连接信息和分组配置。
## 📝 最佳实践
1. **按功能分类运行**: 根据需要选择合适的分类目录
2. **使用主机组**: 利用 inventory 中的主机组进行批量操作
3. **测试先行**: 在开发环境先测试,再应用到生产环境
4. **日志记录**: 重要操作建议记录执行日志
5. **定期维护**: 定期运行系统清理和更新脚本
## 🆘 故障排除
### 常见问题
1. **SSH 连接失败**
- 检查主机是否可达
- 验证 SSH 密钥或密码
- 确认用户权限
2. **Playbook 执行失败**
- 检查目标主机的系统类型
- 验证所需的软件包是否安装
- 查看详细错误日志
3. **权限问题**
- 确认 `ansible_become` 配置正确
- 验证 sudo 权限
### 调试命令
```bash
# 测试连接
ansible all -i inventory.ini -m ping
# 详细输出
ansible-playbook -i inventory.ini playbooks/01-system/system-update.yml -vvv
# 检查语法
ansible-playbook --syntax-check playbooks/01-system/system-update.yml
```
---
*最后更新: $(date '+%Y-%m-%d %H:%M:%S')*

View File

@@ -1,17 +0,0 @@
[defaults]
inventory = inventory.ini
host_key_checking = False
timeout = 30
gathering = smart
fact_caching = memory
# 支持新的 playbooks 目录结构
roles_path = playbooks/
collections_path = playbooks/
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no
pipelining = True
[inventory]
# 启用插件以支持动态 inventory
enable_plugins = host_list, script, auto, yaml, ini, toml

View File

@@ -1,60 +0,0 @@
[dev]
dev1 ansible_host=dev1 ansible_user=ben ansible_become=yes ansible_become_pass=3131
dev2 ansible_host=dev2 ansible_user=ben ansible_become=yes ansible_become_pass=3131
[oci_kr]
ch2 ansible_host=ch2 ansible_user=ben ansible_become=yes ansible_become_pass=3131
ch3 ansible_host=ch3 ansible_user=ben ansible_become=yes ansible_become_pass=3131
master ansible_host=master ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131
[oci_us]
ash1d ansible_host=ash1d ansible_user=ben ansible_become=yes ansible_become_pass=3131
ash2e ansible_host=ash2e ansible_user=ben ansible_become=yes ansible_become_pass=3131
ash3c ansible_host=ash3c ansible_user=ben ansible_become=yes ansible_become_pass=3131
[huawei]
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131
[google]
benwork ansible_host=benwork ansible_user=ben ansible_become=yes ansible_become_pass=3131
[ditigalocean]
syd ansible_host=syd ansible_user=ben ansible_become=yes ansible_become_pass=3131
[aws]
#aws linux dnf
awsirish ansible_host=awsirish ansible_user=ben ansible_become=yes ansible_become_pass=3131
[proxmox]
pve ansible_host=pve ansible_user=root ansible_become=yes ansible_become_pass=Aa313131@ben
xgp ansible_host=xgp ansible_user=root ansible_become=yes ansible_become_pass=Aa313131@ben
nuc12 ansible_host=nuc12 ansible_user=root ansible_become=yes ansible_become_pass=Aa313131@ben
[lxc]
#集中在三台机器不要同时upgrade 会死掉,顺序调度来 (Debian/Ubuntu containers using apt)
warden ansible_host=warden ansible_user=ben ansible_become=yes ansible_become_pass=3131
gitea ansible_host=gitea ansible_user=root ansible_become=yes ansible_become_pass=313131
influxdb ansible_host=influxdb1 ansible_user=root ansible_become=yes ansible_become_pass=313131
mysql ansible_host=mysql ansible_user=root ansible_become=yes ansible_become_pass=313131
postgresql ansible_host=postgresql ansible_user=root ansible_become=yes ansible_become_pass=313131
[alpine]
#Alpine Linux containers using apk package manager
redis ansible_host=redis ansible_user=root ansible_become=yes ansible_become_pass=313131
authentik ansible_host=authentik ansible_user=root ansible_become=yes ansible_become_pass=313131
calibreweb ansible_host=calibreweb ansible_user=root ansible_become=yes ansible_become_pass=313131
[vm]
kali ansible_host=kali ansible_user=ben ansible_become=yes ansible_become_pass=3131
[hcp]
hcp1 ansible_host=hcp1 ansible_user=root ansible_become=yes ansible_become_pass=313131
hcp2 ansible_host=hcp2 ansible_user=root ansible_become=yes ansible_become_pass=313131
[feiniu]
snail ansible_host=snail ansible_user=houzhongxu ansible_ssh_pass=Aa313131@ben ansible_become=yes ansible_become_pass=Aa313131@ben
[armbian]
onecloud1 ansible_host=onecloud1 ansible_user=ben ansible_ssh_pass=3131 ansible_become=yes ansible_become_pass=3131
[germany]
de ansible_host=de ansible_user=ben ansible_ssh_pass=3131 ansible_become=yes ansible_become_pass=3131
[all:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'

View File

@@ -1,183 +0,0 @@
---
- name: Setup Automated Maintenance Cron Jobs
hosts: localhost
gather_facts: no
vars:
# 定时任务配置
cron_jobs:
# 每日快速检查
- name: "Daily system health check"
job: "cd /root/mgmt && ./scripts/ops-manager.sh toolkit all --check > /var/log/daily-health-check.log 2>&1"
minute: "0"
hour: "8"
day: "*"
month: "*"
weekday: "*"
# 每周系统清理
- name: "Weekly system cleanup"
job: "cd /root/mgmt && ./scripts/ops-manager.sh cleanup all > /var/log/weekly-cleanup.log 2>&1"
minute: "0"
hour: "2"
day: "*"
month: "*"
weekday: "0" # Sunday
# 每月安全检查
- name: "Monthly security hardening check"
job: "cd /root/mgmt && ./scripts/ops-manager.sh security all --check > /var/log/monthly-security-check.log 2>&1"
minute: "0"
hour: "3"
day: "1"
month: "*"
weekday: "*"
# 每周证书检查
- name: "Weekly certificate check"
job: "cd /root/mgmt && ./scripts/ops-manager.sh cert all > /var/log/weekly-cert-check.log 2>&1"
minute: "30"
hour: "4"
day: "*"
month: "*"
weekday: "1" # Monday
# 每日 Docker 清理 (仅 LXC 组)
- name: "Daily Docker cleanup for LXC"
job: "cd /root/mgmt && ansible lxc -i ansible/inventory.ini -m shell -a 'docker system prune -f' --become -e 'ansible_ssh_pass=313131' > /var/log/daily-docker-cleanup.log 2>&1"
minute: "0"
hour: "1"
day: "*"
month: "*"
weekday: "*"
# 每周网络连通性检查
- name: "Weekly network connectivity check"
job: "cd /root/mgmt && ./scripts/ops-manager.sh network all > /var/log/weekly-network-check.log 2>&1"
minute: "0"
hour: "6"
day: "*"
month: "*"
weekday: "2" # Tuesday
tasks:
# 创建日志目录
- name: Create log directory
file:
path: /var/log/ansible-automation
state: directory
mode: '0755'
become: yes
# 设置脚本执行权限
- name: Make ops-manager.sh executable
file:
path: /root/mgmt/scripts/ops-manager.sh
mode: '0755'
# 创建定时任务
- name: Setup cron jobs for automated maintenance
cron:
name: "{{ item.name }}"
job: "{{ item.job }}"
minute: "{{ item.minute }}"
hour: "{{ item.hour }}"
day: "{{ item.day }}"
month: "{{ item.month }}"
weekday: "{{ item.weekday }}"
user: root
loop: "{{ cron_jobs }}"
become: yes
# 创建日志轮转配置
- name: Setup log rotation for automation logs
copy:
content: |
/var/log/*-health-check.log
/var/log/*-cleanup.log
/var/log/*-security-check.log
/var/log/*-cert-check.log
/var/log/*-docker-cleanup.log
/var/log/*-network-check.log {
daily
missingok
rotate 30
compress
delaycompress
notifempty
copytruncate
}
dest: /etc/logrotate.d/ansible-automation
mode: '0644'
become: yes
# 创建监控脚本
- name: Create monitoring dashboard script
copy:
content: |
#!/bin/bash
# Automation Monitoring Dashboard
echo "🤖 Ansible Automation Status Dashboard"
echo "======================================"
echo ""
echo "📅 Last Execution Times:"
echo "------------------------"
for log in /var/log/*-check.log /var/log/*-cleanup.log; do
if [ -f "$log" ]; then
echo "$(basename "$log" .log): $(stat -c %y "$log" | cut -d. -f1)"
fi
done
echo ""
echo "📊 Recent Log Summary:"
echo "---------------------"
for log in /var/log/daily-health-check.log /var/log/weekly-cleanup.log; do
if [ -f "$log" ]; then
echo "=== $(basename "$log") ==="
tail -5 "$log" | grep -E "(TASK|PLAY RECAP|ERROR|WARNING)" || echo "No recent activity"
echo ""
fi
done
echo "⏰ Next Scheduled Jobs:"
echo "----------------------"
crontab -l | grep -E "(health|cleanup|security|cert|docker|network)" | while read line; do
echo "$line"
done
echo ""
echo "💾 Log File Sizes:"
echo "-----------------"
ls -lh /var/log/*-*.log 2>/dev/null | awk '{print $5, $9}' || echo "No log files found"
dest: /usr/local/bin/automation-status
mode: '0755'
become: yes
# 显示设置完成信息
- name: Display setup completion info
debug:
msg: |
🎉 自动化定时任务设置完成!
📋 已配置的定时任务:
• 每日 08:00 - 系统健康检查
• 每日 01:00 - Docker 清理 (LXC 组)
• 每周日 02:00 - 系统清理
• 每周一 04:30 - 证书检查
• 每周二 06:00 - 网络连通性检查
• 每月1日 03:00 - 安全检查
📊 监控命令:
• 查看状态: automation-status
• 查看定时任务: crontab -l
• 查看日志: tail -f /var/log/daily-health-check.log
📁 日志位置: /var/log/
🔄 日志轮转: 30天自动清理
💡 手动执行示例:
• ./scripts/ops-manager.sh toolkit all
• ./scripts/ops-manager.sh cleanup lxc
• ./scripts/ops-manager.sh health proxmox

View File

@@ -1,83 +0,0 @@
---
- name: System Cleanup and Maintenance
hosts: all
become: yes
gather_facts: yes
tasks:
# 清理包缓存和孤立包
- name: Clean package cache (Debian/Ubuntu)
apt:
autoclean: yes
autoremove: yes
when: ansible_os_family == "Debian"
- name: Remove orphaned packages (Debian/Ubuntu)
shell: apt-get autoremove --purge -y
when: ansible_os_family == "Debian"
# 清理日志文件
- name: Clean old journal logs (keep 7 days)
shell: journalctl --vacuum-time=7d
- name: Clean old log files
find:
paths: /var/log
patterns: "*.log.*,*.gz"
age: "7d"
recurse: yes
register: old_logs
- name: Remove old log files
file:
path: "{{ item.path }}"
state: absent
loop: "{{ old_logs.files }}"
when: old_logs.files is defined
# 清理临时文件
- name: Clean /tmp directory (files older than 7 days)
find:
paths: /tmp
age: "7d"
recurse: yes
register: tmp_files
- name: Remove old temp files
file:
path: "{{ item.path }}"
state: absent
loop: "{{ tmp_files.files }}"
when: tmp_files.files is defined
# Docker 清理 (如果存在)
- name: Check if Docker is installed
command: which docker
register: docker_check
failed_when: false
changed_when: false
- name: Clean Docker system
shell: |
docker system prune -f
docker image prune -f
docker volume prune -f
when: docker_check.rc == 0
# 磁盘空间检查
- name: Check disk usage
shell: df -h
register: disk_usage
- name: Display disk usage
debug:
msg: "{{ disk_usage.stdout_lines }}"
# 内存使用检查
- name: Check memory usage
shell: free -h
register: memory_usage
- name: Display memory usage
debug:
msg: "{{ memory_usage.stdout_lines }}"

View File

@@ -1,43 +0,0 @@
---
- name: System Update Playbook
hosts: all
become: yes
gather_facts: yes
tasks:
- name: Wait for automatic system updates to complete
shell: while fuser /var/lib/dpkg/lock-frontend >/dev/null 2>&1; do sleep 5; done
when: ansible_os_family == "Debian"
- name: Update apt cache
apt:
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == "Debian"
retries: 3
delay: 10
- name: Upgrade all packages
apt:
upgrade: yes
autoremove: yes
autoclean: yes
when: ansible_os_family == "Debian"
register: upgrade_result
retries: 3
delay: 10
- name: Display upgrade results
debug:
msg: "System upgrade completed. {{ upgrade_result.changed }} packages were updated."
- name: Check if reboot is required
stat:
path: /var/run/reboot-required
register: reboot_required
when: ansible_os_family == "Debian"
- name: Notify if reboot is required
debug:
msg: "System reboot is required to complete the update."
when: reboot_required.stat.exists is defined and reboot_required.stat.exists

View File

@@ -1,152 +0,0 @@
---
- name: SSL Certificate Management and Monitoring
hosts: all
gather_facts: yes
vars:
# 常见证书路径
cert_paths:
- /etc/ssl/certs
- /etc/letsencrypt/live
- /etc/nginx/ssl
- /etc/apache2/ssl
- /usr/local/share/ca-certificates
# 需要检查的服务端口
ssl_services:
- { name: "HTTPS", port: 443 }
- { name: "SMTPS", port: 465 }
- { name: "IMAPS", port: 993 }
- { name: "LDAPS", port: 636 }
tasks:
# 检查证书目录
- name: Check certificate directories
stat:
path: "{{ item }}"
register: cert_dirs
loop: "{{ cert_paths }}"
- name: List existing certificate directories
debug:
msg: "📁 Certificate directory {{ item.item }}: {{ 'EXISTS' if item.stat.exists else 'NOT FOUND' }}"
loop: "{{ cert_dirs.results }}"
# 查找证书文件
- name: Find certificate files
find:
paths: "{{ cert_paths }}"
patterns: "*.crt,*.pem,*.cert"
recurse: yes
register: cert_files
- name: Display found certificates
debug:
msg: "🔐 Found {{ cert_files.files | length }} certificate files"
# 检查证书过期时间
- name: Check certificate expiration
shell: |
if [ -f "{{ item.path }}" ]; then
openssl x509 -in "{{ item.path }}" -noout -enddate 2>/dev/null | cut -d= -f2
fi
register: cert_expiry
loop: "{{ cert_files.files[:10] }}" # 限制检查前10个证书
failed_when: false
- name: Display certificate expiration dates
debug:
msg: "📅 {{ item.item.path | basename }}: expires {{ item.stdout if item.stdout else 'INVALID/UNREADABLE' }}"
loop: "{{ cert_expiry.results }}"
when: item.stdout != ""
# 检查即将过期的证书 (30天内)
- name: Check certificates expiring soon
shell: |
if [ -f "{{ item.path }}" ]; then
exp_date=$(openssl x509 -in "{{ item.path }}" -noout -enddate 2>/dev/null | cut -d= -f2)
if [ ! -z "$exp_date" ]; then
exp_epoch=$(date -d "$exp_date" +%s 2>/dev/null)
now_epoch=$(date +%s)
days_left=$(( (exp_epoch - now_epoch) / 86400 ))
if [ $days_left -lt 30 ]; then
echo "WARNING: $days_left days left"
else
echo "OK: $days_left days left"
fi
fi
fi
register: cert_warnings
loop: "{{ cert_files.files[:10] }}"
failed_when: false
- name: Display certificate warnings
debug:
msg: "⚠️ {{ item.item.path | basename }}: {{ item.stdout }}"
loop: "{{ cert_warnings.results }}"
when: item.stdout != "" and "WARNING" in item.stdout
# 检查 Let's Encrypt 证书
- name: Check Let's Encrypt certificates
shell: certbot certificates 2>/dev/null || echo "Certbot not installed"
register: letsencrypt_certs
failed_when: false
- name: Display Let's Encrypt status
debug:
msg: "🔒 Let's Encrypt: {{ letsencrypt_certs.stdout_lines }}"
when: "'not installed' not in letsencrypt_certs.stdout"
# 检查 SSL 服务端口
- name: Check SSL service ports
wait_for:
port: "{{ item.port }}"
timeout: 3
register: ssl_ports
loop: "{{ ssl_services }}"
failed_when: false
- name: Display SSL service status
debug:
msg: "🔌 {{ item.item.name }} (port {{ item.item.port }}): {{ 'LISTENING' if not item.failed else 'NOT AVAILABLE' }}"
loop: "{{ ssl_ports.results }}"
# 测试 HTTPS 连接
- name: Test HTTPS connection to localhost
uri:
url: "https://{{ ansible_default_ipv4.address }}"
method: GET
validate_certs: no
timeout: 5
register: https_test
failed_when: false
when: ssl_ports.results[0] is defined and not ssl_ports.results[0].failed
- name: Display HTTPS test result
debug:
msg: "🌐 HTTPS Test: {{ 'SUCCESS' if https_test.status is defined else 'FAILED' }}"
when: https_test is defined
# 检查证书链
- name: Check certificate chain for HTTPS
shell: |
echo | openssl s_client -connect {{ ansible_default_ipv4.address }}:443 -servername {{ ansible_hostname }} 2>/dev/null | openssl x509 -noout -subject -issuer
register: cert_chain
failed_when: false
when: ssl_ports.results[0] is defined and not ssl_ports.results[0].failed
- name: Display certificate chain info
debug:
msg: "🔗 Certificate Chain: {{ cert_chain.stdout_lines }}"
when: cert_chain is defined and cert_chain.rc == 0
# 生成证书健康报告
- name: Generate certificate health summary
debug:
msg: |
🔐 Certificate Health Summary for {{ inventory_hostname }}:
📁 Certificate directories found: {{ (cert_dirs.results | selectattr('stat.exists') | list | length) }}
📄 Certificate files found: {{ cert_files.files | length }}
⚠️ Certificates expiring soon: {{ (cert_warnings.results | selectattr('stdout', 'search', 'WARNING') | list | length) }}
🔒 Let's Encrypt: {{ 'Configured' if 'not installed' not in letsencrypt_certs.stdout else 'Not installed' }}
🌐 SSL Services: {{ (ssl_ports.results | rejectattr('failed') | list | length) }}/{{ ssl_services | length }} available

View File

@@ -1,119 +0,0 @@
---
- name: Security Hardening and Backup
hosts: all
become: yes
gather_facts: yes
tasks:
# SSH 安全配置检查
- name: Check SSH configuration security
lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
backup: yes
loop:
- { regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no' }
- { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
- { regexp: '^#?X11Forwarding', line: 'X11Forwarding no' }
- { regexp: '^#?MaxAuthTries', line: 'MaxAuthTries 3' }
notify: restart ssh
when: ansible_os_family == "Debian"
# 防火墙状态检查
- name: Check UFW firewall status
shell: ufw status
register: ufw_status
changed_when: false
failed_when: false
when: ansible_os_family == "Debian"
- name: Display firewall status
debug:
msg: "🔥 Firewall Status: {{ ufw_status.stdout_lines }}"
when: ansible_os_family == "Debian" and ufw_status.stdout_lines is defined
# 检查可疑登录
- name: Check for failed login attempts
shell: grep "Failed password" /var/log/auth.log | tail -10
register: failed_logins
changed_when: false
failed_when: false
- name: Report suspicious login attempts
debug:
msg: "🚨 Recent failed logins: {{ failed_logins.stdout_lines }}"
when: failed_logins.stdout_lines | length > 0
# 检查 root 用户活动
- name: Check recent root activity
shell: grep "sudo.*root" /var/log/auth.log | tail -5
register: root_activity
changed_when: false
failed_when: false
- name: Display root activity
debug:
msg: "👑 Recent root activity: {{ root_activity.stdout_lines }}"
when: root_activity.stdout_lines | length > 0
# 备份重要配置文件
- name: Create backup directory
file:
path: /backup/configs
state: directory
mode: '0700'
- name: Backup important configuration files
copy:
src: "{{ item }}"
dest: "/backup/configs/{{ item | basename }}.{{ ansible_date_time.epoch }}"
remote_src: yes
backup: yes
loop:
- /etc/ssh/sshd_config
- /etc/hosts
- /etc/fstab
- /etc/crontab
failed_when: false
# 检查系统完整性
- name: Check for world-writable files
shell: find /etc /usr /bin /sbin -type f -perm -002 2>/dev/null | head -10
register: world_writable
changed_when: false
- name: Report world-writable files
debug:
msg: "⚠️ World-writable files found: {{ world_writable.stdout_lines }}"
when: world_writable.stdout_lines | length > 0
# 检查 SUID 文件
- name: Check for SUID files
shell: find /usr /bin /sbin -type f -perm -4000 2>/dev/null
register: suid_files
changed_when: false
- name: Display SUID files count
debug:
msg: "🔐 Found {{ suid_files.stdout_lines | length }} SUID files"
# 更新系统时间
- name: Sync system time
shell: timedatectl set-ntp true
failed_when: false
- name: Check time synchronization
shell: timedatectl status
register: time_status
- name: Display time sync status
debug:
msg: "🕐 Time sync: {{ time_status.stdout_lines | select('match', '.*synchronized.*') | list }}"
handlers:
- name: restart ssh
systemd:
name: ssh
state: restarted
when: ansible_os_family == "Debian"

View File

@@ -1,128 +0,0 @@
---
- name: Docker Container Management
hosts: all
become: yes
gather_facts: yes
tasks:
# 检查 Docker 是否安装
- name: Check if Docker is installed
command: which docker
register: docker_installed
failed_when: false
changed_when: false
- name: Skip Docker tasks if not installed
debug:
msg: "Docker not installed on {{ inventory_hostname }}, skipping Docker tasks"
when: docker_installed.rc != 0
# Docker 系统信息
- name: Get Docker system info
shell: docker system df
register: docker_system_info
when: docker_installed.rc == 0
- name: Display Docker system usage
debug:
msg: "🐳 Docker System Usage: {{ docker_system_info.stdout_lines }}"
when: docker_installed.rc == 0
# 检查运行中的容器
- name: List running containers
shell: docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
register: running_containers
when: docker_installed.rc == 0
- name: Display running containers
debug:
msg: "📦 Running Containers: {{ running_containers.stdout_lines }}"
when: docker_installed.rc == 0
# 检查停止的容器
- name: List stopped containers
shell: docker ps -a --filter "status=exited" --format "table {{.Names}}\t{{.Status}}"
register: stopped_containers
when: docker_installed.rc == 0
- name: Display stopped containers
debug:
msg: "⏹️ Stopped Containers: {{ stopped_containers.stdout_lines }}"
when: docker_installed.rc == 0 and stopped_containers.stdout_lines | length > 1
# 检查 Docker 镜像
- name: List Docker images
shell: docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"
register: docker_images
when: docker_installed.rc == 0
- name: Display Docker images
debug:
msg: "🖼️ Docker Images: {{ docker_images.stdout_lines }}"
when: docker_installed.rc == 0
# 检查悬空镜像
- name: Check for dangling images
shell: docker images -f "dangling=true" -q
register: dangling_images
when: docker_installed.rc == 0
- name: Report dangling images
debug:
msg: "🗑️ Found {{ dangling_images.stdout_lines | length }} dangling images"
when: docker_installed.rc == 0
# 检查 Docker 卷
- name: List Docker volumes
shell: docker volume ls
register: docker_volumes
when: docker_installed.rc == 0
- name: Display Docker volumes
debug:
msg: "💾 Docker Volumes: {{ docker_volumes.stdout_lines }}"
when: docker_installed.rc == 0
# 检查 Docker 网络
- name: List Docker networks
shell: docker network ls
register: docker_networks
when: docker_installed.rc == 0
- name: Display Docker networks
debug:
msg: "🌐 Docker Networks: {{ docker_networks.stdout_lines }}"
when: docker_installed.rc == 0
# 检查容器资源使用
- name: Check container resource usage
shell: docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"
register: container_stats
when: docker_installed.rc == 0
- name: Display container resource usage
debug:
msg: "📊 Container Stats: {{ container_stats.stdout_lines }}"
when: docker_installed.rc == 0
# 检查 Docker 服务状态
- name: Check Docker service status
systemd:
name: docker
register: docker_service_status
when: docker_installed.rc == 0
- name: Display Docker service status
debug:
msg: "🔧 Docker Service: {{ docker_service_status.status.ActiveState }}"
when: docker_installed.rc == 0
# 清理建议
- name: Suggest cleanup if needed
debug:
msg: |
💡 Cleanup suggestions:
- Run 'docker system prune -f' to remove unused data
- Run 'docker image prune -f' to remove dangling images
- Run 'docker volume prune -f' to remove unused volumes
when: docker_installed.rc == 0 and (dangling_images.stdout_lines | length > 0 or stopped_containers.stdout_lines | length > 1)

View File

@@ -1,97 +0,0 @@
---
- name: Docker Status Check for HCP Nodes
hosts: hcp
gather_facts: yes
become: yes
tasks:
- name: Check if Docker is installed
command: docker --version
register: docker_version
ignore_errors: yes
- name: Display Docker version
debug:
msg: "Docker version: {{ docker_version.stdout }}"
when: docker_version.rc == 0
- name: Check Docker service status
systemd:
name: docker
register: docker_service_status
- name: Display Docker service status
debug:
msg: "Docker service is {{ docker_service_status.status.ActiveState }}"
- name: Check Docker daemon info
command: docker info --format "{{ '{{' }}.ServerVersion{{ '}}' }}"
register: docker_info
ignore_errors: yes
- name: Display Docker daemon info
debug:
msg: "Docker daemon version: {{ docker_info.stdout }}"
when: docker_info.rc == 0
- name: Check Docker Swarm status
command: docker info --format "{{ '{{' }}.Swarm.LocalNodeState{{ '}}' }}"
register: swarm_status
ignore_errors: yes
- name: Display Swarm status
debug:
msg: "Swarm status: {{ swarm_status.stdout }}"
when: swarm_status.rc == 0
- name: Get Docker Swarm node info (if in swarm)
command: docker node ls
register: swarm_nodes
ignore_errors: yes
when: swarm_status.stdout == "active"
- name: Display Swarm nodes
debug:
msg: "{{ swarm_nodes.stdout_lines }}"
when: swarm_nodes is defined and swarm_nodes.rc == 0
- name: List running containers
command: docker ps --format "table {{ '{{' }}.Names{{ '}}' }}\t{{ '{{' }}.Status{{ '}}' }}\t{{ '{{' }}.Ports{{ '}}' }}"
register: running_containers
ignore_errors: yes
- name: Display running containers
debug:
msg: "{{ running_containers.stdout_lines }}"
when: running_containers.rc == 0
- name: Check Docker network list
command: docker network ls
register: docker_networks
ignore_errors: yes
- name: Display Docker networks
debug:
msg: "{{ docker_networks.stdout_lines }}"
when: docker_networks.rc == 0
- name: Get Docker system info
command: docker system df
register: docker_system_info
ignore_errors: yes
- name: Display Docker system usage
debug:
msg: "{{ docker_system_info.stdout_lines }}"
when: docker_system_info.rc == 0
- name: Check if node is Swarm manager
command: docker node inspect self --format "{{ '{{' }}.ManagerStatus.Leader{{ '}}' }}"
register: is_manager
ignore_errors: yes
when: swarm_status.stdout == "active"
- name: Display manager status
debug:
msg: "Is Swarm manager: {{ is_manager.stdout }}"
when: is_manager is defined and is_manager.rc == 0

View File

@@ -1,143 +0,0 @@
---
- name: Network Connectivity and Performance Check
hosts: all
gather_facts: yes
vars:
test_domains:
- google.com
- github.com
- docker.io
- tailscale.com
test_ports:
- { host: "8.8.8.8", port: 53, name: "Google DNS" }
- { host: "1.1.1.1", port: 53, name: "Cloudflare DNS" }
- { host: "github.com", port: 443, name: "GitHub HTTPS" }
- { host: "docker.io", port: 443, name: "Docker Hub" }
tasks:
# 基本网络信息
- name: Get network interfaces
shell: ip addr show | grep -E "^[0-9]+:|inet "
register: network_interfaces
- name: Display network interfaces
debug:
msg: "🌐 Network Interfaces: {{ network_interfaces.stdout_lines }}"
# 检查默认路由
- name: Check default route
shell: ip route | grep default
register: default_route
- name: Display default route
debug:
msg: "🛣️ Default Route: {{ default_route.stdout }}"
# DNS 解析测试
- name: Test DNS resolution
shell: nslookup {{ item }} | grep -A2 "Name:"
register: dns_test
loop: "{{ test_domains }}"
failed_when: false
- name: Display DNS test results
debug:
msg: "🔍 DNS Test for {{ item.item }}: {{ 'SUCCESS' if item.rc == 0 else 'FAILED' }}"
loop: "{{ dns_test.results }}"
# 网络连通性测试
- name: Test network connectivity (ping)
shell: ping -c 3 {{ item }}
register: ping_test
loop: "{{ test_domains }}"
failed_when: false
- name: Display ping test results
debug:
msg: "🏓 Ping to {{ item.item }}: {{ 'SUCCESS' if item.rc == 0 else 'FAILED' }}"
loop: "{{ ping_test.results }}"
# 端口连通性测试
- name: Test port connectivity
wait_for:
host: "{{ item.host }}"
port: "{{ item.port }}"
timeout: 5
register: port_test
loop: "{{ test_ports }}"
failed_when: false
- name: Display port test results
debug:
msg: "🔌 {{ item.item.name }} ({{ item.item.host }}:{{ item.item.port }}): {{ 'SUCCESS' if not item.failed else 'FAILED' }}"
loop: "{{ port_test.results }}"
# 检查 Tailscale 状态
- name: Check Tailscale status
shell: tailscale status
register: tailscale_status
failed_when: false
- name: Display Tailscale status
debug:
msg: "🔗 Tailscale Status: {{ 'CONNECTED' if tailscale_status.rc == 0 else 'NOT CONNECTED' }}"
- name: Show Tailscale details
debug:
msg: "{{ tailscale_status.stdout_lines }}"
when: tailscale_status.rc == 0
# 检查防火墙状态
- name: Check UFW status (Ubuntu/Debian)
shell: ufw status
register: ufw_status
failed_when: false
when: ansible_os_family == "Debian"
- name: Display UFW status
debug:
msg: "🛡️ UFW Firewall: {{ ufw_status.stdout_lines }}"
when: ansible_os_family == "Debian" and ufw_status.rc == 0
# 检查 iptables 规则
- name: Check iptables rules
shell: iptables -L -n | head -20
register: iptables_rules
failed_when: false
become: yes
- name: Display iptables summary
debug:
msg: "🔥 Iptables Rules: {{ iptables_rules.stdout_lines[:10] }}"
when: iptables_rules.rc == 0
# 网络性能测试
- name: Test download speed (small file)
shell: curl -o /dev/null -s -w "%{time_total}" http://speedtest.wdc01.softlayer.com/downloads/test10.zip
register: download_speed
failed_when: false
- name: Display download speed test
debug:
msg: "⚡ Download Speed Test: {{ download_speed.stdout }}s for 10MB file"
when: download_speed.rc == 0
# 检查网络统计
- name: Get network statistics
shell: cat /proc/net/dev | grep -v "lo:" | grep ":"
register: network_stats
- name: Display network statistics
debug:
msg: "📊 Network Stats: {{ network_stats.stdout_lines }}"
# 生成网络健康报告
- name: Generate network health summary
debug:
msg: |
🌐 Network Health Summary for {{ inventory_hostname }}:
✅ DNS Resolution: {{ (dns_test.results | selectattr('rc', 'equalto', 0) | list | length) }}/{{ test_domains | length }} domains
✅ Ping Connectivity: {{ (ping_test.results | selectattr('rc', 'equalto', 0) | list | length) }}/{{ test_domains | length }} hosts
✅ Port Connectivity: {{ (port_test.results | rejectattr('failed', 'defined') | list | length) }}/{{ test_ports | length }} ports
✅ Tailscale: {{ 'Connected' if tailscale_status.rc == 0 else 'Disconnected' }}

View File

@@ -1,135 +0,0 @@
---
- name: Service Health Check and Monitoring
hosts: all
become: yes
gather_facts: yes
vars:
critical_services:
- ssh
- systemd-resolved
- cron
web_services:
- nginx
- apache2
database_services:
- mysql
- mariadb
- postgresql
container_services:
- docker
- containerd
network_services:
- tailscale
- cloudflared
tasks:
# 检查关键系统服务
- name: Check critical system services
systemd:
name: "{{ item }}"
register: critical_service_status
loop: "{{ critical_services }}"
failed_when: false
- name: Report critical service issues
debug:
msg: "⚠️ Critical service {{ item.item }} is {{ item.status.ActiveState | default('not found') }}"
loop: "{{ critical_service_status.results }}"
when: item.status is defined and item.status.ActiveState != "active"
# 检查 Web 服务
- name: Check web services
systemd:
name: "{{ item }}"
register: web_service_status
loop: "{{ web_services }}"
failed_when: false
- name: Report web service status
debug:
msg: "🌐 Web service {{ item.item }}: {{ item.status.ActiveState | default('not installed') }}"
loop: "{{ web_service_status.results }}"
when: item.status is defined
# 检查数据库服务
- name: Check database services
systemd:
name: "{{ item }}"
register: db_service_status
loop: "{{ database_services }}"
failed_when: false
- name: Report database service status
debug:
msg: "🗄️ Database service {{ item.item }}: {{ item.status.ActiveState | default('not installed') }}"
loop: "{{ db_service_status.results }}"
when: item.status is defined
# 检查容器服务
- name: Check container services
systemd:
name: "{{ item }}"
register: container_service_status
loop: "{{ container_services }}"
failed_when: false
- name: Report container service status
debug:
msg: "📦 Container service {{ item.item }}: {{ item.status.ActiveState | default('not installed') }}"
loop: "{{ container_service_status.results }}"
when: item.status is defined
# 检查网络服务
- name: Check network services
systemd:
name: "{{ item }}"
register: network_service_status
loop: "{{ network_services }}"
failed_when: false
- name: Report network service status
debug:
msg: "🌐 Network service {{ item.item }}: {{ item.status.ActiveState | default('not installed') }}"
loop: "{{ network_service_status.results }}"
when: item.status is defined
# 检查系统负载
- name: Check system load
shell: uptime
register: system_load
- name: Display system load
debug:
msg: "📊 System Load: {{ system_load.stdout }}"
# 检查磁盘空间警告
- name: Check disk space usage
shell: df -h | awk '$5 > 80 {print $0}'
register: disk_warning
changed_when: false
- name: Warn about high disk usage
debug:
msg: "⚠️ High disk usage detected: {{ disk_warning.stdout_lines }}"
when: disk_warning.stdout_lines | length > 0
# 检查内存使用率
- name: Check memory usage percentage
shell: free | awk 'NR==2{printf "%.2f%%", $3*100/$2}'
register: memory_percent
- name: Display memory usage
debug:
msg: "🧠 Memory Usage: {{ memory_percent.stdout }}"
# 检查最近的系统错误
- name: Check recent system errors
shell: journalctl --since "1 hour ago" --priority=err --no-pager | tail -10
register: recent_errors
changed_when: false
- name: Display recent errors
debug:
msg: "🚨 Recent system errors: {{ recent_errors.stdout_lines }}"
when: recent_errors.stdout_lines | length > 0

View File

@@ -1,72 +0,0 @@
---
- name: Cloud Providers System Update Playbook
hosts: huawei,google,ditigalocean,aws
become: yes
gather_facts: yes
tasks:
# Ubuntu/Debian 系统更新 (apt)
- name: Update apt cache (Ubuntu/Debian)
apt:
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == "Debian"
- name: Upgrade all packages (Ubuntu/Debian)
apt:
upgrade: yes
autoremove: yes
autoclean: yes
when: ansible_os_family == "Debian"
register: apt_upgrade_result
# AWS Linux 系统更新 (dnf)
- name: Update dnf cache (AWS Linux/RHEL)
dnf:
update_cache: yes
when: ansible_os_family == "RedHat"
- name: Upgrade all packages (AWS Linux/RHEL)
dnf:
name: "*"
state: latest
skip_broken: yes
when: ansible_os_family == "RedHat"
register: dnf_upgrade_result
# 显示升级结果
- name: Display apt upgrade results
debug:
msg: "APT system upgrade completed. {{ apt_upgrade_result.changed }} packages were updated."
when: ansible_os_family == "Debian" and apt_upgrade_result is defined
- name: Display dnf upgrade results
debug:
msg: "DNF system upgrade completed. {{ dnf_upgrade_result.changed }} packages were updated."
when: ansible_os_family == "RedHat" and dnf_upgrade_result is defined
# 检查是否需要重启 (Ubuntu/Debian)
- name: Check if reboot is required (Ubuntu/Debian)
stat:
path: /var/run/reboot-required
register: debian_reboot_required
when: ansible_os_family == "Debian"
# 检查是否需要重启 (AWS Linux/RHEL)
- name: Check if reboot is required (AWS Linux/RHEL)
command: needs-restarting -r
register: rhel_reboot_required
failed_when: false
changed_when: false
when: ansible_os_family == "RedHat"
# 通知重启信息
- name: Notify if reboot is required (Ubuntu/Debian)
debug:
msg: "System reboot is required to complete the update."
when: ansible_os_family == "Debian" and debian_reboot_required.stat.exists is defined and debian_reboot_required.stat.exists
- name: Notify if reboot is required (AWS Linux/RHEL)
debug:
msg: "System reboot is required to complete the update."
when: ansible_os_family == "RedHat" and rhel_reboot_required.rc == 1

View File

@@ -1,131 +0,0 @@
---
- name: Operations Toolkit - Unified Management Dashboard
hosts: all
gather_facts: yes
vars:
# 可用的运维脚本
available_scripts:
- { name: "system-update", desc: "System package updates", file: "system-update.yml" }
- { name: "system-cleanup", desc: "System cleanup and maintenance", file: "system-cleanup.yml" }
- { name: "service-health", desc: "Service health monitoring", file: "service-health-check.yml" }
- { name: "security-hardening", desc: "Security hardening and backup", file: "security-hardening.yml" }
- { name: "docker-management", desc: "Docker container management", file: "docker-management.yml" }
- { name: "network-connectivity", desc: "Network connectivity check", file: "network-connectivity.yml" }
- { name: "certificate-management", desc: "SSL certificate monitoring", file: "certificate-management.yml" }
tasks:
# 显示系统概览
- name: Display system overview
debug:
msg: |
🖥️ System Overview for {{ inventory_hostname }}:
📊 OS: {{ ansible_distribution }} {{ ansible_distribution_version }}
💾 Memory: {{ (ansible_memtotal_mb/1024)|round(1) }}GB total, {{ (ansible_memfree_mb/1024)|round(1) }}GB free
💿 CPU: {{ ansible_processor_vcpus }} cores
🏠 Architecture: {{ ansible_architecture }}
🌐 IP: {{ ansible_default_ipv4.address }}
⏰ Uptime: {{ ansible_uptime_seconds//86400 }}d {{ (ansible_uptime_seconds%86400)//3600 }}h {{ ((ansible_uptime_seconds%3600)//60) }}m
# 快速系统状态检查
- name: Quick system status check
shell: |
echo "=== DISK USAGE ==="
df -h | grep -E "(Filesystem|/dev/)"
echo ""
echo "=== MEMORY USAGE ==="
free -h
echo ""
echo "=== LOAD AVERAGE ==="
uptime
echo ""
echo "=== TOP PROCESSES ==="
ps aux --sort=-%cpu | head -6
register: quick_status
- name: Display quick status
debug:
msg: "{{ quick_status.stdout_lines }}"
# 检查关键服务状态
- name: Check critical services
systemd:
name: "{{ item }}"
register: service_status
loop:
- ssh
- systemd-resolved
- cron
failed_when: false
- name: Display service status
debug:
msg: "🔧 {{ item.item }}: {{ item.status.ActiveState if item.status is defined else 'NOT FOUND' }}"
loop: "{{ service_status.results }}"
# 检查最近的系统日志错误
- name: Check recent system errors
shell: journalctl --since "1 hour ago" --priority=err --no-pager | tail -10
register: recent_errors
failed_when: false
- name: Display recent errors
debug:
msg: "🚨 Recent Errors: {{ recent_errors.stdout_lines if recent_errors.stdout_lines else ['No recent errors found'] }}"
# 检查网络连接
- name: Quick network check
shell: |
echo "=== NETWORK INTERFACES ==="
ip -br addr show
echo ""
echo "=== DEFAULT ROUTE ==="
ip route | grep default
echo ""
echo "=== DNS TEST ==="
nslookup google.com | grep -A1 "Name:" || echo "DNS resolution failed"
register: network_check
failed_when: false
- name: Display network status
debug:
msg: "🌐 Network Status: {{ network_check.stdout_lines }}"
# 显示可用的运维脚本
- name: Display available operations scripts
debug:
msg: |
🛠️ Available Operations Scripts:
{% for script in available_scripts %}
{{ loop.index }}. {{ script.name }}: {{ script.desc }}
{% endfor %}
💡 Usage Examples:
ansible-playbook -i inventory.ini system-cleanup.yml --limit {{ inventory_hostname }}
ansible-playbook -i inventory.ini docker-management.yml --limit lxc
ansible-playbook -i inventory.ini network-connectivity.yml --limit proxmox
# 生成运维建议
- name: Generate maintenance recommendations
debug:
msg: |
💡 Maintenance Recommendations for {{ inventory_hostname }}:
🔄 Regular Tasks (Weekly):
- Run system-cleanup.yml to free up disk space
- Check service-health-check.yml for service status
- Review certificate-management.yml for expiring certificates
🔒 Security Tasks (Monthly):
- Execute security-hardening.yml for security updates
- Review network-connectivity.yml for network security
🐳 Container Tasks (As needed):
- Use docker-management.yml for Docker maintenance
📊 Monitoring Tasks (Daily):
- Quick check with ops-toolkit.yml (this script)
⚡ Emergency Tasks:
- Use system-update.yml for critical security patches
- Run network-connectivity.yml for connectivity issues

View File

@@ -1,109 +0,0 @@
#!/bin/bash
# Ansible Playbooks 分类运行脚本
# 使用方法: ./run-playbook.sh [category] [playbook] [hosts]
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PLAYBOOKS_DIR="$SCRIPT_DIR/playbooks"
# 颜色定义
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# 显示使用帮助
show_help() {
echo -e "${BLUE}Ansible Playbooks 分类运行脚本${NC}"
echo ""
echo "使用方法:"
echo " $0 [category] [playbook] [hosts]"
echo ""
echo "可用分类:"
echo -e " ${GREEN}01-system${NC} - 系统管理 (更新、清理、定时任务)"
echo -e " ${GREEN}02-security${NC} - 安全管理 (安全加固、证书管理)"
echo -e " ${GREEN}03-services${NC} - 服务管理 (Docker、容器服务)"
echo -e " ${GREEN}04-monitoring${NC} - 监控检查 (健康检查、网络连接)"
echo -e " ${GREEN}05-cloud${NC} - 云服务商专用"
echo -e " ${GREEN}99-tools${NC} - 工具和集成"
echo ""
echo "示例:"
echo " $0 list # 列出所有可用的 playbooks"
echo " $0 01-system system-update.yml all # 在所有主机上运行系统更新"
echo " $0 03-services docker-status-check.yml hcp # 在 hcp 组上检查 Docker 状态"
echo " $0 04-monitoring network-connectivity.yml dev1 # 在 dev1 主机上检查网络连接"
}
# 列出所有可用的 playbooks
list_playbooks() {
echo -e "${BLUE}可用的 Ansible Playbooks:${NC}"
echo ""
for category in $(ls -1 "$PLAYBOOKS_DIR" | sort); do
if [ -d "$PLAYBOOKS_DIR/$category" ]; then
echo -e "${GREEN}📁 $category${NC}"
for playbook in $(ls -1 "$PLAYBOOKS_DIR/$category"/*.yml 2>/dev/null | sort); do
if [ -f "$playbook" ]; then
basename_playbook=$(basename "$playbook")
echo -e " └── ${YELLOW}$basename_playbook${NC}"
fi
done
echo ""
fi
done
}
# 运行指定的 playbook
run_playbook() {
local category="$1"
local playbook="$2"
local hosts="$3"
local playbook_path="$PLAYBOOKS_DIR/$category/$playbook"
if [ ! -f "$playbook_path" ]; then
echo -e "${RED}错误: Playbook 文件不存在: $playbook_path${NC}"
exit 1
fi
echo -e "${GREEN}运行 Playbook:${NC} $category/$playbook"
echo -e "${GREEN}目标主机:${NC} $hosts"
echo ""
# 运行 ansible-playbook
ansible-playbook -i inventory.ini "$playbook_path" --limit "$hosts"
}
# 主逻辑
case "${1:-}" in
"help"|"-h"|"--help"|"")
show_help
;;
"list"|"ls")
list_playbooks
;;
*)
if [ $# -lt 3 ]; then
echo -e "${RED}错误: 参数不足${NC}"
echo ""
show_help
exit 1
fi
category="$1"
playbook="$2"
hosts="$3"
if [ ! -d "$PLAYBOOKS_DIR/$category" ]; then
echo -e "${RED}错误: 分类目录不存在: $category${NC}"
echo ""
list_playbooks
exit 1
fi
run_playbook "$category" "$playbook" "$hosts"
;;
esac

View File

@@ -1,123 +0,0 @@
#!/bin/bash
# Ansible Playbook Runner Script
# Usage: ./run.sh -dev (or any group name)
# Set script directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
INVENTORY_FILE="$SCRIPT_DIR/inventory.ini"
PLAYBOOK_FILE="$SCRIPT_DIR/system-update.yml"
# Function to display usage
show_usage() {
echo "Usage: $0 -<group_name>"
echo ""
echo "Examples:"
echo " $0 -dev # Run on dev group (dev1, dev2)"
echo " $0 -prod # Run on prod group"
echo " $0 -all # Run on all hosts"
echo ""
echo "Available groups in inventory:"
grep '^\[' "$INVENTORY_FILE" | grep -v ':vars' | sed 's/\[//g' | sed 's/\]//g' | sort
}
# Function to check if group exists in inventory
check_group_exists() {
local group_name="$1"
if [ "$group_name" = "all" ]; then
return 0
fi
if grep -q "^\[$group_name\]" "$INVENTORY_FILE"; then
return 0
else
return 1
fi
}
# Function to run ansible playbook
run_playbook() {
local group_name="$1"
echo "========================================="
echo "Running Ansible Playbook on group: $group_name"
echo "========================================="
echo "Inventory: $INVENTORY_FILE"
echo "Playbook: $PLAYBOOK_FILE"
echo "Target: $group_name"
echo "========================================="
echo ""
# Set environment variables for better output
export LANG=C
export ANSIBLE_HOST_KEY_CHECKING=False
# Run the playbook
cd "$SCRIPT_DIR"
ansible-playbook -i "$INVENTORY_FILE" "$PLAYBOOK_FILE" --limit "$group_name" -v
local exit_code=$?
echo ""
echo "========================================="
if [ $exit_code -eq 0 ]; then
echo "✅ Playbook execution completed successfully!"
else
echo "❌ Playbook execution failed with exit code: $exit_code"
fi
echo "========================================="
return $exit_code
}
# Main script logic
main() {
# Check if argument is provided
if [ $# -eq 0 ]; then
echo "❌ Error: No group specified"
echo ""
show_usage
exit 1
fi
# Parse argument
local arg="$1"
if [[ "$arg" =~ ^-(.+)$ ]]; then
local group_name="${BASH_REMATCH[1]}"
else
echo "❌ Error: Invalid argument format. Use -<group_name>"
echo ""
show_usage
exit 1
fi
# Check if files exist
if [ ! -f "$INVENTORY_FILE" ]; then
echo "❌ Error: Inventory file not found: $INVENTORY_FILE"
exit 1
fi
if [ ! -f "$PLAYBOOK_FILE" ]; then
echo "❌ Error: Playbook file not found: $PLAYBOOK_FILE"
exit 1
fi
# Check if group exists
if ! check_group_exists "$group_name"; then
echo "❌ Error: Group '$group_name' not found in inventory"
echo ""
show_usage
exit 1
fi
# Run the playbook
run_playbook "$group_name"
}
# Handle help argument
if [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
show_usage
exit 0
fi
# Run main function
main "$@"