Compare commits
No commits in common. "c0d4cf54dcdeaecab2d2f4d83e13ff0730f185f0" and "d0e7f64c1db1ba4f2fca36ed8c3c29bc93eb74e5" have entirely different histories.
c0d4cf54dc
...
d0e7f64c1d
|
|
@ -1,177 +0,0 @@
|
||||||
# Nomad 集群 Telegraf 监控部署移交文档
|
|
||||||
|
|
||||||
## 📋 项目概述
|
|
||||||
|
|
||||||
**任务**: 为 Nomad 集群部署基于 Telegraf 的硬盘监控系统
|
|
||||||
**目标**: 监控集群所有节点的硬盘使用率、系统性能等指标
|
|
||||||
**监控栈**: Telegraf + InfluxDB 2.x + Grafana
|
|
||||||
|
|
||||||
## 🎯 当前完成状态
|
|
||||||
|
|
||||||
### ✅ 已完成的工作
|
|
||||||
|
|
||||||
#### 1. 容器运行时迁移
|
|
||||||
- **ch3 节点**: ✅ 成功清理 Docker,安装 Podman 4.9.3 + Compose 1.0.6
|
|
||||||
- **ash2e 节点**: ✅ 完成 Docker 移除和 Podman 安装
|
|
||||||
|
|
||||||
#### 2. Telegraf 监控部署
|
|
||||||
- **成功运行节点**: ash3c, semaphore, master, hcp1, hcp2, hcs (共6个节点)
|
|
||||||
- **监控数据**: 已开始向 InfluxDB 发送数据
|
|
||||||
- **配置模式**: 使用远程配置 URL
|
|
||||||
|
|
||||||
#### 3. 监控配置
|
|
||||||
- **InfluxDB URL**: `http://influxdb1.tailnet-68f9.ts.net:8086`
|
|
||||||
- **Token**: `VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w==`
|
|
||||||
- **Organization**: `seekkey`
|
|
||||||
- **Bucket**: `VPS`
|
|
||||||
- **远程配置**: `http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000`
|
|
||||||
|
|
||||||
## 🔄 待完成的工作
|
|
||||||
|
|
||||||
### 1. 剩余节点的 Telegraf 安装
|
|
||||||
**状态**: 部分节点仍需处理
|
|
||||||
**问题节点**: ch3, ch2, ash1d, syd
|
|
||||||
|
|
||||||
**问题描述**:
|
|
||||||
- 这些节点在下载 InfluxData 仓库密钥时失败
|
|
||||||
- 错误信息: `HTTPSConnection.__init__() got an unexpected keyword argument 'cert_file'`
|
|
||||||
- 原因: Python urllib3 版本兼容性问题
|
|
||||||
|
|
||||||
**解决方案**:
|
|
||||||
已创建简化安装脚本 `/root/mgmt/configuration/fix-telegraf-simple.sh`,包含以下步骤:
|
|
||||||
1. 直接下载 Telegraf 1.36.1 二进制文件
|
|
||||||
2. 创建简化的启动脚本
|
|
||||||
3. 部署为 `telegraf-simple.service`
|
|
||||||
|
|
||||||
### 2. 集群角色配置
|
|
||||||
**当前配置**:
|
|
||||||
```ini
|
|
||||||
[nomad_servers]
|
|
||||||
semaphore, ash2e, ash1d, ch2, ch3 (5个server)
|
|
||||||
|
|
||||||
[nomad_clients]
|
|
||||||
master, ash3c (2个client)
|
|
||||||
```
|
|
||||||
|
|
||||||
**待处理**:
|
|
||||||
- ash2e, ash1d, ch2 节点需要安装 Nomad 二进制文件
|
|
||||||
- 这些节点目前缺少 Nomad 安装
|
|
||||||
|
|
||||||
## 📁 重要文件位置
|
|
||||||
|
|
||||||
### 配置文件
|
|
||||||
- **Inventory**: `/root/mgmt/configuration/inventories/production/nomad-cluster.ini`
|
|
||||||
- **全局配置**: `/root/mgmt/configuration/inventories/production/group_vars/all.yml`
|
|
||||||
|
|
||||||
### Playbooks
|
|
||||||
- **Telegraf 部署**: `/root/mgmt/configuration/playbooks/setup-disk-monitoring.yml`
|
|
||||||
- **Docker 移除**: `/root/mgmt/configuration/playbooks/remove-docker-install-podman.yml`
|
|
||||||
- **Nomad 配置**: `/root/mgmt/configuration/playbooks/configure-nomad-tailscale.yml`
|
|
||||||
|
|
||||||
### 模板文件
|
|
||||||
- **Telegraf 主配置**: `/root/mgmt/configuration/templates/telegraf.conf.j2`
|
|
||||||
- **硬盘监控**: `/root/mgmt/configuration/templates/disk-monitoring.conf.j2`
|
|
||||||
- **系统监控**: `/root/mgmt/configuration/templates/system-monitoring.conf.j2`
|
|
||||||
- **环境变量**: `/root/mgmt/configuration/templates/telegraf-env.j2`
|
|
||||||
|
|
||||||
### 修复脚本
|
|
||||||
- **简化安装**: `/root/mgmt/configuration/fix-telegraf-simple.sh`
|
|
||||||
- **远程部署**: `/root/mgmt/configuration/deploy-telegraf-remote.sh`
|
|
||||||
|
|
||||||
## 🔧 技术细节
|
|
||||||
|
|
||||||
### Telegraf 服务配置
|
|
||||||
```ini
|
|
||||||
[Unit]
|
|
||||||
Description=Telegraf
|
|
||||||
After=network.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=simple
|
|
||||||
User=telegraf
|
|
||||||
Group=telegraf
|
|
||||||
ExecStart=/usr/bin/telegraf --config http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000
|
|
||||||
Restart=always
|
|
||||||
RestartSec=5
|
|
||||||
EnvironmentFile=/etc/default/telegraf
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
```
|
|
||||||
|
|
||||||
### 环境变量文件 (/etc/default/telegraf)
|
|
||||||
```bash
|
|
||||||
INFLUX_TOKEN=VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w==
|
|
||||||
INFLUX_ORG=seekkey
|
|
||||||
INFLUX_BUCKET=VPS
|
|
||||||
INFLUX_URL=http://influxdb1.tailnet-68f9.ts.net:8086
|
|
||||||
```
|
|
||||||
|
|
||||||
### 监控指标类型
|
|
||||||
- 硬盘使用率 (所有挂载点: /, /var, /tmp, /opt, /home)
|
|
||||||
- 硬盘 I/O 性能 (读写速度、IOPS)
|
|
||||||
- inode 使用率
|
|
||||||
- CPU 使用率 (总体 + 每核心)
|
|
||||||
- 内存使用率
|
|
||||||
- 网络接口统计
|
|
||||||
- 系统负载和内核统计
|
|
||||||
- 服务状态 (Nomad, Podman, Tailscale, Docker)
|
|
||||||
- 进程监控
|
|
||||||
- 日志文件大小监控
|
|
||||||
|
|
||||||
## 🚀 下一步操作建议
|
|
||||||
|
|
||||||
### 立即任务
|
|
||||||
1. **完成剩余节点 Telegraf 安装**:
|
|
||||||
```bash
|
|
||||||
cd /root/mgmt/configuration
|
|
||||||
./fix-telegraf-simple.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **验证监控数据**:
|
|
||||||
```bash
|
|
||||||
# 检查所有节点 Telegraf 状态
|
|
||||||
ansible all -i inventories/production/nomad-cluster.ini -m shell -a "systemctl is-active telegraf" --limit '!mac-laptop,!win-laptop'
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **在 Grafana 中验证数据**:
|
|
||||||
- 确认 InfluxDB 中有来自所有节点的数据
|
|
||||||
- 创建硬盘监控仪表板
|
|
||||||
|
|
||||||
### 后续优化
|
|
||||||
1. **设置告警规则**:
|
|
||||||
- 硬盘使用率 > 80% 警告
|
|
||||||
- 硬盘使用率 > 90% 严重告警
|
|
||||||
|
|
||||||
2. **优化监控配置**:
|
|
||||||
- 根据实际需求调整收集间隔
|
|
||||||
- 添加更多自定义监控指标
|
|
||||||
|
|
||||||
3. **完成 Nomad 安装**:
|
|
||||||
- 在 ash2e, ash1d, ch2 节点安装 Nomad 二进制文件
|
|
||||||
- 配置集群连接
|
|
||||||
|
|
||||||
## ❗ 已知问题
|
|
||||||
|
|
||||||
1. **仓库密钥下载失败**:
|
|
||||||
- 影响节点: ch3, ch2, ash1d, ash2e, ash3c, syd
|
|
||||||
- 解决方案: 使用简化安装脚本
|
|
||||||
|
|
||||||
2. **包管理器锁定冲突**:
|
|
||||||
- 多个节点同时执行 apt 操作导致锁定
|
|
||||||
- 解决方案: 使用 serial: 1 逐个处理
|
|
||||||
|
|
||||||
3. **telegraf 用户缺失**:
|
|
||||||
- 部分节点需要手动创建 telegraf 系统用户
|
|
||||||
- 解决方案: `useradd --system --no-create-home --shell /bin/false telegraf`
|
|
||||||
|
|
||||||
## 📞 联系信息
|
|
||||||
|
|
||||||
**移交日期**: 2025-09-24
|
|
||||||
**当前状态**: Telegraf 已在 6/11 个节点成功运行
|
|
||||||
**关键成果**: 硬盘监控数据已开始流入 InfluxDB
|
|
||||||
**优先级**: 完成剩余 5 个节点的 Telegraf 安装
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
**备注**: 所有脚本和配置文件都已经过测试,可以直接使用。建议按照上述步骤顺序执行,确保每个步骤完成后再进行下一步。
|
|
||||||
|
|
@ -1,53 +0,0 @@
|
||||||
#!/bin/bash
|
|
||||||
# 简化的 Telegraf 安装脚本 - 使用 Ubuntu 官方仓库
|
|
||||||
|
|
||||||
echo "🚀 使用简化方案安装 Telegraf..."
|
|
||||||
|
|
||||||
# 定义失败的节点(需要手动处理)
|
|
||||||
FAILED_NODES="ch3,ch2,ash1d,ash2e,ash3c,syd"
|
|
||||||
|
|
||||||
echo "📦 第一步:在失败的节点安装 Telegraf(Ubuntu 官方版本)..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m apt -a "name=telegraf state=present update_cache=yes" --become
|
|
||||||
|
|
||||||
if [[ $? -eq 0 ]]; then
|
|
||||||
echo "✅ Telegraf 安装成功"
|
|
||||||
else
|
|
||||||
echo "❌ 安装失败,尝试手动方式..."
|
|
||||||
# 手动安装方式
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m shell -a "apt update && apt install -y telegraf" --become
|
|
||||||
fi
|
|
||||||
|
|
||||||
echo "🔧 第二步:配置 Telegraf 使用远程配置..."
|
|
||||||
|
|
||||||
# 创建环境变量文件
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m copy -a "content='INFLUX_TOKEN=VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w==
|
|
||||||
INFLUX_ORG=nomad
|
|
||||||
INFLUX_BUCKET=nomad_monitoring
|
|
||||||
INFLUX_URL=http://influxdb1.tailnet-68f9.ts.net:8086' dest=/etc/default/telegraf owner=root group=root mode=0600" --become
|
|
||||||
|
|
||||||
# 创建 systemd 服务文件
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m copy -a "content='[Unit]
|
|
||||||
Description=Telegraf - 节点监控服务
|
|
||||||
Documentation=https://github.com/influxdata/telegraf
|
|
||||||
After=network.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=notify
|
|
||||||
User=telegraf
|
|
||||||
Group=telegraf
|
|
||||||
ExecStart=/usr/bin/telegraf --config http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000
|
|
||||||
ExecReload=/bin/kill -HUP \$MAINPID
|
|
||||||
KillMode=control-group
|
|
||||||
Restart=on-failure
|
|
||||||
RestartSec=5
|
|
||||||
TimeoutStopSec=20
|
|
||||||
EnvironmentFile=/etc/default/telegraf
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target' dest=/etc/systemd/system/telegraf.service owner=root group=root mode=0644" --become
|
|
||||||
|
|
||||||
echo "🔄 第三步:启动服务..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m systemd -a "daemon_reload=yes name=telegraf state=started enabled=yes" --become
|
|
||||||
|
|
||||||
echo "✅ 检查结果..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m shell -a "systemctl status telegraf --no-pager -l | head -5" --become
|
|
||||||
|
|
@ -1,52 +0,0 @@
|
||||||
#!/bin/bash
|
|
||||||
# 直接使用远程配置运行 Telegraf 的简化方案
|
|
||||||
|
|
||||||
echo "🚀 创建简化的 Telegraf 服务..."
|
|
||||||
|
|
||||||
# 失败的节点
|
|
||||||
FAILED_NODES="ch3,ch2,ash1d,ash2e,syd"
|
|
||||||
|
|
||||||
echo "📥 第一步:下载并安装 Telegraf 二进制文件..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m shell -a "
|
|
||||||
cd /tmp &&
|
|
||||||
curl -L https://dl.influxdata.com/telegraf/releases/telegraf-1.36.1_linux_amd64.tar.gz -o telegraf.tar.gz &&
|
|
||||||
tar -xzf telegraf.tar.gz &&
|
|
||||||
sudo cp telegraf-1.36.1/usr/bin/telegraf /usr/bin/ &&
|
|
||||||
sudo chmod +x /usr/bin/telegraf &&
|
|
||||||
telegraf version
|
|
||||||
" --become
|
|
||||||
|
|
||||||
echo "🔧 第二步:创建简化的启动脚本..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m copy -a "content='#!/bin/bash
|
|
||||||
export INFLUX_TOKEN=VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w==
|
|
||||||
export INFLUX_ORG=seekkey
|
|
||||||
export INFLUX_BUCKET=VPS
|
|
||||||
export INFLUX_URL=http://influxdb1.tailnet-68f9.ts.net:8086
|
|
||||||
|
|
||||||
/usr/bin/telegraf --config http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000
|
|
||||||
' dest=/usr/local/bin/telegraf-start.sh owner=root group=root mode=0755" --become
|
|
||||||
|
|
||||||
echo "🔄 第三步:停止旧服务并启动新的简化服务..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m systemd -a "name=telegraf state=stopped enabled=no" --become || true
|
|
||||||
|
|
||||||
# 创建简化的 systemd 服务
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m copy -a "content='[Unit]
|
|
||||||
Description=Telegraf (Simplified)
|
|
||||||
After=network.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=simple
|
|
||||||
User=telegraf
|
|
||||||
Group=telegraf
|
|
||||||
ExecStart=/usr/local/bin/telegraf-start.sh
|
|
||||||
Restart=always
|
|
||||||
RestartSec=5
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target' dest=/etc/systemd/system/telegraf-simple.service owner=root group=root mode=0644" --become
|
|
||||||
|
|
||||||
echo "🚀 第四步:启动简化服务..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m systemd -a "daemon_reload=yes name=telegraf-simple state=started enabled=yes" --become
|
|
||||||
|
|
||||||
echo "✅ 检查结果..."
|
|
||||||
ansible $FAILED_NODES -i inventories/production/nomad-cluster.ini -m shell -a "systemctl status telegraf-simple --no-pager -l | head -10" --become
|
|
||||||
|
|
@ -1,7 +0,0 @@
|
||||||
[consul_servers]
|
|
||||||
master ansible_host=100.117.106.136 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
|
||||||
ash3c ansible_host=100.116.80.94 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
|
||||||
hcs ansible_host=100.84.197.26 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
|
||||||
|
|
||||||
[consul_servers:vars]
|
|
||||||
ansible_ssh_private_key_file=~/.ssh/id_ed25519
|
|
||||||
|
|
@ -4,8 +4,8 @@
|
||||||
# InfluxDB 2.x 连接配置
|
# InfluxDB 2.x 连接配置
|
||||||
influxdb_url: "http://influxdb1.tailnet-68f9.ts.net:8086"
|
influxdb_url: "http://influxdb1.tailnet-68f9.ts.net:8086"
|
||||||
influxdb_token: "VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w=="
|
influxdb_token: "VU_dOCVZzqEHb9jSFsDe0bJlEBaVbiG4LqfoczlnmcbfrbmklSt904HJPL4idYGvVi0c2eHkYDi2zCTni7Ay4w=="
|
||||||
influxdb_org: "seekkey" # 组织名称
|
influxdb_org: "nomad" # 组织名称
|
||||||
influxdb_bucket: "VPS" # Bucket 名称
|
influxdb_bucket: "nomad_monitoring" # Bucket 名称
|
||||||
|
|
||||||
# 远程 Telegraf 配置 URL
|
# 远程 Telegraf 配置 URL
|
||||||
telegraf_config_url: "http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000"
|
telegraf_config_url: "http://influxdb1.tailnet-68f9.ts.net:8086/api/v2/telegrafs/0f8a73496790c000"
|
||||||
|
|
|
||||||
|
|
@ -5,16 +5,12 @@ dev2 ansible_host=dev2 ansible_user=ben ansible_become=yes ansible_become_pass=3
|
||||||
[oci_kr]
|
[oci_kr]
|
||||||
ch2 ansible_host=ch2 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
ch2 ansible_host=ch2 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
ch3 ansible_host=ch3 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
ch3 ansible_host=ch3 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
master ansible_host=master ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
||||||
[oci_us]
|
[oci_us]
|
||||||
ash1d ansible_host=ash1d ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
ash1d ansible_host=ash1d ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
ash2e ansible_host=ash2e ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
ash2e ansible_host=ash2e ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
||||||
[oci_a1]
|
|
||||||
master ansible_host=master ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
|
||||||
ash3c ansible_host=ash3c ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
ash3c ansible_host=ash3c ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
||||||
|
|
||||||
[huawei]
|
[huawei]
|
||||||
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
[google]
|
[google]
|
||||||
|
|
@ -22,7 +18,6 @@ benwork ansible_host=benwork ansible_user=ben ansible_become=yes ansible_become_
|
||||||
|
|
||||||
[ditigalocean]
|
[ditigalocean]
|
||||||
syd ansible_host=syd ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
syd ansible_host=syd ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
||||||
[aws]
|
[aws]
|
||||||
#aws linux dnf
|
#aws linux dnf
|
||||||
awsirish ansible_host=awsirish ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
awsirish ansible_host=awsirish ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
@ -34,16 +29,12 @@ nuc12 ansible_host=nuc12 ansible_user=root ansible_become=yes ansible_become_pas
|
||||||
|
|
||||||
[lxc]
|
[lxc]
|
||||||
#集中在三台机器,不要同时upgrade 会死掉,顺序调度来 (Debian/Ubuntu containers using apt)
|
#集中在三台机器,不要同时upgrade 会死掉,顺序调度来 (Debian/Ubuntu containers using apt)
|
||||||
|
warden ansible_host=warden ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
||||||
gitea ansible_host=gitea ansible_user=root ansible_become=yes ansible_become_pass=313131
|
gitea ansible_host=gitea ansible_user=root ansible_become=yes ansible_become_pass=313131
|
||||||
|
influxdb ansible_host=influxdb1 ansible_user=root ansible_become=yes ansible_become_pass=313131
|
||||||
mysql ansible_host=mysql ansible_user=root ansible_become=yes ansible_become_pass=313131
|
mysql ansible_host=mysql ansible_user=root ansible_become=yes ansible_become_pass=313131
|
||||||
postgresql ansible_host=postgresql ansible_user=root ansible_become=yes ansible_become_pass=313131
|
postgresql ansible_host=postgresql ansible_user=root ansible_become=yes ansible_become_pass=313131
|
||||||
|
|
||||||
[nomadlxc]
|
|
||||||
influxdb ansible_host=influxdb1 ansible_user=root ansible_become=yes ansible_become_pass=313131
|
|
||||||
warden ansible_host=warden ansible_user=ben ansible_become=yes ansible_become_pass=3131
|
|
||||||
[semaphore]
|
|
||||||
semaphoressh ansible_host=semaphore ansible_user=root ansible_become=yes ansible_become_pass=313131
|
|
||||||
|
|
||||||
[alpine]
|
[alpine]
|
||||||
#Alpine Linux containers using apk package manager
|
#Alpine Linux containers using apk package manager
|
||||||
redis ansible_host=redis ansible_user=root ansible_become=yes ansible_become_pass=313131
|
redis ansible_host=redis ansible_user=root ansible_become=yes ansible_become_pass=313131
|
||||||
|
|
@ -65,30 +56,5 @@ onecloud1 ansible_host=onecloud1 ansible_user=ben ansible_ssh_pass=3131 ansible_
|
||||||
|
|
||||||
[germany]
|
[germany]
|
||||||
de ansible_host=de ansible_user=ben ansible_ssh_pass=3131 ansible_become=yes ansible_become_pass=3131
|
de ansible_host=de ansible_user=ben ansible_ssh_pass=3131 ansible_become=yes ansible_become_pass=3131
|
||||||
|
|
||||||
[beijing:children]
|
|
||||||
nomadlxc
|
|
||||||
hcp
|
|
||||||
|
|
||||||
[all:vars]
|
[all:vars]
|
||||||
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
|
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
|
||||||
|
|
||||||
[nomad_clients:children]
|
|
||||||
nomadlxc
|
|
||||||
hcp
|
|
||||||
oci_a1
|
|
||||||
huawei
|
|
||||||
ditigalocean
|
|
||||||
germany
|
|
||||||
[nomad_servers:children]
|
|
||||||
oci_us
|
|
||||||
oci_kr
|
|
||||||
semaphore
|
|
||||||
|
|
||||||
[nomad_cluster:children]
|
|
||||||
nomad_servers
|
|
||||||
nomad_clients
|
|
||||||
|
|
||||||
[beijing:children]
|
|
||||||
nomadlxc
|
|
||||||
hcp
|
|
||||||
|
|
@ -1,12 +1,30 @@
|
||||||
[consul_servers:children]
|
[nomad_servers]
|
||||||
nomad_servers
|
semaphore ansible_connection=local nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
ash2e ansible_host=ash2e ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
ash1d ansible_host=ash1d ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
ch2 ansible_host=ch2 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
ch3 ansible_host=ch3 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
# 新增的 Mac 和 Windows 节点(请替换为实际的 Tailscale IP)
|
||||||
|
mac-laptop ansible_host=100.xxx.xxx.xxx ansible_user=your_mac_user nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
win-laptop ansible_host=100.xxx.xxx.xxx ansible_user=your_win_user nomad_role=server nomad_bootstrap_expect=6
|
||||||
|
|
||||||
[consul_servers:vars]
|
[nomad_clients]
|
||||||
consul_cert_dir=/etc/consul.d/certs
|
master ansible_host=100.117.106.136 ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
consul_ca_src=security/certificates/ca.pem
|
ash3c ansible_host=100.116.80.94 ansible_port=22 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
consul_cert_src=security/certificates/consul-server.pem
|
hcp1 ansible_host=hcp1 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
consul_key_src=security/certificates/consul-server-key.pem
|
hcp2 ansible_host=hcp2 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
|
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
|
syd ansible_host=100.117.137.105 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
|
|
||||||
[nomad_cluster:children]
|
[nomad_cluster:children]
|
||||||
nomad_servers
|
nomad_servers
|
||||||
nomad_clients
|
nomad_clients
|
||||||
|
|
||||||
|
[nomad_cluster:vars]
|
||||||
|
ansible_ssh_private_key_file=~/.ssh/id_ed25519
|
||||||
|
ansible_user=ben
|
||||||
|
ansible_become=yes
|
||||||
|
nomad_version=1.10.5
|
||||||
|
nomad_datacenter=dc1
|
||||||
|
nomad_region=global
|
||||||
|
nomad_encrypt_key=NVOMDvXblgWfhtzFzOUIHnKEOrbXOkPrkIPbRGGf1YQ=
|
||||||
|
|
@ -0,0 +1,22 @@
|
||||||
|
[nomad_servers]
|
||||||
|
master ansible_host=100.117.106.136 ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
semaphore ansible_connection=local nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
ash3c ansible_host=100.116.80.94 ansible_port=22 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
|
||||||
|
[nomad_clients]
|
||||||
|
hcp1 ansible_host=hcp1 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
|
hcp2 ansible_host=hcp2 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
|
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
|
|
||||||
|
[nomad_cluster:children]
|
||||||
|
nomad_servers
|
||||||
|
nomad_clients
|
||||||
|
|
||||||
|
[nomad_cluster:vars]
|
||||||
|
ansible_ssh_private_key_file=~/.ssh/id_ed25519
|
||||||
|
ansible_user=ben
|
||||||
|
ansible_become=yes
|
||||||
|
nomad_version=1.10.5
|
||||||
|
nomad_datacenter=dc1
|
||||||
|
nomad_region=global
|
||||||
|
nomad_encrypt_key=NVOMDvXblgWfhtzFzOUIHnKEOrbXOkPrkIPbRGGf1YQ=
|
||||||
|
|
@ -0,0 +1,23 @@
|
||||||
|
[nomad_servers]
|
||||||
|
master ansible_host=100.117.106.136 ansible_port=60022 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
semaphore ansible_connection=local nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
ash3c ansible_host=100.116.80.94 ansible_port=22 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=server nomad_bootstrap_expect=3
|
||||||
|
|
||||||
|
[nomad_clients]
|
||||||
|
hcp1 ansible_host=hcp1 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
|
hcp2 ansible_host=hcp2 ansible_user=root ansible_become=yes ansible_become_pass=313131 nomad_role=client
|
||||||
|
hcs ansible_host=hcs ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
|
syd ansible_host=100.117.137.105 ansible_user=ben ansible_become=yes ansible_become_pass=3131 nomad_role=client
|
||||||
|
|
||||||
|
[nomad_cluster:children]
|
||||||
|
nomad_servers
|
||||||
|
nomad_clients
|
||||||
|
|
||||||
|
[nomad_cluster:vars]
|
||||||
|
ansible_ssh_private_key_file=~/.ssh/id_ed25519
|
||||||
|
ansible_user=ben
|
||||||
|
ansible_become=yes
|
||||||
|
nomad_version=1.10.5
|
||||||
|
nomad_datacenter=dc1
|
||||||
|
nomad_region=global
|
||||||
|
nomad_encrypt_key=NVOMDvXblgWfhtzFzOUIHnKEOrbXOkPrkIPbRGGf1YQ=
|
||||||
|
|
@ -1,202 +0,0 @@
|
||||||
---
|
|
||||||
- name: Add Warden Server as Nomad Client to Cluster
|
|
||||||
hosts: warden
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
vars:
|
|
||||||
nomad_plugin_dir: "/opt/nomad/plugins"
|
|
||||||
nomad_datacenter: "dc1"
|
|
||||||
nomad_region: "global"
|
|
||||||
nomad_servers:
|
|
||||||
- "100.117.106.136:4647"
|
|
||||||
- "100.116.80.94:4647"
|
|
||||||
- "100.97.62.111:4647"
|
|
||||||
- "100.116.112.45:4647"
|
|
||||||
- "100.84.197.26:4647"
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 显示当前处理的节点
|
|
||||||
debug:
|
|
||||||
msg: "🔧 将 warden 服务器添加为 Nomad 客户端: {{ inventory_hostname }}"
|
|
||||||
|
|
||||||
- name: 检查 Nomad 是否已安装
|
|
||||||
shell: which nomad || echo "not_found"
|
|
||||||
register: nomad_check
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 下载并安装 Nomad
|
|
||||||
block:
|
|
||||||
- name: 下载 Nomad 1.10.5
|
|
||||||
get_url:
|
|
||||||
url: "https://releases.hashicorp.com/nomad/1.10.5/nomad_1.10.5_linux_amd64.zip"
|
|
||||||
dest: "/tmp/nomad.zip"
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 解压并安装 Nomad
|
|
||||||
unarchive:
|
|
||||||
src: "/tmp/nomad.zip"
|
|
||||||
dest: "/usr/local/bin/"
|
|
||||||
remote_src: yes
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 清理临时文件
|
|
||||||
file:
|
|
||||||
path: "/tmp/nomad.zip"
|
|
||||||
state: absent
|
|
||||||
when: nomad_check.stdout == "not_found"
|
|
||||||
|
|
||||||
- name: 验证 Nomad 安装
|
|
||||||
shell: nomad version
|
|
||||||
register: nomad_version_output
|
|
||||||
|
|
||||||
- name: 创建 Nomad 配置目录
|
|
||||||
file:
|
|
||||||
path: /etc/nomad.d
|
|
||||||
state: directory
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 创建 Nomad 数据目录
|
|
||||||
file:
|
|
||||||
path: /opt/nomad/data
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 Nomad 插件目录
|
|
||||||
file:
|
|
||||||
path: "{{ nomad_plugin_dir }}"
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 获取服务器 IP 地址
|
|
||||||
shell: |
|
|
||||||
ip route get 1.1.1.1 | grep -oP 'src \K\S+'
|
|
||||||
register: server_ip_result
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 设置服务器 IP 变量
|
|
||||||
set_fact:
|
|
||||||
server_ip: "{{ server_ip_result.stdout }}"
|
|
||||||
|
|
||||||
- name: 停止 Nomad 服务(如果正在运行)
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 Nomad 客户端配置文件
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
# Nomad Client Configuration for warden
|
|
||||||
datacenter = "{{ nomad_datacenter }}"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
log_level = "INFO"
|
|
||||||
bind_addr = "{{ server_ip }}"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = false
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
servers = [
|
|
||||||
{% for server in nomad_servers %}"{{ server }}"{% if not loop.last %}, {% endif %}{% endfor %}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin_dir = "{{ nomad_plugin_dir }}"
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
address = "127.0.0.1:8500"
|
|
||||||
}
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 验证 Nomad 配置
|
|
||||||
shell: nomad config validate /etc/nomad.d/nomad.hcl
|
|
||||||
register: nomad_validate
|
|
||||||
failed_when: nomad_validate.rc != 0
|
|
||||||
|
|
||||||
- name: 创建 Nomad systemd 服务文件
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[Unit]
|
|
||||||
Description=Nomad
|
|
||||||
Documentation=https://www.nomadproject.io/docs/
|
|
||||||
Wants=network-online.target
|
|
||||||
After=network-online.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=notify
|
|
||||||
User=root
|
|
||||||
Group=root
|
|
||||||
ExecStart=/usr/local/bin/nomad agent -config=/etc/nomad.d
|
|
||||||
ExecReload=/bin/kill -HUP $MAINPID
|
|
||||||
KillMode=process
|
|
||||||
KillSignal=SIGINT
|
|
||||||
TimeoutStopSec=5
|
|
||||||
LimitNOFILE=65536
|
|
||||||
LimitNPROC=32768
|
|
||||||
Restart=on-failure
|
|
||||||
RestartSec=2
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
dest: /etc/systemd/system/nomad.service
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 重新加载 systemd 配置
|
|
||||||
systemd:
|
|
||||||
daemon_reload: yes
|
|
||||||
|
|
||||||
- name: 启动并启用 Nomad 服务
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: 等待 Nomad 服务启动
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: "{{ server_ip }}"
|
|
||||||
delay: 5
|
|
||||||
timeout: 60
|
|
||||||
|
|
||||||
- name: 检查 Nomad 客户端状态
|
|
||||||
shell: nomad node status -self
|
|
||||||
register: nomad_node_status
|
|
||||||
retries: 5
|
|
||||||
delay: 5
|
|
||||||
until: nomad_node_status.rc == 0
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示 Nomad 客户端配置结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ warden 服务器已成功配置为 Nomad 客户端
|
|
||||||
📦 Nomad 版本: {{ nomad_version_output.stdout.split('\n')[0] }}
|
|
||||||
🌐 服务器 IP: {{ server_ip }}
|
|
||||||
🏗️ 数据中心: {{ nomad_datacenter }}
|
|
||||||
📊 客户端状态: {{ 'SUCCESS' if nomad_node_status.rc == 0 else 'PENDING' }}
|
|
||||||
🚀 warden 现在是 Nomad 集群的一部分
|
|
||||||
|
|
@ -1,15 +0,0 @@
|
||||||
---
|
|
||||||
- name: 检查 Podman 版本
|
|
||||||
hosts: warden
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 检查当前 Podman 版本
|
|
||||||
shell: podman --version
|
|
||||||
register: current_podman_version
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示当前版本
|
|
||||||
debug:
|
|
||||||
msg: "当前 Podman 版本: {{ current_podman_version.stdout if current_podman_version.rc == 0 else '未安装或无法获取' }}"
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
- name: Check podman version on semaphore (local)
|
|
||||||
hosts: semaphore
|
|
||||||
connection: local
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Check podman version
|
|
||||||
command: /usr/local/bin/podman --version
|
|
||||||
register: podman_version
|
|
||||||
- name: Display podman version
|
|
||||||
debug:
|
|
||||||
msg: "Podman version on {{ inventory_hostname }} is: {{ podman_version.stdout }}"
|
|
||||||
|
|
||||||
- name: Check podman version on other beijing nodes
|
|
||||||
hosts: beijing:!semaphore
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Check podman version
|
|
||||||
command: /usr/local/bin/podman --version
|
|
||||||
register: podman_version
|
|
||||||
- name: Display podman version
|
|
||||||
debug:
|
|
||||||
msg: "Podman version on {{ inventory_hostname }} is: {{ podman_version.stdout }}"
|
|
||||||
|
|
@ -1,14 +0,0 @@
|
||||||
---
|
|
||||||
- name: Check for AppArmor or SELinux denials
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Search journalctl for AppArmor/SELinux messages
|
|
||||||
shell: 'journalctl -k | grep -i -e apparmor -e selinux -e "avc: denied"'
|
|
||||||
register: security_logs
|
|
||||||
changed_when: false
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display security logs
|
|
||||||
debug:
|
|
||||||
var: security_logs.stdout_lines
|
|
||||||
|
|
@ -56,29 +56,21 @@
|
||||||
loop: "{{ alias_files.files }}"
|
loop: "{{ alias_files.files }}"
|
||||||
when: alias_files.files is defined
|
when: alias_files.files is defined
|
||||||
|
|
||||||
- name: Clear aliases from /etc/profile.d/aliases.sh
|
- name: Clear shell history to remove alias commands
|
||||||
ansible.builtin.file:
|
shell: |
|
||||||
path: /etc/profile.d/aliases.sh
|
> /root/.bash_history
|
||||||
state: absent
|
> /root/.zsh_history
|
||||||
|
history -c
|
||||||
|
ignore_errors: yes
|
||||||
|
|
||||||
- name: Clear aliases from /root/.bashrc
|
- name: Unalias all current aliases
|
||||||
ansible.builtin.lineinfile:
|
shell: unalias -a
|
||||||
path: /root/.bashrc
|
ignore_errors: yes
|
||||||
state: absent
|
|
||||||
regexp: "^alias "
|
|
||||||
|
|
||||||
- name: Clear aliases from /root/.bash_aliases
|
- name: Restart shell services
|
||||||
ansible.builtin.file:
|
shell: |
|
||||||
path: /root/.bash_aliases
|
pkill -f bash || true
|
||||||
state: absent
|
pkill -f zsh || true
|
||||||
|
|
||||||
- name: Clear history
|
|
||||||
ansible.builtin.command:
|
|
||||||
cmd: > /root/.bash_history
|
|
||||||
|
|
||||||
- name: Restart shell to apply changes
|
|
||||||
ansible.builtin.command:
|
|
||||||
cmd: pkill -f bash || true
|
|
||||||
|
|
||||||
- name: Test network connectivity after clearing aliases
|
- name: Test network connectivity after clearing aliases
|
||||||
shell: ping -c 2 8.8.8.8 || echo "Ping failed"
|
shell: ping -c 2 8.8.8.8 || echo "Ping failed"
|
||||||
|
|
|
||||||
|
|
@ -1,32 +0,0 @@
|
||||||
---
|
|
||||||
- name: Remove all aliases from user shell configuration files
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
gather_facts: false
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Find all relevant shell configuration files
|
|
||||||
find:
|
|
||||||
paths: /home
|
|
||||||
patterns: .bashrc, .bash_aliases, .profile
|
|
||||||
register: shell_config_files
|
|
||||||
|
|
||||||
- name: Remove aliases from shell configuration files
|
|
||||||
replace:
|
|
||||||
path: "{{ item.path }}"
|
|
||||||
regexp: '^alias .*'
|
|
||||||
replace: ''
|
|
||||||
loop: "{{ shell_config_files.files }}"
|
|
||||||
when: shell_config_files.files is defined
|
|
||||||
|
|
||||||
- name: Remove functions from shell configuration files
|
|
||||||
replace:
|
|
||||||
path: "{{ item.path }}"
|
|
||||||
regexp: '^function .*'
|
|
||||||
replace: ''
|
|
||||||
loop: "{{ shell_config_files.files }}"
|
|
||||||
when: shell_config_files.files is defined
|
|
||||||
|
|
||||||
- name: Display completion message
|
|
||||||
debug:
|
|
||||||
msg: "All aliases and functions have been removed from user shell configuration files."
|
|
||||||
|
|
@ -1,47 +0,0 @@
|
||||||
---
|
|
||||||
- name: Clear proxy settings from the system
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
gather_facts: false
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Remove proxy environment file
|
|
||||||
file:
|
|
||||||
path: /root/mgmt/configuration/proxy.env
|
|
||||||
state: absent
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Unset proxy environment variables
|
|
||||||
shell: |
|
|
||||||
unset http_proxy
|
|
||||||
unset https_proxy
|
|
||||||
unset HTTP_PROXY
|
|
||||||
unset HTTPS_PROXY
|
|
||||||
unset no_proxy
|
|
||||||
unset NO_PROXY
|
|
||||||
unset ALL_PROXY
|
|
||||||
unset all_proxy
|
|
||||||
unset DOCKER_BUILDKIT
|
|
||||||
unset BUILDKIT_PROGRESS
|
|
||||||
unset GIT_HTTP_PROXY
|
|
||||||
unset GIT_HTTPS_PROXY
|
|
||||||
unset CURL_PROXY
|
|
||||||
unset WGET_PROXY
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Remove proxy settings from /etc/environment
|
|
||||||
lineinfile:
|
|
||||||
path: /etc/environment
|
|
||||||
state: absent
|
|
||||||
regexp: '^(http_proxy|https_proxy|no_proxy|ALL_PROXY|DOCKER_BUILDKIT|BUILDKIT_PROGRESS|GIT_HTTP_PROXY|GIT_HTTPS_PROXY|CURL_PROXY|WGET_PROXY)='
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Remove proxy settings from /etc/apt/apt.conf.d/proxy.conf
|
|
||||||
file:
|
|
||||||
path: /etc/apt/apt.conf.d/proxy.conf
|
|
||||||
state: absent
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Display completion message
|
|
||||||
debug:
|
|
||||||
msg: "Proxy settings have been cleared from the system."
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
---
|
|
||||||
- name: Configure NOPASSWD sudo for nomad user
|
|
||||||
hosts: nomad_clients
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Ensure sudoers.d directory exists
|
|
||||||
file:
|
|
||||||
path: /etc/sudoers.d
|
|
||||||
state: directory
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0750'
|
|
||||||
|
|
||||||
- name: Allow nomad user passwordless sudo for required commands
|
|
||||||
copy:
|
|
||||||
dest: /etc/sudoers.d/nomad
|
|
||||||
content: |
|
|
||||||
nomad ALL=(ALL) NOPASSWD: /usr/bin/apt, /usr/bin/systemctl, /bin/mkdir, /bin/chown, /bin/chmod, /bin/mv, /bin/sed, /usr/bin/tee, /usr/sbin/usermod, /usr/bin/unzip, /usr/bin/wget
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0440'
|
|
||||||
validate: 'visudo -cf %s'
|
|
||||||
|
|
@ -11,12 +11,7 @@
|
||||||
- name: 获取当前节点的 Tailscale IP
|
- name: 获取当前节点的 Tailscale IP
|
||||||
shell: tailscale ip | head -1
|
shell: tailscale ip | head -1
|
||||||
register: current_tailscale_ip
|
register: current_tailscale_ip
|
||||||
changed_when: false
|
failed_when: current_tailscale_ip.rc != 0
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 计算用于 Nomad 的地址(优先 Tailscale,回退到 inventory 或 ansible_host)
|
|
||||||
set_fact:
|
|
||||||
node_addr: "{{ (current_tailscale_ip.stdout | default('')) is match('^100\\.') | ternary((current_tailscale_ip.stdout | trim), (hostvars[inventory_hostname].tailscale_ip | default(ansible_host))) }}"
|
|
||||||
|
|
||||||
- name: 确保 Nomad 配置目录存在
|
- name: 确保 Nomad 配置目录存在
|
||||||
file:
|
file:
|
||||||
|
|
@ -37,12 +32,12 @@
|
||||||
data_dir = "/opt/nomad/data"
|
data_dir = "/opt/nomad/data"
|
||||||
log_level = "INFO"
|
log_level = "INFO"
|
||||||
|
|
||||||
bind_addr = "{{ node_addr }}"
|
bind_addr = "{{ current_tailscale_ip.stdout }}"
|
||||||
|
|
||||||
addresses {
|
addresses {
|
||||||
http = "{{ node_addr }}"
|
http = "0.0.0.0"
|
||||||
rpc = "{{ node_addr }}"
|
rpc = "{{ current_tailscale_ip.stdout }}"
|
||||||
serf = "{{ node_addr }}"
|
serf = "{{ current_tailscale_ip.stdout }}"
|
||||||
}
|
}
|
||||||
|
|
||||||
ports {
|
ports {
|
||||||
|
|
@ -79,10 +74,9 @@
|
||||||
}
|
}
|
||||||
|
|
||||||
consul {
|
consul {
|
||||||
address = "{{ node_addr }}:8500"
|
address = "{{ current_tailscale_ip.stdout }}:8500"
|
||||||
}
|
}
|
||||||
when: nomad_role == "server"
|
when: nomad_role == "server"
|
||||||
notify: restart nomad
|
|
||||||
|
|
||||||
- name: 生成 Nomad 客户端配置(使用 Tailscale)
|
- name: 生成 Nomad 客户端配置(使用 Tailscale)
|
||||||
copy:
|
copy:
|
||||||
|
|
@ -95,12 +89,12 @@
|
||||||
data_dir = "/opt/nomad/data"
|
data_dir = "/opt/nomad/data"
|
||||||
log_level = "INFO"
|
log_level = "INFO"
|
||||||
|
|
||||||
bind_addr = "{{ node_addr }}"
|
bind_addr = "{{ current_tailscale_ip.stdout }}"
|
||||||
|
|
||||||
addresses {
|
addresses {
|
||||||
http = "{{ node_addr }}"
|
http = "0.0.0.0"
|
||||||
rpc = "{{ node_addr }}"
|
rpc = "{{ current_tailscale_ip.stdout }}"
|
||||||
serf = "{{ node_addr }}"
|
serf = "{{ current_tailscale_ip.stdout }}"
|
||||||
}
|
}
|
||||||
|
|
||||||
ports {
|
ports {
|
||||||
|
|
@ -115,8 +109,6 @@
|
||||||
|
|
||||||
client {
|
client {
|
||||||
enabled = true
|
enabled = true
|
||||||
network_interface = "tailscale0"
|
|
||||||
cpu_total_compute = 0
|
|
||||||
|
|
||||||
servers = [
|
servers = [
|
||||||
"100.116.158.95:4647", # semaphore
|
"100.116.158.95:4647", # semaphore
|
||||||
|
|
@ -136,10 +128,9 @@
|
||||||
}
|
}
|
||||||
|
|
||||||
consul {
|
consul {
|
||||||
address = "{{ node_addr }}:8500"
|
address = "{{ current_tailscale_ip.stdout }}:8500"
|
||||||
}
|
}
|
||||||
when: nomad_role == "client"
|
when: nomad_role == "client"
|
||||||
notify: restart nomad
|
|
||||||
|
|
||||||
- name: 检查 Nomad 二进制文件位置
|
- name: 检查 Nomad 二进制文件位置
|
||||||
shell: which nomad || find /usr -name nomad 2>/dev/null | head -1
|
shell: which nomad || find /usr -name nomad 2>/dev/null | head -1
|
||||||
|
|
@ -194,7 +185,7 @@
|
||||||
- name: 等待 Nomad 服务启动
|
- name: 等待 Nomad 服务启动
|
||||||
wait_for:
|
wait_for:
|
||||||
port: 4646
|
port: 4646
|
||||||
host: "{{ node_addr }}"
|
host: "{{ current_tailscale_ip.stdout }}"
|
||||||
delay: 5
|
delay: 5
|
||||||
timeout: 30
|
timeout: 30
|
||||||
ignore_errors: yes
|
ignore_errors: yes
|
||||||
|
|
@ -208,7 +199,7 @@
|
||||||
debug:
|
debug:
|
||||||
msg: |
|
msg: |
|
||||||
✅ 节点 {{ inventory_hostname }} 配置完成
|
✅ 节点 {{ inventory_hostname }} 配置完成
|
||||||
🌐 使用地址: {{ node_addr }}
|
🌐 Tailscale IP: {{ current_tailscale_ip.stdout }}
|
||||||
🎯 角色: {{ nomad_role }}
|
🎯 角色: {{ nomad_role }}
|
||||||
🔧 Nomad 二进制: {{ nomad_binary_path.stdout }}
|
🔧 Nomad 二进制: {{ nomad_binary_path.stdout }}
|
||||||
📊 服务状态: {{ 'active' if nomad_status.rc == 0 else 'failed' }}
|
📊 服务状态: {{ 'active' if nomad_status.rc == 0 else 'failed' }}
|
||||||
|
|
|
||||||
|
|
@ -1,115 +0,0 @@
|
||||||
---
|
|
||||||
- name: Configure Podman for Nomad Integration
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 显示当前处理的节点
|
|
||||||
debug:
|
|
||||||
msg: "🔧 正在为 Nomad 配置 Podman: {{ inventory_hostname }}"
|
|
||||||
|
|
||||||
- name: 确保 Podman 已安装
|
|
||||||
package:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
|
|
||||||
- name: 启用并启动 Podman socket 服务
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
|
|
||||||
- name: 创建 Podman 系统配置目录
|
|
||||||
file:
|
|
||||||
path: /etc/containers
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 配置 Podman 使用系统 socket
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[engine]
|
|
||||||
# 使用系统级 socket 而不是用户级 socket
|
|
||||||
active_service = "system"
|
|
||||||
[engine.service_destinations]
|
|
||||||
[engine.service_destinations.system]
|
|
||||||
uri = "unix:///run/podman/podman.sock"
|
|
||||||
dest: /etc/containers/containers.conf
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 检查是否存在 nomad 用户
|
|
||||||
getent:
|
|
||||||
database: passwd
|
|
||||||
key: nomad
|
|
||||||
register: nomad_user_check
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 为 nomad 用户创建配置目录
|
|
||||||
file:
|
|
||||||
path: "/home/nomad/.config/containers"
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
|
|
||||||
- name: 为 nomad 用户配置 Podman
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[engine]
|
|
||||||
active_service = "system"
|
|
||||||
[engine.service_destinations]
|
|
||||||
[engine.service_destinations.system]
|
|
||||||
uri = "unix:///run/podman/podman.sock"
|
|
||||||
dest: /home/nomad/.config/containers/containers.conf
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0644'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
|
|
||||||
- name: 将 nomad 用户添加到 podman 组
|
|
||||||
user:
|
|
||||||
name: nomad
|
|
||||||
groups: podman
|
|
||||||
append: yes
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 podman 组(如果不存在)
|
|
||||||
group:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 设置 podman socket 目录权限
|
|
||||||
file:
|
|
||||||
path: /run/podman
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
group: podman
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 验证 Podman socket 权限
|
|
||||||
file:
|
|
||||||
path: /run/podman/podman.sock
|
|
||||||
mode: '066'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 验证 Podman 安装
|
|
||||||
shell: podman --version
|
|
||||||
register: podman_version
|
|
||||||
|
|
||||||
- name: 测试 Podman 功能
|
|
||||||
shell: podman info
|
|
||||||
register: podman_info
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示配置结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ 节点 {{ inventory_hostname }} Podman 配置完成
|
|
||||||
📦 Podman 版本: {{ podman_version.stdout }}
|
|
||||||
🐳 Podman 状态: {{ 'SUCCESS' if podman_info.rc == 0 else 'WARNING' }}
|
|
||||||
👤 Nomad 用户: {{ 'FOUND' if nomad_user_check is succeeded else 'NOT FOUND' }}
|
|
||||||
|
|
@ -1,33 +0,0 @@
|
||||||
---
|
|
||||||
- name: Debug cgroup permissions
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Check permissions of /sys/fs/cgroup/cpuset/
|
|
||||||
stat:
|
|
||||||
path: /sys/fs/cgroup/cpuset/
|
|
||||||
register: cpuset_dir
|
|
||||||
|
|
||||||
- name: Display cpuset dir stats
|
|
||||||
debug:
|
|
||||||
var: cpuset_dir.stat
|
|
||||||
|
|
||||||
- name: Check for nomad subdir in cpuset
|
|
||||||
stat:
|
|
||||||
path: /sys/fs/cgroup/cpuset/nomad
|
|
||||||
register: nomad_cpuset_dir
|
|
||||||
ignore_errors: true
|
|
||||||
|
|
||||||
- name: Display nomad cpuset dir stats
|
|
||||||
debug:
|
|
||||||
var: nomad_cpuset_dir.stat
|
|
||||||
when: nomad_cpuset_dir.stat.exists is defined and nomad_cpuset_dir.stat.exists
|
|
||||||
|
|
||||||
- name: List contents of /sys/fs/cgroup/cpuset/
|
|
||||||
command: ls -la /sys/fs/cgroup/cpuset/
|
|
||||||
register: ls_cpuset
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: Display contents of /sys/fs/cgroup/cpuset/
|
|
||||||
debug:
|
|
||||||
var: ls_cpuset.stdout_lines
|
|
||||||
|
|
@ -1,14 +0,0 @@
|
||||||
---
|
|
||||||
- name: Debug Nomad cgroup subdirectory
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: List contents of /sys/fs/cgroup/cpuset/nomad/
|
|
||||||
command: ls -la /sys/fs/cgroup/cpuset/nomad/
|
|
||||||
register: ls_nomad_cpuset
|
|
||||||
changed_when: false
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display contents of /sys/fs/cgroup/cpuset/nomad/
|
|
||||||
debug:
|
|
||||||
var: ls_nomad_cpuset.stdout_lines
|
|
||||||
|
|
@ -1,24 +0,0 @@
|
||||||
- name: Debug Nomad service on germany
|
|
||||||
hosts: germany
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Get Nomad service status
|
|
||||||
command: systemctl status nomad.service --no-pager -l
|
|
||||||
register: nomad_status
|
|
||||||
ignore_errors: true
|
|
||||||
|
|
||||||
- name: Get Nomad service journal
|
|
||||||
command: journalctl -xeu nomad.service --no-pager -n 100
|
|
||||||
register: nomad_journal
|
|
||||||
ignore_errors: true
|
|
||||||
|
|
||||||
- name: Display debug information
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
--- Nomad Service Status ---
|
|
||||||
{{ nomad_status.stdout }}
|
|
||||||
{{ nomad_status.stderr }}
|
|
||||||
|
|
||||||
--- Nomad Service Journal ---
|
|
||||||
{{ nomad_journal.stdout }}
|
|
||||||
{{ nomad_journal.stderr }}
|
|
||||||
|
|
@ -1,30 +0,0 @@
|
||||||
---
|
|
||||||
- name: Gather Nomad debug information from multiple nodes
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Get Nomad service status
|
|
||||||
shell: systemctl status nomad --no-pager -l
|
|
||||||
register: nomad_status
|
|
||||||
changed_when: false
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Get last 50 lines of Nomad journal logs
|
|
||||||
shell: journalctl -u nomad -n 50 --no-pager
|
|
||||||
register: nomad_journal
|
|
||||||
changed_when: false
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display Nomad Status
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
--- Nomad Status for {{ inventory_hostname }} ---
|
|
||||||
{{ nomad_status.stdout }}
|
|
||||||
{{ nomad_status.stderr }}
|
|
||||||
|
|
||||||
- name: Display Nomad Journal
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
--- Nomad Journal for {{ inventory_hostname }} ---
|
|
||||||
{{ nomad_journal.stdout }}
|
|
||||||
{{ nomad_journal.stderr }}
|
|
||||||
|
|
@ -1,12 +0,0 @@
|
||||||
- name: Distribute new podman binary to syd
|
|
||||||
hosts: syd
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Copy new podman binary to /usr/local/bin
|
|
||||||
copy:
|
|
||||||
src: /root/mgmt/configuration/podman-remote-static-linux_amd64
|
|
||||||
dest: /usr/local/bin/podman
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
become: yes
|
|
||||||
|
|
@ -1,76 +0,0 @@
|
||||||
---
|
|
||||||
- name: Distribute Nomad Podman Driver to all nodes
|
|
||||||
hosts: nomad_cluster
|
|
||||||
become: yes
|
|
||||||
vars:
|
|
||||||
nomad_user: nomad
|
|
||||||
nomad_data_dir: /opt/nomad/data
|
|
||||||
nomad_plugins_dir: "{{ nomad_data_dir }}/plugins"
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
|
|
||||||
- name: Create plugins directory
|
|
||||||
file:
|
|
||||||
path: "{{ nomad_plugins_dir }}"
|
|
||||||
state: directory
|
|
||||||
owner: "{{ nomad_user }}"
|
|
||||||
group: "{{ nomad_user }}"
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Copy Nomad Podman driver from local
|
|
||||||
copy:
|
|
||||||
src: /tmp/nomad-driver-podman
|
|
||||||
dest: "{{ nomad_plugins_dir }}/nomad-driver-podman"
|
|
||||||
owner: "{{ nomad_user }}"
|
|
||||||
group: "{{ nomad_user }}"
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Update Nomad configuration for plugin directory
|
|
||||||
lineinfile:
|
|
||||||
path: /etc/nomad.d/nomad.hcl
|
|
||||||
regexp: '^plugin_dir'
|
|
||||||
line: 'plugin_dir = "{{ nomad_plugins_dir }}"'
|
|
||||||
insertafter: 'data_dir = "/opt/nomad/data"'
|
|
||||||
|
|
||||||
- name: Ensure Podman is installed
|
|
||||||
package:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
|
|
||||||
- name: Enable Podman socket
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: Wait for Nomad to be ready
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: localhost
|
|
||||||
delay: 10
|
|
||||||
timeout: 60
|
|
||||||
|
|
||||||
- name: Wait for plugins to load
|
|
||||||
pause:
|
|
||||||
seconds: 15
|
|
||||||
|
|
||||||
- name: Check driver status
|
|
||||||
shell: |
|
|
||||||
/usr/local/bin/nomad node status -self | grep -A 10 "Driver Status" || /usr/bin/nomad node status -self | grep -A 10 "Driver Status"
|
|
||||||
register: driver_status
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display driver status
|
|
||||||
debug:
|
|
||||||
var: driver_status.stdout_lines
|
|
||||||
|
|
@ -1,12 +0,0 @@
|
||||||
- name: Distribute new podman binary to germany
|
|
||||||
hosts: germany
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Copy new podman binary to /usr/local/bin
|
|
||||||
copy:
|
|
||||||
src: /root/mgmt/configuration/podman-remote-static-linux_amd64
|
|
||||||
dest: /usr/local/bin/podman
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
become: yes
|
|
||||||
|
|
@ -1,12 +0,0 @@
|
||||||
- name: Distribute new podman binary to specified nomad_clients
|
|
||||||
hosts: nomadlxc,hcp,huawei,ditigalocean
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Copy new podman binary to /usr/local/bin
|
|
||||||
copy:
|
|
||||||
src: /root/mgmt/configuration/podman-remote-static-linux_amd64
|
|
||||||
dest: /usr/local/bin/podman
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
become: yes
|
|
||||||
|
|
@ -1,25 +0,0 @@
|
||||||
---
|
|
||||||
- name: Ensure nomad user and plugin directory exist
|
|
||||||
hosts: nomad_clients
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Ensure nomad group exists
|
|
||||||
group:
|
|
||||||
name: nomad
|
|
||||||
state: present
|
|
||||||
|
|
||||||
- name: Ensure nomad user exists
|
|
||||||
user:
|
|
||||||
name: nomad
|
|
||||||
group: nomad
|
|
||||||
shell: /usr/sbin/nologin
|
|
||||||
system: yes
|
|
||||||
create_home: no
|
|
||||||
|
|
||||||
- name: Ensure plugin directory exists with correct ownership
|
|
||||||
file:
|
|
||||||
path: /opt/nomad/data/plugins
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
|
|
@ -1,14 +0,0 @@
|
||||||
---
|
|
||||||
- name: Find Nomad service
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: List systemd services and filter for nomad
|
|
||||||
shell: systemctl list-unit-files --type=service | grep -i nomad
|
|
||||||
register: nomad_services
|
|
||||||
changed_when: false
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display found services
|
|
||||||
debug:
|
|
||||||
var: nomad_services.stdout_lines
|
|
||||||
|
|
@ -1,16 +0,0 @@
|
||||||
---
|
|
||||||
- name: Debug apt repository issues
|
|
||||||
hosts: beijing:children
|
|
||||||
become: yes
|
|
||||||
ignore_unreachable: yes
|
|
||||||
tasks:
|
|
||||||
- name: Run apt-get update to capture error
|
|
||||||
ansible.builtin.shell: apt-get update
|
|
||||||
register: apt_update_result
|
|
||||||
failed_when: false
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: Display apt-get update stderr
|
|
||||||
ansible.builtin.debug:
|
|
||||||
var: apt_update_result.stderr
|
|
||||||
verbosity: 2
|
|
||||||
|
|
@ -1,19 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix cgroup permissions for Nomad
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Recursively change ownership of nomad cgroup directory
|
|
||||||
file:
|
|
||||||
path: /sys/fs/cgroup/cpuset/nomad
|
|
||||||
state: directory
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
recurse: yes
|
|
||||||
|
|
||||||
- name: Change ownership of the parent cpuset directory
|
|
||||||
file:
|
|
||||||
path: /sys/fs/cgroup/cpuset/
|
|
||||||
state: directory
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
|
|
@ -1,126 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix duplicate Podman configuration in Nomad
|
|
||||||
hosts: nomad_cluster
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
|
|
||||||
- name: Backup current configuration
|
|
||||||
copy:
|
|
||||||
src: /etc/nomad.d/nomad.hcl
|
|
||||||
dest: /etc/nomad.d/nomad.hcl.backup-duplicate-fix
|
|
||||||
remote_src: yes
|
|
||||||
|
|
||||||
- name: Read current configuration
|
|
||||||
slurp:
|
|
||||||
src: /etc/nomad.d/nomad.hcl
|
|
||||||
register: current_config
|
|
||||||
|
|
||||||
- name: Create clean configuration for clients
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
datacenter = "{{ nomad_datacenter }}"
|
|
||||||
region = "{{ nomad_region }}"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
bind_addr = "{{ tailscale_ip }}"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = false
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
servers = ["100.116.158.95:4647", "100.117.106.136:4647", "100.86.141.112:4647", "100.81.26.3:4647", "100.103.147.94:4647"]
|
|
||||||
}
|
|
||||||
|
|
||||||
ui {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
|
|
||||||
addresses {
|
|
||||||
http = "0.0.0.0"
|
|
||||||
rpc = "{{ tailscale_ip }}"
|
|
||||||
serf = "{{ tailscale_ip }}"
|
|
||||||
}
|
|
||||||
|
|
||||||
ports {
|
|
||||||
http = 4646
|
|
||||||
rpc = 4647
|
|
||||||
serf = 4648
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
recover_stopped = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
auto_advertise = false
|
|
||||||
server_auto_join = false
|
|
||||||
client_auto_join = false
|
|
||||||
}
|
|
||||||
|
|
||||||
log_level = "INFO"
|
|
||||||
enable_syslog = true
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0640'
|
|
||||||
when: nomad_role == "client"
|
|
||||||
|
|
||||||
- name: Ensure Podman is installed
|
|
||||||
package:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
|
|
||||||
- name: Enable and start Podman socket
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
|
|
||||||
- name: Set proper permissions on Podman socket
|
|
||||||
file:
|
|
||||||
path: /run/podman/podman.sock
|
|
||||||
mode: '0666'
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Validate Nomad configuration
|
|
||||||
shell: /usr/local/bin/nomad config validate /etc/nomad.d/nomad.hcl || /usr/bin/nomad config validate /etc/nomad.d/nomad.hcl
|
|
||||||
register: config_validation
|
|
||||||
failed_when: config_validation.rc != 0
|
|
||||||
|
|
||||||
- name: Start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: Wait for Nomad to be ready
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: localhost
|
|
||||||
delay: 10
|
|
||||||
timeout: 60
|
|
||||||
|
|
||||||
- name: Wait for drivers to load
|
|
||||||
pause:
|
|
||||||
seconds: 20
|
|
||||||
|
|
||||||
- name: Check driver status
|
|
||||||
shell: |
|
|
||||||
/usr/local/bin/nomad node status -self | grep -A 10 "Driver Status" || /usr/bin/nomad node status -self | grep -A 10 "Driver Status"
|
|
||||||
register: driver_status
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display driver status
|
|
||||||
debug:
|
|
||||||
var: driver_status.stdout_lines
|
|
||||||
|
|
@ -1,34 +0,0 @@
|
||||||
---
|
|
||||||
- name: 直接复制正确的 HashiCorp APT 源配置
|
|
||||||
hosts: nomad_cluster
|
|
||||||
become: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 备份现有的 HashiCorp APT 源配置(如果存在)
|
|
||||||
copy:
|
|
||||||
src: "/etc/apt/sources.list.d/hashicorp.list"
|
|
||||||
dest: "/etc/apt/sources.list.d/hashicorp.list.backup-{{ ansible_date_time.epoch }}"
|
|
||||||
remote_src: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建正确的 HashiCorp APT 源配置
|
|
||||||
copy:
|
|
||||||
content: "deb [trusted=yes] http://apt.releases.hashicorp.com bookworm main\n"
|
|
||||||
dest: "/etc/apt/sources.list.d/hashicorp.list"
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 更新 APT 缓存
|
|
||||||
apt:
|
|
||||||
update_cache: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 验证配置
|
|
||||||
command: cat /etc/apt/sources.list.d/hashicorp.list
|
|
||||||
register: config_check
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 显示配置内容
|
|
||||||
debug:
|
|
||||||
msg: "HashiCorp APT 源配置: {{ config_check.stdout }}"
|
|
||||||
|
|
@ -1,98 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix Nomad Cluster Configuration
|
|
||||||
hosts: nomad_servers
|
|
||||||
become: yes
|
|
||||||
vars:
|
|
||||||
nomad_servers_list:
|
|
||||||
- "100.116.158.95" # semaphore
|
|
||||||
- "100.103.147.94" # ash2e
|
|
||||||
- "100.81.26.3" # ash1d
|
|
||||||
- "100.90.159.68" # ch2
|
|
||||||
- "{{ ansible_default_ipv4.address }}" # ch3 (will be determined dynamically)
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Create nomad user
|
|
||||||
user:
|
|
||||||
name: nomad
|
|
||||||
system: yes
|
|
||||||
shell: /bin/false
|
|
||||||
home: /opt/nomad
|
|
||||||
create_home: no
|
|
||||||
|
|
||||||
- name: Create Nomad configuration directory
|
|
||||||
file:
|
|
||||||
path: /etc/nomad.d
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Create Nomad data directory
|
|
||||||
file:
|
|
||||||
path: /opt/nomad/data
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Create Nomad log directory
|
|
||||||
file:
|
|
||||||
path: /var/log/nomad
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Generate Nomad server configuration
|
|
||||||
template:
|
|
||||||
src: nomad-server.hcl.j2
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
mode: '0644'
|
|
||||||
notify: restart nomad
|
|
||||||
|
|
||||||
- name: Create Nomad systemd service file
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[Unit]
|
|
||||||
Description=Nomad
|
|
||||||
Documentation=https://www.nomadproject.io/
|
|
||||||
Requires=network-online.target
|
|
||||||
After=network-online.target
|
|
||||||
ConditionFileNotEmpty=/etc/nomad.d/nomad.hcl
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=notify
|
|
||||||
User=nomad
|
|
||||||
Group=nomad
|
|
||||||
ExecStart=/usr/bin/nomad agent -config=/etc/nomad.d/nomad.hcl
|
|
||||||
ExecReload=/bin/kill -HUP $MAINPID
|
|
||||||
KillMode=process
|
|
||||||
Restart=on-failure
|
|
||||||
LimitNOFILE=65536
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
dest: /etc/systemd/system/nomad.service
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: Reload systemd daemon
|
|
||||||
systemd:
|
|
||||||
daemon_reload: yes
|
|
||||||
|
|
||||||
- name: Enable and start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
|
|
||||||
handlers:
|
|
||||||
- name: restart nomad
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: restarted
|
|
||||||
|
|
@ -1,45 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix Nomad server configuration
|
|
||||||
hosts: localhost
|
|
||||||
gather_facts: no
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Create corrected nomad.hcl
|
|
||||||
copy:
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
content: |
|
|
||||||
datacenter = "dc1"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
log_level = "INFO"
|
|
||||||
|
|
||||||
bind_addr = "100.116.158.95"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = true
|
|
||||||
bootstrap_expect = 5
|
|
||||||
encrypt = "NVOMDvXblgWfhtzFzOUIHnKEOrbXOkPrkIPbRGGf1YQ="
|
|
||||||
retry_join = [
|
|
||||||
"100.116.158.95", # semaphore
|
|
||||||
"100.81.26.3", # ash1d
|
|
||||||
"100.103.147.94", # ash2e
|
|
||||||
"100.90.159.68", # ch2
|
|
||||||
"100.86.141.112" # ch3
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = false
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
address = "100.116.158.95:8500"
|
|
||||||
}
|
|
||||||
|
|
@ -1,109 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix Nomad server configuration
|
|
||||||
hosts: nomad_servers
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
|
|
||||||
- name: Backup current configuration
|
|
||||||
copy:
|
|
||||||
src: /etc/nomad.d/nomad.hcl
|
|
||||||
dest: /etc/nomad.d/nomad.hcl.backup-server-fix
|
|
||||||
remote_src: yes
|
|
||||||
|
|
||||||
- name: Create clean server configuration
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
datacenter = "{{ nomad_datacenter }}"
|
|
||||||
region = "{{ nomad_region }}"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
bind_addr = "{{ ansible_default_ipv4.address }}"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = true
|
|
||||||
bootstrap_expect = {{ nomad_bootstrap_expect }}
|
|
||||||
encrypt = "{{ nomad_encrypt_key }}"
|
|
||||||
|
|
||||||
retry_join = [
|
|
||||||
"100.116.158.95",
|
|
||||||
"100.103.147.94",
|
|
||||||
"100.81.26.3",
|
|
||||||
"100.90.159.68",
|
|
||||||
"100.86.141.112"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
|
|
||||||
ui {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
|
|
||||||
addresses {
|
|
||||||
http = "0.0.0.0"
|
|
||||||
rpc = "{{ ansible_default_ipv4.address }}"
|
|
||||||
serf = "{{ ansible_default_ipv4.address }}"
|
|
||||||
}
|
|
||||||
|
|
||||||
ports {
|
|
||||||
http = 4646
|
|
||||||
rpc = 4647
|
|
||||||
serf = 4648
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
recover_stopped = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
auto_advertise = false
|
|
||||||
server_auto_join = false
|
|
||||||
client_auto_join = false
|
|
||||||
}
|
|
||||||
|
|
||||||
log_level = "INFO"
|
|
||||||
log_file = "/var/log/nomad/nomad.log"
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0640'
|
|
||||||
|
|
||||||
- name: Ensure Podman is installed
|
|
||||||
package:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
|
|
||||||
- name: Enable and start Podman socket
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
|
|
||||||
- name: Validate Nomad configuration
|
|
||||||
shell: /usr/local/bin/nomad config validate /etc/nomad.d/nomad.hcl || /usr/bin/nomad config validate /etc/nomad.d/nomad.hcl
|
|
||||||
register: config_validation
|
|
||||||
failed_when: config_validation.rc != 0
|
|
||||||
|
|
||||||
- name: Start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: Wait for Nomad to be ready
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: localhost
|
|
||||||
delay: 10
|
|
||||||
timeout: 60
|
|
||||||
|
|
@ -1,103 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix Nomad server network configuration
|
|
||||||
hosts: nomad_servers
|
|
||||||
become: yes
|
|
||||||
vars:
|
|
||||||
server_ips:
|
|
||||||
semaphore: "100.116.158.95"
|
|
||||||
ash2e: "100.103.147.94"
|
|
||||||
ash1d: "100.81.26.3"
|
|
||||||
ch2: "100.90.159.68"
|
|
||||||
ch3: "100.86.141.112"
|
|
||||||
tasks:
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
|
|
||||||
- name: Get server IP for this host
|
|
||||||
set_fact:
|
|
||||||
server_ip: "{{ server_ips[inventory_hostname] }}"
|
|
||||||
|
|
||||||
- name: Create corrected server configuration
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
datacenter = "{{ nomad_datacenter }}"
|
|
||||||
region = "{{ nomad_region }}"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
bind_addr = "{{ server_ip }}"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = true
|
|
||||||
bootstrap_expect = {{ nomad_bootstrap_expect }}
|
|
||||||
encrypt = "{{ nomad_encrypt_key }}"
|
|
||||||
|
|
||||||
retry_join = [
|
|
||||||
"100.116.158.95",
|
|
||||||
"100.103.147.94",
|
|
||||||
"100.81.26.3",
|
|
||||||
"100.90.159.68",
|
|
||||||
"100.86.141.112"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
|
|
||||||
ui {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
|
|
||||||
addresses {
|
|
||||||
http = "0.0.0.0"
|
|
||||||
rpc = "{{ server_ip }}"
|
|
||||||
serf = "{{ server_ip }}"
|
|
||||||
}
|
|
||||||
|
|
||||||
ports {
|
|
||||||
http = 4646
|
|
||||||
rpc = 4647
|
|
||||||
serf = 4648
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
recover_stopped = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
auto_advertise = false
|
|
||||||
server_auto_join = false
|
|
||||||
client_auto_join = false
|
|
||||||
}
|
|
||||||
|
|
||||||
log_level = "INFO"
|
|
||||||
log_file = "/var/log/nomad/nomad.log"
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0640'
|
|
||||||
|
|
||||||
- name: Validate Nomad configuration
|
|
||||||
shell: /usr/local/bin/nomad config validate /etc/nomad.d/nomad.hcl || /usr/bin/nomad config validate /etc/nomad.d/nomad.hcl
|
|
||||||
register: config_validation
|
|
||||||
failed_when: config_validation.rc != 0
|
|
||||||
|
|
||||||
- name: Start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: Wait for Nomad to be ready
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: localhost
|
|
||||||
delay: 10
|
|
||||||
timeout: 60
|
|
||||||
|
|
@ -1,39 +0,0 @@
|
||||||
---
|
|
||||||
- name: Fix Warden docker-compose.yml
|
|
||||||
hosts: warden
|
|
||||||
become: yes
|
|
||||||
gather_facts: no
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Ensure /opt/warden directory exists
|
|
||||||
file:
|
|
||||||
path: /opt/warden
|
|
||||||
state: directory
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Create or update docker-compose.yml with correct indentation
|
|
||||||
copy:
|
|
||||||
dest: /opt/warden/docker-compose.yml
|
|
||||||
content: |
|
|
||||||
services:
|
|
||||||
vaultwarden:
|
|
||||||
image: hub.git4ta.fun/vaultwarden/server:latest
|
|
||||||
security_opt:
|
|
||||||
- "seccomp=unconfined"
|
|
||||||
env_file:
|
|
||||||
- .env
|
|
||||||
volumes:
|
|
||||||
- ./data:/data
|
|
||||||
ports:
|
|
||||||
- "980:80"
|
|
||||||
restart: always
|
|
||||||
networks:
|
|
||||||
- vaultwarden_network
|
|
||||||
|
|
||||||
networks:
|
|
||||||
vaultwarden_network:
|
|
||||||
owner: root
|
|
||||||
group: root
|
|
||||||
mode: '0644'
|
|
||||||
|
|
@ -1,12 +0,0 @@
|
||||||
---
|
|
||||||
- name: Get Tailscale IP for specified nodes
|
|
||||||
hosts: all
|
|
||||||
gather_facts: no
|
|
||||||
tasks:
|
|
||||||
- name: Get tailscale IP
|
|
||||||
shell: "tailscale ip -4"
|
|
||||||
register: tailscale_ip
|
|
||||||
|
|
||||||
- name: Display Tailscale IP
|
|
||||||
debug:
|
|
||||||
msg: "Node {{ inventory_hostname }} has IP: {{ tailscale_ip.stdout }}"
|
|
||||||
|
|
@ -1,67 +0,0 @@
|
||||||
---
|
|
||||||
- name: 强制升级 Podman 到最新版本
|
|
||||||
hosts: warden
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 检查当前 Podman 版本
|
|
||||||
shell: podman --version
|
|
||||||
register: current_podman_version
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示当前版本
|
|
||||||
debug:
|
|
||||||
msg: "升级前版本: {{ current_podman_version.stdout if current_podman_version.rc == 0 else '未安装' }}"
|
|
||||||
|
|
||||||
- name: 卸载现有 Podman
|
|
||||||
shell: apt-get remove -y --purge podman* containerd* runc*
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理残留配置
|
|
||||||
shell: |
|
|
||||||
rm -rf /etc/containers
|
|
||||||
rm -rf /usr/share/containers
|
|
||||||
rm -rf /var/lib/containers
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 直接下载并安装最新版Podman二进制文件
|
|
||||||
shell: |
|
|
||||||
# 清理可能存在的旧版本
|
|
||||||
rm -f /tmp/podman-latest.tar.gz
|
|
||||||
rm -f /usr/local/bin/podman
|
|
||||||
|
|
||||||
# 获取最新版本号
|
|
||||||
LATEST_VERSION="v5.6.1" # 硬编码最新版本避免网络问题
|
|
||||||
echo "安装版本: $LATEST_VERSION"
|
|
||||||
|
|
||||||
# 使用GitHub镜像站点下载二进制文件
|
|
||||||
echo "使用GitHub镜像站点下载..."
|
|
||||||
wget -O /tmp/podman-latest.tar.gz "https://gh.git4ta.fun/github.com/containers/podman/releases/download/${LATEST_VERSION}/podman-linux-static-amd64.tar.gz"
|
|
||||||
|
|
||||||
# 检查文件是否下载成功,如果失败尝试直接下载
|
|
||||||
if [ ! -f /tmp/podman-latest.tar.gz ]; then
|
|
||||||
echo "镜像下载失败,尝试直接下载..."
|
|
||||||
wget -O /tmp/podman-latest.tar.gz "https://github.com/containers/podman/releases/download/${LATEST_VERSION}/podman-linux-static-amd64.tar.gz"
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 解压并安装
|
|
||||||
tar -xzf /tmp/podman-latest.tar.gz -C /usr/local/bin/ --strip-components=1
|
|
||||||
chmod +x /usr/local/bin/podman
|
|
||||||
|
|
||||||
# 更新PATH
|
|
||||||
echo 'export PATH=/usr/local/bin:$PATH' >> /etc/profile
|
|
||||||
. /etc/profile
|
|
||||||
|
|
||||||
# 验证安装
|
|
||||||
/usr/local/bin/podman --version
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 验证安装结果
|
|
||||||
shell: podman --version
|
|
||||||
register: new_podman_version
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示最终版本
|
|
||||||
debug:
|
|
||||||
msg: "升级后版本: {{ new_podman_version.stdout if new_podman_version.rc == 0 else '安装失败' }}"
|
|
||||||
|
|
@ -1,161 +0,0 @@
|
||||||
---
|
|
||||||
- name: Install and Configure Nomad Podman Driver on Client Nodes
|
|
||||||
hosts: nomad_clients
|
|
||||||
become: yes
|
|
||||||
vars:
|
|
||||||
nomad_plugin_dir: "/opt/nomad/plugins"
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: Create backup directory with timestamp
|
|
||||||
set_fact:
|
|
||||||
backup_dir: "/root/backup/{{ ansible_date_time.date }}_{{ ansible_date_time.hour }}{{ ansible_date_time.minute }}{{ ansible_date_time.second }}"
|
|
||||||
|
|
||||||
- name: Create backup directory
|
|
||||||
file:
|
|
||||||
path: "{{ backup_dir }}"
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Backup current Nomad configuration
|
|
||||||
copy:
|
|
||||||
src: /etc/nomad.d/nomad.hcl
|
|
||||||
dest: "{{ backup_dir }}/nomad.hcl.backup"
|
|
||||||
remote_src: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Backup current apt sources
|
|
||||||
shell: |
|
|
||||||
cp -r /etc/apt/sources.list* {{ backup_dir }}/
|
|
||||||
dpkg --get-selections > {{ backup_dir }}/installed_packages.txt
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Create temporary directory for apt
|
|
||||||
file:
|
|
||||||
path: /tmp/apt-temp
|
|
||||||
state: directory
|
|
||||||
mode: '1777'
|
|
||||||
|
|
||||||
- name: Download HashiCorp GPG key
|
|
||||||
get_url:
|
|
||||||
url: https://apt.releases.hashicorp.com/gpg
|
|
||||||
dest: /tmp/hashicorp.gpg
|
|
||||||
mode: '0644'
|
|
||||||
environment:
|
|
||||||
TMPDIR: /tmp/apt-temp
|
|
||||||
|
|
||||||
- name: Install HashiCorp GPG key
|
|
||||||
shell: |
|
|
||||||
gpg --dearmor < /tmp/hashicorp.gpg > /usr/share/keyrings/hashicorp-archive-keyring.gpg
|
|
||||||
environment:
|
|
||||||
TMPDIR: /tmp/apt-temp
|
|
||||||
|
|
||||||
- name: Add HashiCorp repository
|
|
||||||
lineinfile:
|
|
||||||
path: /etc/apt/sources.list.d/hashicorp.list
|
|
||||||
line: "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com {{ ansible_distribution_release }} main"
|
|
||||||
create: yes
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: Update apt cache
|
|
||||||
apt:
|
|
||||||
update_cache: yes
|
|
||||||
environment:
|
|
||||||
TMPDIR: /tmp/apt-temp
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: Install nomad-driver-podman
|
|
||||||
apt:
|
|
||||||
name: nomad-driver-podman
|
|
||||||
state: present
|
|
||||||
environment:
|
|
||||||
TMPDIR: /tmp/apt-temp
|
|
||||||
|
|
||||||
- name: Create Nomad plugin directory
|
|
||||||
file:
|
|
||||||
path: "{{ nomad_plugin_dir }}"
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: Create symlink for nomad-driver-podman in plugin directory
|
|
||||||
file:
|
|
||||||
src: /usr/bin/nomad-driver-podman
|
|
||||||
dest: "{{ nomad_plugin_dir }}/nomad-driver-podman"
|
|
||||||
state: link
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
|
|
||||||
- name: Get server IP address
|
|
||||||
shell: |
|
|
||||||
ip route get 1.1.1.1 | grep -oP 'src \K\S+'
|
|
||||||
register: server_ip_result
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: Set server IP fact
|
|
||||||
set_fact:
|
|
||||||
server_ip: "{{ server_ip_result.stdout }}"
|
|
||||||
|
|
||||||
- name: Stop Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: stopped
|
|
||||||
|
|
||||||
- name: Create updated Nomad client configuration
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
datacenter = "{{ nomad_datacenter }}"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
log_level = "INFO"
|
|
||||||
bind_addr = "{{ server_ip }}"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = false
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
servers = ["100.117.106.136:4647", "100.116.80.94:4647", "100.97.62.111:4647", "100.116.112.45:4647", "100.84.197.26:4647"]
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin_dir = "{{ nomad_plugin_dir }}"
|
|
||||||
|
|
||||||
plugin "nomad-driver-podman" {
|
|
||||||
config {
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
recover_stopped = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
address = "127.0.0.1:8500"
|
|
||||||
}
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0640'
|
|
||||||
backup: yes
|
|
||||||
|
|
||||||
- name: Validate Nomad configuration
|
|
||||||
shell: nomad config validate /etc/nomad.d/nomad.hcl
|
|
||||||
register: nomad_validate
|
|
||||||
failed_when: nomad_validate.rc != 0
|
|
||||||
|
|
||||||
- name: Start Nomad service
|
|
||||||
systemd:
|
|
||||||
name: nomad
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: Wait for Nomad to be ready
|
|
||||||
wait_for:
|
|
||||||
port: 4646
|
|
||||||
host: "{{ server_ip }}"
|
|
||||||
delay: 5
|
|
||||||
timeout: 60
|
|
||||||
|
|
||||||
- name: Display backup location
|
|
||||||
debug:
|
|
||||||
msg: "Backup created at: {{ backup_dir }}"
|
|
||||||
|
|
@ -1,8 +1,10 @@
|
||||||
---
|
---
|
||||||
- name: Install Nomad by direct download from HashiCorp
|
- name: Install Nomad by direct download from HashiCorp
|
||||||
hosts: all
|
hosts: hcs
|
||||||
become: yes
|
become: yes
|
||||||
vars:
|
vars:
|
||||||
|
nomad_version: "1.10.5"
|
||||||
|
nomad_url: "https://releases.hashicorp.com/nomad/{{ nomad_version }}/nomad_{{ nomad_version }}_linux_amd64.zip"
|
||||||
nomad_user: "nomad"
|
nomad_user: "nomad"
|
||||||
nomad_group: "nomad"
|
nomad_group: "nomad"
|
||||||
nomad_home: "/opt/nomad"
|
nomad_home: "/opt/nomad"
|
||||||
|
|
|
||||||
|
|
@ -1,218 +0,0 @@
|
||||||
---
|
|
||||||
- name: Integrated Podman Setup - Remove Docker, Install and Configure Podman with Compose for Nomad
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 显示当前处理的节点
|
|
||||||
debug:
|
|
||||||
msg: "🔧 开始集成 Podman 设置: {{ inventory_hostname }}"
|
|
||||||
|
|
||||||
- name: 检查 Docker 服务状态
|
|
||||||
shell: systemctl is-active docker 2>/dev/null || echo "inactive"
|
|
||||||
register: docker_status
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 停止 Docker 服务
|
|
||||||
systemd:
|
|
||||||
name: docker
|
|
||||||
state: stopped
|
|
||||||
enabled: no
|
|
||||||
ignore_errors: yes
|
|
||||||
when: docker_status.stdout == "active"
|
|
||||||
|
|
||||||
- name: 停止 Docker socket
|
|
||||||
systemd:
|
|
||||||
name: docker.socket
|
|
||||||
state: stopped
|
|
||||||
enabled: no
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 移除 Docker 相关包
|
|
||||||
apt:
|
|
||||||
name:
|
|
||||||
- docker-ce
|
|
||||||
- docker-ce-cli
|
|
||||||
- containerd.io
|
|
||||||
- docker-buildx-plugin
|
|
||||||
- docker-compose-plugin
|
|
||||||
- docker.io
|
|
||||||
- docker-doc
|
|
||||||
- docker-compose
|
|
||||||
- docker-registry
|
|
||||||
- containerd
|
|
||||||
- runc
|
|
||||||
state: absent
|
|
||||||
purge: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理 Docker 数据目录
|
|
||||||
file:
|
|
||||||
path: "{{ item }}"
|
|
||||||
state: absent
|
|
||||||
loop:
|
|
||||||
- /var/lib/docker
|
|
||||||
- /var/lib/containerd
|
|
||||||
- /etc/docker
|
|
||||||
- /etc/containerd
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理 Docker 用户组
|
|
||||||
group:
|
|
||||||
name: docker
|
|
||||||
state: absent
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 更新包缓存
|
|
||||||
apt:
|
|
||||||
update_cache: yes
|
|
||||||
cache_valid_time: 3600
|
|
||||||
|
|
||||||
- name: 安装 Podman 及相关工具
|
|
||||||
apt:
|
|
||||||
name:
|
|
||||||
- podman
|
|
||||||
- buildah
|
|
||||||
- skopeo
|
|
||||||
- python3-pip
|
|
||||||
- python3-setuptools
|
|
||||||
state: present
|
|
||||||
retries: 3
|
|
||||||
delay: 10
|
|
||||||
|
|
||||||
- name: 安装 Podman Compose via pip
|
|
||||||
pip:
|
|
||||||
name: podman-compose
|
|
||||||
state: present
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 启用 Podman socket 服务
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 Podman 用户服务目录
|
|
||||||
file:
|
|
||||||
path: /etc/systemd/user
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 验证 Podman 安装
|
|
||||||
shell: podman --version
|
|
||||||
register: podman_version
|
|
||||||
|
|
||||||
- name: 验证 Podman Compose 安装
|
|
||||||
shell: podman-compose --version 2>/dev/null || echo "未安装"
|
|
||||||
register: podman_compose_version
|
|
||||||
|
|
||||||
- name: 检查 Docker 清理状态
|
|
||||||
shell: systemctl is-active docker 2>/dev/null || echo "已移除"
|
|
||||||
register: final_docker_status
|
|
||||||
|
|
||||||
- name: 显示 Docker 移除和 Podman 安装结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ 节点 {{ inventory_hostname }} Docker 移除和 Podman 安装完成
|
|
||||||
🐳 Docker 状态: {{ final_docker_status.stdout }}
|
|
||||||
📦 Podman 版本: {{ podman_version.stdout }}
|
|
||||||
🔧 Compose 状态: {{ podman_compose_version.stdout }}
|
|
||||||
|
|
||||||
- name: 创建 Podman 系统配置目录
|
|
||||||
file:
|
|
||||||
path: /etc/containers
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 配置 Podman 使用系统 socket
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[engine]
|
|
||||||
# 使用系统级 socket 而不是用户级 socket
|
|
||||||
active_service = "system"
|
|
||||||
[engine.service_destinations]
|
|
||||||
[engine.service_destinations.system]
|
|
||||||
uri = "unix:///run/podman/podman.sock"
|
|
||||||
dest: /etc/containers/containers.conf
|
|
||||||
mode: '0644'
|
|
||||||
|
|
||||||
- name: 检查是否存在 nomad 用户
|
|
||||||
getent:
|
|
||||||
database: passwd
|
|
||||||
key: nomad
|
|
||||||
register: nomad_user_check
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 为 nomad 用户创建配置目录
|
|
||||||
file:
|
|
||||||
path: "/home/nomad/.config/containers"
|
|
||||||
state: directory
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0755'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
|
|
||||||
- name: 为 nomad 用户配置 Podman
|
|
||||||
copy:
|
|
||||||
content: |
|
|
||||||
[engine]
|
|
||||||
active_service = "system"
|
|
||||||
[engine.service_destinations]
|
|
||||||
[engine.service_destinations.system]
|
|
||||||
uri = "unix:///run/podman/podman.sock"
|
|
||||||
dest: /home/nomad/.config/containers/containers.conf
|
|
||||||
owner: nomad
|
|
||||||
group: nomad
|
|
||||||
mode: '0644'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
|
|
||||||
- name: 将 nomad 用户添加到 podman 组
|
|
||||||
user:
|
|
||||||
name: nomad
|
|
||||||
groups: podman
|
|
||||||
append: yes
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 podman 组(如果不存在)
|
|
||||||
group:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 设置 podman socket 目录权限
|
|
||||||
file:
|
|
||||||
path: /run/podman
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
group: podman
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 验证 Podman socket 权限
|
|
||||||
file:
|
|
||||||
path: /run/podman/podman.sock
|
|
||||||
mode: '0666'
|
|
||||||
when: nomad_user_check is succeeded
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 测试 Podman 功能
|
|
||||||
shell: podman info
|
|
||||||
register: podman_info
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理 apt 缓存
|
|
||||||
apt:
|
|
||||||
autoclean: yes
|
|
||||||
autoremove: yes
|
|
||||||
|
|
||||||
- name: 显示最终配置结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
🎉 节点 {{ inventory_hostname }} 集成 Podman 设置完成!
|
|
||||||
📦 Podman 版本: {{ podman_version.stdout }}
|
|
||||||
🐳 Podman Compose: {{ podman_compose_version.stdout }}
|
|
||||||
👤 Nomad 用户: {{ 'FOUND' if nomad_user_check is succeeded else 'NOT FOUND' }}
|
|
||||||
🔧 Podman 状态: {{ 'SUCCESS' if podman_info.rc == 0 else 'WARNING' }}
|
|
||||||
🚀 Docker 已移除,Podman 已配置为与 Nomad 集成
|
|
||||||
|
|
@ -1,22 +0,0 @@
|
||||||
---
|
|
||||||
- name: Manually run Nomad agent for debugging
|
|
||||||
hosts: germany
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Find Nomad binary path
|
|
||||||
shell: which nomad || find /usr -name nomad 2>/dev/null | head -1
|
|
||||||
register: nomad_binary_path
|
|
||||||
failed_when: nomad_binary_path.stdout == ""
|
|
||||||
|
|
||||||
- name: Run nomad agent directly
|
|
||||||
command: "{{ nomad_binary_path.stdout }} agent -config=/etc/nomad.d/nomad.hcl"
|
|
||||||
register: nomad_run
|
|
||||||
failed_when: false
|
|
||||||
|
|
||||||
- name: Display Nomad output
|
|
||||||
debug:
|
|
||||||
var: nomad_run.stdout
|
|
||||||
|
|
||||||
- name: Display Nomad error output
|
|
||||||
debug:
|
|
||||||
var: nomad_run.stderr
|
|
||||||
|
|
@ -1,7 +0,0 @@
|
||||||
---
|
|
||||||
- name: Ping nodes to check connectivity
|
|
||||||
hosts: all
|
|
||||||
gather_facts: no
|
|
||||||
tasks:
|
|
||||||
- name: Ping the host
|
|
||||||
ping:
|
|
||||||
|
|
@ -1,12 +0,0 @@
|
||||||
- name: Read Nomad config on germany
|
|
||||||
hosts: germany
|
|
||||||
gather_facts: false
|
|
||||||
tasks:
|
|
||||||
- name: Read nomad.hcl
|
|
||||||
command: cat /etc/nomad.d/nomad.hcl
|
|
||||||
register: nomad_config
|
|
||||||
ignore_errors: true
|
|
||||||
|
|
||||||
- name: Display config
|
|
||||||
debug:
|
|
||||||
msg: "{{ nomad_config.stdout }}"
|
|
||||||
|
|
@ -1,13 +0,0 @@
|
||||||
---
|
|
||||||
- name: Read Nomad config file
|
|
||||||
hosts: localhost
|
|
||||||
gather_facts: no
|
|
||||||
tasks:
|
|
||||||
- name: Read nomad.hcl
|
|
||||||
slurp:
|
|
||||||
src: /etc/nomad.d/nomad.hcl
|
|
||||||
register: nomad_config
|
|
||||||
|
|
||||||
- name: Display Nomad config
|
|
||||||
debug:
|
|
||||||
msg: "{{ nomad_config['content'] | b64decode }}"
|
|
||||||
|
|
@ -1,126 +0,0 @@
|
||||||
---
|
|
||||||
- name: 移除 Docker 并安装带 Compose 功能的 Podman
|
|
||||||
hosts: all
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 显示当前处理的节点
|
|
||||||
debug:
|
|
||||||
msg: "🔧 正在处理节点: {{ inventory_hostname }}"
|
|
||||||
|
|
||||||
- name: 检查 Docker 服务状态
|
|
||||||
shell: systemctl is-active docker 2>/dev/null || echo "inactive"
|
|
||||||
register: docker_status
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 停止 Docker 服务
|
|
||||||
systemd:
|
|
||||||
name: docker
|
|
||||||
state: stopped
|
|
||||||
enabled: no
|
|
||||||
ignore_errors: yes
|
|
||||||
when: docker_status.stdout == "active"
|
|
||||||
|
|
||||||
- name: 停止 Docker socket
|
|
||||||
systemd:
|
|
||||||
name: docker.socket
|
|
||||||
state: stopped
|
|
||||||
enabled: no
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 移除 Docker 相关包
|
|
||||||
apt:
|
|
||||||
name:
|
|
||||||
- docker-ce
|
|
||||||
- docker-ce-cli
|
|
||||||
- containerd.io
|
|
||||||
- docker-buildx-plugin
|
|
||||||
- docker-compose-plugin
|
|
||||||
- docker.io
|
|
||||||
- docker-doc
|
|
||||||
- docker-compose
|
|
||||||
- docker-registry
|
|
||||||
- containerd
|
|
||||||
- runc
|
|
||||||
state: absent
|
|
||||||
purge: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理 Docker 数据目录
|
|
||||||
file:
|
|
||||||
path: "{{ item }}"
|
|
||||||
state: absent
|
|
||||||
loop:
|
|
||||||
- /var/lib/docker
|
|
||||||
- /var/lib/containerd
|
|
||||||
- /etc/docker
|
|
||||||
- /etc/containerd
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 清理 Docker 用户组
|
|
||||||
group:
|
|
||||||
name: docker
|
|
||||||
state: absent
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 更新包缓存
|
|
||||||
apt:
|
|
||||||
update_cache: yes
|
|
||||||
cache_valid_time: 3600
|
|
||||||
|
|
||||||
- name: 安装 Podman 及相关工具
|
|
||||||
apt:
|
|
||||||
name:
|
|
||||||
- podman
|
|
||||||
- buildah
|
|
||||||
- skopeo
|
|
||||||
- python3-pip
|
|
||||||
- python3-setuptools
|
|
||||||
state: present
|
|
||||||
retries: 3
|
|
||||||
delay: 10
|
|
||||||
|
|
||||||
- name: 安装 Podman Compose via pip
|
|
||||||
pip:
|
|
||||||
name: podman-compose
|
|
||||||
state: present
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 启用 Podman socket 服务
|
|
||||||
systemd:
|
|
||||||
name: podman.socket
|
|
||||||
enabled: yes
|
|
||||||
state: started
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 创建 Podman 用户服务目录
|
|
||||||
file:
|
|
||||||
path: /etc/systemd/user
|
|
||||||
state: directory
|
|
||||||
mode: '0755'
|
|
||||||
|
|
||||||
- name: 验证 Podman 安装
|
|
||||||
shell: podman --version
|
|
||||||
register: podman_version
|
|
||||||
|
|
||||||
- name: 验证 Podman Compose 安装
|
|
||||||
shell: podman-compose --version 2>/dev/null || echo "未安装"
|
|
||||||
register: podman_compose_version
|
|
||||||
|
|
||||||
- name: 检查 Docker 清理状态
|
|
||||||
shell: systemctl is-active docker 2>/dev/null || echo "已移除"
|
|
||||||
register: final_docker_status
|
|
||||||
|
|
||||||
- name: 显示节点处理结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ 节点 {{ inventory_hostname }} 处理完成
|
|
||||||
🐳 Docker 状态: {{ final_docker_status.stdout }}
|
|
||||||
📦 Podman 版本: {{ podman_version.stdout }}
|
|
||||||
🔧 Compose 状态: {{ podman_compose_version.stdout }}
|
|
||||||
|
|
||||||
- name: 清理 apt 缓存
|
|
||||||
apt:
|
|
||||||
autoclean: yes
|
|
||||||
autoremove: yes
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
- name: 安装并配置新的 Nomad Server 节点
|
- name: 安装并配置新的 Nomad Server 节点
|
||||||
hosts: influxdb1
|
hosts: ash2e,ash1d,ch2
|
||||||
become: yes
|
become: yes
|
||||||
gather_facts: no
|
gather_facts: no
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,100 +0,0 @@
|
||||||
---
|
|
||||||
- name: 测试将 Podman 切换到 Snap 版本 (ch2 节点)
|
|
||||||
hosts: ch2
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 检查当前 Podman 版本和安装方式
|
|
||||||
shell: |
|
|
||||||
echo "=== 当前 Podman 信息 ==="
|
|
||||||
podman --version
|
|
||||||
echo "安装路径: $(which podman)"
|
|
||||||
echo "=== Snap 状态 ==="
|
|
||||||
which snap || echo "snap 未安装"
|
|
||||||
snap list podman 2>/dev/null || echo "Podman snap 未安装"
|
|
||||||
echo "=== 包管理器状态 ==="
|
|
||||||
dpkg -l | grep podman || echo "未通过 apt 安装"
|
|
||||||
register: current_status
|
|
||||||
|
|
||||||
- name: 显示当前状态
|
|
||||||
debug:
|
|
||||||
msg: "{{ current_status.stdout }}"
|
|
||||||
|
|
||||||
- name: 检查 snap 是否已安装
|
|
||||||
shell: which snap
|
|
||||||
register: snap_check
|
|
||||||
ignore_errors: yes
|
|
||||||
changed_when: false
|
|
||||||
|
|
||||||
- name: 安装 snapd (如果未安装)
|
|
||||||
apt:
|
|
||||||
name: snapd
|
|
||||||
state: present
|
|
||||||
when: snap_check.rc != 0
|
|
||||||
|
|
||||||
- name: 确保 snapd 服务运行
|
|
||||||
systemd:
|
|
||||||
name: snapd
|
|
||||||
state: started
|
|
||||||
enabled: yes
|
|
||||||
|
|
||||||
- name: 检查当前 Podman snap 版本
|
|
||||||
shell: snap info podman
|
|
||||||
register: snap_podman_info
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示可用的 Podman snap 版本
|
|
||||||
debug:
|
|
||||||
msg: "{{ snap_podman_info.stdout if snap_podman_info.rc == 0 else '无法获取 snap podman 信息' }}"
|
|
||||||
|
|
||||||
- name: 停止当前 Podman 相关服务
|
|
||||||
systemd:
|
|
||||||
name: podman
|
|
||||||
state: stopped
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 移除通过包管理器安装的 Podman
|
|
||||||
apt:
|
|
||||||
name: podman
|
|
||||||
state: absent
|
|
||||||
purge: yes
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 安装 Podman snap (edge 通道)
|
|
||||||
snap:
|
|
||||||
name: podman
|
|
||||||
state: present
|
|
||||||
classic: yes
|
|
||||||
channel: edge
|
|
||||||
|
|
||||||
- name: 创建符号链接 (确保 podman 命令可用)
|
|
||||||
file:
|
|
||||||
src: /snap/bin/podman
|
|
||||||
dest: /usr/local/bin/podman
|
|
||||||
state: link
|
|
||||||
force: yes
|
|
||||||
|
|
||||||
- name: 验证 Snap Podman 安装
|
|
||||||
shell: |
|
|
||||||
/snap/bin/podman --version
|
|
||||||
which podman
|
|
||||||
register: snap_podman_verify
|
|
||||||
|
|
||||||
- name: 显示安装结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ Snap Podman 安装完成
|
|
||||||
🚀 版本: {{ snap_podman_verify.stdout_lines[0] }}
|
|
||||||
📍 路径: {{ snap_podman_verify.stdout_lines[1] }}
|
|
||||||
|
|
||||||
- name: 测试 Podman 基本功能
|
|
||||||
shell: |
|
|
||||||
/snap/bin/podman version
|
|
||||||
/snap/bin/podman info --format json | jq -r '.host.arch'
|
|
||||||
register: podman_test
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示测试结果
|
|
||||||
debug:
|
|
||||||
msg: "Podman 测试结果: {{ podman_test.stdout if podman_test.rc == 0 else '测试失败' }}"
|
|
||||||
|
|
@ -1,37 +0,0 @@
|
||||||
---
|
|
||||||
- name: Update Nomad config to run as a client
|
|
||||||
hosts: localhost
|
|
||||||
gather_facts: no
|
|
||||||
become: yes
|
|
||||||
tasks:
|
|
||||||
- name: Create new nomad.hcl
|
|
||||||
copy:
|
|
||||||
dest: /etc/nomad.d/nomad.hcl
|
|
||||||
content: |
|
|
||||||
datacenter = "dc1"
|
|
||||||
data_dir = "/opt/nomad/data"
|
|
||||||
log_level = "INFO"
|
|
||||||
|
|
||||||
bind_addr = "100.116.158.95"
|
|
||||||
|
|
||||||
server {
|
|
||||||
enabled = false
|
|
||||||
}
|
|
||||||
|
|
||||||
client {
|
|
||||||
enabled = true
|
|
||||||
servers = ["100.81.26.3:4647", "100.103.147.94:4647", "100.90.159.68:4647"]
|
|
||||||
}
|
|
||||||
|
|
||||||
plugin "podman" {
|
|
||||||
config {
|
|
||||||
socket_path = "unix:///run/podman/podman.sock"
|
|
||||||
volumes {
|
|
||||||
enabled = true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
consul {
|
|
||||||
address = "100.116.158.95:8500"
|
|
||||||
}
|
|
||||||
|
|
@ -1,77 +0,0 @@
|
||||||
---
|
|
||||||
- name: 升级 Podman 到最新版本 (warden 节点测试)
|
|
||||||
hosts: warden
|
|
||||||
become: yes
|
|
||||||
gather_facts: yes
|
|
||||||
|
|
||||||
tasks:
|
|
||||||
- name: 检查当前 Podman 版本
|
|
||||||
shell: podman --version
|
|
||||||
register: current_podman_version
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示当前版本
|
|
||||||
debug:
|
|
||||||
msg: "当前 Podman 版本: {{ current_podman_version.stdout if current_podman_version.rc == 0 else '未安装或无法获取' }}"
|
|
||||||
|
|
||||||
- name: 备份现有 Podman 配置
|
|
||||||
shell: |
|
|
||||||
if [ -d /etc/containers ]; then
|
|
||||||
cp -r /etc/containers /etc/containers.backup.$(date +%Y%m%d)
|
|
||||||
fi
|
|
||||||
if [ -d /usr/share/containers ]; then
|
|
||||||
cp -r /usr/share/containers /usr/share/containers.backup.$(date +%Y%m%d)
|
|
||||||
fi
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 添加 Kubic 仓库 (HTTP 跳过签名)
|
|
||||||
shell: |
|
|
||||||
# 添加仓库并跳过签名验证
|
|
||||||
echo "deb [trusted=yes] http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_22.04/ /" > /etc/apt/sources.list.d/kubic-containers.list
|
|
||||||
|
|
||||||
- name: 更新包列表 (跳过签名验证)
|
|
||||||
shell: apt-get update -o Acquire::AllowInsecureRepositories=true -o Acquire::AllowDowngradeToInsecureRepositories=true
|
|
||||||
|
|
||||||
- name: 检查仓库中可用的 Podman 版本
|
|
||||||
shell: apt-cache policy podman
|
|
||||||
register: podman_versions
|
|
||||||
|
|
||||||
- name: 显示可用的 Podman 版本
|
|
||||||
debug:
|
|
||||||
msg: "{{ podman_versions.stdout }}"
|
|
||||||
|
|
||||||
- name: 安装 Podman 5.x (强制跳过签名)
|
|
||||||
shell: apt-get install -y --allow-unauthenticated --allow-downgrades --allow-remove-essential --allow-change-held-packages podman
|
|
||||||
|
|
||||||
- name: 验证 Podman 5.x 安装
|
|
||||||
shell: |
|
|
||||||
podman --version
|
|
||||||
podman info --format json | jq -r '.Version.Version'
|
|
||||||
register: podman_5_verify
|
|
||||||
|
|
||||||
- name: 显示升级结果
|
|
||||||
debug:
|
|
||||||
msg: |
|
|
||||||
✅ Podman 升级完成
|
|
||||||
🚀 新版本: {{ podman_5_verify.stdout_lines[0] }}
|
|
||||||
📊 详细版本: {{ podman_5_verify.stdout_lines[1] }}
|
|
||||||
|
|
||||||
- name: 测试基本功能
|
|
||||||
shell: |
|
|
||||||
podman run --rm hello-world
|
|
||||||
register: podman_test
|
|
||||||
ignore_errors: yes
|
|
||||||
|
|
||||||
- name: 显示测试结果
|
|
||||||
debug:
|
|
||||||
msg: "Podman 功能测试: {{ '成功' if podman_test.rc == 0 else '失败 - ' + podman_test.stderr }}"
|
|
||||||
|
|
||||||
- name: 检查相关服务状态
|
|
||||||
shell: |
|
|
||||||
systemctl status podman.socket 2>/dev/null || echo "podman.socket 未运行"
|
|
||||||
systemctl status containerd 2>/dev/null || echo "containerd 未运行"
|
|
||||||
register: service_status
|
|
||||||
|
|
||||||
- name: 显示服务状态
|
|
||||||
debug:
|
|
||||||
msg: "{{ service_status.stdout }}"
|
|
||||||
Binary file not shown.
|
|
@ -1,81 +0,0 @@
|
||||||
job "consul-cluster" {
|
|
||||||
datacenters = ["dc1"]
|
|
||||||
type = "service"
|
|
||||||
|
|
||||||
constraint {
|
|
||||||
attribute = "${node.unique.name}"
|
|
||||||
operator = "regexp"
|
|
||||||
value = "^(master|ash3c|hcs)$"
|
|
||||||
}
|
|
||||||
|
|
||||||
group "consul" {
|
|
||||||
count = 3
|
|
||||||
|
|
||||||
network {
|
|
||||||
port "http" {
|
|
||||||
static = 8500
|
|
||||||
}
|
|
||||||
port "serf_lan" {
|
|
||||||
static = 8301
|
|
||||||
}
|
|
||||||
port "serf_wan" {
|
|
||||||
static = 8302
|
|
||||||
}
|
|
||||||
port "server" {
|
|
||||||
static = 8300
|
|
||||||
}
|
|
||||||
port "dns" {
|
|
||||||
static = 8600
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
service {
|
|
||||||
name = "consul"
|
|
||||||
port = "http"
|
|
||||||
|
|
||||||
check {
|
|
||||||
type = "http"
|
|
||||||
path = "/v1/status/leader"
|
|
||||||
interval = "10s"
|
|
||||||
timeout = "2s"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
task "consul" {
|
|
||||||
driver = "podman"
|
|
||||||
|
|
||||||
config {
|
|
||||||
image = "consul:1.15.4"
|
|
||||||
network_mode = "host"
|
|
||||||
|
|
||||||
args = [
|
|
||||||
"agent",
|
|
||||||
"-server",
|
|
||||||
"-bootstrap-expect=3",
|
|
||||||
"-ui",
|
|
||||||
"-data-dir=/consul/data",
|
|
||||||
"-config-dir=/consul/config",
|
|
||||||
"-bind={{ env \"attr.unique.network.ip-address\" }}",
|
|
||||||
"-client=0.0.0.0",
|
|
||||||
"-retry-join=100.117.106.136",
|
|
||||||
"-retry-join=100.116.80.94",
|
|
||||||
"-retry-join=100.84.197.26"
|
|
||||||
]
|
|
||||||
|
|
||||||
volumes = [
|
|
||||||
"consul-data:/consul/data",
|
|
||||||
"consul-config:/consul/config"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
resources {
|
|
||||||
cpu = 500
|
|
||||||
memory = 512
|
|
||||||
}
|
|
||||||
|
|
||||||
env {
|
|
||||||
CONSUL_BIND_INTERFACE = "tailscale0"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -1,57 +0,0 @@
|
||||||
job "consul-cluster" {
|
|
||||||
datacenters = ["dc1"]
|
|
||||||
type = "service"
|
|
||||||
|
|
||||||
group "consul-servers" {
|
|
||||||
count = 3
|
|
||||||
|
|
||||||
constraint {
|
|
||||||
attribute = "${node.unique.name}"
|
|
||||||
operator = "regexp"
|
|
||||||
value = "(master|ash3c|hcp)"
|
|
||||||
}
|
|
||||||
|
|
||||||
task "consul" {
|
|
||||||
driver = "podman"
|
|
||||||
|
|
||||||
config {
|
|
||||||
image = "hashicorp/consul:latest"
|
|
||||||
ports = ["server", "serf_lan", "serf_wan", "ui"]
|
|
||||||
args = [
|
|
||||||
"agent",
|
|
||||||
"-server",
|
|
||||||
"-bootstrap-expect=3",
|
|
||||||
"-data-dir=/consul/data",
|
|
||||||
"-ui",
|
|
||||||
"-client=0.0.0.0",
|
|
||||||
"-bind={{ env `NOMAD_IP_server` }}",
|
|
||||||
"-retry-join=100.117.106.136",
|
|
||||||
"-retry-join=100.116.80.94",
|
|
||||||
"-retry-join=100.76.13.187"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
volume_mount {
|
|
||||||
volume = "consul-data"
|
|
||||||
destination = "/consul/data"
|
|
||||||
read_only = false
|
|
||||||
}
|
|
||||||
|
|
||||||
resources {
|
|
||||||
network {
|
|
||||||
mbits = 10
|
|
||||||
port "server" { static = 8300 }
|
|
||||||
port "serf_lan" { static = 8301 }
|
|
||||||
port "serf_wan" { static = 8302 }
|
|
||||||
port "ui" { static = 8500 }
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
volume "consul-data" {
|
|
||||||
type = "host"
|
|
||||||
read_only = false
|
|
||||||
source = "consul-data"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -0,0 +1,240 @@
|
||||||
|
# ZSH 配置总结
|
||||||
|
|
||||||
|
## 已安装和配置的组件
|
||||||
|
|
||||||
|
### 1. 基础组件
|
||||||
|
- ✅ **oh-my-zsh**: 已安装并配置
|
||||||
|
- ✅ **zsh**: 版本 5.9
|
||||||
|
- ✅ **Powerline 字体**: 已安装支持
|
||||||
|
- ✅ **tmux**: 已安装
|
||||||
|
|
||||||
|
### 2. 核心插件
|
||||||
|
- ✅ **git**: Git 集成和别名
|
||||||
|
- ✅ **docker**: Docker 命令补全和别名
|
||||||
|
- ✅ **docker-compose**: Docker Compose 支持
|
||||||
|
- ✅ **ansible**: Ansible 命令补全
|
||||||
|
- ✅ **terraform**: Terraform/OpenTofu 支持
|
||||||
|
- ✅ **kubectl**: Kubernetes 命令补全
|
||||||
|
- ✅ **helm**: Helm 包管理器支持
|
||||||
|
- ✅ **aws**: AWS CLI 支持
|
||||||
|
- ✅ **gcloud**: Google Cloud CLI 支持
|
||||||
|
|
||||||
|
### 3. 增强插件
|
||||||
|
- ✅ **zsh-autosuggestions**: 命令自动建议
|
||||||
|
- ✅ **zsh-syntax-highlighting**: 语法高亮
|
||||||
|
- ✅ **zsh-completions**: 增强补全功能
|
||||||
|
- ✅ **colored-man-pages**: 彩色手册页
|
||||||
|
- ✅ **command-not-found**: 命令未找到提示
|
||||||
|
- ✅ **extract**: 解压文件支持
|
||||||
|
- ✅ **history-substring-search**: 历史搜索
|
||||||
|
- ✅ **sudo**: sudo 支持
|
||||||
|
- ✅ **systemd**: systemd 服务管理
|
||||||
|
- ✅ **tmux**: tmux 集成
|
||||||
|
- ✅ **vscode**: VS Code 集成
|
||||||
|
- ✅ **web-search**: 网络搜索
|
||||||
|
- ✅ **z**: 智能目录跳转
|
||||||
|
|
||||||
|
### 4. 主题
|
||||||
|
- ✅ **agnoster**: 功能丰富的主题,支持 Git 状态显示
|
||||||
|
|
||||||
|
## 自定义别名
|
||||||
|
|
||||||
|
### 项目管理别名
|
||||||
|
```bash
|
||||||
|
mgmt # 进入管理项目目录
|
||||||
|
mgmt-status # 显示项目状态
|
||||||
|
mgmt-deploy # 快速部署
|
||||||
|
mgmt-cleanup # 清理环境
|
||||||
|
mgmt-swarm # Swarm 管理
|
||||||
|
mgmt-tofu # OpenTofu 管理
|
||||||
|
```
|
||||||
|
|
||||||
|
### Ansible 别名
|
||||||
|
```bash
|
||||||
|
ansible-check # 语法检查
|
||||||
|
ansible-deploy # 部署
|
||||||
|
ansible-ping # 连通性测试
|
||||||
|
ansible-vault # 密码管理
|
||||||
|
ansible-galaxy # 角色管理
|
||||||
|
```
|
||||||
|
|
||||||
|
### OpenTofu/Terraform 别名
|
||||||
|
```bash
|
||||||
|
tofu-init # 初始化
|
||||||
|
tofu-plan # 计划
|
||||||
|
tofu-apply # 应用
|
||||||
|
tofu-destroy # 销毁
|
||||||
|
tofu-output # 输出
|
||||||
|
tofu-validate # 验证
|
||||||
|
tofu-fmt # 格式化
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docker 别名
|
||||||
|
```bash
|
||||||
|
d # docker
|
||||||
|
dc # docker-compose
|
||||||
|
dps # docker ps
|
||||||
|
dpsa # docker ps -a
|
||||||
|
di # docker images
|
||||||
|
dex # docker exec -it
|
||||||
|
dlog # docker logs -f
|
||||||
|
dclean # docker system prune -f
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docker Swarm 别名
|
||||||
|
```bash
|
||||||
|
dswarm # docker swarm
|
||||||
|
dstack # docker stack
|
||||||
|
dservice # docker service
|
||||||
|
dnode # docker node
|
||||||
|
dnetwork # docker network
|
||||||
|
dsecret # docker secret
|
||||||
|
dconfig # docker config
|
||||||
|
```
|
||||||
|
|
||||||
|
### Kubernetes 别名
|
||||||
|
```bash
|
||||||
|
k # kubectl
|
||||||
|
kgp # kubectl get pods
|
||||||
|
kgs # kubectl get services
|
||||||
|
kgd # kubectl get deployments
|
||||||
|
kgn # kubectl get nodes
|
||||||
|
kaf # kubectl apply -f
|
||||||
|
kdf # kubectl delete -f
|
||||||
|
kl # kubectl logs -f
|
||||||
|
```
|
||||||
|
|
||||||
|
### Git 别名
|
||||||
|
```bash
|
||||||
|
gs # git status
|
||||||
|
ga # git add
|
||||||
|
gc # git commit
|
||||||
|
gp # git push
|
||||||
|
gl # git pull
|
||||||
|
gd # git diff
|
||||||
|
gb # git branch
|
||||||
|
gco # git checkout
|
||||||
|
```
|
||||||
|
|
||||||
|
### 系统别名
|
||||||
|
```bash
|
||||||
|
ll # ls -alF
|
||||||
|
la # ls -A
|
||||||
|
l # ls -CF
|
||||||
|
.. # cd ..
|
||||||
|
... # cd ../..
|
||||||
|
.... # cd ../../..
|
||||||
|
grep # grep --color=auto
|
||||||
|
ports # netstat -tuln
|
||||||
|
myip # 获取公网IP
|
||||||
|
speedtest # 网速测试
|
||||||
|
psg # ps aux | grep
|
||||||
|
top # htop
|
||||||
|
```
|
||||||
|
|
||||||
|
## 配置文件位置
|
||||||
|
|
||||||
|
- **主配置**: `~/.zshrc`
|
||||||
|
- **自定义别名**: `~/.oh-my-zsh/custom/aliases.zsh`
|
||||||
|
- **代理配置**: `/root/mgmt/configuration/proxy.env`
|
||||||
|
|
||||||
|
## 使用方法
|
||||||
|
|
||||||
|
### 启动 ZSH
|
||||||
|
```bash
|
||||||
|
zsh
|
||||||
|
```
|
||||||
|
|
||||||
|
### 重新加载配置
|
||||||
|
```bash
|
||||||
|
source ~/.zshrc
|
||||||
|
```
|
||||||
|
|
||||||
|
### 查看所有别名
|
||||||
|
```bash
|
||||||
|
alias
|
||||||
|
```
|
||||||
|
|
||||||
|
### 查看特定别名
|
||||||
|
```bash
|
||||||
|
alias | grep docker
|
||||||
|
alias | grep mgmt
|
||||||
|
```
|
||||||
|
|
||||||
|
## 功能特性
|
||||||
|
|
||||||
|
### 1. 自动建议
|
||||||
|
- 输入命令时会显示历史命令建议
|
||||||
|
- 使用 `→` 键接受建议
|
||||||
|
|
||||||
|
### 2. 语法高亮
|
||||||
|
- 命令输入时实时语法高亮
|
||||||
|
- 错误命令显示为红色
|
||||||
|
|
||||||
|
### 3. 智能补全
|
||||||
|
- 支持所有已安装工具的补全
|
||||||
|
- 支持文件路径补全
|
||||||
|
- 支持命令参数补全
|
||||||
|
|
||||||
|
### 4. 历史搜索
|
||||||
|
- 使用 `↑` `↓` 键搜索历史命令
|
||||||
|
- 支持部分匹配搜索
|
||||||
|
|
||||||
|
### 5. 目录跳转
|
||||||
|
- 使用 `z` 命令智能跳转到常用目录
|
||||||
|
- 基于访问频率和最近访问时间
|
||||||
|
|
||||||
|
### 6. 代理支持
|
||||||
|
- 自动加载代理配置
|
||||||
|
- 支持 HTTP/HTTPS 代理
|
||||||
|
|
||||||
|
## 故障排除
|
||||||
|
|
||||||
|
### 如果别名不工作
|
||||||
|
```bash
|
||||||
|
# 检查别名是否加载
|
||||||
|
alias | grep <alias-name>
|
||||||
|
|
||||||
|
# 重新加载配置
|
||||||
|
source ~/.zshrc
|
||||||
|
```
|
||||||
|
|
||||||
|
### 如果插件不工作
|
||||||
|
```bash
|
||||||
|
# 检查插件是否安装
|
||||||
|
ls ~/.oh-my-zsh/plugins/ | grep <plugin-name>
|
||||||
|
|
||||||
|
# 检查自定义插件
|
||||||
|
ls ~/.oh-my-zsh/custom/plugins/
|
||||||
|
```
|
||||||
|
|
||||||
|
### 如果主题显示异常
|
||||||
|
```bash
|
||||||
|
# 检查字体是否安装
|
||||||
|
fc-list | grep Powerline
|
||||||
|
|
||||||
|
# 尝试其他主题
|
||||||
|
# 编辑 ~/.zshrc 中的 ZSH_THEME
|
||||||
|
```
|
||||||
|
|
||||||
|
## 扩展建议
|
||||||
|
|
||||||
|
### 可以添加的额外插件
|
||||||
|
- **fzf**: 模糊查找
|
||||||
|
- **bat**: 更好的 cat 命令
|
||||||
|
- **exa**: 更好的 ls 命令
|
||||||
|
- **ripgrep**: 更快的 grep
|
||||||
|
- **fd**: 更快的 find
|
||||||
|
|
||||||
|
### 可以添加的额外别名
|
||||||
|
- 根据个人使用习惯添加更多别名
|
||||||
|
- 为常用命令组合创建别名
|
||||||
|
- 为项目特定命令创建别名
|
||||||
|
|
||||||
|
## 性能优化
|
||||||
|
|
||||||
|
- 已配置的插件数量适中,不会显著影响启动速度
|
||||||
|
- 使用 `zsh-completions` 提供更好的补全性能
|
||||||
|
- 历史记录配置优化,避免内存占用过大
|
||||||
|
|
||||||
|
配置完成!现在您拥有了一个功能强大、高度定制的 ZSH 环境,专门为管理系统的需求进行了优化。
|
||||||
|
|
@ -1,110 +0,0 @@
|
||||||
job "install-podman-driver" {
|
|
||||||
datacenters = ["dc1"]
|
|
||||||
type = "system" # 在所有节点上运行
|
|
||||||
|
|
||||||
group "install" {
|
|
||||||
task "install-podman" {
|
|
||||||
driver = "exec"
|
|
||||||
|
|
||||||
config {
|
|
||||||
command = "bash"
|
|
||||||
args = [
|
|
||||||
"-c",
|
|
||||||
<<-EOF
|
|
||||||
set -euo pipefail
|
|
||||||
export PATH="/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin"
|
|
||||||
|
|
||||||
# 依赖工具
|
|
||||||
if ! command -v jq >/dev/null 2>&1 || ! command -v unzip >/dev/null 2>&1 || ! command -v wget >/dev/null 2>&1; then
|
|
||||||
echo "Installing dependencies (jq unzip wget)..."
|
|
||||||
sudo -n apt update -y || true
|
|
||||||
sudo -n apt install -y jq unzip wget || true
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 安装 Podman(若未安装)
|
|
||||||
if ! command -v podman >/dev/null 2>&1; then
|
|
||||||
echo "Installing Podman..."
|
|
||||||
sudo -n apt update -y || true
|
|
||||||
sudo -n apt install -y podman || true
|
|
||||||
sudo -n systemctl enable podman || true
|
|
||||||
else
|
|
||||||
echo "Podman already installed"
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 启用并启动 podman.socket,确保 Nomad 可访问
|
|
||||||
sudo -n systemctl enable --now podman.socket || true
|
|
||||||
if getent group podman >/dev/null 2>&1; then
|
|
||||||
sudo -n usermod -aG podman nomad || true
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 安装 Nomad Podman 驱动插件(始终确保存在)
|
|
||||||
PODMAN_DRIVER_VERSION="0.6.1"
|
|
||||||
PLUGIN_DIR="/opt/nomad/data/plugins"
|
|
||||||
sudo -n mkdir -p "${PLUGIN_DIR}" || true
|
|
||||||
cd /tmp
|
|
||||||
if [ ! -x "${PLUGIN_DIR}/nomad-driver-podman" ]; then
|
|
||||||
echo "Installing nomad-driver-podman ${PODMAN_DRIVER_VERSION}..."
|
|
||||||
wget -q "https://releases.hashicorp.com/nomad-driver-podman/${PODMAN_DRIVER_VERSION}/nomad-driver-podman_${PODMAN_DRIVER_VERSION}_linux_amd64.zip"
|
|
||||||
unzip -o "nomad-driver-podman_${PODMAN_DRIVER_VERSION}_linux_amd64.zip"
|
|
||||||
sudo -n mv -f nomad-driver-podman "${PLUGIN_DIR}/"
|
|
||||||
sudo -n chmod +x "${PLUGIN_DIR}/nomad-driver-podman"
|
|
||||||
sudo -n chown -R nomad:nomad "${PLUGIN_DIR}"
|
|
||||||
rm -f "nomad-driver-podman_${PODMAN_DRIVER_VERSION}_linux_amd64.zip"
|
|
||||||
else
|
|
||||||
echo "nomad-driver-podman already present in ${PLUGIN_DIR}"
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 更新 /etc/nomad.d/nomad.hcl 的 plugin_dir 设置
|
|
||||||
if [ -f /etc/nomad.d/nomad.hcl ]; then
|
|
||||||
if grep -q "^plugin_dir\s*=\s*\"" /etc/nomad.d/nomad.hcl; then
|
|
||||||
sudo -n sed -i 's#^plugin_dir\s*=\s*\".*\"#plugin_dir = "/opt/nomad/data/plugins"#' /etc/nomad.d/nomad.hcl || true
|
|
||||||
else
|
|
||||||
echo 'plugin_dir = "/opt/nomad/data/plugins"' | sudo -n tee -a /etc/nomad.d/nomad.hcl >/dev/null || true
|
|
||||||
fi
|
|
||||||
fi
|
|
||||||
|
|
||||||
# 重启 Nomad 服务以加载插件
|
|
||||||
sudo -n systemctl restart nomad || true
|
|
||||||
echo "Waiting for Nomad to restart..."
|
|
||||||
sleep 15
|
|
||||||
|
|
||||||
# 检查 Podman 驱动是否被 Nomad 检测到
|
|
||||||
if /usr/local/bin/nomad node status -self -json 2>/dev/null | jq -r '.Drivers.podman.Detected' | grep -q "true"; then
|
|
||||||
echo "Podman driver successfully loaded"
|
|
||||||
exit 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
echo "Podman driver not detected yet, retrying once after socket restart..."
|
|
||||||
sudo -n systemctl restart podman.socket || true
|
|
||||||
sleep 5
|
|
||||||
if /usr/local/bin/nomad node status -self -json 2>/dev/null | jq -r '.Drivers.podman.Detected' | grep -q "true"; then
|
|
||||||
echo "Podman driver successfully loaded after socket restart"
|
|
||||||
exit 0
|
|
||||||
else
|
|
||||||
echo "Podman driver still not detected; manual investigation may be required"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
EOF
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
resources {
|
|
||||||
cpu = 200
|
|
||||||
memory = 256
|
|
||||||
}
|
|
||||||
|
|
||||||
// 以root权限运行
|
|
||||||
// user = "root"
|
|
||||||
# 使用 nomad 用户运行任务,避免客户端策略禁止 root
|
|
||||||
user = "nomad"
|
|
||||||
|
|
||||||
# 确保任务成功完成
|
|
||||||
restart {
|
|
||||||
attempts = 1
|
|
||||||
interval = "24h"
|
|
||||||
delay = "60s"
|
|
||||||
mode = "fail"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -1,39 +0,0 @@
|
||||||
#!/bin/bash
|
|
||||||
set -euo pipefail
|
|
||||||
|
|
||||||
ADDR="http://100.81.26.3:4646"
|
|
||||||
# 检查 NOMAD_TOKEN 是否设置,如果设置了,则准备好 Header
|
|
||||||
HDR=""
|
|
||||||
if [ -n "${NOMAD_TOKEN:-}" ]; then
|
|
||||||
HDR="-H "X-Nomad-Token: $NOMAD_TOKEN""
|
|
||||||
fi
|
|
||||||
|
|
||||||
echo "--- 节点列表 (Before) ---"
|
|
||||||
nomad node status -address="$ADDR"
|
|
||||||
|
|
||||||
echo
|
|
||||||
echo "--- 开始查找需要清理的旧节点 ---"
|
|
||||||
|
|
||||||
# 使用 jq 从 nomad node status 的 json 输出中精确查找
|
|
||||||
# 条件: 状态为 "down" 且 名称匹配列表
|
|
||||||
IDS_TO_PURGE=$(nomad node status -address="$ADDR" -json | jq -r '.[] | select(.Status == "down" and (.Name | test("^(ch3|ch2|ash1d|ash2e|semaphore)$"))) | .ID')
|
|
||||||
|
|
||||||
if [[ -z "$IDS_TO_PURGE" ]]; then
|
|
||||||
echo "✅ 未找到符合条件的 'down' 状态节点,无需清理。"
|
|
||||||
else
|
|
||||||
echo "以下是待清理的节点 ID:"
|
|
||||||
echo "$IDS_TO_PURGE"
|
|
||||||
echo
|
|
||||||
|
|
||||||
# 循环遍历 ID,使用 curl 调用 HTTP API 进行 purge
|
|
||||||
for NODE_ID in $IDS_TO_PURGE; do
|
|
||||||
echo "===> 正在清理节点: $NODE_ID"
|
|
||||||
# 构造 curl 命令,并使用 eval 来正确处理可能为空的 $HDR
|
|
||||||
cmd="curl -sS -XPOST $HDR -w ' -> HTTP %{http_code}\n' '$ADDR/v1/node/$NODE_ID/purge'"
|
|
||||||
eval $cmd
|
|
||||||
done
|
|
||||||
fi
|
|
||||||
|
|
||||||
echo
|
|
||||||
echo "--- 节点列表 (After) ---"
|
|
||||||
nomad node status -address="$ADDR"
|
|
||||||
|
|
@ -0,0 +1,106 @@
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# 最小化 ZSH 配置 - 适合快速部署
|
||||||
|
# 用法: curl -fsSL https://your-gitea.com/ben/mgmt/raw/branch/main/snippets/zsh/zshrc-minimal.sh | bash
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# 颜色定义
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
BLUE='\033[0;34m'
|
||||||
|
NC='\033[0m'
|
||||||
|
|
||||||
|
log_info() { echo -e "${BLUE}[INFO]${NC} $1"; }
|
||||||
|
log_success() { echo -e "${GREEN}[SUCCESS]${NC} $1"; }
|
||||||
|
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
|
||||||
|
|
||||||
|
# 检查 root 权限
|
||||||
|
if [[ $EUID -ne 0 ]]; then
|
||||||
|
log_error "需要 root 权限"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
log_info "开始安装最小化 ZSH 配置..."
|
||||||
|
|
||||||
|
# 安装依赖
|
||||||
|
apt update && apt install -y zsh git curl fonts-powerline
|
||||||
|
|
||||||
|
# 安装 oh-my-zsh
|
||||||
|
if [[ ! -d "$HOME/.oh-my-zsh" ]]; then
|
||||||
|
RUNZSH=no CHSH=no sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# 安装关键插件
|
||||||
|
custom_dir="$HOME/.oh-my-zsh/custom/plugins"
|
||||||
|
mkdir -p "$custom_dir"
|
||||||
|
|
||||||
|
[[ ! -d "$custom_dir/zsh-autosuggestions" ]] && git clone https://github.com/zsh-users/zsh-autosuggestions "$custom_dir/zsh-autosuggestions"
|
||||||
|
[[ ! -d "$custom_dir/zsh-syntax-highlighting" ]] && git clone https://github.com/zsh-users/zsh-syntax-highlighting.git "$custom_dir/zsh-syntax-highlighting"
|
||||||
|
|
||||||
|
# 创建最小化配置
|
||||||
|
cat > "$HOME/.zshrc" << 'EOF'
|
||||||
|
# Oh My Zsh 配置
|
||||||
|
export ZSH="$HOME/.oh-my-zsh"
|
||||||
|
ZSH_THEME="agnoster"
|
||||||
|
|
||||||
|
plugins=(
|
||||||
|
git
|
||||||
|
docker
|
||||||
|
ansible
|
||||||
|
terraform
|
||||||
|
kubectl
|
||||||
|
zsh-autosuggestions
|
||||||
|
zsh-syntax-highlighting
|
||||||
|
)
|
||||||
|
|
||||||
|
source $ZSH/oh-my-zsh.sh
|
||||||
|
|
||||||
|
# 基本别名
|
||||||
|
alias ll='ls -alF'
|
||||||
|
alias la='ls -A'
|
||||||
|
alias l='ls -CF'
|
||||||
|
alias ..='cd ..'
|
||||||
|
alias ...='cd ../..'
|
||||||
|
alias grep='grep --color=auto'
|
||||||
|
|
||||||
|
# Docker 别名
|
||||||
|
alias d='docker'
|
||||||
|
alias dps='docker ps'
|
||||||
|
alias dpsa='docker ps -a'
|
||||||
|
alias dex='docker exec -it'
|
||||||
|
alias dlog='docker logs -f'
|
||||||
|
|
||||||
|
# Kubernetes 别名
|
||||||
|
alias k='kubectl'
|
||||||
|
alias kgp='kubectl get pods'
|
||||||
|
alias kgs='kubectl get services'
|
||||||
|
alias kgd='kubectl get deployments'
|
||||||
|
|
||||||
|
# Git 别名
|
||||||
|
alias gs='git status'
|
||||||
|
alias ga='git add'
|
||||||
|
alias gc='git commit'
|
||||||
|
alias gp='git push'
|
||||||
|
alias gl='git pull'
|
||||||
|
|
||||||
|
# 历史配置
|
||||||
|
HISTSIZE=10000
|
||||||
|
SAVEHIST=10000
|
||||||
|
HISTFILE=~/.zsh_history
|
||||||
|
setopt SHARE_HISTORY
|
||||||
|
setopt HIST_IGNORE_DUPS
|
||||||
|
|
||||||
|
# 自动建议配置
|
||||||
|
ZSH_AUTOSUGGEST_HIGHLIGHT_STYLE='fg=8'
|
||||||
|
ZSH_AUTOSUGGEST_STRATEGY=(history completion)
|
||||||
|
|
||||||
|
echo "🚀 ZSH 配置完成!"
|
||||||
|
EOF
|
||||||
|
|
||||||
|
# 设置默认 shell
|
||||||
|
chsh -s "$(which zsh)"
|
||||||
|
|
||||||
|
log_success "最小化 ZSH 配置安装完成!"
|
||||||
|
log_info "请重新登录或运行: source ~/.zshrc"
|
||||||
|
|
@ -1,40 +0,0 @@
|
||||||
job "test-nginx" {
|
|
||||||
datacenters = ["dc1"]
|
|
||||||
type = "service"
|
|
||||||
|
|
||||||
group "web" {
|
|
||||||
count = 1
|
|
||||||
|
|
||||||
network {
|
|
||||||
port "http" {
|
|
||||||
static = 8080
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
task "nginx" {
|
|
||||||
driver = "podman"
|
|
||||||
|
|
||||||
config {
|
|
||||||
image = "nginx:alpine"
|
|
||||||
ports = ["http"]
|
|
||||||
}
|
|
||||||
|
|
||||||
resources {
|
|
||||||
cpu = 100
|
|
||||||
memory = 128
|
|
||||||
}
|
|
||||||
|
|
||||||
service {
|
|
||||||
name = "nginx-test"
|
|
||||||
port = "http"
|
|
||||||
|
|
||||||
check {
|
|
||||||
type = "http"
|
|
||||||
path = "/"
|
|
||||||
interval = "10s"
|
|
||||||
timeout = "3s"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Loading…
Reference in New Issue