Skip to content

xalgorithm/epyc2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Kubernetes Infrastructure on Proxmox

A comprehensive Infrastructure as Code (IaC) solution for deploying a production-ready Kubernetes cluster on Proxmox with monitoring, backup, and network scanning capabilities.

πŸ—οΈ Architecture Overview

This project deploys a complete Kubernetes infrastructure stack including:

  • Kubernetes Cluster: K3s-based cluster with control plane and worker nodes
  • Load Balancing: MetalLB for bare-metal load balancing
  • Monitoring Stack: Prometheus, Grafana, Loki, and Mimir for comprehensive observability
  • Log Aggregation: Syslog receiver for OPNsense and external device logs
  • Backup System: Automated backup solution with NFS storage
  • Ingress: Nginx ingress controller with host-based routing
  • Automation: N8N workflow automation platform

πŸš€ Quick Start

Prerequisites

  • Proxmox VE 7.0+ with API access
  • Terraform 1.0+
  • SSH key pair for VM access
  • NFS server for backup storage (optional)

1. Clone and Configure

git clone <repository-url>
cd kubernetes-proxmox-infrastructure
cp terraform.tfvars.example terraform.tfvars

2. Configure Variables

Edit terraform.tfvars with your environment settings:

# Proxmox Configuration
proxmox_api_url      = "https://your-proxmox:8006/api2/json"
proxmox_api_token_id = "your-token-id"
proxmox_api_token_secret = "your-token-secret"

# VM Configuration
ssh_public_key_path  = "~/.ssh/id_ed25519.pub"
ssh_private_key_path = "~/.ssh/id_ed25519"

# Network Configuration
vm_network_bridge = "vmbr0"
vm_network_vlan   = 100

# NFS Backup Configuration (optional)
nfs_server_ip   = "192.168.1.100"
nfs_backup_path = "/data/kubernetes/backups"

3. Deploy Infrastructure

# Pre-flight checks
./scripts/deployment/pre-flight-check.sh

# Deploy full stack
./scripts/deployment/deploy-full-stack.sh

πŸ“ Project Structure

β”œβ”€β”€ docs/                          # Documentation
β”‚   β”œβ”€β”€ deployment/               # Deployment guides
β”‚   β”œβ”€β”€ backup/                   # Backup documentation
β”‚   β”œβ”€β”€ monitoring/               # Monitoring setup
β”‚   β”œβ”€β”€ troubleshooting/          # Troubleshooting guides
β”‚   └── TERRAFORM_STRUCTURE.md    # Terraform organization guide
β”œβ”€β”€ scripts/                      # Automation scripts
β”‚   β”œβ”€β”€ deployment/               # Deployment scripts
β”‚   β”œβ”€β”€ backup/                   # Backup and restore scripts
β”‚   β”œβ”€β”€ maintenance/              # Maintenance scripts
β”‚   └── troubleshooting/          # Troubleshooting scripts
β”œβ”€β”€ configs/                      # Configuration files
β”‚   β”œβ”€β”€ grafana/                  # Grafana dashboards
β”‚   β”œβ”€β”€ prometheus/               # Prometheus configs
β”‚   └── backup/                   # Backup configurations
β”œβ”€β”€ *.tf                          # Terraform configuration files
└── terraform.tfvars              # Environment variables

πŸ”§ Components

Infrastructure (Terraform)

The Terraform configuration is organized following best practices with clear separation of concerns:

Core Configuration

  • versions.tf: Terraform and provider version requirements
  • providers.tf: Provider configurations
  • variables.tf: Input variable declarations
  • outputs.tf: Output value declarations
  • backend.tf: Backend configuration

Infrastructure Layer

  • infrastructure-proxmox.tf: VM definitions and provisioning
  • infrastructure-network.tf: MetalLB and Ingress controller

Kubernetes Platform

  • kubernetes-cluster.tf: Cluster bootstrapping and setup
  • kubernetes-storage.tf: NFS storage configuration
  • kubernetes-ingress.tf: Ingress resource definitions

Monitoring & Backup

  • monitoring.tf: Prometheus, Grafana, Loki, Mimir stack
  • backup.tf: Backup system and CronJobs
  • opnsense-logging.tf: OPNsense log integration

Applications

  • applications-media.tf: Media applications (Mylar)
  • applications-automation.tf: Automation tools (N8N)

See TERRAFORM_STRUCTURE.md for detailed information

Key Features

πŸ” Monitoring & Observability

  • Prometheus: Metrics collection and alerting
  • Grafana: Visualization and dashboards
  • Loki: Log aggregation and analysis
  • Mimir: Long-term metrics storage

πŸ’Ύ Backup & Recovery

  • Automated Backups: Scheduled ETCD and application data backups
  • Manual Backup Triggers: On-demand backup capabilities
  • Restoration Testing: Comprehensive restore validation
  • NFS Storage: Centralized backup storage with redundancy

🌐 Networking

  • MetalLB: Layer 2 load balancing for bare-metal
  • Traefik Ingress: HTTP/HTTPS routing with automatic SSL
  • Network Policies: Secure inter-pod communication

πŸ“– Documentation

Deployment

Backup & Recovery

Monitoring

Troubleshooting

πŸ› οΈ Common Operations

Deployment

# Full stack deployment
./scripts/deployment/deploy-full-stack.sh

# Step-by-step deployment
./scripts/deployment/deploy-step-by-step.sh

# Pre-flight checks
./scripts/deployment/pre-flight-check.sh

Backup Operations

# Manual backup (all components)
./scripts/backup/trigger-manual-backup.sh

# Test backup restoration
./scripts/backup/test-backup-restoration.sh dry-run

# Restore specific component
./scripts/backup/test-individual-restore.sh grafana

Maintenance

# Check NFS permissions
./scripts/maintenance/test-nfs-permissions.sh

# Update Grafana dashboards
./scripts/maintenance/update-grafana-dashboards.sh

Troubleshooting

# Diagnose NFS access issues
./scripts/troubleshooting/diagnose-nfs-access.sh



# Fix kubeconfig secret encoding
./scripts/troubleshooting/fix-kubeconfig-secret.sh

πŸ” Security Considerations

  • SSH Key Authentication: Password authentication disabled by default
  • Network Segmentation: VLANs and network policies for isolation
  • Secret Management: Kubernetes secrets for sensitive data
  • Backup Encryption: Consider encrypting backup data at rest
  • Access Control: RBAC policies for service accounts

πŸ“Š Monitoring & Alerting

Default Dashboards

  • Kubernetes Cluster Overview: Node and pod metrics

  • Backup Monitoring: Backup status and performance

  • Application Metrics: Component-specific dashboards

Key Metrics

  • Cluster resource utilization
  • Backup success/failure rates
  • Network device discovery status
  • Application performance metrics

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

  • Documentation: Check the docs/ directory for detailed guides
  • Issues: Report bugs and feature requests via GitHub issues
  • Troubleshooting: Use the troubleshooting scripts in scripts/troubleshooting/

🏷️ Version

Current version: 1.0.0

πŸ“ Changelog

See CHANGELOG.md for version history and updates.


Note: This infrastructure is designed for production use but should be thoroughly tested in your environment before deployment. Always follow your organization's security and operational guidelines.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published