Troubleshooting Guide
Solutions for common issues encountered on collabrains.eu infrastructure.
Container Issues
Service Won't Start
Symptoms: Container crashes immediately or never becomes healthy
Steps:
1. Check logs for error messages:
bash
docker logs CONTAINER_NAME --tail 100
-
Check dependencies (database, cache):
bash docker ps | grep postgres # Check if PostgreSQL is running docker ps | grep redis # Check if Redis is running -
Verify .env configuration:
bash cat /data/coolify/services/SERVICE_ID/.env # Check for typos or missing values -
Restart with fresh state:
bash cd /data/coolify/services/SERVICE_ID docker compose down docker compose up -d docker logs SERVICE_NAME --tail 50
Out of Memory (OOM) Errors
Symptoms: Service crashes with "Killed" or OOMKilled status
Check memory usage:
free -h # Check available RAM and swap
docker stats # Monitor container memory
Solutions:
1. Check swap availability:
bash
free -h | grep Swap
# Should show ~8GB available
-
Identify memory hogs:
bash docker stats --no-stream | sort -k 4 -h -
Restart memory-heavy services (Immich, Grist):
bash docker restart SERVICE_NAME -
Increase service limits in docker-compose.yml:
yaml services: service_name: mem_limit: 4g memswap_limit: 8g
Port Conflicts
Symptoms: "Port already allocated" error
Solution:
# Find what's using the port
netstat -tuln | grep :8000
lsof -i :8000
# Kill the process
kill -9 PID
Database Issues
PostgreSQL Connection Refused
Symptoms: Service can't connect to database
Check database status:
docker ps | grep postgres
docker logs postgres-SERVICE_ID --tail 20
Verify connectivity:
docker exec -it postgres-SERVICE_ID psql -U username -d dbname
# If fails, check credentials in .env
Restart PostgreSQL:
docker restart postgres-SERVICE_ID
# Wait 10 seconds for startup
Database Disk Full
Symptoms: Database inserts fail, logs show "no space left"
Check disk:
df -h /
df -h /data
Solutions:
1. Clean old backups:
bash
find /backups -mtime +30 -exec rm -rf {} \;
-
Check PostgreSQL log files:
bash docker exec postgres-SERVICE_ID du -sh /var/lib/postgresql/data -
Vacuum PostgreSQL database:
bash docker exec -it postgres-SERVICE_ID psql -U username -d dbname VACUUM FULL; \q
Redis Authentication Error
Symptoms: "redis.exceptions.AuthenticationError: Authentication required"
Cause: Timing issue on startup or misconfigured credentials
Solution:
# Verify Redis is running
docker ps | grep redis
# Test connection
docker exec REDIS_ID redis-cli ping
# Should return: PONG
# If connection fails, restart both Redis and dependent service
docker compose -f /data/coolify/services/SERVICE_ID/docker-compose.yml down
sleep 5
docker compose -f /data/coolify/services/SERVICE_ID/docker-compose.yml up -d
# Wait for health check
sleep 15
docker logs SERVICE_NAME --tail 20
Network & SSL Issues
HTTPS Certificate Error
Symptoms: Browser warning, "certificate not trusted" or "domain mismatch"
Check certificate:
openssl s_client -connect docs.collabrains.eu:443 -servername docs.collabrains.eu </dev/null 2>&1 | grep -E "subject=|CN=|notAfter"
Check Let's Encrypt status:
docker logs traefik --tail 50 | grep "acme\|certificate\|error"
Force certificate renewal:
# Remove old certificate (Traefik will auto-renew)
rm /data/coolify/proxy/acme.json
# Restart Traefik
docker restart traefik
# Wait for certificate generation (30-60 seconds)
sleep 60
docker logs traefik --tail 20
Service Not Accessible via URL
Symptoms: https://docs.collabrains.eu returns "connection refused" or 502
Steps:
1. Check Traefik routing:
bash
docker logs traefik --tail 50 | grep "collabrains.eu"
-
Verify backend service is healthy:
bash docker ps | grep SERVICE_NAME # Should show "healthy" status -
Test container is listening:
bash docker exec SERVICE_CONTAINER curl http://localhost:8000 # Should return HTML/response, not connection refused -
Check UFW firewall:
bash ufw status # Ports 80, 443 should be allowed -
Verify DNS:
bash dig docs.collabrains.eu nslookup docs.collabrains.eu
Backup & Recovery
Backup Script Failed
Check manual backup:
/usr/local/bin/backup-collabrains.sh
# View backup directory
ls -lh /backups/$(date +%Y-%m-%d)/
Restore from Backup
See Backups & Recovery for detailed procedures.
Performance Issues
High CPU Usage
Identify culprit:
docker stats --no-stream | sort -k 3 -h
top
Common causes: - Immich AI indexing: normal during photo uploads - n8n workflows: check for infinite loops - Paperless OCR: normal during document processing
Solution: Let process complete, or adjust settings to run during off-peak hours
High Memory Usage
docker stats --no-stream | sort -k 4 -h
free -h
Solutions:
- Increase swap: dd if=/dev/zero of=/swapfile bs=1G count=8 && chmod 600 /swapfile && mkswap /swapfile && swapon /swapfile
- Restart heavy service: docker restart SERVICE_NAME
- Reduce concurrent tasks (e.g., OCR workers in Paperless)
Slow File Operations
Check disk I/O:
iostat -x 1 5
iotop
Solutions:
- Check disk space (should be >20% free)
- Reduce concurrent tasks
- Check for stuck processes: ps aux | grep defunct
Service-Specific Issues
Immich Photos Not Uploading
# Check logs
docker logs immich-server --tail 50
# Verify storage
du -sh /data/coolify/services/IMMICH_ID/
# Restart service
docker restart immich-server
Paperless Documents Not Processing
# Check consume directory
ls -la /data/coolify/services/woq978nbzog6dmddhrmeujvk/consume/
# Check Redis
docker exec redis-woq978nbzog6dmddhrmeujvk redis-cli ping
# Restart
cd /data/coolify/services/woq978nbzog6dmddhrmeujvk && docker compose restart
n8n Workflows Not Triggering
# Check logs
docker logs n8n-main -f
# Check webhook receiver (if using GitHub webhooks)
docker logs webhook-receiver-webhook-1 --tail 20
# Restart n8n
docker restart n8n-main
Grist Slow/Unresponsive
# Check database
docker logs postgres-GRIST_ID --tail 20
# Check memory
docker stats grist
# Restart
cd /data/coolify/services/GRIST_ID && docker compose restart
Emergency Procedures
Full System Recovery
# 1. Stop all containers
docker stop $(docker ps -q)
# 2. Wait 5 seconds
sleep 5
# 3. Start all containers
docker start $(docker ps -aq)
# 4. Monitor startup
watch -n 2 'docker ps'
Restore from Backup
See Backups & Recovery for detailed procedures.
Clear All Caches
# Redis
docker exec redis-SERVICE_ID redis-cli FLUSHALL
# Docker build cache
docker builder prune -a
# System cache
sync && echo 3 > /proc/sys/vm/drop_caches
Getting Help
- Check logs first: Most issues are visible in container logs
- Search this guide: Use Ctrl+F to find symptoms
- Check service docs: Each service has its own troubleshooting section
- System status:
docker ps -a,free -h,df -h - Monitoring dashboard: https://grafana.collabrains.eu
Contact Information
- Server: Hetzner Cloud VPS (FSN1)
- SSH:
ssh root@collabrains.eu - Console: Hetzner web console
- Email: dijkstra247@gmail.com