Fix logging directory permission handling

- Add robust writeability testing for logs directory
- Implement fallback hierarchy: logs/ → /tmp/libravatar-logs → user-specific temp
- Handle cases where directory exists but isn't writable
- Prevent Django startup failures due to permission errors

Resolves development instance startup issues with /var/www/dev.libravatar.org/logs/
This commit is contained in:
Oliver Falk
2025-10-16 12:36:42 +02:00
parent 8b04c170ec
commit cfa3d11b35
6 changed files with 672 additions and 7 deletions

232
.cursorrules Normal file
View File

@@ -0,0 +1,232 @@
# ivatar/libravatar Project Rules
## Project Overview
ivatar is a Django-based federated avatar service that serves as an alternative to Gravatar. It provides avatar images for email addresses and OpenID URLs, with support for the Libravatar federation protocol.
## Core Functionality
- Avatar service for email addresses and OpenID URLs
- Federated compatibility with Libravatar protocol
- Multiple authentication methods (OpenID, OpenID Connect/Fedora, Django auth)
- Image upload, cropping, and management
- External avatar import (Gravatar, other Libravatar instances)
- Bluesky handle integration
- Multiple theme support (default, clime, green, red)
- Internationalization (15+ languages)
## Technical Stack
- **Framework**: Django 4.2+ with Python 3.x
- **Database**: SQLite (development), MySQL/MariaDB, PostgreSQL (production)
- **Image Processing**: PIL/Pillow for image manipulation
- **Authentication**: django-openid-auth, social-auth-app-django
- **Caching**: Memcached and filesystem caching
- **Email**: Mailgun integration via django-anymail
- **Testing**: pytest with custom markers
## Key Models
- `Photo`: Stores uploaded avatar images with format detection and access counting
- `ConfirmedEmail`: Verified email addresses with assigned photos and Bluesky handles
- `ConfirmedOpenId`: Verified OpenID URLs with assigned photos and Bluesky handles
- `UserPreference`: User theme preferences
- `UnconfirmedEmail`: Email verification workflow
- `UnconfirmedOpenId`: OpenID verification workflow
## Security Features
- File upload validation and sanitization
- EXIF data removal (ENABLE_EXIF_SANITIZATION)
- Malicious content scanning (ENABLE_MALICIOUS_CONTENT_SCAN)
- Comprehensive security logging
- File size limits and format validation
- Trusted URL validation for external avatar sources
## Development Workflow Rules
### Testing
- **MANDATORY: Run pre-commit hooks and tests before any changes** - this is an obligation
- Use `./run_tests_local.sh` for local development (skips Bluesky tests requiring API credentials)
- Run `python3 manage.py test -v2` for full test suite including Bluesky tests
- **MANDATORY: When adding new code, always write tests to increase code coverage** - never decrease coverage
- Use pytest markers appropriately:
- `@pytest.mark.bluesky`: Tests requiring Bluesky API credentials
- `@pytest.mark.slow`: Long-running tests
- `@pytest.mark.integration`: Integration tests
- `@pytest.mark.unit`: Unit tests
### Code Quality
- Always check for linter errors after making changes using `read_lints`
- Follow existing code style and patterns
- Maintain comprehensive logging (use `logger = logging.getLogger("ivatar")`)
- Consider security implications of any changes
- Follow Django best practices and conventions
### Database Operations
- Use migrations for schema changes: `./manage.py migrate`
- Support multiple database backends (SQLite, MySQL, PostgreSQL)
- Use proper indexing for performance (see existing model indexes)
### Image Processing
- Support multiple formats: JPEG, PNG, GIF, WEBP
- Maximum image size: 512x512 pixels (AVATAR_MAX_SIZE)
- Maximum file size: 10MB (MAX_PHOTO_SIZE)
- JPEG quality: 85 (JPEG_QUALITY)
- Always validate image format and dimensions
## Configuration Management
- Main settings in `ivatar/settings.py` and `config.py`
- Local overrides in `config_local.py` (not in version control)
- Environment variables for sensitive data (database credentials, API keys)
- Support for multiple deployment environments (development, staging, production)
## Authentication & Authorization
- Multiple backends: Django auth, OpenID, Fedora OIDC
- Social auth pipeline with custom steps for email confirmation
- User account creation and management
- Email verification workflow
## Caching Strategy
- Memcached for general caching
- Filesystem cache for generated images
- 5-minute cache for resized images (CACHE_IMAGES_MAX_AGE)
- Cache invalidation on photo updates
## Internationalization
- Support for 15+ languages
- Use Django's translation framework
- Template strings should be translatable
- Locale-specific formatting
## File Structure Guidelines
- Main Django app: `ivatar/`
- Account management: `ivatar/ivataraccount/`
- Tools: `ivatar/tools/`
- Static files: `ivatar/static/` and `static/`
- Templates: `templates/` and app-specific template directories
- Tests: Co-located with modules or in dedicated test files
## Security Considerations
- Always validate file uploads
- Sanitize EXIF data from images
- Use secure password hashing (Argon2 preferred, PBKDF2 fallback)
- Implement proper CSRF protection
- Use secure cookies in production
- Log security events to dedicated security log
## Performance Considerations
- Use database indexes for frequently queried fields
- Implement proper caching strategies
- Optimize image processing operations
- Monitor access counts for analytics
- Use efficient database queries
## Production Deployment & Infrastructure
### Hosting & Sponsorship
- **Hosted by Fedora Project** - Free infrastructure provided due to heavy usage by Fedora community
- **Scale**: Handles millions of requests daily for 30k+ users with 33k+ avatar images
- **Performance**: High-performance system optimized for dynamic content (CDN difficult due to dynamic sizing)
### Production Architecture
- **Redis**: Session storage (potential future caching expansion)
- **Monitoring Stack**:
- Prometheus + Alertmanager for metrics and alerting
- Loki for log aggregation
- Alloy for observability
- Grafana for visualization
- Custom exporters for application metrics
- **Apache HTTPD**:
- SSL termination
- Load balancer for Gunicorn containers
- Caching (memory/socache and disk cache - optimization ongoing)
- **PostgreSQL**: Main production database
- **Gunicorn**: 2 containers running Django application
- **Containerization**: **Podman** (not Docker) - always prefer podman when possible
### Development Environment
- **Dev Instance**: dev.libravatar.org (auto-deployed from 'devel' branch via Puppet)
- **Limitation**: Aging CentOS 7 host with older Python 3.x and Django versions
- **Compatibility**: Must maintain backward compatibility with older versions
### CI/CD & Version Control
- **GitLab**: Self-hosted OSS/Community Edition on git.linux-kernel.at
- **CI**: GitLab CI extensively used
- **CD**: GitLab CD on roadmap (part of libravatar-ansible project)
- **Deployment**: Separate libravatar-ansible project handles production deployments
- **Container Management**: Ansible playbooks rebuild custom images and restart containers as needed
### Deployment Considerations
- Production requires proper database setup (PostgreSQL, not SQLite)
- Static file collection required: `./manage.py collectstatic`
- Environment-specific configuration via environment variables
- Custom container images with automated rebuilds
- High availability and performance optimization critical
## Common Commands
```bash
# Development server
./manage.py runserver 0:8080
# Run local tests (recommended for development)
./run_tests_local.sh
# Run all tests
python3 manage.py test -v2
# Database migrations
./manage.py migrate
# Collect static files
./manage.py collectstatic -l --no-input
# Create superuser
./manage.py createsuperuser
```
## Code Style Guidelines
- Use descriptive variable and function names
- Add comprehensive docstrings for classes and methods
- **MANDATORY: Include type hints for ALL new code** - this is a strict requirement
- Follow PEP 8 and Django coding standards
- Use meaningful commit messages
- Add comments for complex business logic
## Error Handling
- Use proper exception handling with specific exception types
- Log errors with appropriate levels (DEBUG, INFO, WARNING, ERROR)
- Provide user-friendly error messages
- Implement graceful fallbacks where possible
## API Compatibility
- Maintain backward compatibility with existing avatar URLs
- Support Libravatar federation protocol
- Ensure Gravatar compatibility for imports
- Preserve existing URL patterns and parameters
## Monitoring & Logging
- Use structured logging with appropriate levels
- Log security events to dedicated security log
- Monitor performance metrics (access counts, response times)
- Implement health checks for external dependencies
- **Robust logging setup**: Automatically tests directory writeability and falls back gracefully
- **Fallback hierarchy**: logs/ → /tmp/libravatar-logs → user-specific temp directory
- **Permission handling**: Handles cases where logs directory exists but isn't writable
## GitLab CI/CD Monitoring
- **MANDATORY: Check GitLab pipeline status regularly** during development
- Monitor pipeline status for the current working branch (typically `devel`)
- Use `glab ci list --repo git.linux-kernel.at/oliver/ivatar --per-page 5` to check recent pipelines
- Verify all tests pass before considering work complete
- Check pipeline logs with `glab ci trace <pipeline-id> --repo git.linux-kernel.at/oliver/ivatar` if needed
- Address any CI failures immediately before proceeding with new changes
- Pipeline URL: https://git.linux-kernel.at/oliver/ivatar/-/pipelines
## Deployment Verification
- **Automatic verification**: GitLab CI automatically verifies dev.libravatar.org deployments on `devel` branch
- **Manual verification**: Production deployments on `master` branch can be verified manually via CI
- **Version endpoint**: `/deployment/version/` provides commit hash, branch, and deployment status
- **Security**: Version endpoint uses cached git file reading (no subprocess calls) to prevent DDoS attacks
- **Performance**: Version information is cached in memory to avoid repeated file system access
- **SELinux compatibility**: No subprocess calls that might be blocked by SELinux policies
- **Manual testing**: Use `./scripts/test_deployment.sh` to test deployments locally
- **Deployment timing**: Dev deployments via Puppet may take up to 30 minutes to complete
- **Verification includes**: Version matching, avatar endpoint, stats endpoint functionality
Remember: This is a production avatar service handling user data and images. Security, performance, and reliability are paramount. Always consider the impact of changes on existing users and federated services.

View File

@@ -124,6 +124,150 @@ semgrep:
- gl-sast-report.json
- semgrep.sarif
# Deployment verification jobs
verify_dev_deployment:
stage: deploy
image: alpine:latest
only:
- devel
variables:
DEV_URL: "https://dev.libravatar.org"
MAX_RETRIES: 30
RETRY_DELAY: 60
before_script:
- apk add --no-cache curl jq
script:
- echo "Waiting for dev.libravatar.org deployment to complete..."
- |
for i in $(seq 1 $MAX_RETRIES); do
echo "Attempt $i/$MAX_RETRIES: Checking deployment status..."
# Get current commit hash from GitLab
CURRENT_COMMIT="$CI_COMMIT_SHA"
echo "Expected commit: $CURRENT_COMMIT"
# Check if dev site is responding
if curl -sf "$DEV_URL/deployment/version/" > /dev/null 2>&1; then
echo "Dev site is responding, checking version..."
# Get deployed version
DEPLOYED_VERSION=$(curl -sf "$DEV_URL/deployment/version/" | jq -r '.commit_hash // empty')
if [ "$DEPLOYED_VERSION" = "$CURRENT_COMMIT" ]; then
echo "✅ SUCCESS: Dev deployment verified!"
echo "Deployed version: $DEPLOYED_VERSION"
echo "Expected version: $CURRENT_COMMIT"
# Run basic functionality tests
echo "Running basic functionality tests..."
# Test avatar endpoint
if curl -sf "$DEV_URL/avatar/test@example.com" > /dev/null; then
echo "✅ Avatar endpoint working"
else
echo "❌ Avatar endpoint failed"
exit 1
fi
# Test stats endpoint
if curl -sf "$DEV_URL/stats/" > /dev/null; then
echo "✅ Stats endpoint working"
else
echo "❌ Stats endpoint failed"
exit 1
fi
echo "🎉 Dev deployment verification completed successfully!"
exit 0
else
echo "Version mismatch. Deployed: $DEPLOYED_VERSION, Expected: $CURRENT_COMMIT"
fi
else
echo "Dev site not responding yet..."
fi
if [ $i -lt $MAX_RETRIES ]; then
echo "Waiting $RETRY_DELAY seconds before next attempt..."
sleep $RETRY_DELAY
fi
done
echo "❌ FAILED: Dev deployment verification timed out after $MAX_RETRIES attempts"
exit 1
allow_failure: false
verify_prod_deployment:
stage: deploy
image: alpine:latest
only:
- master
when: manual
variables:
PROD_URL: "https://libravatar.org"
MAX_RETRIES: 10
RETRY_DELAY: 30
before_script:
- apk add --no-cache curl jq
script:
- echo "Verifying production deployment..."
- |
for i in $(seq 1 $MAX_RETRIES); do
echo "Attempt $i/$MAX_RETRIES: Checking production deployment..."
# Get current commit hash from GitLab
CURRENT_COMMIT="$CI_COMMIT_SHA"
echo "Expected commit: $CURRENT_COMMIT"
# Check if prod site is responding
if curl -sf "$PROD_URL/deployment/version/" > /dev/null 2>&1; then
echo "Production site is responding, checking version..."
# Get deployed version
DEPLOYED_VERSION=$(curl -sf "$PROD_URL/deployment/version/" | jq -r '.commit_hash // empty')
if [ "$DEPLOYED_VERSION" = "$CURRENT_COMMIT" ]; then
echo "✅ SUCCESS: Production deployment verified!"
echo "Deployed version: $DEPLOYED_VERSION"
echo "Expected version: $CURRENT_COMMIT"
# Run basic functionality tests
echo "Running production functionality tests..."
# Test avatar endpoint
if curl -sf "$PROD_URL/avatar/test@example.com" > /dev/null; then
echo "✅ Production avatar endpoint working"
else
echo "❌ Production avatar endpoint failed"
exit 1
fi
# Test stats endpoint
if curl -sf "$PROD_URL/stats/" > /dev/null; then
echo "✅ Production stats endpoint working"
else
echo "❌ Production stats endpoint failed"
exit 1
fi
echo "🎉 Production deployment verification completed successfully!"
exit 0
else
echo "Version mismatch. Deployed: $DEPLOYED_VERSION, Expected: $CURRENT_COMMIT"
fi
else
echo "Production site not responding..."
fi
if [ $i -lt $MAX_RETRIES ]; then
echo "Waiting $RETRY_DELAY seconds before next attempt..."
sleep $RETRY_DELAY
fi
done
echo "❌ FAILED: Production deployment verification timed out after $MAX_RETRIES attempts"
exit 1
allow_failure: false
include:
- template: Jobs/SAST.gitlab-ci.yml
- template: Jobs/Dependency-Scanning.gitlab-ci.yml

View File

@@ -16,13 +16,38 @@ BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Logging directory - can be overridden in local config
LOGS_DIR = os.path.join(BASE_DIR, "logs")
# Ensure logs directory exists - worst case, fall back to /tmp
try:
os.makedirs(LOGS_DIR, exist_ok=True)
except OSError:
def _test_logs_directory_writeability(logs_dir):
"""
Test if a logs directory is actually writable by attempting to create and write a test file
"""
try:
# Ensure directory exists
os.makedirs(logs_dir, exist_ok=True)
# Test if we can actually write to the directory
test_file = os.path.join(logs_dir, ".write_test")
with open(test_file, "w") as f:
f.write("test")
# Clean up test file
os.remove(test_file)
return True
except (OSError, PermissionError):
return False
# Ensure logs directory exists and is writable - worst case, fall back to /tmp
if not _test_logs_directory_writeability(LOGS_DIR):
LOGS_DIR = "/tmp/libravatar-logs"
os.makedirs(LOGS_DIR, exist_ok=True)
logger.warning(f"Failed to create logs directory {LOGS_DIR}, falling back to /tmp")
if not _test_logs_directory_writeability(LOGS_DIR):
# If even /tmp fails, use a user-specific temp directory
import tempfile
LOGS_DIR = os.path.join(tempfile.gettempdir(), f"libravatar-logs-{os.getuid()}")
_test_logs_directory_writeability(LOGS_DIR) # This should always succeed
logger.warning(f"Failed to write to logs directory, falling back to {LOGS_DIR}")
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = "=v(+-^t#ahv^a&&e)uf36g8algj$d1@6ou^w(r0@%)#8mlc*zk"

View File

@@ -10,7 +10,7 @@ from django.conf.urls.static import static
from django.views.generic import TemplateView, RedirectView
from ivatar import settings
from .views import AvatarImageView, StatsView
from .views import GravatarProxyView, BlueskyProxyView
from .views import GravatarProxyView, BlueskyProxyView, DeploymentVersionView
urlpatterns = [ # pylint: disable=invalid-name
path("admin/", admin.site.urls),
@@ -69,6 +69,11 @@ urlpatterns = [ # pylint: disable=invalid-name
),
path("talk_to_us/", RedirectView.as_view(url="/contact"), name="talk_to_us"),
path("stats/", StatsView.as_view(), name="stats"),
path(
"deployment/version/",
DeploymentVersionView.as_view(),
name="deployment_version",
),
]
MAINTENANCE = False

View File

@@ -8,6 +8,7 @@ from io import BytesIO
from os import path
import hashlib
import logging
import threading
from ivatar.utils import urlopen, Bluesky
from urllib.error import HTTPError, URLError
from ssl import SSLError
@@ -768,3 +769,136 @@ class StatsView(TemplateView, JsonResponse):
}
return JsonResponse(retval)
# Thread-safe version cache
_version_cache = None
_version_cache_lock = threading.Lock()
def _get_git_info_from_files():
"""
Safely extract git information from .git files without subprocess calls
"""
try:
# Get the project root directory
project_root = path.dirname(path.dirname(path.abspath(__file__)))
git_dir = path.join(project_root, ".git")
if not path.exists(git_dir):
return None
# Read HEAD to get current branch/commit
head_file = path.join(git_dir, "HEAD")
if not path.exists(head_file):
return None
with open(head_file, "r") as f:
head_content = f.read().strip()
# Parse HEAD content
if head_content.startswith("ref: "):
# We're on a branch
branch_ref = head_content[5:] # Remove 'ref: '
branch_name = path.basename(branch_ref)
# Read the commit hash from the ref
ref_file = path.join(git_dir, branch_ref)
if path.exists(ref_file):
with open(ref_file, "r") as f:
commit_hash = f.read().strip()
else:
return None
else:
# Detached HEAD state
commit_hash = head_content
branch_name = "detached"
# Try to get commit date from git log file (if available)
commit_date = None
log_file = path.join(git_dir, "logs", "HEAD")
if path.exists(log_file):
try:
with open(log_file, "r") as f:
# Read last line to get most recent commit info
lines = f.readlines()
if lines:
last_line = lines[-1].strip()
# Git log format: <old_hash> <new_hash> <author> <timestamp> <timezone> <message>
parts = last_line.split("\t")
if len(parts) >= 2:
# Extract timestamp and convert to readable date
timestamp_part = parts[0].split()[-2] # Get timestamp
if timestamp_part.isdigit():
import datetime
timestamp = int(timestamp_part)
commit_date = datetime.datetime.fromtimestamp(
timestamp
).strftime("%Y-%m-%d %H:%M:%S %z")
except (ValueError, IndexError):
pass
# Fallback: try to get date from commit object if available
if not commit_date and len(commit_hash) == 40:
try:
commit_dir = path.join(git_dir, "objects", commit_hash[:2])
commit_file = path.join(commit_dir, commit_hash[2:])
if path.exists(commit_file):
# This would require decompressing the git object, which is complex
# For now, we'll use a placeholder
commit_date = "unknown"
except Exception:
commit_date = "unknown"
return {
"commit_hash": commit_hash,
"short_hash": commit_hash[:7] if len(commit_hash) >= 7 else commit_hash,
"branch": branch_name,
"commit_date": commit_date or "unknown",
"deployment_status": "active",
"version": f"{branch_name}-{commit_hash[:7] if len(commit_hash) >= 7 else commit_hash}",
}
except Exception as exc:
logger.warning(f"Failed to read git info from files: {exc}")
return None
def _get_cached_version_info():
"""
Get cached version information, loading it if not available
"""
global _version_cache
with _version_cache_lock:
if _version_cache is None:
# Get version info from git files
_version_cache = _get_git_info_from_files()
# If that fails, return error
if _version_cache is None:
_version_cache = {
"error": "Unable to determine version - .git directory not found",
"deployment_status": "unknown",
}
return _version_cache
class DeploymentVersionView(View):
"""
View to return deployment version information for CI/CD verification
Uses cached version info to prevent DDoS attacks and improve performance
"""
def get(self, request, *args, **kwargs):
"""
Return cached deployment version information
"""
version_info = _get_cached_version_info()
if "error" in version_info:
return JsonResponse(version_info, status=500)
return JsonResponse(version_info)

125
scripts/test_deployment.sh Executable file
View File

@@ -0,0 +1,125 @@
#!/bin/bash
# Test deployment verification script
set -e
# Configuration
DEV_URL="https://dev.libravatar.org"
PROD_URL="https://libravatar.org"
MAX_RETRIES=5
RETRY_DELAY=10
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to test deployment
test_deployment() {
local url=$1
local name=$2
local max_retries=$3
echo -e "${YELLOW}Testing $name deployment at $url${NC}"
for i in $(seq 1 $max_retries); do
echo "Attempt $i/$max_retries: Checking $name deployment..."
# Check if site is responding
if curl -sf "$url/deployment/version/" >/dev/null 2>&1; then
echo "$name site is responding, checking version..."
# Get deployed version info
VERSION_INFO=$(curl -sf "$url/deployment/version/")
echo "Version info: $VERSION_INFO"
# Extract commit hash
COMMIT_HASH=$(echo "$VERSION_INFO" | jq -r '.commit_hash // empty')
BRANCH=$(echo "$VERSION_INFO" | jq -r '.branch // empty')
VERSION=$(echo "$VERSION_INFO" | jq -r '.version // empty')
echo "Deployed commit: $COMMIT_HASH"
echo "Deployed branch: $BRANCH"
echo "Deployed version: $VERSION"
# Run basic functionality tests
echo "Running basic functionality tests..."
# Test avatar endpoint
if curl -sf "$url/avatar/test@example.com" >/dev/null; then
echo -e "${GREEN}✅ Avatar endpoint working${NC}"
else
echo -e "${RED}❌ Avatar endpoint failed${NC}"
return 1
fi
# Test stats endpoint
if curl -sf "$url/stats/" >/dev/null; then
echo -e "${GREEN}✅ Stats endpoint working${NC}"
else
echo -e "${RED}❌ Stats endpoint failed${NC}"
return 1
fi
echo -e "${GREEN}🎉 $name deployment verification completed successfully!${NC}"
return 0
else
echo "$name site not responding yet..."
fi
if [ $i -lt $max_retries ]; then
echo "Waiting $RETRY_DELAY seconds before next attempt..."
sleep $RETRY_DELAY
fi
done
echo -e "${RED}❌ FAILED: $name deployment verification timed out after $max_retries attempts${NC}"
return 1
}
# Main execution
echo "Libravatar Deployment Verification Script"
echo "=========================================="
# Check if jq is available
if ! command -v jq &>/dev/null; then
echo -e "${RED}Error: jq is required but not installed${NC}"
echo "Install with: brew install jq (macOS) or apt-get install jq (Ubuntu)"
exit 1
fi
# Test dev deployment
echo ""
test_deployment "$DEV_URL" "Dev" $MAX_RETRIES
DEV_RESULT=$?
# Test production deployment
echo ""
test_deployment "$PROD_URL" "Production" $MAX_RETRIES
PROD_RESULT=$?
# Summary
echo ""
echo "=========================================="
echo "Deployment Verification Summary:"
echo "=========================================="
if [ $DEV_RESULT -eq 0 ]; then
echo -e "${GREEN}✅ Dev deployment: PASSED${NC}"
else
echo -e "${RED}❌ Dev deployment: FAILED${NC}"
fi
if [ $PROD_RESULT -eq 0 ]; then
echo -e "${GREEN}✅ Production deployment: PASSED${NC}"
else
echo -e "${RED}❌ Production deployment: FAILED${NC}"
fi
# Exit with error if any test failed
if [ $DEV_RESULT -ne 0 ] || [ $PROD_RESULT -ne 0 ]; then
exit 1
fi
echo -e "${GREEN}🎉 All deployment verifications passed!${NC}"