Date: 2026-02-07
Status: ✅ All components implemented and tested
Comprehensive implementation of Keycloak resilience system with dual authentication, circuit breaker pattern, token caching, sync queue, and monitoring. Web users can continue working even if Keycloak is unreachable.
File: factory.web/ObjUser.py
New Methods:
Login(username, password, package) - Dual authentication (local + Keycloak)ChangePassword(username, old_password, new_password, package) - With sync queue_create_session(username, package, auth_mode) - Session managementAuthentication Flow:
# TIER 1: Local (PRIMARY - always works)
local_auth = verify_local_password()
# TIER 2: Keycloak (OPTIONAL - best effort)
if keycloak.is_available():
keycloak_token = keycloak.authenticate()
else:
cached_token = keycloak.get_cached_token()
# TIER 3: Create session (always succeeds)
session_id = create_session()
File: factory.web/ObjUserSession.py
Features:
Usage:
# List sessions
python factory.web/ObjUserSession.py list-sessions john.doe --package homechoice
# Invalidate session
python factory.web/ObjUserSession.py invalidate <session-id>
# Logout user (all devices)
python factory.web/ObjUserSession.py logout-user john.doe --package homechoice
# Clean up expired
python factory.web/ObjUserSession.py cleanup --days 7
# Get metrics
python factory.web/ObjUserSession.py metrics --package homechoice
File: factory.core/ObjKeycloakResilient.py
Features:
New Methods:
is_available() - Check with circuit breakerauthenticate() - With automatic cachingget_cached_token() - Retrieve cached tokensrefresh_cached_token() - Refresh using refresh_tokenauto_refresh_expiring_tokens() - Proactive refreshqueue_sync() - Queue operations when offlineprocess_sync_queue() - Process pending syncsprovision_users_batch() - Batch user creationsync_users_from_database() - Database migrationcleanup_expired_tokens() - Token cleanupcleanup_completed_syncs() - Queue cleanupget_status() - Comprehensive statusFile: factory.core/ObjKeycloakSyncService.py
Features:
Usage:
# Start service (foreground)
python factory.core/ObjKeycloakSyncService.py start
# Check status
python factory.core/ObjKeycloakSyncService.py status
# Test configuration
python factory.core/ObjKeycloakSyncService.py test
# Process queue now (one cycle)
python factory.core/ObjKeycloakSyncService.py sync-now
# Run cleanup
python factory.core/ObjKeycloakSyncService.py cleanup
Systemd Service (for production):
# Enable and start
sudo systemctl enable keycloak-sync
sudo systemctl start keycloak-sync
# View logs
sudo journalctl -u keycloak-sync -f
File: local.processing/schema/package.sync/tables/sys_keycloak_token_cache.yaml
Stores cached Keycloak tokens with 24-hour grace period.
Key Fields:
File: local.processing/schema/package.sync/tables/sys_keycloak_sync_queue.yaml
Queues sync operations for eventual consistency.
Key Fields:
File: local.processing/schema/package.sync/tables/sys_usersession.yaml
New Field Added:
AuthMode varchar(20) - Authentication mode (keycloak/cached/local)Files:
local.processing/schema/package.sync/tables/sys_user_history.yamllocal.processing/schema/package.janee/tables/sys_user_history.yamlNew Field Added:
AuthMode varchar(20) - Tracks authentication mode in audit trailFile: config.yaml
New Section:
keycloak:
# ... existing config ...
resilience:
enabled: true
failure_threshold: 3 # Open circuit after 3 failures
cooldown_seconds: 60 # Wait 60s before retry
token_grace_period_hours: 24 # Cache tokens 24h
allow_offline_mode: true # Allow local auth when down
health_check_cache_seconds: 30 # Cache health check
File: resource.test/pytests/factory.core/test_ObjKeycloakResilient.py
Test Coverage:
Run Tests:
# All tests
pytest resource.test/pytests/factory.core/test_ObjKeycloakResilient.py -v
# Integration tests only
pytest resource.test/pytests/factory.core/test_ObjKeycloakResilient.py -m integration
File: factory.deploy/validate_keycloak_config.py
Validates:
Usage:
# Validate configuration
python factory.deploy/validate_keycloak_config.py validate
# Attempt auto-fix
python factory.deploy/validate_keycloak_config.py fix
File: factory.web/migrate_users_to_keycloak.py
Features:
Usage:
# Dry run (see what would be migrated)
python factory.web/migrate_users_to_keycloak.py migrate homechoice --dry-run
# Migrate all active users
python factory.web/migrate_users_to_keycloak.py migrate homechoice
# Migrate specific users
python factory.web/migrate_users_to_keycloak.py migrate homechoice --where "UserGroup LIKE '%ADMIN%'"
# Check status
python factory.web/migrate_users_to_keycloak.py status homechoice
File: factory.core/ObjMonitor.py
New Method: collect_keycloak(context)
Collects:
Usage: Automatically called during monitoring cycles.
File: resource.web/keycloak_status.html
Features:
Deploy: Copy to web server's static files directory.
Files:
factory.report/package.core/ObjReportKeycloakStatus.py - Report classfactory.report/package.core/ObjReportKeycloakStatus.yaml - Configurationfactory.report/package.core/ObjReportKeycloakStatus.md - DocumentationReport Types:
Summary - High-level status overview
Detailed - Comprehensive metrics
Sync Queue - Queue analysis
Usage:
from factory.report.package.core.ObjReportKeycloakStatus import Report
# Summary report
report = Report()
html = report.Render("homechoice", "summary")
# Detailed report
html = report.Render("homechoice", "detailed")
# Sync queue report
html = report.Render("homechoice", "sync_queue")
Integration:
# Via ServeReport.py
@app.route('/report/keycloak-status')
def keycloak_status():
report = Report()
return report.Render(request.args.get('package', ''), 'summary')
# Scheduled email
from ObjNotify import ObjNotify
report = Report()
html = report.Render("homechoice", "detailed")
notify = ObjNotify()
notify.send_email(to="ops@example.com", subject="Keycloak Status", body_html=html)
File: factory.core/ObjKeycloakResilient.py
Alerts Sent:
Circuit Opens (warning):
Circuit Closes (info):
Integration: Automatic via _send_alert() method using ObjNotify.
python -c "
from factory.core.ObjData import ObjData
db = ObjData()
db.create_tables_from_yaml('local.processing/schema/package.sync/tables/sys_keycloak_token_cache.yaml')
db.create_tables_from_yaml('local.processing/schema/package.sync/tables/sys_keycloak_sync_queue.yaml')
db.create_tables_from_yaml('local.processing/schema/package.sync/tables/sys_usersession.yaml')
print('✅ Tables created successfully')
"
python factory.deploy/validate_keycloak_config.py validate
# Test migration (dry-run)
python factory.web/migrate_users_to_keycloak.py migrate homechoice --dry-run
# Perform migration
python factory.web/migrate_users_to_keycloak.py migrate homechoice
# Foreground (for testing)
python factory.core/ObjKeycloakSyncService.py start
# Or as systemd service (production)
sudo systemctl start keycloak-sync
from factory.web.ObjUser import User
user = User()
result = user.Login("test@example.com", "password", "homechoice")
print(f"Success: {result['success']}")
print(f"Auth mode: {result['auth_mode']}")
print(f"Keycloak status: {result['keycloak_status']}")
factory.web/ObjUserSession.py - Session managementfactory.core/ObjKeycloakResilient.py - Resilient Keycloak clientfactory.core/ObjKeycloakSyncService.py - Background sync servicelocal.processing/schema/.../sys_keycloak_token_cache.yaml - Token cache schemalocal.processing/schema/.../sys_keycloak_sync_queue.yaml - Sync queue schemaresource.test/pytests/factory.core/test_ObjKeycloakResilient.py - Testsfactory.web/migrate_users_to_keycloak.py - Migration scriptfactory.deploy/validate_keycloak_config.py - Config validatorresource.web/keycloak_status.html - Web dashboardresource.config/grafana_keycloak_dashboard.json - Grafana dashboardfactory.core/ObjKeycloakResilient.md - Documentationfactory.core/ObjKeycloakSyncService.md - Documentationfactory.web/ObjUser.py - Added Login(), ChangePassword() methodsconfig.yaml - Added keycloak.resilience sectionlocal.processing/schema/.../sys_usersession.yaml - Added AuthMode fieldlocal.processing/schema/.../sys_user_history.yaml (sync) - Added AuthModelocal.processing/schema/.../sys_user_history.yaml (janee) - Added AuthModefactory.core/ObjMonitor.py - Added collect_keycloak() methodfactory.web/ObjUser.yaml - Already had configuration (no changes needed)✅ Dual Authentication - Local (primary) + Keycloak (enhancement)
✅ Circuit Breaker - Prevents hammering failed service
✅ Token Caching - 24-hour grace period
✅ Sync Queue - Eventual consistency
✅ Token Refresh - Proactive token renewal
✅ Batch Provisioning - Efficient mass user creation
✅ Alert Integration - ObjNotify for circuit events
✅ Session Management - Multi-type session tracking
✅ Health Monitoring - ObjMonitor integration
✅ Web Dashboard - Real-time status display
✅ Grafana Dashboard - Advanced metrics visualization
✅ Migration Tools - Database to Keycloak sync
✅ Config Validation - Automated setup verification
✅ Comprehensive Testing - Pytest test suite
✅ Users can login when Keycloak is down
✅ Users can change passwords when Keycloak is down
✅ Sessions remain valid during Keycloak outage
✅ Changes sync automatically when Keycloak recovers
✅ No manual intervention required
✅ Operations team has visibility into system state
✅ Alerts notify team when Keycloak unavailable
✅ Background service processes sync queue
✅ Metrics collected for monitoring
✅ Dashboards available for visualization
Database Setup:
# Create tables on target database
python factory.deploy/validate_keycloak_config.py fix
Migration:
# Migrate existing users
python factory.web/migrate_users_to_keycloak.py migrate <package>
Service Deployment:
# Install systemd service
sudo cp resource.config/keycloak-sync.service /etc/systemd/system/
sudo systemctl enable keycloak-sync
sudo systemctl start keycloak-sync
Monitoring Setup:
Testing:
# Run validation
python factory.deploy/validate_keycloak_config.py validate
# Run tests
pytest resource.test/pytests/factory.core/test_ObjKeycloakResilient.py -v
Issue: Users can't login
Solution: Check sys_user table, verify local auth works
Issue: Keycloak syncs not processing
Solution: Start sync service, check queue with status command
Issue: Too many cached tokens
Solution: Run cleanup: ObjKeycloakSyncService.py cleanup
Issue: Circuit breaker stuck OPEN
Solution: Verify Keycloak connectivity, wait for cooldown period
# Check sync service logs
sudo journalctl -u keycloak-sync -f
# Check Keycloak status
python factory.core/ObjKeycloakSyncService.py status
# View pending syncs
mysql -e "SELECT * FROM sys_keycloak_sync_queue WHERE Status='pending'"
# View cached tokens
mysql -e "SELECT User, Package, CachedAt FROM sys_keycloak_token_cache"
resource.notes/objuser_keycloak_resilience_design.md - Detailed design (28 KB)resource.notes/objuser_keycloak_quick_implementation.md - Implementation guide (22 KB)resource.notes/KEYCLOAK_INTEGRATION_SUMMARY.md - High-level overviewfactory.core/ObjKeycloakResilient.md - API documentationfactory.core/ObjKeycloakSyncService.md - Service documentationresource.notes/sys_user_cleanup_2026-02-07.md - Schema cleanup notesImplementation Status: ✅ COMPLETE
Version: 1.0.0
Date: 2026-02-07
Principle: Web users can continue working even if Keycloak is unreachable