This document outlines recommended enhancements to the Credit Bureau Routing Framework to improve reliability, performance, monitoring, and operational efficiency.
Comprehensive command-line interface for testing and validation.
Commands:
# Test specific bureau with real data
python ObjBureauCLI.py test EXPERIAN --id 8001011234567 --firstname John --surname Doe --birthdate 19800101
# Show raw response data
python ObjBureauCLI.py test MYDATA --id 8001011234567 --firstname Jane --surname Smith --raw
# Health check all bureaus
python ObjBureauCLI.py health
# Compare multiple bureaus
python ObjBureauCLI.py compare --bureaus EXPERIAN,MYDATA --id 8001011234567 --firstname John --surname Doe
# Show usage statistics
python ObjBureauCLI.py stats
python ObjBureauCLI.py stats EXPERIAN --days 30
# Validate configuration
python ObjBureauCLI.py validate-config
# List registered bureaus
python ObjBureauCLI.py list-bureaus
Features:
Problem: Transient network issues or temporary API failures cause immediate failures without retry.
Solution: Implement automatic retry with exponential backoff.
Implementation:
# In ObjServiceBureau
def resolve_with_retry(self, guid, max_retries=3, initial_delay=1.0):
"""
Execute bureau API call with exponential backoff retry.
Args:
guid: Request GUID
max_retries: Maximum retry attempts
initial_delay: Initial delay in seconds (doubles each retry)
Returns:
int: 1 for success, 0 for failure after all retries
"""
delay = initial_delay
for attempt in range(max_retries + 1):
result = self.resolve(guid)
if result == 1:
if attempt > 0:
self.debug(f"Succeeded on retry attempt {attempt}")
return 1
if attempt < max_retries:
self.debug(f"Attempt {attempt + 1} failed, retrying in {delay}s...")
time.sleep(delay)
delay *= 2 # Exponential backoff
self.debug(f"Failed after {max_retries + 1} attempts")
return 0
Configuration:
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauMaxRetries', '3', 'Maximum retry attempts for bureau API calls');
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauRetryDelay', '1.0', 'Initial retry delay in seconds (exponential backoff)');
Benefits:
Problem: Excessive API calls may exceed bureau quotas or trigger rate limits.
Solution: Implement rate limiting with token bucket algorithm.
Implementation:
# In ObjServiceBureau
class BureauRateLimiter:
"""Token bucket rate limiter for bureau API calls."""
def __init__(self, calls_per_minute=60, burst_size=10):
self.calls_per_minute = calls_per_minute
self.burst_size = burst_size
self.tokens = burst_size
self.last_update = time.time()
def acquire(self, block=True, timeout=30):
"""Acquire permission to make API call."""
start_time = time.time()
while True:
now = time.time()
elapsed = now - self.last_update
# Refill tokens
self.tokens = min(
self.burst_size,
self.tokens + (elapsed * self.calls_per_minute / 60)
)
self.last_update = now
if self.tokens >= 1:
self.tokens -= 1
return True
if not block:
return False
if time.time() - start_time > timeout:
return False
time.sleep(0.1)
# Usage in resolve()
def resolve(self, guid, id_number=""):
# Acquire rate limit permission
if not self.rate_limiter.acquire(timeout=30):
self.debug("Rate limit timeout - too many API calls")
self.notify_failure(
self.get_bureau_code(),
"Rate limit exceeded",
guid
)
return 0
# Make API call...
Configuration:
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauRateLimit_EXPERIAN', '60', 'Experian API calls per minute');
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauRateLimit_MYDATA', '120', 'MyData API calls per minute');
Benefits:
Problem: No visibility into bureau performance, response times, or degradation.
Solution: Track and store performance metrics for each bureau call.
Schema:
CREATE TABLE IF NOT EXISTS `bureau_metrics` (
`id` BIGINT AUTO_INCREMENT PRIMARY KEY,
`timestamp` DATETIME DEFAULT CURRENT_TIMESTAMP,
`bureau_code` VARCHAR(50) NOT NULL,
`request_guid` VARCHAR(50),
`success` TINYINT(1) NOT NULL,
`response_time_ms` INT,
`error_message` TEXT,
`source` VARCHAR(20), -- 'buffer', 'api', 'failover', 'dual'
`package` VARCHAR(50),
`deployment` VARCHAR(50),
INDEX `idx_bureau_time` (`bureau_code`, `timestamp`),
INDEX `idx_success` (`success`, `timestamp`)
) ENGINE=InnoDB;
Implementation:
def resolve(self, guid, id_number=""):
start_time = time.time()
source = "api"
error_message = None
try:
# Check buffer first
if self.check_buffer(id_number):
source = "buffer"
self.resolve_buffer(guid, id_number, self.get_buffer_time())
result = 1
else:
result = self._make_api_call(guid, id_number)
return result
except Exception as e:
error_message = str(e)
return 0
finally:
# Record metrics
elapsed_ms = int((time.time() - start_time) * 1000)
self._record_metrics(
guid,
result == 1,
elapsed_ms,
error_message,
source
)
Dashboard Queries:
-- Average response time by bureau (last 24 hours)
SELECT
bureau_code,
AVG(response_time_ms) as avg_ms,
MIN(response_time_ms) as min_ms,
MAX(response_time_ms) as max_ms,
COUNT(*) as total_calls
FROM bureau_metrics
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY bureau_code;
-- Success rate by bureau (last 7 days)
SELECT
bureau_code,
SUM(success) as successful,
COUNT(*) as total,
(SUM(success) / COUNT(*) * 100) as success_rate
FROM bureau_metrics
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 7 DAY)
GROUP BY bureau_code;
-- Buffer hit rate
SELECT
bureau_code,
SUM(CASE WHEN source = 'buffer' THEN 1 ELSE 0 END) as buffer_hits,
SUM(CASE WHEN source = 'api' THEN 1 ELSE 0 END) as api_calls,
(SUM(CASE WHEN source = 'buffer' THEN 1 ELSE 0 END) / COUNT(*) * 100) as buffer_hit_rate
FROM bureau_metrics
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 7 DAY)
GROUP BY bureau_code;
Benefits:
Problem: Repeated calls to failing bureau waste time and resources.
Solution: Implement circuit breaker to fail fast when bureau is known to be down.
States:
Implementation:
class CircuitBreaker:
"""Circuit breaker for bureau API calls."""
def __init__(self, failure_threshold=5, timeout=60):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.failures = 0
self.last_failure_time = 0
self.state = "CLOSED"
def call(self, func, *args, **kwargs):
if self.state == "OPEN":
if time.time() - self.last_failure_time > self.timeout:
self.state = "HALF_OPEN"
self.failures = 0
else:
raise Exception("Circuit breaker OPEN - bureau unavailable")
try:
result = func(*args, **kwargs)
if result == 1:
self.failures = 0
self.state = "CLOSED"
else:
self._record_failure()
return result
except Exception as e:
self._record_failure()
raise
def _record_failure(self):
self.failures += 1
self.last_failure_time = time.time()
if self.failures >= self.failure_threshold:
self.state = "OPEN"
Benefits:
Problem: Difficult to debug failed API calls without request/response data.
Solution: Optional detailed logging of API interactions.
Schema:
CREATE TABLE IF NOT EXISTS `bureau_api_log` (
`id` BIGINT AUTO_INCREMENT PRIMARY KEY,
`timestamp` DATETIME DEFAULT CURRENT_TIMESTAMP,
`bureau_code` VARCHAR(50) NOT NULL,
`request_guid` VARCHAR(50),
`request_payload` TEXT,
`response_payload` TEXT,
`response_code` INT,
`success` TINYINT(1),
`error_message` TEXT,
INDEX `idx_guid` (`request_guid`),
INDEX `idx_time` (`timestamp`)
) ENGINE=InnoDB;
Configuration:
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauLogRequests', 'N', 'Log bureau API requests/responses (Y/N)');
INSERT INTO data_dim_constants (Varname, varvalue, Description)
VALUES ('BureauLogFailuresOnly', 'Y', 'Only log failed requests (Y/N)');
Benefits:
Allow temporary disabling of bureaus without removing from registry.
Schema:
ALTER TABLE def_remoteconnections
ADD COLUMN Maintenance CHAR(1) DEFAULT 'N',
ADD COLUMN MaintenanceMessage TEXT;
Usage:
-- Put Experian in maintenance mode
UPDATE def_remoteconnections
SET Maintenance = 'Y',
MaintenanceMessage = 'Scheduled maintenance until 2026-02-05 10:00'
WHERE Remote LIKE 'NORMALSEARCHSERVICE%';
Replace simple RDG random with weighted distribution based on bureau health.
Configuration:
ALTER TABLE def_bureau_strategy
ADD COLUMN Weight INT DEFAULT 100, -- Weight for load balancing
ADD COLUMN AutoAdjustWeight CHAR(1) DEFAULT 'N'; -- Auto-adjust based on success rate
Logic:
Queue requests during peak load or bureau unavailability.
Implementation:
Complete mock implementation for unit tests and CI/CD.
File: ObjServiceBureauMock.py
class ObjServiceApiMock(ObjServiceBureau):
"""Mock bureau for testing - returns canned responses."""
def __init__(self, DB=0):
super().__init__(DB)
self.mock_responses = {}
def set_mock_response(self, id_number, response_data):
self.mock_responses[id_number] = response_data
def resolve(self, guid, id_number=""):
# Return mock data instead of API call
return 1 # Always succeed in tests
Validate bureau responses against expected schema.
Implementation:
def validate_response(self, response_data):
"""Validate bureau response structure."""
required_fields = ['credit_score', 'risk_grade']
for field in required_fields:
if field not in response_data:
self.debug(f"Missing required field: {field}")
return False
return True
Built-in A/B testing for bureau comparisons.
Features:
Track API costs per bureau and optimize spending.
Schema:
CREATE TABLE IF NOT EXISTS `bureau_cost_config` (
`bureau_code` VARCHAR(50) PRIMARY KEY,
`cost_per_call` DECIMAL(10, 4),
`currency` VARCHAR(3) DEFAULT 'ZAR'
);
CREATE TABLE IF NOT EXISTS `bureau_cost_tracking` (
`month` DATE,
`bureau_code` VARCHAR(50),
`total_calls` INT,
`total_cost` DECIMAL(12, 2),
PRIMARY KEY (`month`, `bureau_code`)
);
Advanced caching beyond simple buffer.
Features:
Route to different bureau endpoints by region/country.
Schema:
ALTER TABLE def_remoteconnections
ADD COLUMN Region VARCHAR(50) DEFAULT 'ZA',
ADD COLUMN Priority INT DEFAULT 100;
Modern API interface for bureau queries.
Features:
Phase 1 (Immediate):
Phase 2 (Next Sprint):
5. Rate Limiting
6. Circuit Breaker
7. Bureau Maintenance Mode
8. Mock Bureau for Tests
Phase 3 (Future):
9. Weighted Load Distribution
10. Request Queuing
11. Response Validation
12. Cost Tracking
Phase 4 (Long-term):
13. A/B Testing Framework
14. Intelligent Caching
15. Multi-Region Support
16. GraphQL API
For each enhancement:
Set up monitoring for:
Update documentation when implementing:
Questions or Suggestions?
This is a living document. As bureau integrations evolve, add new enhancement ideas and update implementation status.