The ObjAlertIncident module provides comprehensive incident management
capabilities for the Axion platform. It enables tracking, investigation,
and resolution of incidents while maintaining a complete audit trail and
communication history.
Primary incident tracking table.
| Column | Type | Description |
|---|---|---|
| IncidentGuid | VARCHAR(255) | Unique identifier (primary key) |
| Title | VARCHAR(255) | Short incident description |
| Severity | ENUM | CRITICAL, HIGH, MEDIUM, LOW |
| Status | ENUM | CREATED, INVESTIGATING, MONITORING, CLOSED |
| Package | VARCHAR(100) | Affected system package/module |
| Description | TEXT | Detailed incident description |
| AssignedTo | VARCHAR(255) | User/team responsible |
| CreatedBy | VARCHAR(255) | User who created incident |
| CreatedAt | DATETIME | Creation timestamp |
| ClosedBy | VARCHAR(255) | User who closed incident |
| ClosedAt | DATETIME | Closure timestamp |
| ResolutionSummary | TEXT | How incident was resolved |
| RootCause | TEXT | What caused the incident |
Links alerts to incidents (many-to-many).
| Column | Type | Description |
|---|---|---|
| IncidentGuid | VARCHAR(255) | Foreign key to data_incident |
| AlertGuid | VARCHAR(255) | Alert identifier |
| LinkedAt | DATETIME | When alert was linked |
| LinkedBy | VARCHAR(255) | User who linked alert |
Investigation notes and documentation.
| Column | Type | Description |
|---|---|---|
| NoteGuid | VARCHAR(255) | Unique identifier |
| IncidentGuid | VARCHAR(255) | Foreign key to data_incident |
| NoteType | ENUM | ROOT_CAUSE, RESOLUTION, INVESTIGATION, MITIGATION, GENERAL |
| Content | TEXT | Note content |
| CreatedBy | VARCHAR(255) | User who created note |
| CreatedAt | DATETIME | Creation timestamp |
Audit trail of all incident events.
| Column | Type | Description |
|---|---|---|
| TimelineGuid | VARCHAR(255) | Unique identifier |
| IncidentGuid | VARCHAR(255) | Foreign key to data_incident |
| EventType | VARCHAR(100) | Type of event |
| Description | TEXT | Event description |
| CreatedBy | VARCHAR(255) | User who triggered event |
| Timestamp | DATETIME | Event timestamp |
Tracks all notifications and communications.
| Column | Type | Description |
|---|---|---|
| CommunicationGuid | VARCHAR(255) | Unique identifier |
| IncidentGuid | VARCHAR(255) | Foreign key to data_incident |
| ShortRef | VARCHAR(20) | Human-readable reference (REF-YYYYMMDD-XXX) |
| Channel | VARCHAR(50) | PAGERDUTY, SLACK, DISCORD, SMS, etc. |
| Direction | ENUM | OUTBOUND, INBOUND |
| Status | VARCHAR(50) | SENT, FAILED, PENDING |
| Message | TEXT | Message content |
| Recipient | TEXT | Notification recipient |
| ExternalId | VARCHAR(255) | External system reference |
| ErrorMessage | TEXT | Error details if failed |
| Timestamp | DATETIME | Send timestamp |
| Metadata | JSON | Additional data |
Creates a new incident record.
Parameters:
title (str): Short incident descriptionseverity (str): CRITICAL, HIGH, MEDIUM, or LOWpackage (str): System package affecteddescription (str, optional): Detailed descriptionassigned_to (str, optional): User/team responsiblecreated_by (str, optional): User creating incidentalert_links (list, optional): Alert GUIDs to linkReturns:
str: New incident GUIDExample:
from factory.core import ObjAlertIncident
incident = ObjAlertIncident.ObjAlertIncident()
incident_guid = incident.create_incident(
title="Database Connection Pool Exhausted",
severity="CRITICAL",
package="DATABASE",
description="All connection pool slots occupied",
assigned_to="dba-team",
alert_links=["alert-123", "alert-456"]
)
Updates incident status with validation.
Parameters:
incident_guid (str): Incident identifiernew_status (str): New status valueupdated_by (str, optional): User making updateValid Transitions:
Example:
incident.update_incident_status(
incident_guid=incident_guid,
new_status="INVESTIGATING",
updated_by="engineer1"
)
Adds investigation notes to incident.
Parameters:
incident_guid (str): Incident identifiernote_type (str): ROOT_CAUSE, RESOLUTION, INVESTIGATION, MITIGATION, GENERALcontent (str): Note contentcreated_by (str, optional): User creating noteReturns:
str: Note GUIDExample:
incident.add_incident_note(
incident_guid=incident_guid,
note_type="ROOT_CAUSE",
content="Missing database index causing full table scans",
created_by="dba1"
)
Closes an incident with resolution details.
Parameters:
incident_guid (str): Incident identifierresolution_summary (str, optional): How it was resolvedroot_cause (str, optional): What caused itclosed_by (str, optional): User closing incidentExample:
incident.close_incident(
incident_guid=incident_guid,
resolution_summary="Added missing index and restarted service",
root_cause="Database index optimization oversight",
closed_by="engineer1"
)
Sends notifications across configured channels.
Parameters:
incident_guid (str): Incident identifiernotify_code (str): Notification configuration codemessage_template (str, optional): Custom message templatesent_by (str, optional): User sending notificationReturns:
dict: Delivery statistics {"sent": int, "failed": int}Features:
Template Placeholders:
{incident_guid}{incident_title}{incident_severity}{incident_status}{incident_package}{incident_description}{incident_created_at}Example:
stats = incident.notify_incident(
incident_guid=incident_guid,
notify_code="INCIDENT_CRITICAL",
sent_by="SYSTEM"
)
print(f"Sent: {stats['sent']}, Failed: {stats['failed']}")
Manually logs a communication record.
Parameters:
incident_guid (str): Incident identifierchannel (str): Communication channeldirection (str, optional): OUTBOUND or INBOUNDmessage (str, optional): Message contentrecipient (str, optional): Notification recipientstatus (str, optional): SENT, FAILED, or PENDINGexternal_id (str, optional): External system referencemetadata (dict, optional): Additional dataReturns:
tuple: (communication_guid, short_ref)Example:
comm_guid, short_ref = incident.log_communication(
incident_guid=incident_guid,
channel="PHONE",
direction="OUTBOUND",
message="Called on-call engineer to escalate",
recipient="+1-555-0123",
status="SENT",
metadata={"duration_seconds": 180}
)
print(f"Logged as: {short_ref}")
Retrieves communication history for incident.
Parameters:
incident_guid (str): Incident identifierchannel (str, optional): Filter by channellimit (int, optional): Maximum records to returnReturns:
list: Communication records (newest first)Example:
# Get all communications
comms = incident.get_incident_communications(incident_guid)
# Filter by channel
slack_comms = incident.get_incident_communications(
incident_guid,
channel="SLACK"
)
# Paginate results
recent = incident.get_incident_communications(
incident_guid,
limit=10
)
Looks up communication by short reference.
Parameters:
short_ref (str): Short reference (e.g., REF-20251229-A7K)Returns:
dict: Communication details including incident contextNone: If reference not foundExample:
comm = incident.get_communication_by_ref("REF-20251229-A7K")
if comm:
print(f"Channel: {comm['Channel']}")
print(f"Incident: {comm['IncidentTitle']}")
print(f"Severity: {comm['IncidentSeverity']}")
Generates unique short reference for tracking.
Format: REF-YYYYMMDD-XXX
Returns:
str: Short referenceExample:
ref = incident.generate_short_ref()
# Example output: "REF-20251229-A7K"
Note: Database unique constraint ensures no duplicates. With high
volume (>1000/day), some collision retries may occur.
Links an alert to an existing incident.
Parameters:
incident_guid (str): Incident identifieralert_guid (str): Alert identifierlinked_by (str, optional): User linking alertExample:
incident.link_alert_to_incident(
incident_guid=incident_guid,
alert_guid="alert-789",
linked_by="engineer1"
)
Merges source incident into target incident.
Parameters:
source_incident_guid (str): Incident to merge fromtarget_incident_guid (str): Incident to merge intomerged_by (str, optional): User performing mergeEffect:
Example:
incident.merge_incidents(
source_incident_guid="incident-B",
target_incident_guid="incident-A",
merged_by="admin"
)
Generates formatted incident report.
Parameters:
incident_guid (str): Incident identifierformat (str, optional): text, markdown, or html (default: text)Returns:
str: Formatted reportReport Includes:
Example:
# Text report
report = incident.generate_incident_report(
incident_guid,
format="text"
)
print(report)
# HTML report for email
html_report = incident.generate_incident_report(
incident_guid,
format="html"
)
Emails incident report to stakeholders.
Parameters:
incident_guid (str): Incident identifierformat (str, optional): html or text (default: html)additional_recipients (list, optional): Extra email addressesExample:
incident.email_incident_report(
incident_guid=incident_guid,
format="html",
additional_recipients=[
"manager@example.com",
"product-owner@example.com"
]
)
Calculates incident metrics and statistics.
Parameters:
package (str, optional): Filter by packagestart_date (str, optional): Start of date rangeend_date (str, optional): End of date rangeReturns:
dict: Metrics including:
total_incidents: Total countby_severity: Breakdown by severity levelby_status: Breakdown by statusavg_resolution_time_hours: Average time to closeincidents_by_day: Daily incident countsExample:
metrics = incident.get_incident_metrics(
package="DATABASE",
start_date="2025-01-01",
end_date="2025-01-31"
)
print(f"Total: {metrics['total_incidents']}")
print(f"Critical: {metrics['by_severity']['CRITICAL']}")
print(f"Avg resolution: {metrics['avg_resolution_time_hours']}h")
The module provides comprehensive CLI interface:
Creates a new incident.
python factory.core/ObjAlertIncident.py create-incident \
--title "Database Down" \
--severity CRITICAL \
--package DATABASE \
--description "Primary database not responding" \
--assigned-to dba-team
Lists incidents with optional filtering.
# All incidents
python factory.core/ObjAlertIncident.py list-incidents
# Filter by status
python factory.core/ObjAlertIncident.py list-incidents \
--status INVESTIGATING
# Filter by severity
python factory.core/ObjAlertIncident.py list-incidents \
--severity CRITICAL
Shows detailed incident information.
python factory.core/ObjAlertIncident.py get-incident <incident-guid>
Sends incident notification.
python factory.core/ObjAlertIncident.py notify-incident \
<incident-guid> \
INCIDENT_CRITICAL
Shows communication history.
# All communications
python factory.core/ObjAlertIncident.py list-communications \
<incident-guid>
# Filter by channel
python factory.core/ObjAlertIncident.py list-communications \
<incident-guid> \
--channel PAGERDUTY
Looks up communication by short reference.
python factory.core/ObjAlertIncident.py lookup-ref REF-20251229-A7K
The incident system integrates seamlessly with ObjNotify for
multi-channel notifications:
When notify_incident() is called, the system:
ObjNotify receives incident context via private attributes:
_Incidentguid: Current incident GUID_Incidentsentby: User sending notificationThis enables ObjNotify to:
All notifications automatically include short reference:
CRITICAL INCIDENT ALERT
=======================
Database Connection Pool Exhausted
Status: INVESTIGATING
Package: DATABASE
[Ref: REF-20251229-A7K]
Recipients can reference this code when responding or escalating.
Typical workflow:
Merge incidents when:
from factory.core import ObjAlertIncident
incident = ObjAlertIncident.ObjAlertIncident()
# 1. Create incident
guid = incident.create_incident(
title="API Latency Degraded",
severity="HIGH",
package="WEBAPI",
description="95th percentile >5 seconds",
assigned_to="backend-team"
)
# 2. Send initial alert
incident.notify_incident(guid, "INCIDENT_HIGH_PRIORITY")
# 3. Begin investigation
incident.update_incident_status(guid, "INVESTIGATING")
# 4. Document findings
incident.add_incident_note(
guid,
"INVESTIGATION",
"Slow query on orders table detected"
)
# 5. Identify root cause
incident.add_incident_note(
guid,
"ROOT_CAUSE",
"Missing index on orders.created_at"
)
# 6. Apply fix
incident.add_incident_note(
guid,
"RESOLUTION",
"Added index, latency normalized"
)
# 7. Monitor
incident.update_incident_status(guid, "MONITORING")
# 8. Close
incident.close_incident(
guid,
resolution_summary="Database index added",
root_cause="Missing query optimization"
)
# 9. Send closure notification
incident.notify_incident(guid, "INCIDENT_RESOLVED")
# 10. Generate report
report = incident.generate_incident_report(guid, "html")
incident.email_incident_report(
guid,
additional_recipients=["manager@example.com"]
)
# User reports issue via Slack
# Message includes: [Ref: REF-20251229-A7K]
# Look up the communication
comm = incident.get_communication_by_ref("REF-20251229-A7K")
if comm:
print(f"Incident: {comm['IncidentGuid']}")
print(f"Title: {comm['IncidentTitle']}")
print(f"Severity: {comm['IncidentSeverity']}")
print(f"Status: {comm['IncidentStatus']}")
# Get full incident details
incident_details = incident.get_incident(comm['IncidentGuid'])
The ObjAlertIncident module has comprehensive test coverage for incident management functionality.
| Test Suite | Tests | Type | Purpose |
|---|---|---|---|
test_ObjAlertIncident.py |
48 | Unit | Incident creation, severity tracking, status management, timeline tracking |
| Total | 48 |
# Run all ObjAlertIncident tests
pytest resource.test/pytests/factory.core/test_ObjAlertIncident.py -v