Common mistakes to avoid when writing code for the Axion platform.
When reviewing SQL or table creation code, flag any hardcoded collation/charset strings. These should always use self.get_collation() or the {collation} placeholder.
Bad:
" ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;"
".replace('latin1_general_ci', ...)" # fragile in-query substitution
Good:
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE={collation};
collation = self.get_collation()
sql.replace("{collation}", collation)
Every def_* table MUST include the Module column for system organization.
Bad:
CREATE TABLE def_MyNewTable (
Package VARCHAR(100) NOT NULL,
MyCode VARCHAR(255) NOT NULL,
Description TEXT,
PRIMARY KEY (Package, MyCode)
)
Good:
CREATE TABLE def_MyNewTable (
Package VARCHAR(100) NOT NULL,
Module VARCHAR(255) DEFAULT NULL COMMENT 'Module identifier for system organization',
MyCode VARCHAR(255) NOT NULL,
Description TEXT,
PRIMARY KEY (Package, MyCode)
)
Critical: Column name is capitalized: Module NOT module
The correct method name is escape_sql(), not sql_escape().
Bad:
value = self.sql_escape(user_input) # Wrong method name
Good:
value = self.escape_sql(user_input) # Correct method
Never hardcode package names. Always retrieve dynamically.
Bad:
sql = "SELECT * FROM config WHERE Package = 'HOMECHOICE'"
package = "HOMECHOICE"
Good:
package = self.get_package()
sql = f"SELECT * FROM config WHERE Package = '{package}'"
Silent exception handling masks errors and makes debugging impossible.
Bad:
try:
risky_operation()
except BaseException:
pass # Silent failure - debugging nightmare
Good:
try:
risky_operation()
except BaseException as e:
self.debug(f"Exception in risky_operation: {e}")
Every table should have a primary key for data integrity.
Bad:
CREATE TABLE data_something (
Package VARCHAR(100),
SomeCode VARCHAR(255),
Value TEXT
)
-- Missing PRIMARY KEY!
Good:
CREATE TABLE data_something (
Package VARCHAR(100),
SomeCode VARCHAR(255),
Value TEXT,
PRIMARY KEY (Package, SomeCode)
)
SQL queries should be centralized in YAML for maintainability, not embedded in Python code.
Bad:
class MyClass(ObjData):
def get_records(self):
sql = """
SELECT * FROM my_table
WHERE status = 'active'
AND created_date > DATE_SUB(NOW(), INTERVAL 7 DAY)
"""
return self.sql_get_dictionary_list(sql)
Good:
# In MyClass.yaml:
# database:
# queries:
# get_active_records: |
# SELECT * FROM my_table
# WHERE status = 'active'
# AND created_date > DATE_SUB(NOW(), INTERVAL 7 DAY)
class MyClass(ObjData):
def get_records(self):
sql = self.load_yaml_query("get_active_records")
return self.sql_get_dictionary_list(sql)
Always use the framework's debug method for consistent logging.
Bad:
print("Processing record:", record_id)
logging.debug("Starting process")
Good:
self.debug(f"Processing record: {record_id}")
self.debug("Starting process")
Critical: DO_DEBUG constant must always be present in modules, even if unused.
F-strings are the standard for string formatting in this codebase.
Bad:
message = "User " + username + " logged in at " + str(timestamp)
query = "SELECT * FROM " + table_name + " WHERE id = " + str(user_id)
Good:
message = f"User {username} logged in at {timestamp}"
query = f"SELECT * FROM {table_name} WHERE id = {user_id}"
Every module must have DO_DEBUG constant at the top, even if not used within the module.
Bad:
# Module missing DO_DEBUG constant
import os
import sys
class MyClass:
def __init__(self):
...
Good:
import os
import sys
DO_DEBUG = True # Must be present even if unused in module
class MyClass:
def __init__(self):
...
Why: Used at runtime externally to control debug output.
Use descriptive variable names, not abbreviations.
Bad:
def process_data(dt, usr, pkg):
for rec in dt:
x = rec.get('val')
if x > 0:
usr_nm = usr.get('nm')
Good:
def process_data(data_table, user, package):
for record in data_table:
value = record.get('value')
if value > 0:
user_name = user.get('name')
Class member variables starting with underscore have special naming rules.
Bad:
class MyClass:
def __init__(self):
self._data_import_code = "" # Wrong
self._column_types = [] # Wrong
self._is_active = False # Wrong
Good:
class MyClass:
def __init__(self):
self._Dataimportcode = "" # First letter after _ is uppercase
self._Columntypes = [] # Remove underscores after first _
self._Isactive = False # Same pattern
Pattern: _data_import_code → _Dataimportcode (first letter after _ uppercase, rest lowercase, no underscores)
The codebase standard is typer for CLI interfaces.
Bad:
import argparse
parser = argparse.ArgumentParser(description='Process data')
parser.add_argument('--file', type=str, help='Input file')
args = parser.parse_args()
Good:
import typer
app = typer.Typer()
@app.command()
def process(file: str = typer.Option(..., help="Input file")):
# Process data
pass
if __name__ == "__main__":
app()
Do not wrap self.debug() calls in if DO_DEBUG conditions.
Bad:
if DO_DEBUG:
self.debug("Processing started")
self.debug(f"Record count: {count}")
Good:
self.debug("Processing started")
self.debug(f"Record count: {count}")
Why: The self.debug() method already checks DO_DEBUG internally.
Local module imports require sys.path setup when running from project root.
Bad:
import ObjData # Will fail if not in path
from ObjWorkflow import Workflow
Good:
import sys
import os
base_path = os.getcwd()
paths = ["", "/factory.core", "/factory.service"]
for relative_path in paths:
if (base_path + relative_path) not in sys.path:
sys.path.append(base_path + relative_path)
# Now safe to import
import ObjData
from ObjWorkflow import Workflow
Pattern: Always place this block after external imports, before local imports.
YAML files should contain a single document without separators.
Bad:
---
database:
schema:
my_table: |
CREATE TABLE ...
---
queries:
get_data: |
SELECT ...
Good:
database:
schema:
my_table: |
CREATE TABLE ...
queries:
get_data: |
SELECT ...
Why: Single document only, no --- separators.
Always initialize connection types, even for local transfers.
Bad:
# Only calls if connection exists
if source_remote_connection:
self.remote_connect_source(source_remote_connection)
# source_remote_connection_type is never set!
Good:
# ALWAYS call, even if empty string
self.remote_connect_source(source_remote_connection)
self.remote_connect_target(target_remote_connection)
# ALWAYS set connection types
self.source_remote_connection_type = self.get_connection_type(
self.source_connection
)
self.target_remote_connection_type = self.get_connection_type(
self.target_connection
)
Why: Methods like validate_structure() depend on these attributes existing. Conditional initialization causes AttributeError for local transfers.
Critical data folders must never be committed.
Bad:
# Accidentally commits sensitive data
git add .
git commit -m "Update configuration"
Good:
# Before EVERY commit, verify .gitignore includes:
# data.config
# local.documents
# data.documents
# archive.documents
git add <specific files>
git commit -m "Update configuration"
Critical: Check .gitignore before every commit.
Never bypass git hooks unless absolutely necessary.
Bad:
git commit --no-verify -m "Quick fix"
git push --no-verify
Good:
# Fix the issues that hooks are catching
git commit -m "Quick fix"
git push
Why: Hooks exist to prevent problems. Bypassing them usually creates bigger issues.
Always escape user input, even if you trust the source.
Bad:
def get_user(self, user_id):
sql = f"SELECT * FROM users WHERE id = {user_id}" # Injection risk
return self.sql_get_dictionary_list(sql)
def search_users(self, search_term):
# Direct string interpolation - vulnerable!
sql = f"SELECT * FROM users WHERE name LIKE '%{search_term}%'"
return self.sql_get_dictionary_list(sql)
Good:
def get_user(self, user_id):
user_id = self.escape_sql(user_id)
sql = f"SELECT * FROM users WHERE id = '{user_id}'"
return self.sql_get_dictionary_list(sql)
def search_users(self, search_term):
search_term = self.escape_sql(search_term)
sql = f"SELECT * FROM users WHERE name LIKE '%{search_term}%'"
return self.sql_get_dictionary_list(sql)
Note from CLAUDE.md: "Do not automatically fix SQL injection vulnerabilities - document them instead"
Tests must clean up all resources they create.
Bad:
def test_feature():
obj.sql_execute("CREATE TABLE test_table (id INT, name VARCHAR(100))")
obj.sql_execute("INSERT INTO test_table VALUES (1, 'test')")
# Test code
result = obj.sql_get_value("SELECT COUNT(*) FROM test_table")
assert result == 1
# No cleanup - table remains!
Good:
def test_feature():
try:
obj.sql_execute("CREATE TABLE test_table (id INT, name VARCHAR(100))")
obj.sql_execute("INSERT INTO test_table VALUES (1, 'test')")
# Test code
result = obj.sql_get_value("SELECT COUNT(*) FROM test_table")
assert result == 1
finally:
# Always cleanup
obj.sql_execute("DROP TABLE IF EXISTS test_table")
Clean up first, then insert fresh data.
Bad:
# May update with wrong values if key already exists
obj.sql_execute("""
INSERT INTO config (key, value)
VALUES ('setting', 'new_value')
ON DUPLICATE KEY UPDATE value = 'old_value'
""")
Good:
# Clean up first for predictable state
obj.sql_execute("DELETE FROM config WHERE key = 'setting'")
obj.sql_execute("""
INSERT INTO config (key, value)
VALUES ('setting', 'new_value')
""")
Why: ON DUPLICATE KEY UPDATE can execute unexpected UPDATE path with stale values.
Maximum line length is 80 characters per CLAUDE.md.
Bad:
result = self.sql_get_dictionary_list("SELECT customer_id, customer_name, customer_email, customer_phone FROM customers WHERE status = 'active'")
Good:
query = """
SELECT customer_id, customer_name, customer_email, customer_phone
FROM customers
WHERE status = 'active'
"""
result = self.sql_get_dictionary_list(query)
Never remove framework-required attributes even if they seem unused.
Bad:
class MyClass(ObjData):
def __init__(self):
super().__init__()
# Removed _IsA, _Version, _Updatedate to "clean up"
Good:
class MyClass(ObjData):
def __init__(self):
super().__init__()
# Keep these even if unused in this class
self._IsA = "MyClass"
self._Version = "1.0"
self._Updatedate = "2026-02-21"
Why: These attributes are part of the framework contract.
Only document code you actually changed during a task.
Bad:
# During a bug fix, adding docstrings to surrounding functions
def existing_function(self, param):
"""
Process parameter.
Args:
param: The parameter to process
Returns:
Processed result
"""
return self.process(param) # Didn't change this, just added docs
Good:
# Only document code you actually changed
def existing_function(self, param):
return self.process(param) # Leave unchanged code as-is
Why: Per CLAUDE.md - "Don't add docstrings, comments, or type annotations to code you didn't change"
Reserve documentation updates for significant changes only.
Bad:
# Made 1-line bug fix in ObjNotify.py
# Then spent time updating ObjNotify.md with minor details
Good:
# Update .md only for significant changes
# Trivial fixes don't need documentation updates
When to update .md: Significant API changes, new features, architectural changes
Docstrings should explain purpose and logic, not reiterate implementation.
Bad:
def transfer_data(self, source, target):
"""
Transfer data from source to target.
Implementation:
1. Opens connection to source database using pymysql
2. Executes SELECT query with buffer_size=500
3. Iterates through results using cursor.fetchmany()
4. For each batch, escapes values using self.escape_sql
5. Inserts to target using INSERT IGNORE
6. Commits every 100 records
7. Closes connections
Args:
source: Source table name
target: Target table name
"""
Good:
def transfer_data(self, source, target):
"""Transfer data from source to target table."""
Don't register simulation or visualization tools as workflow nodes.
Bad:
@ObjNodeRegistry.register("SIMUL")
class ObjWorkflowSimul(ObjWorkflowNode):
# This is a simulation tool, NOT a workflow node!
Good:
# Don't register simulation/visualization tools
class ObjWorkflowSimul(ObjWorkflowNode):
# No @register decorator
Never register: ObjWorkflowSimul, ObjWorkflowVisual, FLOW node type
Use specific flow types, not legacy FLOW base type.
Bad:
# Using legacy FLOW instead of specific types
node_type = "FLOW" # Base/legacy
Good:
# Use specific flow types
node_type = "FORMFLOW" # or FORMGUI, REPORTFLOW, REPORTGUI
Cache YAML data outside loops to avoid repeated file I/O.
Bad:
for record in records:
yaml_data = self.yaml_read_file(file_path) # Re-reads every iteration!
process(yaml_data, record)
Good:
yaml_data = self.yaml_read_file(file_path) # Read once
for record in records:
process(yaml_data, record)
Create indexed cache for YAML sources to avoid repeated lookups.
Bad:
# In transfer_structure for YAML source
for guid in guid_list:
record = self.yaml_read_file(source_file) # Re-reads entire file!
filtered = [r for r in record if r['guid'] == guid]
Good:
# Cache YAML data and create index
yaml_cache = self.yaml_read_file(source_file)
yaml_index = {str(r['guid']): r for r in yaml_cache}
for guid in guid_list:
record = yaml_index.get(guid)
Avoid executing queries in loops - use JOINs or batch queries instead.
Bad:
users = self.sql_get_dictionary_list("SELECT id, name FROM users")
for user in users:
# N+1 queries - one per user!
orders = self.sql_get_dictionary_list(
f"SELECT * FROM orders WHERE user_id = {user['id']}"
)
process(user, orders)
Good:
# Single query with JOIN
query = """
SELECT u.id, u.name, o.*
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
"""
data = self.sql_get_dictionary_list(query)
# Group and process
Use the framework's config access pattern, not direct file reading.
Bad:
import yaml
with open('config.yaml') as f:
config = yaml.safe_load(f)
value = config['section']['key']
Good:
# Use inherited get_ini_value from Objects
value = self.get_ini_value('section', 'key')
Why: Uses proper config access pattern with package resolution and placeholders
Leverage config placeholders for dynamic values.
Bad:
database:
host: "mysql.production.example.com"
name: "homechoice_db"
Good:
database:
host: "$terraform$mysql_host"
name: "$package$_db"
Available placeholders: $package$, $dns$, $terraform$, $environment_KEY$
Use constants from ObjConstants instead of hardcoded paths.
Bad:
file_path = "local.documents/uploads/file.pdf"
archive_path = "archive.documents/old/file.pdf"
Good:
from ObjConstants import LOCAL_DOCUMENTS, ARCHIVE_DOCUMENTS
file_path = f"{LOCAL_DOCUMENTS}/uploads/file.pdf"
archive_path = f"{ARCHIVE_DOCUMENTS}/old/file.pdf"
Always use context managers for file operations.
Bad:
def read_data(self):
f = open('data.txt', 'r')
data = f.read()
return data # File never closed!
Good:
def read_data(self):
with open('data.txt', 'r') as f:
data = f.read()
return data # File automatically closed
All service classes are named ObjServiceApi regardless of domain.
Bad:
# Suggesting to rename based on domain
class ObjServiceApi: # In factory.service/package.shopify
# "This should be ObjServiceShopify!" # WRONG
Good:
# Keep standard name
class ObjServiceApi: # Always ObjServiceApi regardless of domain
Why: Per CLAUDE.md - standard naming convention for all services
Classes in factory.text are always named ObjProcessText.
Bad:
# In factory.text/ObjTextEmoji.py
class ObjTextEmoji: # Wrong name
Good:
# In factory.text/ObjTextEmoji.py
class ObjProcessText: # Always ObjProcessText
Why: Module-specific convention for factory.text modules
Delete unused code completely instead of commenting or renaming.
Bad:
# Keeping old code around "just in case"
def new_method(self, param):
result = self.process(param)
return result
# Old method (removed)
# def old_method(self, param):
# return self.process(param)
# Renamed unused parameter
def other_method(self, data, _deprecated_param):
return self.transform(data)
Good:
# Just delete unused code
def new_method(self, param):
result = self.process(param)
return result
def other_method(self, data):
return self.transform(data)
Why: Per CLAUDE.md - "If you are certain that something is unused, you can delete it completely"
Always specify which factory module to compile.
Bad:
# Trying to compile without specifying target
python factory.deploy/ObjCompile.py
Good:
# Specify the factory
python factory.deploy/ObjCompile.py service core
python factory.deploy/ObjCompile.py all
python factory.deploy/ObjCompile.py service report --package core
LXD must be installed via snap, not apt.
Bad:
sudo apt install lxd # Wrong package manager
Good:
sudo snap install lxd
sudo lxd init --auto
sudo usermod -a -G lxd $(whoami)
# Log out and back in for group changes to take effect
Why: Per CLAUDE.md - LXD requires snap installation
Always validate critical transfers to ensure data integrity.
Bad:
count = obj.transfer('MY_TRANSFER')
print(f"Transferred {count} records")
# No validation - assume it worked!
Good:
count = obj.transfer('MY_TRANSFER')
# Validate results
validation = obj.validate_transfer(
track_guid=track_guid,
transfer_code='MY_TRANSFER',
validation_types=['ROW_COUNT', 'CHECKSUM']
)
if not validation['passed']:
raise Exception("Transfer validation failed!")
Always dry-run first to validate connectivity and estimate impact.
Bad:
# Running directly in production without testing
count = obj.transfer_incremental('CRITICAL_TRANSFER')
Good:
# Always dry-run first
dry_result = obj.transfer_dry_run('CRITICAL_TRANSFER')
if not dry_result['source_accessible']:
raise Exception("Source not accessible!")
if dry_result['estimated_records'] > 1000000:
print("Warning: Large transfer, consider batching")
# Then run actual transfer
count = obj.transfer_incremental('CRITICAL_TRANSFER')
Catch specific exceptions to provide better error handling.
Bad:
try:
result = dangerous_operation()
process(result)
save(result)
except Exception: # Catches EVERYTHING including KeyboardInterrupt!
self.debug("Something failed")
Good:
try:
result = dangerous_operation()
except ConnectionError as e:
self.debug(f"Connection failed: {e}")
return None
except ValueError as e:
self.debug(f"Invalid data: {e}")
return None
try:
process(result)
save(result)
except Exception as e:
self.debug(f"Processing failed: {e}")
Include relevant context in error messages for debugging.
Bad:
except Exception as e:
self.debug(f"Error: {e}") # What was being processed?
Good:
except Exception as e:
self.debug(
f"Error processing record {record_id} "
f"in transfer {transfer_code}: {e}"
)
Only commit when explicitly requested by the user.
Bad:
# After completing a task, automatically committing
# git add .
# git commit -m "Implemented feature"
# User never asked for commit!
Good:
# Only commit when user explicitly asks:
# "commit these changes" or "create a commit"
# Otherwise just report completion
Why: Per CLAUDE.md - "NEVER commit changes unless the user explicitly asks you to"
Always create pull requests to develop, not main.
Bad:
# Creating PR to main
gh pr create --base main
Good:
# Always PR to develop (default branch)
gh pr create --base develop
Why: Per CLAUDE.md - default branch for PRs is develop
Never force push to main branches without explicit user permission.
Bad:
git push --force origin main # DANGEROUS!
Good:
# Warn user and ask for confirmation if they request this
# Only proceed with explicit approval
Why: Per CLAUDE.md - "NEVER run force push to main/master, warn the user if they request it"
Always include audit timestamp fields in tables.
Bad:
CREATE TABLE def_MyTable (
Package VARCHAR(100),
Code VARCHAR(255),
Description TEXT,
PRIMARY KEY (Package, Code)
)
Good:
CREATE TABLE def_MyTable (
Package VARCHAR(100),
Code VARCHAR(255),
Description TEXT,
CreatedDate DATETIME DEFAULT CURRENT_TIMESTAMP,
UpdatedDate DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (Package, Code)
)
Why: Audit trail for when records were created/modified
Definition tables should include an Active flag for soft deletes.
Bad:
CREATE TABLE def_Something (
Package VARCHAR(100),
Code VARCHAR(255),
Description TEXT
)
Good:
CREATE TABLE def_Something (
Package VARCHAR(100),
Code VARCHAR(255),
Description TEXT,
Active CHAR(1) DEFAULT 'Y',
KEY idx_active (Active)
)
Why: Enables soft deletes and filtering active records efficiently
Add indexes to columns used in JOINs and WHERE clauses.
Bad:
CREATE TABLE data_orders (
OrderId VARCHAR(100),
CustomerId VARCHAR(100), -- No index!
ProductId VARCHAR(100), -- No index!
PRIMARY KEY (OrderId)
)
Good:
CREATE TABLE data_orders (
OrderId VARCHAR(100),
CustomerId VARCHAR(100),
ProductId VARCHAR(100),
PRIMARY KEY (OrderId),
KEY idx_customer (CustomerId),
KEY idx_product (ProductId)
)
Why: Dramatically improves JOIN and lookup performance
Use VARCHAR for IDs to support UUIDs and distributed systems.
Bad:
CREATE TABLE data_tracking (
TrackId INT AUTO_INCREMENT, -- Breaks in distributed systems
Package VARCHAR(100),
PRIMARY KEY (TrackId)
)
Good:
CREATE TABLE data_tracking (
Package VARCHAR(100),
TrackGuid VARCHAR(100), -- UUID-compatible
PRIMARY KEY (Package, TrackGuid)
)
Why: VARCHAR GUIDs work better for distributed systems, merging databases
Target tables for transfers must include metadata columns.
Bad:
CREATE TABLE target_data (
guid VARCHAR(100),
name VARCHAR(255),
value INT,
-- Missing transfer metadata!
PRIMARY KEY (guid)
)
Good:
CREATE TABLE target_data (
guid VARCHAR(100),
name VARCHAR(255),
value INT,
TransferSource VARCHAR(255),
TransferDate DATETIME,
PRIMARY KEY (guid)
)
Why: Required for tracking data lineage and enabling REPLACE mode
Use framework's transfer methods instead of manual transfers.
Bad:
def my_custom_transfer(self):
# Manual transfer without tracking
source_data = self.get_source_data()
for record in source_data:
self.insert_target(record)
Good:
def my_custom_transfer(self):
# Use framework's transfer_structure with tracking
return self.transfer_structure(
source_table='my_source',
target_table='my_target',
filter_guid='guid',
filter_query='1=1',
transfer_code='MY_TRANSFER',
enable_tracking=True
)
Why: Automatic progress tracking, validation, and error handling
Make buffer size configurable for different dataset sizes.
Bad:
# Fixed buffer for all transfers
def transfer_data(self):
buffer_size = 500 # Always 500
Good:
# Allow override for large/small datasets
def transfer_data(self, buffer_size: int = 500):
# Configurable per transfer
Why: Large datasets benefit from bigger buffers, small ones from smaller
Don't put notification codes in YAML init - handle them dynamically in code.
Bad:
# Notification codes in YAML init section
init:
notification_codes: |
INSERT INTO def_notify (Package, NotifyCode, Description)
VALUES ('{package}', 'TRANSFER_FAILED', 'Transfer operation failed')
ON DUPLICATE KEY UPDATE Description = VALUES(Description)
Good:
# Handle notifications dynamically in code
self.notify('TRANSFER_FAILED', {
'message': 'Transfer failed!',
'transfer_code': transfer_code,
'error': str(error)
})
Why: Notifications should be dynamic and contextual, not pre-defined in static YAML
Don't log every iteration in large loops.
Bad:
for record in records: # 1 million records
self.debug(f"Processing record: {record}") # 1 million log lines!
Good:
self.debug(f"Processing {len(records)} records...")
for i, record in enumerate(records):
if i % 1000 == 0: # Log every 1000 records
self.debug(f"Processed {i}/{len(records)} records")
Why: Reduces log volume and improves performance
Provide real-time progress for long-running operations.
Bad:
# Long-running transfer with no progress updates
for i in range(1000000):
transfer_record(i)
# User has no idea what's happening
Good:
# Use built-in progress tracking
if track_guid:
self.update_progress(
track_guid=track_guid,
records_processed=i,
total_records=1000000,
phase='WRITING'
)
Why: Users can monitor progress and estimate completion time
The Block field is not used in webhook tables - don't modify it.
Bad:
# Setting Block field in webhook
obj.sql_execute("""
UPDATE def_webhook
SET Block = 'some_value'
WHERE WebhookCode = 'MY_HOOK'
""")
Good:
# Block field is NOT used - don't set it
# Primary keys: (WebhookCode, Package) - no Block
Why: Per MEMORY.md - Block field is not used in webhook tables
Use the proper abstraction layer for webhook persistence.
Bad:
# Direct SQL manipulation of webhook tables
obj.sql_execute("""
INSERT INTO def_webhook (WebhookCode, Package, Url)
VALUES ('MY_HOOK', 'HOMECHOICE', 'https://...')
""")
Good:
# Use ObjHookEdit interface
from ObjHookEdit import ObjHookEdit
hook_edit = ObjHookEdit()
hook_edit.save_webhook({
'webhook_code': 'MY_HOOK',
'url': 'https://...',
...
})
Why: Proper layered architecture - ObjHookEdit handles validation and relationships
Include type hints for better code clarity and IDE support.
Bad:
def process_data(data, config, options):
# What types are these?
...
Good:
from typing import Dict, List, Any
def process_data(
data: List[Dict[str, Any]],
config: Dict[str, str],
options: Dict[str, bool]
) -> int:
...
Why: Improves code readability and enables better IDE autocomplete
Always validate inputs before processing.
Bad:
def transfer(self, transfer_code):
# Assume transfer_code is valid
sql = f"SELECT * FROM def_Transferbulk WHERE Transfercode = '{transfer_code}'"
Good:
def transfer(self, transfer_code: str) -> int:
if not transfer_code or not transfer_code.strip():
raise ValueError("transfer_code cannot be empty")
transfer_code = self.escape_sql(transfer_code)
sql = f"SELECT * FROM def_Transferbulk WHERE Transfercode = '{transfer_code}'"
Why: Fail fast with clear errors rather than cryptic failures later
Always check for None before using return values.
Bad:
result = self.sql_get_value("SELECT COUNT(*) FROM table")
# What if result is None?
total = result + 10 # TypeError if None!
Good:
result = self.sql_get_value("SELECT COUNT(*) FROM table")
total = int(result or 0) + 10
Why: Prevents TypeError from None operations
Mark tests that require external services.
Bad:
# Test requires database but no marker
def test_database_transfer():
# Runs even without database - fails
Good:
import pytest
@pytest.mark.integration
def test_database_transfer():
# Skipped if integration tests disabled
Why: Allows running unit tests without external dependencies
Check service availability before running integration tests.
Bad:
def test_mongo_transfer():
# Assumes MongoDB is running - fails otherwise
mongo_client = MongoClient('localhost', 27017)
...
Good:
def test_mongo_transfer(self, preflight_check):
# Automatically skipped if MongoDB not available
if not preflight_check['mongodb']:
pytest.skip("MongoDB not available")
Why: Tests should skip gracefully when dependencies missing
Generate test data dynamically instead of hardcoding.
Bad:
def test_transfer():
# Hardcoded test data
obj.sql_execute("INSERT INTO test_table VALUES ('ABC123', 'Test', 100)")
Good:
def test_transfer(self):
# Generate test data
test_data = obj.generate_test_data(
count=10,
guid_prefix='TEST'
)
for record in test_data:
obj.insert_record(record)
Why: More flexible, maintainable, and tests different data patterns
Keep controllers thin - delegate to service layer.
Bad:
# In ServeReport.py
@app.post("/generate")
def generate_report(request):
# Business logic here!
data = db.query("SELECT ...")
processed = [transform(row) for row in data]
formatted = format_output(processed)
return formatted
Good:
# In ServeReport.py
@app.post("/generate")
def generate_report(request):
# Delegate to service
report = ObjReport()
return report.generate(request.params)
# In ObjReport.py
class ObjReport:
def generate(self, params):
# Business logic here
...
Why: Separation of concerns - controllers handle routing, services handle logic
Split large classes into focused, single-purpose classes.
Bad:
class ObjMegaClass:
def transfer_data(self): ...
def send_email(self): ...
def generate_report(self): ...
def validate_user(self): ...
def process_payment(self): ...
# 50 more methods...
Good:
# Separate concerns into focused classes
class ObjDataTransfer:
def transfer(self): ...
class ObjEmail:
def send(self): ...
class ObjReport:
def generate(self): ...
Why: Single Responsibility Principle - easier to test and maintain
Follow the factory.* module structure, don't create util directories.
Bad:
# Creating utility files outside factory structure
# utils/data_helper.py
# helpers/transfer_utils.py
Good:
# Use proper factory structure
# factory.core/ObjDataHelper.py
# factory.service/ObjServiceTransfer.py
Why: Consistent organization pattern across the codebase
Validate cron expressions before saving them.
Bad:
UPDATE def_Transferbulk
SET TransferSchedule = '0 0 * * *' -- Not validated!
WHERE Transfercode = 'MY_TRANSFER'
Good:
from ObjDataTransfer import ObjDataTransfer
obj = ObjDataTransfer()
is_valid = obj.validate_cron_schedule('0 0 * * *')
if is_valid:
obj.sql_execute("""
UPDATE def_Transferbulk
SET TransferSchedule = '0 0 * * *'
WHERE Transfercode = 'MY_TRANSFER'
""")
Why: Prevents invalid schedules that will never run
Check schedule before executing scheduled transfers.
Bad:
# Running transfer without schedule check
obj.transfer('SCHEDULED_TRANSFER')
# May run too frequently!
Good:
if obj.should_run_transfer('SCHEDULED_TRANSFER'):
obj.transfer('SCHEDULED_TRANSFER')
else:
self.debug("Transfer not due to run yet")
Why: Respects cron schedule, prevents duplicate runs
Use environment variables for sensitive data.
Bad:
database:
password: "MySecretPassword123" # Plain text in version control!
Good:
database:
password: "$environment_DB_PASSWORD$" # Environment variable
Why: Keeps credentials out of version control
Never hardcode credentials in code.
Bad:
# In code
API_KEY = "sk-1234567890abcdef"
DB_PASSWORD = "password123"
Good:
# From environment or encrypted config
API_KEY = os.getenv('API_KEY')
DB_PASSWORD = self.get_ini_value('database', 'password')
Why: Prevents credential leaks in version control
Use framework's automatic encryption for sensitive fields.
Bad:
CREATE TABLE def_connections (
Password VARCHAR(255) -- Stored plain text!
)
Good:
CREATE TABLE def_connections (
Remotepassword VARCHAR(255) -- Auto-encrypted by ObjData layer
)
Why: Fields in datastore.encryption config are auto-encrypted by framework
Properly handle async execution in workflow nodes.
Bad:
async def execute(self, run_context, ...):
# Async method but sync calls
result = self.process_data() # Blocking!
Good:
async def execute(self, run_context, ...):
# Proper async/await
result = await self.process_data_async()
Why: Mixing sync/async incorrectly can cause blocking
This is one of the most important antipatterns to avoid in async code.
Never use blocking calls in async functions - they freeze the entire application.
Bad - All of these block the event loop:
import time
import requests
async def long_task():
time.sleep(10) # ❌ BLOCKS ENTIRE EVENT LOOP FOR 10 SECONDS!
async def fetch_data():
response = requests.get('https://api.example.com') # ❌ BLOCKING!
async def read_file():
with open('large_file.txt', 'r') as f:
data = f.read() # ❌ BLOCKING I/O!
async def database_query():
result = pymysql_connection.execute(query) # ❌ BLOCKING!
Good - Non-blocking alternatives:
import asyncio
import aiohttp
import aiofiles
async def long_task():
await asyncio.sleep(10) # ✅ Non-blocking sleep
async def fetch_data():
async with aiohttp.ClientSession() as session:
async with session.get('https://api.example.com') as response:
return await response.json() # ✅ Non-blocking HTTP
async def read_file():
async with aiofiles.open('large_file.txt', 'r') as f:
data = await f.read() # ✅ Non-blocking file I/O
async def database_query():
# Use async database drivers
result = await async_connection.execute(query) # ✅ Non-blocking DB
Common Blocking Calls to Avoid:
time.sleep() → use await asyncio.sleep()requests.get() → use aiohttpopen().read() → use aiofilessubprocess.run() → use await asyncio.create_subprocess_exec()await asyncio.to_thread() or executorImpact of Blocking:
# If you have 1000 concurrent requests and one blocks for 10 seconds:
# - All 1000 requests wait 10 seconds
# - Server appears frozen
# - Timeouts cascade
# - User experience destroyed
Detection:
# Add this to detect blocking calls
import asyncio
async def with_timeout():
try:
await asyncio.wait_for(potentially_blocking_call(), timeout=1.0)
except asyncio.TimeoutError:
# This call is blocking the event loop!
self.debug("WARNING: Blocking call detected!")
Why Critical: Blocking the event loop defeats the entire purpose of async programming and can bring down your entire application. A single blocking call can freeze thousands of concurrent operations.
Reuse database connections instead of creating new ones repeatedly.
Bad:
for i in range(1000):
conn = pymysql.connect(...) # New connection each time!
cursor = conn.cursor()
cursor.execute("SELECT ...")
conn.close()
Good:
# Framework handles connection pooling automatically
for i in range(1000):
result = self.sql_get_value("SELECT ...") # Reuses pooled connection
Why: Connection creation is expensive; pooling dramatically improves performance
Ensure failures are logged for debugging and monitoring.
Bad:
try:
count = obj.transfer('MY_TRANSFER')
except Exception:
pass # Failure never recorded!
Good:
try:
count = obj.transfer('MY_TRANSFER')
except Exception as e:
# Framework tracks failures in data_track_bulktransfer
self.debug(f"Transfer failed: {e}")
raise # Re-raise to ensure tracking
Why: Silent failures make debugging impossible
Alert on critical errors, don't just log them.
Bad:
if critical_error:
self.debug("Critical error occurred")
# No alert sent!
Good:
if critical_error:
self.debug("Critical error occurred")
self.notify('CRITICAL_ERROR', {
'error': str(critical_error),
'context': 'data_transfer',
'severity': 'HIGH'
})
Why: Critical errors need immediate attention, not buried in logs
Use proper time-series database for metrics, not text files.
Bad:
# Logging performance to text file
with open('perf.log', 'a') as f:
f.write(f"{timestamp},{duration},{count}\n")
Good:
# Use InfluxDB for time-series metrics
self.influx_write('transfer_metrics', {
'duration': duration,
'record_count': count,
'transfer_code': transfer_code
})
Why: InfluxDB provides querying, aggregation, and visualization
RULE: All constants ALWAYS go in ObjConstants, never in individual modules.
Bad:
# In ObjDataTransfer.py
DEFAULT_BUFFER_SIZE = 500
MAX_RETRIES = 3
TIMEOUT_SECONDS = 60
Good:
# In ObjConstants.py
DEFAULT_BUFFER_SIZE = 500
MAX_RETRIES = 3
TIMEOUT_SECONDS = 60
# In ObjDataTransfer.py
from ObjConstants import DEFAULT_BUFFER_SIZE, MAX_RETRIES
Why: Centralized constants are easier to find, update, and maintain
RULE: All type aliases ALWAYS go in ObjTypes, and type names MUST end in Type.
Bad:
# In ObjWorkflow.py
WorkflowConfig = Dict[str, Any] # Wrong location
NodeResult = Dict[str, str] # Wrong location
Response = Dict[str, Any] # Missing 'Type' suffix
Good:
# In ObjTypes.py
WorkflowConfigType = Dict[str, Any] # Correct location + suffix
NodeResultType = Dict[str, str] # Correct location + suffix
ResponseType = Dict[str, Any] # Correct location + suffix
# In ObjWorkflow.py
from ObjTypes import WorkflowConfigType, NodeResultType
Why: Centralized types improve consistency and type checking across modules
RULE: Any string values that can become Enums MUST go in ObjEnum using StrEnum.
Bad:
# In ObjDataTransfer.py - hardcoded strings
def transfer(self, mode):
if mode == "SUPPLEMENT": # Magic string
...
elif mode == "REPLACE": # Magic string
...
# Or using regular Enum in wrong location
class TransferMode(Enum):
SUPPLEMENT = "SUPPLEMENT"
REPLACE = "REPLACE"
Good:
# In ObjEnum.py - centralized StrEnum
from enum import StrEnum
class TransferMode(StrEnum):
SUPPLEMENT = "SUPPLEMENT"
REPLACE = "REPLACE"
class DatabaseType(StrEnum):
MYSQL = "MYSQL"
MSSQL = "MSSQL"
MONGO = "MONGO"
# In ObjDataTransfer.py
from ObjEnum import TransferMode
def transfer(self, mode: TransferMode):
if mode == TransferMode.SUPPLEMENT:
...
elif mode == TransferMode.REPLACE:
...
Why:
RULE: Type names MUST end in Type.
Bad:
# In ObjTypes.py
Payload = Dict[str, Any] # Missing suffix
ConfigDict = Dict[str, str] # Wrong suffix
ResponseData = Dict[str, Any] # Wrong suffix
Good:
# In ObjTypes.py
PayloadType = Dict[str, Any] # Correct suffix
ConfigType = Dict[str, str] # Correct suffix
ResponseType = Dict[str, Any] # Correct suffix
Why: Consistent naming convention makes types easy to identify
Constants belong in ObjConstants, not scattered across modules.
Bad:
# In ObjEmail.py
SMTP_TIMEOUT = 30
# In ObjReport.py
MAX_ROWS = 10000
# In ObjApi.py
API_VERSION = "v1"
Good:
# In ObjConstants.py
SMTP_TIMEOUT = 30
MAX_REPORT_ROWS = 10000
API_VERSION = "v1"
# Import where needed
from ObjConstants import SMTP_TIMEOUT, MAX_REPORT_ROWS
Why: Single source of truth for all constants
Don't hardcode absolute paths - use relative paths from project root or config values.
Bad:
file_path = "/home/user/projects/axion/data/file.txt"
config_file = "/var/lib/axion/config.yaml"
Good:
base_path = os.getcwd()
file_path = os.path.join(base_path, "data", "file.txt")
# Or use constants
from ObjConstants import LOCAL_DOCUMENTS
file_path = f"{LOCAL_DOCUMENTS}/file.txt"
Why: Hardcoded paths break on different systems and deployments
Always use with statements for file handles, database connections, locks, etc.
Bad:
f = open('file.txt')
data = f.read()
f.close() # May not execute if exception occurs
lock = threading.Lock()
lock.acquire()
# Critical section
lock.release() # May not execute if exception occurs
Good:
with open('file.txt') as f:
data = f.read() # File automatically closed
with threading.Lock():
# Critical section
# Lock automatically released
Why: Context managers guarantee cleanup even when exceptions occur
Don't silently ignore return values from database operations, API calls, etc.
Bad:
self.sql_execute(query) # Did it succeed?
api_response = self.call_api(endpoint) # Any errors?
Good:
result = self.sql_execute(query)
if not result:
self.debug("Query execution failed")
raise Exception("Database operation failed")
api_response = self.call_api(endpoint)
if api_response.get('status') != 'success':
self.debug(f"API call failed: {api_response.get('error')}")
raise Exception("API operation failed")
Why: Silent failures make debugging impossible
Never use mutable objects (lists, dicts) as default parameter values.
Bad:
def process_items(items=[]): # ❌ Shared across all calls!
items.append('new')
return items
# First call returns ['new']
# Second call returns ['new', 'new'] # BUG!
Good:
def process_items(items=None):
if items is None:
items = []
items.append('new')
return items
# Each call gets fresh list
Why: Mutable defaults persist across function calls, causing unexpected state
Always validate and sanitize user input before using in queries.
Bad:
user_id = request.get('user_id')
sql = f"SELECT * FROM users WHERE id = '{user_id}'" # Injection risk!
Good:
user_id = request.get('user_id')
# Validate format
if not user_id or not user_id.isdigit():
raise ValueError("Invalid user ID format")
# Escape for SQL
user_id = self.escape_sql(user_id)
sql = f"SELECT * FROM users WHERE id = '{user_id}'"
Why: Prevents SQL injection attacks
If a query exists in YAML, don't recreate it in Python - load from YAML.
Bad:
# Query already exists in ObjDataTransfer.yaml
sql = """
SELECT Package, Module, Transfercode
FROM def_Transferbulk
WHERE Active = 'Y'
AND TransferSchedule IS NOT NULL
"""
Good:
# Load from YAML
sql = self.load_yaml_query("get_scheduled_transfers")
Why: Single source of truth, easier maintenance
Be explicit about UTC vs local time - don't mix naive and aware datetimes.
Bad:
now = datetime.now() # Local? UTC? Timezone-aware? Unclear!
timestamp = datetime.strptime(date_string, '%Y-%m-%d') # Naive datetime
Good:
from datetime import datetime, timezone
now_utc = datetime.now(timezone.utc) # Explicit UTC
# Or for local time with timezone
import pytz
now_local = datetime.now(pytz.timezone('America/New_York'))
Why: Prevents timezone-related bugs in distributed systems
Never use eval/exec on untrusted input - major security risk.
Bad:
user_code = request.get('code')
eval(user_code) # ❌ REMOTE CODE EXECUTION VULNERABILITY!
user_expression = request.get('expression')
result = exec(user_expression) # ❌ SECURITY HOLE!
Good:
# Use safe alternatives
import json
import ast
# For JSON data
user_data = json.loads(request.get('data'))
# For Python literals only (safe subset)
user_literal = ast.literal_eval(request.get('literal'))
Why: eval/exec can execute arbitrary code, including malicious commands
Always close connections or use connection pooling properly.
Bad:
def get_data(self):
conn = mysql.connector.connect(**config)
cursor = conn.cursor()
cursor.execute(query)
results = cursor.fetchall()
return results # Connection and cursor never closed!
Good:
def get_data(self):
# Use framework's connection pooling
return self.sql_get_dictionary_list(query)
# Or if manual connection needed:
with mysql.connector.connect(**config) as conn:
with conn.cursor() as cursor:
cursor.execute(query)
results = cursor.fetchall()
return results
Why: Connection leaks exhaust database connection pools
Don't create environment variables for single-use values - use config.yaml.
Bad:
os.environ['MY_TEMP_VALUE'] = 'something'
os.environ['SINGLE_USE_FLAG'] = 'true'
Good:
# config.yaml
mypackage:
temp_value: 'something'
feature_flag: true
# Access via framework
value = self.get_ini_value('mypackage', 'temp_value')
Why: Environment variables should be for deployment config, not runtime state
All public methods should have type hints for clarity and IDE support.
Bad:
def transfer_data(self, code, mode, validate):
"""Transfer data between tables."""
return self._do_transfer(code, mode, validate)
Good:
def transfer_data(
self,
code: str,
mode: str,
validate: bool = True
) -> int:
"""Transfer data between tables."""
return self._do_transfer(code, mode, validate)
Why: Type hints improve code clarity, enable better IDE autocomplete, and catch bugs
Don't use bare except: or overly broad except Exception: in production code.
Bad:
try:
result = critical_operation()
except: # Catches EVERYTHING including KeyboardInterrupt, SystemExit!
pass
try:
result = api_call()
except Exception: # Too broad - what exactly failed?
self.debug("Something failed")
Good:
try:
result = critical_operation()
except (ValueError, KeyError) as e:
self.debug(f"Invalid data: {e}")
raise
except ConnectionError as e:
self.debug(f"Connection failed: {e}")
# Handle connection-specific recovery
except Exception as e:
# Last resort - log and re-raise
self.debug(f"Unexpected error in critical_operation: {e}")
raise
Why: Specific exceptions enable proper error handling and recovery
Process data in batches, not one record at a time.
Bad:
# 10,000 individual INSERT statements!
for record in records:
self.sql_execute(
f"INSERT INTO table (id, name) VALUES ('{record['id']}', '{record['name']}')"
)
Good:
# Batch inserts
batch_size = 1000
for i in range(0, len(records), batch_size):
batch = records[i:i+batch_size]
values = []
for record in batch:
id_val = self.escape_sql(record['id'])
name_val = self.escape_sql(record['name'])
values.append(f"('{id_val}', '{name_val}')")
sql = f"INSERT INTO table (id, name) VALUES {','.join(values)}"
self.sql_execute(sql)
Why: Batch operations are 10-100x faster than individual operations
Keep business logic separate from formatting/presentation code.
Bad:
def get_user_report(self, user_id):
# Mixing data retrieval, calculation, and HTML generation
user = self.get_user(user_id)
orders = self.get_orders(user_id)
total = sum(o['amount'] for o in orders)
# HTML generation mixed in
return f"""
<h1>{user['name']}</h1>
<p>Email: {user['email']}</p>
<p>Total Orders: ${total}</p>
"""
Good:
# Separate business logic
def get_user_summary(self, user_id):
user = self.get_user(user_id)
orders = self.get_orders(user_id)
total = sum(o['amount'] for o in orders)
return {
'user': user,
'total': total,
'order_count': len(orders)
}
# Separate presentation
def format_user_report_html(self, summary):
return f"""
<h1>{summary['user']['name']}</h1>
<p>Email: {summary['user']['email']}</p>
<p>Total Orders: ${summary['total']}</p>
"""
Why: Separation of concerns - easier to test, reuse, and maintain
Complex logic needs explanation comments - don't assume it's self-evident.
Bad:
# Complex retry logic without explanation
for i in range(max_retries):
try:
return self._execute()
except Exception:
time.sleep(2 ** i)
Good:
# Exponential backoff retry: 1s, 2s, 4s, 8s...
# This prevents overwhelming the remote service while allowing
# transient errors (network glitches, temporary locks) to recover
for attempt in range(max_retries):
try:
return self._execute()
except Exception as e:
if attempt == max_retries - 1:
# Final attempt failed - re-raise exception
raise
backoff_seconds = 2 ** attempt
self.debug(
f"Retry {attempt + 1}/{max_retries} "
f"after {backoff_seconds}s delay: {e}"
)
await asyncio.sleep(backoff_seconds)
Why: Complex algorithms need context for future maintainers
Avoid loading entire datasets into memory when you only need a subset.
Bad:
# Loading millions of records into memory
all_records = self.sql_get_dictionary_list("SELECT * FROM huge_table")
for record in all_records[:100]: # Only need first 100!
process(record)
Good:
# Use LIMIT or pagination
records = self.sql_get_dictionary_list(
"SELECT * FROM huge_table LIMIT 100"
)
for record in records:
process(record)
Why: Memory efficiency - don't load data you won't use
Avoid circular imports between modules.
Bad:
# In ObjWorkflow.py
from ObjNode import ObjNode # ObjNode imports ObjWorkflow!
# In ObjNode.py
from ObjWorkflow import Workflow # Circular dependency!
Good:
# Use TYPE_CHECKING for type hints only
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from ObjWorkflow import Workflow
# Or restructure to break the circle
# Move shared code to a third module
Why: Circular imports cause import errors and make code harder to understand
Always sanitize filenames to prevent directory traversal attacks.
Bad:
filename = request.get('filename')
file_path = f"uploads/{filename}" # ../../etc/passwd ?
with open(file_path, 'w') as f:
f.write(data)
Good:
import os
filename = request.get('filename')
# Remove path separators and parent directory references
safe_filename = os.path.basename(filename).replace('..', '')
if not safe_filename:
raise ValueError("Invalid filename")
file_path = f"uploads/{safe_filename}"
Why: Prevents directory traversal and path injection attacks
Don't pass unsanitized user input to shell commands.
Bad:
import subprocess
user_file = request.get('file')
subprocess.call(f"cat {user_file}", shell=True) # Command injection!
Good:
import subprocess
user_file = request.get('file')
# Use list form without shell=True
subprocess.call(['cat', user_file])
# Or better: don't shell out at all
with open(user_file, 'r') as f:
content = f.read()
Why: Prevents command injection attacks
Wrap related operations in transactions to maintain consistency.
Bad:
# Non-atomic operations
self.sql_execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
# System crashes here - money disappears!
self.sql_execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
Good:
# Atomic transaction
self.sql_begin_transaction()
try:
self.sql_execute(
"UPDATE accounts SET balance = balance - 100 WHERE id = 1"
)
self.sql_execute(
"UPDATE accounts SET balance = balance + 100 WHERE id = 2"
)
self.sql_commit()
except Exception as e:
self.sql_rollback()
self.debug(f"Transaction failed: {e}")
raise
Why: Ensures data consistency - all operations succeed or all fail
Never log passwords, API keys, credit cards, or PII.
Bad:
self.debug(f"Connecting with password: {db_password}")
self.debug(f"User credit card: {card_number}")
self.debug(f"API Key: {api_key}")
self.debug(f"SSN: {user_ssn}")
Good:
self.debug("Connecting to database with configured credentials")
self.debug(f"Processing payment for user {user_id}")
self.debug("API authentication successful")
self.debug(f"User profile updated for user {user_id}")
Why: Prevents credential leaks in log files and monitoring systems
Use parameterized queries when supported by the driver.
Bad:
user_id = request.get('user_id')
user_id = self.escape_sql(user_id)
sql = f"SELECT * FROM users WHERE id = '{user_id}'"
Good:
# If using raw connection (not ObjData framework)
user_id = request.get('user_id')
cursor.execute(
"SELECT * FROM users WHERE id = %s",
(user_id,) # Parameterized - automatically escaped
)
Why: Prepared statements prevent SQL injection and can improve performance
Don't recreate functionality that already exists in the framework.
Bad:
# Creating custom GUID generator
import random
import string
def generate_guid():
return ''.join(
random.choices(string.ascii_uppercase + string.digits, k=32)
)
# Custom email validation
def is_valid_email(email):
return '@' in email # Incomplete validation
Good:
# Use framework's GUID generation
guid = self.generate_guid()
# Use framework's validation
is_valid = self.validate_email(email)
Why: Framework features are tested, optimized, and maintained
Stream large files instead of loading entirely into memory.
Bad:
# Loading entire file into memory
file_data = request.files['upload'].read() # 5GB file - OOM!
process_data(file_data)
Good:
# Stream file in chunks
chunk_size = 8192
with request.files['upload'].stream as stream:
while True:
chunk = stream.read(chunk_size)
if not chunk:
break
process_chunk(chunk)
Why: Streaming prevents out-of-memory errors with large files
Stay current with Python best practices.
Bad:
# Deprecated as of Python 3.9
from typing import Dict, List
my_dict: Dict[str, str] = {} # Old style
# Using old string formatting
message = "Hello %s" % name
# collections.abc moved
from collections import Mapping # Deprecated
Good:
# Python 3.12+ style - use built-in generics
my_dict: dict[str, str] = {}
my_list: list[int] = []
# F-strings
message = f"Hello {name}"
# Correct import location
from collections.abc import Mapping
Why: Modern Python is more readable and performant
Access enum members by attribute, not string lookup.
Bad:
from ObjEnum import TransferMode
# String lookup - fragile
mode = TransferMode['SUPPLEMENT']
# Or worse - using string values directly
if mode_string == "SUPPLEMENT": # Magic string
do_supplement()
Good:
from ObjEnum import TransferMode
# Direct attribute access
mode = TransferMode.SUPPLEMENT
# Type-safe comparison
if mode == TransferMode.SUPPLEMENT:
do_supplement()
Why: Type safety, autocomplete, refactoring support
Avoid global mutable variables - they create hidden dependencies.
Bad:
# Global mutable state
current_user = None
active_connections = []
def login(user):
global current_user
current_user = user # Modifies global state!
def add_connection(conn):
active_connections.append(conn) # Hidden side effect!
Good:
# Use class attributes or pass state explicitly
class SessionManager:
def __init__(self):
self.current_user = None
self.active_connections = []
def login(self, user):
self.current_user = user
def add_connection(self, conn):
self.active_connections.append(conn)
Why: Global mutable state makes testing difficult and creates hidden dependencies
Use pathlib for modern path manipulation.
Bad:
import os
# String concatenation for paths
path = os.getcwd() + "/" + "data" + "/" + "file.txt"
# Manual directory creation
if not os.path.exists(directory):
os.makedirs(directory)
# String-based path checking
if path.endswith('.txt'):
process_text()
Good:
from pathlib import Path
# Pathlib operations
path = Path.cwd() / "data" / "file.txt"
# Automatic parent creation
path.parent.mkdir(parents=True, exist_ok=True)
# Path-based checking
if path.suffix == '.txt':
process_text()
Why: Pathlib is more readable, cross-platform, and type-safe
Monitor and clean up resources in long-running services.
Bad:
class DataProcessor:
def __init__(self):
self.cache = {}
def process(self, data_id):
# Cache grows forever - memory leak!
if data_id not in self.cache:
self.cache[data_id] = expensive_load(data_id)
return self.cache[data_id]
Good:
from functools import lru_cache
class DataProcessor:
# LRU cache automatically evicts old entries
@lru_cache(maxsize=1000)
def _load_data(self, data_id):
return expensive_load(data_id)
def process(self, data_id):
return self._load_data(data_id)
# Or use time-based cache with Redis
# Or implement manual cache eviction based on TTL
Why: Unbounded caches cause memory leaks in long-running services
Validate critical configuration at startup, not at runtime.
Bad:
# Configuration errors discovered at runtime
class MyService:
def process_data(self):
# Only now discover config is missing!
db_host = self.get_ini_value('database', 'host')
if not db_host:
raise ValueError("Database host not configured")
Good:
class MyService:
def __init__(self):
# Validate configuration on startup
self.db_host = self.get_ini_value('database', 'host')
if not self.db_host:
raise ValueError(
"FATAL: Database host not configured in config.yaml"
)
self.api_key = self.get_ini_value('api', 'key')
if not self.api_key:
raise ValueError("FATAL: API key not configured")
self.debug("Configuration validated successfully")
def process_data(self):
# Safe to use self.db_host - validated on startup
...
Why: Fail fast at startup rather than discovering config errors in production
Use dataclasses instead of plain classes for data storage.
Bad:
class TransferConfig:
def __init__(self, source, target, mode, guid_field):
self.source = source
self.target = target
self.mode = mode
self.guid_field = guid_field
def __repr__(self):
return f"TransferConfig({self.source}, {self.target})"
def __eq__(self, other):
return (self.source == other.source and
self.target == other.target and
self.mode == other.mode)
Good:
from dataclasses import dataclass
@dataclass
class TransferConfig:
source: str
target: str
mode: str
guid_field: str
# __init__, __repr__, __eq__ auto-generated!
Why: Dataclasses reduce boilerplate and provide useful default methods
Use enumerate when you need both index and value.
Bad:
i = 0
for record in records:
process(i, record)
i += 1
Good:
for i, record in enumerate(records):
process(i, record)
# With custom start index
for i, record in enumerate(records, start=1):
process(i, record)
Why: More Pythonic and less error-prone than manual index tracking
Use perf_counter for accurate timing measurements.
Bad:
import time
start = time.time()
do_work()
duration = time.time() - start # Can jump with system clock changes
Good:
import time
start = time.perf_counter()
do_work()
duration = time.perf_counter() - start # Monotonic, accurate
Why: perf_counter is monotonic and unaffected by system clock adjustments
Use zip to iterate over multiple sequences together.
Bad:
for i in range(len(names)):
print(f"{names[i]}: {ages[i]}")
Good:
for name, age in zip(names, ages):
print(f"{name}: {age}")
# With strict length checking (Python 3.10+)
for name, age in zip(names, ages, strict=True):
print(f"{name}: {age}")
Why: More readable and prevents index errors
Use built-in functions for clarity.
Bad:
# Check if any record is active
has_active = False
for record in records:
if record['active']:
has_active = True
break
# Check if all records are valid
all_valid = True
for record in records:
if not record['valid']:
all_valid = False
break
Good:
# Check if any record is active
has_active = any(record['active'] for record in records)
# Check if all records are valid
all_valid = all(record['valid'] for record in records)
Why: More concise and expresses intent clearly
Avoid wildcard imports - be explicit about what you import.
Bad:
from ObjConstants import * # What did we import? Unclear!
from typing import * # Pollutes namespace
from ObjEnum import * # Which enums are available?
Good:
from ObjConstants import DEFAULT_BUFFER_SIZE, MAX_RETRIES
from typing import Dict, List, Optional
from ObjEnum import TransferMode, DatabaseType
Why: Explicit imports improve code readability and prevent naming conflicts
Use dict.get() with defaults instead of KeyError-prone access.
Bad:
try:
value = config['setting']
except KeyError:
value = 'default'
# Or checking existence first
if 'setting' in config:
value = config['setting']
else:
value = 'default'
Good:
value = config.get('setting', 'default')
Why: More concise and Pythonic
Use defaultdict when you need automatic initialization.
Bad:
# Manual initialization
counts = {}
for item in items:
if item not in counts:
counts[item] = 0
counts[item] += 1
# Or grouping items
groups = {}
for item in items:
key = item['category']
if key not in groups:
groups[key] = []
groups[key].append(item)
Good:
from collections import defaultdict
# Counting
counts = defaultdict(int)
for item in items:
counts[item] += 1 # Automatically initializes to 0
# Grouping
groups = defaultdict(list)
for item in items:
groups[item['category']].append(item) # Auto-initializes to []
Why: Eliminates boilerplate initialization code
Use Counter for counting occurrences.
Bad:
# Manual counting
status_counts = {}
for record in records:
status = record['status']
if status not in status_counts:
status_counts[status] = 0
status_counts[status] += 1
# Finding most common
most_common = max(status_counts.items(), key=lambda x: x[1])
Good:
from collections import Counter
status_counts = Counter(record['status'] for record in records)
most_common = status_counts.most_common(1)[0]
# Get top 3
top_three = status_counts.most_common(3)
Why: Counter is optimized for counting and provides useful methods
Leverage itertools for efficient iteration patterns.
Bad:
# Chunking manually
def chunks(lst, n):
for i in range(0, len(lst), n):
yield lst[i:i + n]
# Flattening manually
flat_list = []
for sublist in nested_list:
for item in sublist:
flat_list.append(item)
# Grouping consecutive items
groups = []
current_group = [items[0]]
for item in items[1:]:
if item == current_group[-1]:
current_group.append(item)
else:
groups.append(current_group)
current_group = [item]
Good:
from itertools import islice, chain, groupby
# Chunking with itertools
def chunks(iterable, n):
it = iter(iterable)
while chunk := list(islice(it, n)):
yield chunk
# Flattening with chain
flat_list = list(chain.from_iterable(nested_list))
# Grouping consecutive items
groups = [list(g) for k, g in groupby(items)]
Why: itertools provides optimized, memory-efficient iteration tools
Create context managers easily with contextlib.
Bad:
class DatabaseTransaction:
def __init__(self, db):
self.db = db
def __enter__(self):
self.db.begin_transaction()
return self.db
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type:
self.db.rollback()
else:
self.db.commit()
Good:
from contextlib import contextmanager
@contextmanager
def database_transaction(db):
db.begin_transaction()
try:
yield db
except Exception:
db.rollback()
raise
else:
db.commit()
# Usage
with database_transaction(db) as conn:
conn.execute(query)
Why: Simpler syntax for creating context managers
Use partial to create specialized function variants.
Bad:
def transfer_with_defaults(transfer_code):
return transfer_data(
transfer_code=transfer_code,
mode='SUPPLEMENT',
validate=True,
buffer_size=500
)
# Have to create wrapper for every variation
def transfer_with_replace(transfer_code):
return transfer_data(
transfer_code=transfer_code,
mode='REPLACE',
validate=True,
buffer_size=500
)
Good:
from functools import partial
# Create specialized functions
transfer_supplement = partial(
transfer_data,
mode='SUPPLEMENT',
validate=True,
buffer_size=500
)
transfer_replace = partial(
transfer_data,
mode='REPLACE',
validate=True,
buffer_size=500
)
# Use them
result = transfer_supplement(transfer_code='MY_TRANSFER')
Why: Reduces code duplication and creates clear specialized functions
Use operator module instead of lambda for simple operations.
Bad:
from functools import reduce
# Using lambda for simple operations
total = reduce(lambda a, b: a + b, numbers)
sorted_records = sorted(records, key=lambda r: r['date'])
max_value = max(items, key=lambda x: x['value'])
Good:
from functools import reduce
from operator import add, itemgetter, attrgetter
# Using operator
total = reduce(add, numbers)
sorted_records = sorted(records, key=itemgetter('date'))
max_value = max(items, key=itemgetter('value'))
# For object attributes
sorted_objs = sorted(objects, key=attrgetter('created_date'))
Why: operator functions are faster and more readable than lambdas
Use bisect for efficient insertion into sorted lists.
Bad:
# Manual insertion into sorted list - O(n) search
def insert_sorted(sorted_list, item):
for i, existing_item in enumerate(sorted_list):
if item < existing_item:
sorted_list.insert(i, item)
return
sorted_list.append(item)
Good:
import bisect
# O(log n) search with bisect
def insert_sorted(sorted_list, item):
bisect.insort(sorted_list, item)
# Or get insertion index without inserting
index = bisect.bisect_left(sorted_list, item)
Why: bisect provides O(log n) binary search for sorted lists
Use heapq for efficient priority queue operations.
Bad:
# Manual priority queue - O(n log n) for every push
class PriorityQueue:
def __init__(self):
self.items = []
def push(self, item, priority):
self.items.append((priority, item))
self.items.sort() # O(n log n) - inefficient!
def pop(self):
return self.items.pop(0)[1]
Good:
import heapq
class PriorityQueue:
def __init__(self):
self.heap = []
def push(self, item, priority):
heapq.heappush(self.heap, (priority, item)) # O(log n)
def pop(self):
return heapq.heappop(self.heap)[1] # O(log n)
def peek(self):
return self.heap[0][1] if self.heap else None
Why: heapq provides O(log n) operations for priority queues
Use walrus operator (:=) for assignment expressions (Python 3.8+).
Bad:
# Duplicate computation
if len(data) > 0:
process(len(data))
# Assignment then check
match = pattern.search(text)
if match:
process(match.group(1))
Good:
# Walrus operator - assign and check
if (n := len(data)) > 0:
process(n)
if match := pattern.search(text):
process(match.group(1))
Why: Reduces code duplication and improves readability
Use structural pattern matching for complex branching (Python 3.10+).
Bad:
if isinstance(node_type, str) and node_type == 'FORMFLOW':
handle_form_flow(node)
elif isinstance(node_type, str) and node_type == 'REPORTFLOW':
handle_report_flow(node)
elif isinstance(node_type, str) and node_type == 'API':
handle_api(node)
else:
handle_default(node)
Good:
match node_type:
case 'FORMFLOW':
handle_form_flow(node)
case 'REPORTFLOW':
handle_report_flow(node)
case 'API':
handle_api(node)
case _:
handle_default(node)
Why: Cleaner syntax for complex branching logic
Always check for None before string operations.
Bad:
def format_name(name):
return name.upper() # AttributeError if name is None
def get_domain(email):
return email.split('@')[1] # Error if None
Good:
def format_name(name):
if not name:
return ''
return name.upper()
def get_domain(email):
if not email or '@' not in email:
return None
return email.split('@')[1]
Why: Prevents AttributeError on None values
Be aware that join resets when it encounters an absolute path.
Bad:
# If user_path is absolute, base_path is ignored!
user_path = request.get('path')
full_path = os.path.join(base_path, user_path) # Security risk!
Good:
# Sanitize user input first
user_path = request.get('path')
if os.path.isabs(user_path):
raise ValueError("Absolute paths not allowed")
# Use Path for better handling
from pathlib import Path
base = Path(base_path)
full_path = base / user_path.lstrip('/') # Safe joining
Why: Prevents directory traversal attacks
Use else clause for code that should run only if no exception occurred.
Bad:
try:
result = risky_operation()
success = True # Confusing - is this part of try?
process_success(result) # Should this be protected?
except Exception as e:
success = False
Good:
try:
result = risky_operation()
except Exception as e:
handle_error(e)
else:
# Only runs if no exception
process_success(result)
Why: Clarifies which code is protected and which runs on success
Use finally for cleanup that must happen regardless of exceptions.
Bad:
lock.acquire()
try:
critical_section()
lock.release() # Might not execute if exception!
except Exception:
lock.release() # Duplicate code
raise
Good:
lock.acquire()
try:
critical_section()
except Exception:
raise
finally:
lock.release() # Always executes
Why: Guarantees cleanup code runs even when exceptions occur
Use generator expressions for memory efficiency.
Bad:
# Creates intermediate list
total = sum([x * 2 for x in huge_list])
# Creates full list then takes 10
first_ten = list([process(x) for x in items])[:10]
Good:
# Generator - no intermediate list
total = sum(x * 2 for x in huge_list)
# Use islice to limit processing
from itertools import islice
first_ten = list(islice((process(x) for x in items), 10))
Why: Generators are memory-efficient for large datasets
Use sets for O(1) membership testing instead of lists.
Bad:
# O(n) membership testing
excluded_codes = ['CODE1', 'CODE2', 'CODE3']
for record in records:
if record['code'] not in excluded_codes: # O(n) each time!
process(record)
Good:
# O(1) membership testing
excluded_codes = {'CODE1', 'CODE2', 'CODE3'}
for record in records:
if record['code'] not in excluded_codes: # O(1)!
process(record)
Why: Sets provide O(1) membership testing vs O(n) for lists
Use set operations for collection comparisons.
Bad:
# Find items in list1 not in list2
unique_items = []
for item in list1:
if item not in list2:
unique_items.append(item)
# Find common items
common = []
for item in list1:
if item in list2:
common.append(item)
Good:
# Set difference
unique_items = set(list1) - set(list2)
# Set intersection
common = set(list1) & set(list2)
# Union, symmetric difference
all_items = set(list1) | set(list2)
exclusive = set(list1) ^ set(list2)
Why: Set operations are concise and performant
Use slots to reduce memory usage in classes with many instances.
Bad:
# Creating millions of instances - high memory usage
class DataPoint:
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
# Each instance has a __dict__
Good:
# Reduced memory footprint
class DataPoint:
__slots__ = ['x', 'y', 'z']
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
# No __dict__, less memory per instance
Why: slots can reduce memory usage by 40-50% for simple classes
Prefer f-strings for readability and performance.
Bad:
message = "User {} logged in at {}".format(username, timestamp)
query = "SELECT * FROM {} WHERE id = {}".format(table, user_id)
Good:
message = f"User {username} logged in at {timestamp}"
query = f"SELECT * FROM {table} WHERE id = {user_id}"
Why: f-strings are faster and more readable
Use string methods instead of regular expressions for simple patterns.
Bad:
import re
# Overkill for simple checks
if re.match(r'^https?://', url):
is_http = True
if re.search(r'@', email):
has_at = True
Good:
# Simple string methods
if url.startswith(('http://', 'https://')):
is_http = True
if '@' in email:
has_at = True
Why: String methods are simpler and faster than regex for basic patterns
Use removeprefix/removesuffix instead of manual slicing (Python 3.9+).
Bad:
# Manual prefix removal
if filename.startswith('temp_'):
filename = filename[5:] # Fragile - magic number
# Manual suffix removal
if url.endswith('.html'):
url = url[:-5] # Fragile - magic number
Good:
# Built-in methods (Python 3.9+)
filename = filename.removeprefix('temp_')
url = url.removesuffix('.html')
Why: More explicit and less error-prone than slicing
Use chained comparisons for readability.
Bad:
if x >= 0 and x <= 100:
in_range = True
if value > min_val and value < max_val:
in_bounds = True
Good:
if 0 <= x <= 100:
in_range = True
if min_val < value < max_val:
in_bounds = True
Why: Chained comparisons are more readable and Pythonic
Use is and is not for None comparisons, not ==.
Bad:
if value == None: # Wrong operator
handle_none()
if value != None:
handle_value()
Good:
if value is None: # Correct
handle_none()
if value is not None:
handle_value()
Why: is checks identity, not equality; None is a singleton
These 145 antipatterns represent common mistakes in the Axion codebase. Key principles:
Database & SQL:
Module VARCHAR(255) in def_* tables{collation} placeholder, never hardcodeself.escape_sql() not self.sql_escape()Python Code Quality:
self.debug() not print() or loggingDO_DEBUG constant in all modulesFramework Patterns:
self.get_package() for package retrievalObjServiceApiPerformance:
Git & Deployment:
Error Handling:
Architectural Rules:
Type)StrEnum)Security:
Resource Management:
Code Quality:
Python Standard Library:
Python Idioms:
is/is not for None comparisons