Source: factory.core/ObjML.py
Handles Machine Learning operations, including data manipulation,
model training, and evaluation.
| Method | Signature | Description |
|---|---|---|
| preprocess_data | preprocess_data(df: pd.DataFrame, target_column: str, categorical_features: List[str], numeric_features: List[str]) -> Tuple[np.ndarray, np.ndarray, OneHotEncoder] |
Preprocesses the data by handling one-hot encoding, |
| train_model | train_model(model: ObjMLModel, X_train: np.ndarray, y_train: np.ndarray) |
Trains the given machine learning model. |
| predict_with_model | predict_with_model(model: ObjMLModel, X: np.ndarray) -> np.ndarray |
Makes predictions using the given machine learning model. |
| evaluate_model | evaluate_model(model: ObjMLModel, X_test: np.ndarray, y_test: np.ndarray) -> Any |
Evaluates the given machine learning model. |
| get_data_from_mysql | get_data_from_mysql(query: str) -> Optional[pl.DataFrame] |
Reads data from a MySQL database using the given query. |
| calculate_woe_iv | calculate_woe_iv(df: pd.DataFrame, feature: str, target: str) -> Tuple[Dict[str, float], float] |
Calculates Weight of Evidence (WOE) and Information Value (IV) for a given feature. |
| calculate_woe_iv_multiclass | calculate_woe_iv_multiclass(df: pd.DataFrame, feature: str, target: str, target_class: str) -> Tuple[Dict[str, float], float] |
Calculates Weight of Evidence (WOE) and Information Value (IV) for a given feature |
| apply_woe_transformation | apply_woe_transformation(df: pd.DataFrame, feature: str, woe_map: Dict[str, float]) -> pd.DataFrame |
Applies WOE transformation to a specified feature in the DataFrame. |
| scorecard_scaling | scorecard_scaling(model: LogisticRegression, pdo: int = 20, base_odds: float = 50.0, base_score: int = 500) -> Dict[str, Dict[str, float]] |
Calculates scorecard points for each feature based on a trained Logistic Regression model. |
| scorecard_scaling_multiclass | scorecard_scaling_multiclass(models: Dict[str, LogisticRegression], pdo: int = 20, base_odds: float = 50.0, base_score: int = 500) -> Dict[str, Dict[str, float]] |
Calculates scorecard points for multi-class classification using one-vs-rest approach. |
| train_cost_sensitive_classifier | train_cost_sensitive_classifier(X: pd.DataFrame, y: pd.Series, class_weights: Dict[int, float], model_type: str = 'LogisticRegression', test_size: float = 0.3, random_state: int = 42) -> Tuple[Any, Dict[str, Any]] |
Trains a cost-sensitive classification model and evaluates its performance. |
| create_table_if_not_exists | create_table_if_not_exists() -> None |
Creates the def_ml_models table if it doesn't exist. |
| save_model_to_db | save_model_to_db(model: Any, model_name: str, version: str, feature_names: List[str], training_metrics: Dict, encoder: Any = None, model_type: str = None, created_by: str = None, is_active: bool = False) -> str |
Save model with metadata to database. |
| load_model_from_db | load_model_from_db(model_name: str, version: str = 'latest') -> Tuple[Any, Dict] |
Load model and metadata from database. |
| list_models | list_models(model_name: Optional[str] = None, limit: int = None) -> List[Dict] |
List saved models from database. |
Create the def_ml_models table if it doesn't exist.
List saved models from database.
Show detailed information about a specific model.