Covert the code to pep8 standards in terms of spacing and tabs.
autopep8 -i ObjServiceEtl.py
XML ETL
Fetch Batch
Please note: Batch query not configurable, hard coded sql
Each row in batch processed record at a time
app_client_submission_no important guid
Schemas:
Context
Source: XML of Application Client Submissions
Source-schema:switchx_data
Source-tables:
Output
A set of tables one for each subject area extracted from XML String.
Note:
Scratch Area
-based on older versions of code
-used to asist in refactor
-update as refactored
Goal layout out current implementation of the 'global' sql replacements
Output fields that are common to every output table
Let call it global columns for reference
etl_mapped_flat_list
output_table.output_columnexpected simplified xml path like stringhaspcd, ispcdoutput_tableoutput_columnconfirm if still usedconfirm if still usedtext at xml nodenodes index relative to siblings, can be up to 2 levels, e.g. sibling applicant and sibling incomeexpense node per applicantapp_client_submission_nosorted_etl_mapped_flat_list
target_table, indexing (reverse), target_columnvaluesTemplateDict
xml_template_map
col1,col2,col3) VALUES"Section
Line-start : 536
Line-end : 812
For loop, over an enumerated sorted_etl_mapped_flat_list
Reset:
Line-start : 562
Line-end : 577
reset occurs when the target_table table changes
all the rows for previous target_table has been looped through
this is where we build sql_master_dict
structure of sql_master_dict = {
'insert into table_1 (col1, col2, col3) values ': '(1,2,3), (4,5,6), (7,8,9)',
'insert into table_2 (col1, col2) values ': '(2,3), (5,6)',
etc
}
we get the values for sql_master_dict from valuesTemplateDict
valuesTemplateDict contains the values portion of the sql statement for the previous target_table
set valuesTemplateDict to an empty dictionary ready for the next table
only trigger reset if not 1st iteration
no data in valuesTemplateDict on 1st iteration
#- subsequent reset operations
#-- if valuesTemplateDict exists append to finalSqlInsertTable
#-- not used, delete line finalSqlInsertTable = valuesTemplateDict
#-- not used, delete line finalSqlInsertTable = []
#-- define replacementsGlobal_previous
#-- this holds the collected global columns of previous table
#-- key is the template string to be replaced in valuesTemplateDict
#-- value is the node value detected and assigned in Global Field Block
#-- define replacementsGlobal same as replacementsGlobal_previous, one to be deleted
Store Global Column values:
Line-start : 579
Line-end : 580
Prepare Guids Index:
Line-start : 582
Line-end : 673
The is where the guids are created for the insert statement
Note that each loop process one value (one table cell) at a time
Constructing and using the guids as a dictionary key enforces the proper association of a value to a specific row in the a target table
Determination of the applicant and row_guid depends on the following
-- does the current table require an applicant_guid and/or row_guid
-- are there more than 1 applicants in the current XML string
-- are there more, less or equal applicants vs the lower level children
Four Guids
-- master_guid
--- application_client_submission_no (should have just been called application_client_submission_no)
-- applicant_guid
--- master_guid + an index
--- index indicating the nodes order as an applicant relative to its siblings
-- row_guid
--- master_guid + an index
--- an applicant might be a grand parent of many children that share the same direct parent, i.e. multiple nodes could be in the same context associated with a single applicant
--- this index indicating the nodes order relative to its siblings
-- guid
--- might be redundant, set to master_guid is adequate
--- carry over from previous implementation as its a required field in the output tables
Sql Template Replace Block:
Line-start : 797
Line-end : 812