Core Components
Frontend — Streamlit UI
Frontend — Streamlit UI
Provides an interactive, step-by-step user interface. Users select their company, define filters, and upload files.Supports two modes:
- Single Processing Mode — processes one company’s inputs for a specific location
- State Consolidation Mode — merges multiple locations/companies for a state into a single sheet, with Document Summaries (e.g. cancelled invoice counts)
Dynamic SOP Manager (sop_manager.py)
Dynamic SOP Manager (sop_manager.py)
Instead of hardcoded rules, each company (e.g. Britannia, Godrej, Sunpure) has its own SOP configuration file inside the
sops/ directory.The SOP controls:- Which files are required for upload
- Standard column definitions
- Custom logic rules
Data Processing Engine (Pandas)
Data Processing Engine (Pandas)
- Fast, memory-efficient Excel/CSV parsing via
pd.read_excelandpd.read_csv - Column standardization — maps varying vendor column names to a standard set
- VLOOKUP-style matching using dictionaries and
Series.map()— avoidsJOINoperations that could accidentally duplicate rows
Unit Standardizer (unit_standardizer.py)
Unit Standardizer (unit_standardizer.py)
Automatically maps shorthand units in raw data (e.g.
PCS, KG, Ltr) to standardized full-form outputs required by downstream GST systems.Design Rules
| Rule | Description |
|---|---|
| No Deduplication | Duplicate rows in the input Bill Details are strictly preserved in the output |
| Row Cardinality Check | Output row count must exactly match the input Bill Details file |
| Lookup Strategy | No table merges — missing lookup values result in empty or UNREGISTERED fields, never extra rows |
| Preservation of Totals | Taxable amounts and taxes are taken directly from inputs — no overriding calculations |