Map dimension account chart¶
Overview¶
This required mapping resources of dimension account chart
Procedures¶
Problems:
On the SSC side, they published datasets included nested of (account label, account class) but it's different between:
(a) On the business category (BASIC, BANK) or related mapping of (NON-FINANCIAL, FINANCIAL)
(b) On the fundamental statement type.
See the example here:
For the targeted website: SSC Information Disclosure Official Website
For the reference term: Go to Term
Goals:
-
Centralized the account mapping for each business-category.
-
Handle the mapping to fulfil (100% coverage) on fundamental resources
There are seperated 2 procedures with different on (a) Required handlers team (b) Domain, which included:
Procedure | Description |
---|---|
Procedure 1 - Export metadata | Get and export the account chart has not been coverage |
Procedure 2 - Map account | Mapping, validate dimension of account chart by domain expert |
Procedure 3 - Sync into lake | Sync dimensiosn into lake and trigger fundamental datasets |
For the overall process, validation dimension will use following SQL to verify resources:
/*
Engine: Bigquery
*/
SELECT *
FROM `$TARGETED_DATASET`.`fact_fundamental_report_statement`
WHERE
1=1
AND account.id IS NULL
for $TARGETED_DATASET
will map related to environment of the lake-prep project (which should be in production
mode)
For targeted strategy:
(a) Transition into following order: NON-FINANCIAL ~> FINANCIAL.
(b) The targeted coverage rate: 85-95% for each section
Detail¶
Procedure 1 - Export non-coverage of account chart¶
Property | Value |
---|---|
In charge | Data team members |
Step 1: Download latest updated of dimension in the account chart
Download latest inno_dim_fundamental_account_chart.xlsx Excel file on master branch
The dimension account chat has following metadata:
Component | Description |
---|---|
business_category_code | Business category code, included [BASIC, BANK] |
statement_type_code | Fundamental statement [BALANCE_SHEET, INCOME_STATEMENT, CASH_FLOW_DIRECT, CASH_FLOW_INDIRECT] |
account_id | Account ID that managed by Innotech |
account_level | Account level of record related to fundamental statement |
account_prefix | Account prefix of related label |
account_label | Account label (Namespace of account in Vietnamese language) |
account_class | Account class (Related to VAS) |
For the relationship:
(a) 1 Account ID can have mapping into multiple account labels
(b) 1 Account label related to 1 account class
Step 2: Go to inno-lake-prep
and transition into application
folder as working directory.
Run the following command in bash then use the compiled SQL statement to fetch datasets required into
Step 3: Go to the console of BigQuery
to load the dataset
Verify the target production and save data into the spreadsheet
Step 4: Mofiy dataset by copy and paste into spreadsheet, which has following metadata
Spreadsheet: [Data] - Fundamental - Snapshot - Non account mapping by date
Shared with: Related expert and data team can read the dataset
The spreadsheet will naming based on YYYMMDD
tab which mean the date of exported data.
Procedure 2 - Export non-coverage of account chart¶
Property | Value |
---|---|
In charge | Domain experts |
Follow up | Data team members |
Step 1: Go to spreadsheet declared: [Data] - Fundamental - Snapshot - Non account mapping by date
Find the missing mapping of nested term (account_id, account_level, account_prefix) then register the number of related (account_label, account_class)
Note: Based on the metadata of financial statement, you can go to SSC information disclosure to see the real datasets has been get.
Step 2: Based on the updated data of dimension then register to that file. This required domain knowledge.
Step 3: Send back the updated for data team to handle the file back
Procedure 3 - Sync with lake and trigger mapping¶
Property | Value |
---|---|
In charge | Data team members |
Step 1: Sync update in basement with buffer table then trigger run into dimension table
This will need to sync into 1 source of inno_dim_fundamental_account_chart.xlsx
Step 2: Sync trigger on lake
Appendix¶
Appendix A: Record of Changes¶
Table: Record of changes
Version | Date | Author | Description of Change |
---|---|---|---|
0.1.0 | 05/29/2024 | Bao Truong | Draft version of operation for mapping dimension account chart |