Introduction¶
Overview¶
Build the dataset for currency
Pipeline to fetch
- Des: Tỷ giá của 10 ngân hàng.
Targeted coverred
Related to PR 185
Methodology¶
Schema¶
Target:
BankID | *(str, int) Name | str Security | str
BankID | *(str, int) Currency | str Name | str Buy | float Transfer | float Sell | float
Indirect QUote = 1/Direct Quote
Validate¶
-
Check the bank ID
-
Check the resource ID
Handler¶
Flow¶
Foreign Key BankID - BankID
[2] Layout
n Banks -> n functions -> pl.DataFrame(Schema)
[3] Functionals
-
Typehint
-
Class enumuration of Bank
from typing import Callable import enum
class Bank(enum.Enum) BIDV = "BIDV" ...
(a) _get_currency_quote_from__ function with _ is bank_id
n Banks -> n Format (API | XML | HTML | Excel |...) -> Structured 1 dataset
argument: at_date (date) the date that get the function
Check:
at_date <= datetime.now() ## timezone
+ zoneinfo -> Zone Asia --> filter
What?
a) Requests -> Fetch - read to memory
b) Convert dataframe (pl.DataFrame - engine)
c) Convert data-type, columns name
/////
-> return: output dataframe
b)
configuration: dict[Type[Bank], Callable] = { #
CURRENCY: pl.DataFrame = pl.DataFrame([], orient="row")
for bank in Bank:
for _date in range_date:
func: Callable = configuration.get(bank.name)
data = func(_date)
if << condition: data eixist | data.is_empty() is False >>:
CURRENCY = pl.concat([CURRENCY, data])
CURRENCY <<<<<<<<<<<<<<<<<<
c) Storage local
CURRENCY.write*excel("CURRENCY*{<datetime.now().strftime('%Y%m%d')}.xlsx")
Specification¶
Website: Bank [ Vietcombank | Vietinbank | BIDV | Agribank ]
Bank | Ticker | URL | Status |
---|---|---|---|
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB | VCB | ... | OK |
VCB --- API ----- ok BIDV --- API ----- ok ACB --- API ----- ok Techcombank --- API ----- ok Sacombank --- API ----- ok MBB --- API ----- blocked Agribank --- HTML ----- ok SHB --- HTML ----- ok Vietinbank --- ok
Note: Tỷ giá là gì
USDVND ->
Schema¶
Tỷ giá >
Process¶
Dataset: currency_rate
Dimension:
Date Currency Code
<Mã ngoại tệ>+<VND>
USDVND
EURVND
-> List[Record]
-> Using polars >
Convert dataframe
-
Rename: currencyCode -> currency_code (camel_case -> snake_case)
-
Convert: string -> float
-
<1 day>
For loop: < RANGE: 2000-7-28> ->