Skip to content

SAD

Overview

Inno Data Operation is an internal portal platform to centralized operations for data system.

Table content:

SAD - System architechture design

The data operation included the following position

flowchart LR

  %% Component
  oschestration["[1]" Oschestration]
  operation["[2]" Data Operation]
  api["[3]" Internal API endpoints]
  database["[4]" Internal Database]
  lake["[5]" Centralized Lake House]
  client["[6]" Client API]

  %% Flow
  oschestration & operation -- interactive --> api -- storage buffer database --> database -- streaming --> lake -- transform --> client

Logical View

The logical view related to relevant services in the system.

flowchart LR

  %% Component
  subgraph user[User]
    direction LR
    stakeholder[Stakeholder]
    data[Data Officer]
    subgraph operator[Operator]
      direction LR
      maker[Maker]
      checker[Checker]
    end
  end

  subgraph cloud[Cloud Environment]

    endpoint[Endpoint Website URL]

    subgraph auth[Auth Service]
      auth_idp[Auth Provider]
    end

    subgraph fe[Frontend Service]
      ui
    end

    subgraph be[Backend Service]
      subgraph database_layer[Database Layer]
        api
      end
    end

    subgraph internal_storage[Internal Storage]
      internal_bucket[Internal Bucket]
    end

  end

  %% FLow
  user -- access --> endpoint -- direct--> auth_idp
  auth_idp -- interactive --> ui[UI console] -- input/modify/apply --> api[Frontbase API] & internal_bucket

The targeted users for this platform:

  • Stakeholder to see the activity in the system of data platform.

  • Operator (included Marker-Checker) are the pool of users can actions (input/modify/changes/aplly) interactive with data records

  • Data officer that handle the system of data through UI console

There are 4 related services:

Code Services Description
SERVICE-01 UI Service The application website for user interactive
SERVICE-02 Auth Service Service handle auth (authen + author) related to operation system
SERVICE-03 Backend Service The API interactive with database layer
SERVICE-04 Internal Storage The bucket of internal assets (files, images, ...)

Physical View

Global view on UI service (SERVICE-01)

Overall

Here is the physical view to establish the UI service (SERVICE-01) into cloud environment

flowchart LR

  %% Service in Cloud Platform
  subgraph Google Cloud Platform

    cbuild[Cloud Build]
    lb[Load Balancing]
    iap[Cloud IAP]
    secret[Cloud Secret Manager]
    logging[Cloud Logging]
    iam[Cloud IAM]
    artifact[Cloud Artifact]

    subgraph ui[UI service]
      run[Cloud Run]
    end

    subgraph related[Related Component]
      be[Backend Endpoint]
      notification_service[Notification Center]
    end
  end

  %% Service for Code Storage
  subgraph gh[GitHub]
    repository[Repository]
  end

  %% Flow
  lb -- navigate --> iap -- handle authentication --> run
  gh <-- sync/trigger --> cbuild -- deploy --> run
  secret -- control secret --> cbuild
  iam -- control access --> cbuild
  cbuild -- store images --> artifact

  run -- invoke --> be
  run -- yield logs --> logging
  run -- notification --> notification_service
The table component
Code Resource Identifer Description
COMPONENT-01 GitHub Repository inno-data-operation Contain source-code of UI project
COMPONENT-02 GitHub Repository inno-infra Contain related config for infra components
COMPONENT-03 Cloud Run $ENV-inno-data-operation Serverless deployment for application service
COMPONENT-04 CloudBuild Private in asia-southeast1 CICD platform
COMPONENT-05 Load Balancing lb-* Load balancer routing service
COMPONENT-06 Cloud IAP iap-* IAP control authentication internal
COMPONENT-07 Cloud Secret Manager related to Cloud Project Control secret for the service
COMPONENT-08 Cloud IAM related to Cloud Project Acess Management in the Cloud
COMPONENT-09 Cloud Artifact related to Cloud Project Store artifact of images related to project
COMPONENT-10 Cloud Logging related to Cloud Project Store logs of build and service logs

Note:

(*) $ENV meaning prefix of environment

Global Configurations
No Term Description Identity
1 PROJECT_ID Project ID of Google Cloud Platform Data Project ID
2 REGION Project region of Google Cloud Platform Data Project region
3 SERVICE_NAME Service name of project data-operation
4 Timezone Timezone configuration for overall system Asia/Ho_Chi_Minh
Deployment Environment

The environment seperated into 3 environments: Local Development, Staging, Production

The strategy for deployment based on changes (push) on GitHub repositories, which related to developers activities

flowchart LR

  %% Component
  developers[Developers]
  gh[GitHub]
  cicd[CICD Platform]
  resource[Targeted Resources]

  %% Flow
  developers <-- trigger --> gh <-- sync --> cicd <-- deploy --> resource

For detail of environment:

Staging:

Property Value
Deployment Strategy CICD
Deployment Triggers On push into master branch
Targeted resource Google Cloud Platform
Endpoint https://internal.staging-portal.data.innotech.vn

Production:

Property Value
Deployment Strategy CICD
Deployment Triggers On push into production branch
Targeted resource Google Cloud Platform
Endpoint https://internal.portal.data.innotech.vn
Detail on component
1 | UI Service - Repository inno-data-operation
Property Value
ID COMPONENT-01
Resource Repository
Identifier Repository: Inno-Data-Operation
Role Contain code of frontend and backend service related to the UI service
2 | UI Service - Repository inno-infra
Property Value
ID COMPONENT-02
Resource Repository
Identifier Repository: Inno-Infra
Role Contain code for infrastructure, permission on cloud
3 | UI Service - Cloud Run
Property Value
ID COMPONENT-03
Resource Cloud Run
Identifier Based on environment
Role Serverless deployment of service

Based on each development, this has different instruction and resources configuration for Cloud Run

Environment: Production:

Configuration Value
Environment Production
Project Reference to Data project ID
Region Reference to Data project region
Name prod-inno-data-operation
Description [Production] Data Operation - Internal Data Operation
Platform managed
allow-unauthenticated No
no-cpu-throttling True
image $LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_IMAGE_NAME:latest
port 3000
service-account $_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com (Declared in permission part)
cpu 1
memory 2Gi
Min instance 1
Max instance 1
ingress all (Required check)
labels on_commit=$SHORT_SHA,
tag latest

Environment: Staging:

Configuration Value
Environment Staging
Project Reference to Data project ID
Region Reference to Data project region
Name staging-inno-data-operation
Description [Staging] Data Operation - Internal Data Operation
Platform managed
allow-unauthenticated No
no-cpu-throttling True
image $LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_IMAGE_NAME:latest
port 3000
service-account $_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com (Declared in permission part)
cpu 1
memory 1Gi
Min instance null
Max instance 1
ingress all (Required check)
labels on_commit=$SHORT_SHA,
tag latest
4 | UI Service - CloudBuild
Property Value
ID COMPONENT-04
Resource Cloud Build
Identifier Data project service
Role CICD platform
5 | UI Service - Load Balancing
Property Value
ID COMPONENT-05
Resource Load Balancing
Identifier [UPDATE LATER]
Role Load balancer to service

In charge by: Network Admin

Deployment and managed by: Terraform in inno-infra

For the deployment guidance, go with Infra deployment instruction of load balancing

6 | UI Service - IAP
Property Value
ID COMPONENT-06
Resource Cloud IAP
Identifier Data project service
Role Control context of IAP related to permission of entities user

In charge by: Network Admin

Deployment and managed by: Terraform in inno-infra

7 | UI Service - Secret Manager
Property Value
ID COMPONENT-07
Resource Cloud Secret Manager
Identifier Data project service
Role Store secret related to project

In charge by: Maintainer of project

8 | UI Service - IAM
Property Value
ID COMPONENT-08
Resource Cloud IAM
Identifier Data project service
Role Acess Management in the Cloud

In charge by: Infra Admin

9 | UI Service - Cloud Artifact
Property Value
ID COMPONENT-09
Resource Cloud Artifact
Identifier Data project service
Role Store image artifact of service
10 | UI Service - Cloud Logging
Property Value
ID COMPONENT-10
Resource Cloud Logging
Identifier Data project service
Role Service logs, build execution logs
Permission

For GitHub

Runner Permission Reason
principal:innodatarunner@innotech.vn Repository admin of repo inno-data-operation Sync, config repository to cloud
principal:hung.doan@innotech.vn Write to repo inno-data-operation Control frontend service
principal:tho.nguyen@innotech.vn Write to repo inno-data-operation Develop frontend service
principal:bao.truong@innotech.vn Repository admin of repo inno-data-operation Maintainer of project
principal:dat.phan@innotech.vn Repository admin of repo inno-data-operation Maintainer of project
principal:tien.luong@innotech.vn Repository admin of repo inno-data-operation Network Admin

For Google Cloud Platform:

The tables of service account within project

Service Account Permission Description
sa-muerta SA Muerta Muerta - Blasts fearsome trickshots and unleashes ethereal ruin
sa-visage SA Visage Visage - Scouts and attacks with his familiars

sa-muerta: Project builder:

Service account: sa-muerta@$PROJECT_ID.iam.gserviceaccount.com

Alias: sa-muerta

Permissions Identifiers Performtion
roles/cloudbuild.builds.builder Cloud Build Cloudbuild builder
roles/iam.serviceAccountUser IAM Impersion service account on a targeted service
roles/secretmanager.secretAccessor Secret Access secrets
roles/storage.admin GCS::inno-internal-cloudbuild Related to write log into bucket
roles/run.developer Cloud Run Deploy Run service
PROJECT::roles/run.services.setIamPolicy Cloud Run Set IAM for run service

sa-visage: Runner of the service:

Service account: sa-visage@$PROJECT_ID.iam.gserviceaccount.com

Alias: sa-visage

Permissions Identifiers Performtion
roles/run.invoker Cloud Build Cloudbuild builder
roles/secretmanager.secretAccessor Secret Access secrets
roles/monitoring.metricWriter Cloud Monitoring Writing monitoring data to a metrics scope
roles/logging.logWriter Cloud Logging Write log entries
roles/errorreporting.writer Cloud Error Reporting Write error into centralized error
Securities
Authentication

At the first stage, all the users comes with IAP and Load balancer

UI component with site-map

Design site-map and reference element for internal operation

Table of routes:

Component Sub-component Description
Homepage
Introduction Introduction user portal
Dataset
Introduction /home Datasets page
Dynamic /{dataset} Single view for dataset with view/update/delete
Pipelines
Introduction Internal pipelines
Dynamic /{pipeline} Single view for operation workload on datasets
Documentation
Introduction Reference to internal docs internal.docs.data.innotech.vn
Internal
Introduction Introduction on status dashboard
Status /status Infrastructure, Workload, Pipelines status
Incident /incident Incident component
Admin
Introduction Admin control Priviledges
Internal Role Setting roles
Version control Introduction about the company The physolical that we follow

Template Design:

[1] The template related to the datasets and pipelines

Template console for datasets and pipelines

[2] The template on error block

Template console for errors

Site-map design:

For the syntax:

Component Dynamic path
Dataset /datasets/{dataset-id}
Pipeline /pipelines/{pipeline-id}

For the overall picture of the site-map:

flowchart LR

  %% Component
  %% - Convention:
  %% (a) Path with have prefix with `p_*`
  %% (b) Dataset has prefix of `*_ds_*`
  %% (c) Pipeline has prefix of `*_pipe_*`
  root[Root page origin URL]
  p_home[`/homepage`]
  p_datasets[`/datasets/home`]
    p_ds_settlement[`datasets/settlement`]
    p_ds_quote[`datasets/quote`]
    p_ds_corporate_action[`datasets/corporate-action`]
  p_pipelines[`/pipelines/home`]
    p_pipe_adjustment[`pipelines/adjustment`]
  p_admin[`/admin/home`]

  %% Page site-map
  root --> p_home
  root --> p_datasets --> p_ds_quote & p_ds_settlement & p_ds_corporate_action
  root --> p_pipelines --> p_pipe_adjustment
  root --> p_admin

This mapping on the view folder for the design component with following matrix table

Component Sub path Documentation
Dataset
Settlement View > Dataset > Settlement
Quote View > Dataset > Quote
Corporate Action View > Dataset > Corporate Action
Pipeline
Executions View > Dataset > Corporate Action

Technical stack

Based on each service, the stack is choosen based on the team handle the component

Prefix of technical stack: using prefix of TS-*

Frontend - Website:

Service Component Group Stack Identity
SERVICE-01
TS-01 Runtime Javascript NodeJS
TS-02 Language Library React
TS-03 Authentication
TS-04 Visualization Timeseries Trading View
TS-05 Dependencies CSS TailwindCSS
TS-06 Dependencies Notification notistack

Backend - Auth service: Reference to SAD of the internal-auth service

Backend - API service: Reference to SAD of the submarine service

Internal Operation

Service Health Check

The Cloud Monitoring will monitoring the through the HTTP check uptime of related service

Each endpoint will exposed path to check health on service with following metadata

Service On path Response Metadata
UI service /api/service/heath Status code: 200 version, revision, timestamp
Auth service REQUIRED LATER Status code: 200 version, revision, timestamp
Backend service /service/heath Status code: 200 version, revision, timestamp

Example on the json output should returned

// Response on health check
// Status code: 200
{
  "version": "2.3.1", // Version that project are in, in semantic version
  "revision": "2024-04-01", // Revision of the service, sometime reference to date that deployed the service
  "timestamp": "2024-05-04T17:25:44Z" // Should be in UTC timestamp format
}

The service health will be monitoring by Cloud Monitoring and code be in inno-infra

CICD remove resource

CICD to remove resource on holidays and Sat, Sun.

Appendix

Appendix A: Record of Changes

Table: Record of changes

Version Date Author Description of Change
0.1.0 01/04/2024 Bao Truong Initation documentation
0.2.0 10/04/2024 Bao Truong Added logical view of SAD
0.3.0 04/05/2024 Bao Truong Updated authentication flow for project
0.4.0 04/05/2024 Bao Truong First draft UI techstack
0.5.0 04/05/2024 Bao Truong Added physical view of SAD
0.6.0 04/05/2024 Bao Truong Updated the component in Cloud platform
0.7.0 04/05/2024 Bao Truong Updated overall identities of project
0.7.1 04/05/2024 Bao Truong Updated documentation in the internal docs
0.7.2 04/05/2024 Bao Truong Added table of contents
0.7.3 04/05/2024 Bao Truong Updated flow of documentation
0.8.0 04/05/2024 Bao Truong Updated site-map for documentation
0.9.0 04/05/2024 Bao Truong Updated deployment
0.10.0 04/05/2024 Bao Truong Updated the permission on repository and flow
0.11.0 04/05/2024 Bao Truong Added artifact, logging and docs for component
0.12.0 04/05/2024 Bao Truong Added configuration on Cloud Run service
0.13.0 04/05/2024 Bao Truong Added service health check on service
0.14.0 04/05/2024 Bao Truong Added element with site-map on UI service
0.15.0 04/06/2024 Bao Truong Updated UI templates, site-map
0.16.0 05/14/2024 Bao Truong Added template for error
0.16.1 05/14/2024 Bao Truong Updated the logic of data-operation
0.16.2 06/06/2024 Bao Truong Updated the permission of service account
0.16.2 06/10/2024 Bao Truong Updated the staging environment

Source Reference