Skip to content

Tech Stack Landscape

Overview

This document cover the Tech Stack component that our team leverage to persuite the landscape API.

TODO:

Logical view

flowchart LR

  %% component
  provider[Provider Source]
  ops[Operation]
  lakehouse[LakeHouse]
  warehouse[Warehouse]
  consume[Consume]

  %% On sync
  ops --> lakehouse
  provider --> lakehouse

  %% Transfer
  lakehouse --> warehouse --> consume

Component

Physcical

  • Server: Onprem server

  • Dataproc serverless [Spark, SparkNLP]

  • Cloud Run

Language:

  • Programing: Core Python

  • SQL in various backend database: MySQL, Postgres, BigQuery

Next generation of our Tech Stack

We lack control of

  • Server, Control Tower with can be reduced by Prometheus, Thanos

  • Authentication, both using JWT and Google Identity

  • Acquire new tools for the big picture

Prometheus, Thanos

Control Tower for all system

Google Kurbernes

Deploy containers

Argo Worlflow

Scheduler Oschestration

Argo CD

Deployment Rollout, Rollback

Vision AI

This packages of Google will make annotate the elements

Ref:

Vision AI | Cloud Vision API | Google Cloud

  • Document AI to generate the entities from Financial Reports PDF. E.g: Document AI | Google Cloud

  • Detect from Images: Detect Web entities and pages | Cloud Vision API | Google Cloud

  • Annotate Financial Entities

Trino Distributed SQL query engine for big data

Trino, a query engine that runs at ludicrous speed target Postgres Connectors

Sentry

Source Reference

[1] What is Tech Stack? MongoDB What is a Tech Stack and How Do They Work?