Skip to content

Real-time Snapshot World Index

Overview

Description: Get near real-time of world indicies by snapshot

Specification.

Attribute Value
Provider Yahoo Finance
Data source Yahoo Finance World Indicies
Serving API of Pluto, Spectrum
Timeframe 5 minutes

The example screenshot for dataset:

Screenshot Yahoo World Indicies

SAD - System architechture design

Logical View

Below is the logical view for the pipeline sync data from provider Yahoo into the endpoint.

flowchart TB

  %% Component
  task_push[Task Declare]
  madillo[Madillo]
  postgres[Postgres]
  api[API]
  orchestration[Orchestration]

  subgraph pipeline[Extract pipeline]
    source_chart[Chart API]
    source_summary[Summary API]
    subgraph basement[Basement]
      flatten_chart[Flatten Chart]
      flatten_summary[Flatten Summary]
      join[Join by Ticker]
      dump[Dump Data]
    end

    source_chart -- Fetch (1) --> flatten_chart
    source_summary -- Fetch (2) --> flatten_summary
    flatten_chart -- (3) --> join
    flatten_summary -- (4) --> join
    join -- (5) --> dump
  end

  subgraph gcp[Google Cloud Platform]
    subgraph pubsub[PubSub]
      topic[Topic]
      sub[Subcription]
    end

    subgraph task_push[Workflow Task Push]
      payload[Payload] --> http[HTTP build Task]
    end
    subgraph sync_dataset[Workflow Sync Dataset]
      event[Event] -- call --> madillo[Pipeline Madillo] -- transfer --> postgres[Postgres]
    end

    subgraph task[Cloud Task]
      queue[Queue]
    end

    sub --> task_push <---> queue <---> sync_dataset --> api
  end

  %% Flow
  orchestration --> pipeline
  dump -- Publish message (6) --> topic
  topic -- Push subscribe into --> sub

Step 1: For the extract layer, fetch data from the provider by JSON on 2 components (Snapshot + Timeseries Close) then merged with the ticker primary key.

Step 2: Publish the payload to registed topic (PubSub) then the topics will dispatch with the subcription.

Step 3: By the subscription then will send the payload to create task into queue.

Step 4: Queue dispatch event into sync datasets

Step 5: User can query from the API itself.

Physical Plan

Property Value
Resource PubSub Topic
Identifier Topic: prod-yahoo-world-index
Property Value
Resource PubSub Subcription
Identifier Sub: prod-subscription-yahoo-world-index-workflow-sync-spectrum
Property Value
Resource Cloud Tasks
Identifier Queue: realtime-snapshot-provider-yahoo
Property Value
Resource Workflow
Identifier Queue: prod-realtime-snapshot-provider-yahoo-task-declare
Property Value
Resource Workflow
Identifier Queue: prod-realtime-snapshot-provider-yahoo-executor
Property Value
Resource Cloud SQL
Identifier Table: asset_security_index_global_snapshot
Property Value
Resource API
Identifier spectrum
Property Value
Resource API
Identifier pluto

The IAM: sa-dragon-knight

ERD

The ERD design for the target sync table:

Table: asset_security_index_global_snapshot

Column Type Description
ticker String(200) The global ticker of index
short_name Text Short name of the index
language String(200) Language code style
region String(100) Country region related to ticker
quote_type String(100) Financial instrument type
currency String(100) Currency applied of index
market String(100) Market traded of the index
exchange String(200) Exchange name
exchange_full_name Text Exchange full-name
exchange_timezone_name String(200) Exchange timezone name
open Float(30, 5) Open
high Float(30, 5) High
low Float(30, 5) Low
close Float(30, 5) Close
previous_close Float(30, 5) Previous close
volume BigInteger Volume
change Float(30, 5) Absolute change
percent_change Float(30, 5) Percent change
intraday_high_low Text Intraday high - low
fifty_two_week_range Text 52W range
quote Text Quote component
last_updated_ts BigInteger Last updated snapshot

For the Quote components

Node Type Description
@metadata object The metadata describe the attributes of node quote
timestamp list(int) The timestamp
data list(float) The related data mapping to timestamp

Component of @metadata:

Node Type Nullable Description
time_granularity str False The granularity attributes of timeframe of quote
range str False The range that requested for quote
type str False The type of data. In this case is close quote
start str True The start of date
end str True The end of date
timezone str True The timezone of global index
gmtoffset str True The offset value to config to timezone

Example on JSON of @metadata

"@metadata": {
  "time_granularity": "5m",
  "range": "1d",
  "type": "close",
  "start": 1716384600,
  "end": 1716385200,
  "timezone": "MTY",
  "gmtoffset": 14000
}

Return output

{
  "id": "a2cd773c-d12c-4a99-856e-8eaa30e2b0b7",
  "status": "ok",
  "data": [
    {
      "ticker": "^GSPC",
      "short_name": "S&P 500",
      "language": "en-US",
      "region": "US",
      "quote_type": "INDEX",
      "currency": "USD",
      "market": "us_market",
      "exchange": "SNP",
      "exchange_full_name": "SNP",
      "exchange_timezone_name": "America/New_York",
      "open": 5319.28,
      "high": 5286.01,
      "low": 5323.18,
      "close": 5307.01,
      "previous_close": 5321.41,
      "volume": 2079437000,
      "change": -14.400391,
      "percent_change": -0.2706123,
      "intraday_high_low": "5286.01 - 5323.18",
      "fifty_two_week_range": "4103.78 - 5325.49",
      "quote": {
        "@metadata": {
          "time_granularity": "5m",
          "range": "1d",
          "type": "close",
          "start": 1716384600,
          "end": 1716385200,
          "timezone": "MTY",
          "gmtoffset": 14000
        },
        "timestamp": [1716384600, 1716384900, 1716385200],
        "data": [5318.33, 5318.32, 5320.53]
      },
      "last_updated_ts": 1716434187759875
    }
  ],
  "pagination": {
    "total": 1,
    "page": 1,
    "limit": 100,
    "url": {
      "first": "http://<ORIGIN_URL>/security/index/global/snapshot?ticker=%5EGSPC&page=1&limit=100",
      "next": null,
      "previous": null,
      "last": "http://<ORIGIN_URL>/security/index/global/snapshot?ticker=%5EGSPC&page=1&limit=100"
    }
  }
}

Appendix

Appendix A: Record of Changes

Table: Record of changes

Version Date Author Description of changes
0.0.1 Bao Truong Initiation documentation
0.0.2 05/23/2024 Thinh Luu Provide output schema, update data flow
0.0.3 05/24/2024 Bao Truong Layout, Descripton of script
0.0.4 05/24/2024 Bao Truong Updated SAD component
0.0.5 05/24/2024 Bao Truong Updated ERD
0.0.6 05/27/2024 Bao Truong Updated physical component, namespace
0.0.7 05/27/2024 Bao Truong Updated node elements for metadata
Repository Description
inno-basement Contain logics of publish messages and control orchestration
inno-infra Provision for physical resources
inno-docs Documentation for pipeline
inno-spectrum Deploy endpoint into public