Skip to content

Load Balancing for Data service

Overview

A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications.

Using GCP Load Balancing: GCP Load Balancing Overview

Sub-domain

The below component

Public services

Sub-domain register table

Sub-domain Description
api.data.innotech.vn Endpoint RESTful API of Data service
documentation.data.innotech.vn Documentation for public access about Data service
asset.data.innotech.vn (CDN) Publicly serving static resources (images, pdf/excel/csv files, ...)

Table of sub-domain mapping

Sub-domain IP-type Type State Service
api.data.innotech.vn Static Public Production production::inno-spectrum::api::latest
documentation.data.innotech.vn Static Public Production production::inno-spectrum::documentaiton::latest
asset.data.innotech.vn Static Public Production storage::inno-public-asset

Internal services

Sub-domain register table

Sub-domain Description
internal.api.data.innotech.vn Internal RESTful API for internal workload
internal.portal.data.innotech.vn Portal website for internal workload
internal.staging-portal.data.innotech.vn Staging - Portal website for internal workload
internal.docs.data.innotech.vn Documentation for internal workload
internal.notification.data.innotech.vn Handle messages in the data platform
internal.workflow-orchestration.data.innotech.vn Workflow Orchestration of Data Platform
internal.staging-workflow-orchestration.data.innotech.vn Staging - Workflow Orchestration of Data Platform
internal.processor.data.innotech.vn Internal data pipelines
internal.staging-processor.data.innotech.vn Staging - Internal data pipelines
internal.transform.data.innotech.vn API the transformation of lake
internal.staging-transform.data.innotech.vn Staging - API the transformation of lake
internal.auth.data.innotech.vn Internal Authentication Service
internal.staging-auth.data.innotech.vn Staging - Internal Authentication Service
internal.keycloak.data.innotech.vn KeyCloak server
internal.staging-keycloak.data.innotech.vn Staging - KeyCloak server

Table of sub-domain mapping

Production:

Sub-domain IP-type Type State Service
internal.api.data.innotech.vn Static Internal Production production::inno-submarine::latest
internal.portal.data.innotech.vn Static Internal Production production::inno-data-operation::latest
internal.docs.data.innotech.vn Static Internal Production production::inno-docs::latest
internal.notification.data.innotech.vn Static Internal Production production::inno-notification::latest
internal.transform.data.innotech.vn Static Internal Production production::inno-lake-prep::latest
internal.workflow-orchestration.data.innotech.vn Static Internal Production production::inno-basement::latest
internal.auth.data.innotech.vn Static Internal Production production::inno-internal-auth::run::latest
internal.keycloak.data.innotech.vn Static Internal Production production::inno-keycloak::run::latest

Staging:

Sub-domain IP-type Type State Service
internal.staging-portal.data.innotech.vn Static Internal Staging staging::inno-data-operation::latest
internal.staging-workflow-orchestration.data.innotech.vn Static Internal Staging staging::inno-basement::latest
internal.staging-transform.data.innotech.vn Static Internal Staging staging::inno-lake-prep::latest
internal.staging-auth.data.innotech.vn Static Internal Staging staging::inno-internal-auth::run::latest
internal.staging-keycloak.data.innotech.vn Static Internal Staging staging::inno-keycloak::run::latest

SAD - System architechture design

Logical

flowchart LR

  %% Component
  internet[Internet]
  forward_rule[Forwarding rule]
  https_proxy["Target HTTP(S) proxy
  (with SSL certificate)"]
  url-map[URL map]
  backend["Backend service
  with optional
  URL mask"]
  run-service[Cloud Run service]

  subgraph "External Application Load Balacer"
    forward_rule --> https_proxy --> url-map --> backend
  end

  subgraph "Region: asia-southeast1"
    direction TB
    subgraph "Serverless NEG"
        direction RL
        run-service
    end
  end

  %% Flow
  internet --> forward_rule
  backend --> run-service

Development

Diagram for deploy the service API

flowchart LR
  id1[Internet]
  id2[Forwarding rule]
  id3["Target HTTP(S) proxy
    (with SSL certificate)"]
  id4[URL map]
  id5["Backend service
    with optional
    URL mask"]
  id6[Cloud Run service]

  id1 --> id2

  subgraph "External Application Load Balacer"
    id2 --> id3 --> id4 --> id5
  end

  subgraph "Region: us-central1"
    direction TB
    subgraph "Serverless NEG"
      direction RL
      id6
    end
  end

  id5 --> id6

Naming convention rules

Variable Value Example
STATIC_REVERSE_IP_NAME ip-<domain> ip-internal-docs-data-innotech-vn
BACKEND_SERVICE_NAME lb-be-<sub-domain>-endpoint lb-be-internal-docs-endpoint
CERTIFICATE_NAME ssl-<domain> ssl-internal-docs-data-innotech-vn
SERVERLESS_NEG_NAME prod-neg-<domain> prod-neg-internal-docs-data-innotech-vn
CLOUD_RUN_SERVICE_NAME <cloud-run-name> prod-inno-docs
LOAD_BALANCER_NAME lb-<domain> lb-internal-docs-data-innotech-vn
TARGET_HTTP_PROXY_NAME lb-fe-http-<domain> lb-fe-http-internal-docs-data-innotech-vn
TARGET_HTTPS_PROXY_NAME lb-fe-https-<domain> lb-fe-https-internal-docs-data-innotech-vn
HTTP_FORWARDING_RULE_NAME http-80-<domain> http-80-internal-docs-data-innotech-vn
HTTPS_FORWARDING_RULE_NAME http-443-<domain> http-443-internal-docs-data-innotech-vn

Deployment Workflow

Steps:

The following is the steps to deploy for a target Load Balancing

flowchart TB
  Start --> | Step 1 |I(Create IP Address) --> |Step 2|C(Create Google managed certificate SSL)
  C --> | Step 3 |B(Create Backend Service)
  B --> | Step 4 |S(Create serverless NEG) -- Step 5: Add to --> B
  S --> | Step 6 |L(Create Load Balancer)
  L --> | Step 7 |P(Create Path Matcher)
  P --> | Step 8 |HTTP(create an HTTPS target proxy) --> | Config Request to | L
  HTTP --> | Step 9 |R(Create a forwarding rule to route incoming requests to the proxy)
  R --> | Step 10 |AD(Contact Domain Admin to set up IP address created in step 1)

Command line to deploy:

Following steps in above SAD, there are sample commands line for each step.

Step 1: Create reserve an external IP address

gcloud compute addresses create $STATIC_REVERSE_IP_NAME \
    --network-tier=PREMIUM \
    --ip-version=IPV4 \
    --global;

# Get IP address
declare STATIC_IP_ADDRESS=$(gcloud compute addresses describe $STATIC_REVERSE_IP_NAME \
    --format="get(address)" \
    --global \
)

Step 2: Create Google managed certificate (SSL)

gcloud compute ssl-certificates create $CERTIFICATE_NAME \
    --description=$CERTIFICATE_DESCRIPTION \
    --domains=$DATA_TEAM_DOMAIN \
    --global;

# Metadata
gcloud compute ssl-certificates describe $CERTIFICATE_NAME \
--global \
--format="get(name,managed.status, managed.domainStatus)";

# Check the certificate
gcloud compute ssl-certificates list --global;

Step 3: Create a backend service

gcloud compute backend-services create $BACKEND_SERVICE_NAME \
    --load-balancing-scheme=EXTERNAL \
    --global;

Step 4: Create serverless NEG

gcloud compute network-endpoint-groups create $SERVERLESS_NEG_NAME \
    --region=$PROJECT_REGION \
    --network-endpoint-type=serverless  \
    --cloud-run-service=$CLOUD_RUN_SERVICE_NAME;

Step 5: Add the serverless NEG as a backend to the backend service

gcloud compute backend-services add-backend $BACKEND_SERVICE_NAME \
    --global \
    --network-endpoint-group=$SERVERLESS_NEG_NAME \
    --network-endpoint-group-region=$PROJECT_REGION;

Step 6: Create a URL map to route incoming requests to the backend service

# Note: This is Load Balancer
gcloud compute url-maps create $LOAD_BALANCER_NAME \
    --default-service $BACKEND_SERVICE_NAME;

Step 7: Create Path Matcher

gcloud compute url-maps add-path-matcher $LOAD_BALANCER_NAME \
    --default-service $BACKEND_SERVICE_NAME \
    --path-matcher-name $PATH_MATCHER \
    --new-hosts $DATA_TEAM_DOMAIN \
    --backend-service-path-rules "$PATH_SERVICE=$BACKEND_SERVICE_NAME" \
    --delete-orphaned-path-matcher;

Step 8: Create a target HTTP(S) proxy to route requests to your URL map.

[Retired] For internal, just access through 443

gcloud compute target-http-proxies create $TARGET_HTTP_PROXY_NAME \
    --url-map=$LOAD_BALANCER_NAME;

For an HTTPS load balancer, create an HTTPS target proxy. The proxy is the portion of the load balancer that holds the SSL certificate for HTTPS Load Balancing, so you also load your certificate in this step.

gcloud compute target-https-proxies create $TARGET_HTTPS_PROXY_NAME \
    --ssl-certificates=$CERTIFICATE_NAME \
    --global-ssl-certificates \
    --url-map=$LOAD_BALANCER_NAME;

Step 9: Create a forwarding rule to route incoming requests to the proxy.

For an HTTP load balancer [Retired]

gcloud compute forwarding-rules create $HTTP_FORWARDING_RULE_NAME \
    --load-balancing-scheme=EXTERNAL \
    --network-tier=PREMIUM \
    --address=$STATIC_IP_ADDRESS \
    --target-http-proxy=$TARGET_HTTP_PROXY_NAME \
    --global \
    --ports=80;

For an HTTPS load balancer

gcloud compute forwarding-rules create $HTTPS_FORWARDING_RULE_NAME \
    --load-balancing-scheme=EXTERNAL \
    --network-tier=PREMIUM \
    --address=$STATIC_IP_ADDRESS \
    --target-https-proxy=$TARGET_HTTPS_PROXY_NAME \
    --global \
    --ports=443;

Step 10: Contact Domain Admin to setup records

cat <<-EOL
NAME                  TYPE     DATA
www                   A        $STATIC_IP_ADDRESS
@                     A        $STATIC_IP_ADDRESS
EOL

Source Reference

[1]: Google Load Balancing Overview