Skip to content

SAD

Version:

0.0.1 Initiation Document

Table of Contents:

Overview

The MLOps project provide a process for CI, CD and CT with machine learning model

Change

  • Added compute.googleapi.com in service.tf
  • Added firewall_rule allow tcp:5000
  • Added compute address
  • Added compute vm instance
  • Configure the tracking server mlflow
  • Updated secret infra

SAD - System architechture design

Logical View

flowchart LR
    subgraph orchestration[Orchestration]
        pipeline[Pipeline]
    end
    pipeline --submit job--> cluster[Cluster]
    cluster --log--> tracking_server[Tracking Server]
    cluster --register model--> model_registry[Model Registry]
    model_registry -- serving --> serving_api[Serving API]

Physical View

flowchart LR
  subgraph orchestration[Prefect]
      flow[Flow]
  end
  subgraph gcs[Google Cloud Storage]
  end
  subgraph dataproc[Dataproc]
      cluster[Hadoop Cluster]
  end
  subgraph mlflow[MLflow]
      tracking_server[Tracking Server]
      model_registry[Model Registry]
  end
  flow --submit job--> cluster
  flow --push code--> gcs
  gcs --pull code--> cluster
  cluster --log--> tracking_server
  cluster --register model--> model_registry
  model_registry -- serving --> serving_api[FastAPI]

Network

flowchart LR

  subgraph GCP[GCP]
    direction RL
    subgraph vpc[VPC network]
      direction LR
      subgraph region[Region]
        direction TB
        subgraph subnet[Subnet]
          direction TB
          subgraph zone[Zone]
            direction TB
            vm["VM instance\nExternal IP: 1.2.3.4\nnetwork_tag: allow-ingress-mlflow"]
          end
        end
      end
      fw_rule["Firewall rule\ntarget_tags: allow-ingress-mlflow\ndirection: ingress\naction: allow\nprotocol: tcp\nports   : 5000\nsource_ranges: vpn_innotech"]
    end
  end

  check{"Authentication"}
  internet[Internet] --- |http://1.2.3.4:5000| check --- |Yes| vm
  check --x |No| vm

Network and subnet

  • Every VM is part of a VPC network. VPC networks provide connectivity for your VM instance to other Google Cloud products and to the internet

  • Each subnet in a VPC network is associated with a region and contains one or more IP address ranges. Each of the network interfaces for VM must be connected to a subnet.

  • When create a VM, can specify a VPC network and subnet. If omit this configuration, the default network and subnet are used. Google Cloud assigns an internal IPv4 address to the new VM from the primary IPv4 address range of the selected subnet

IP address

  • Each VM interface has an internal IPv4 address, which is allocated from the subnet.

  • VMs use these IP addresses to communicate with other Google Cloud resources and external systems. External IP addresses are publicly routable IP addresses that can communicate with the internet. Both external and internal IP addresses can be either ephemeral or static.

  • To communicate with the internet, use an external IPv4 or external IPv6 address configured on the instance

Firewall rule

  • VPC firewall rules let allow or deny connections to or from VM based on a configuration that specify. Google Cloud always enforces enabled VPC firewall rules, protecting VMs regardless of their configuration and operating system, even if the VM has not started.

Service Account

Prerequisites for Deployment

Prefect

  • A server host prefect management process
  • A server host prefect worker process
  • A DBMS for metatda storage(PostgreSQL, MySQL, SQLite)

MLflow

  • A server host MLflow management process
  • A DBMS for metatda storage(PostgreSQL, MySQL, SQLite)
  • A object storage for artifact storage(File system, HDFS, GCS,...)

Resource

Infra

  • 1 compute engine (host Prefect and MLflow management process)
  • Memory: 4 GB
  • CPU: 2 vCPU
  • Disk: 10 GB
  • 1 Dataproc cluster
  • 1 SQL instance (DBMS for Prefect and MLflow)
  • 1 storage object on GCS (Artifact storage)
  • Server 2 on-premise (host Prefect worker)

Appendix

Appendix A: Record of Changes

Table: Record of changes

Version Date Author Description of Change
0.1.0 06/04/2024 Bao Truong Initation documentation