This section assumes that you have a working Kubernetes environment.
The external platform dependencies of Flyte are:
- An S3-compatible object storage used for task metadata and to retrieve data to be processed by workflows.
- A relational database.
In this tutorial, we'll use Minio with a single bucket as the object storage provider and Postgres as the relational database. These two elements are configured to retain data even if the corresponding Pod is deleted.
NOTE: if you plan to run Flyte on a K8s environment with multiple nodes, the instructions in these section should be generally useful regardless of number of K8s worker and controlplane nodes. Also, to provide shared storage for your environment make sure to check out the supported
minio
topologies and supported backend storage systems.
- Prepare your K8s cluster to provision Persistent Volumes:
microk8s enable hostpath-storage
NOTE: for other K8s distributions, verify the provisioner available for local storage, typically associated with a StorageClass (
kubectl get storageclass
). If there isn't any, consider using this implementation of the hostpath provisioner. To learn more about how Kubernetes handles data persistency, go to the docs.
PersistentVolumeClaims created by the hostpath storage provisioner are bound to the local node, so it is impossible to move them to a different node. For multi-node K8s environments, use the StorageClass surfaced by your shared storage backend.
- Download the manifest that will deploy the Flyte dependencies:
curl -sl https://raw.githubusercontent.com/davidmirror-ops/flyte-the-hard-way/main/docs/on-premises/single-node/manifests/onprem-flyte-dependencies.yaml > onprem-flyte-dependencies.yaml
- Make sure to adjust sensitive values like
MINIO_ROOT_PASSWORD
andPOSTGRES_PASSWORD
before submitting the manifest:
kubectl apply -f onprem-flyte-dependencies.yaml
Example output:
namespace/flyte created
persistentvolumeclaim/postgresql-pvc created
persistentvolumeclaim/minio-pvc created
service/postgres created
deployment.apps/postgres created
deployment.apps/minio created
service/minio created
- Verify that both
minio
andpostgres
Pods are inRunning
state:
kubectl get pods -n flyte
Example output:
NAME READY STATUS RESTARTS AGE
postgres-6f6bb8bff7-9sjnj 1/1 Running 0 75s
minio-7d795cd5d8-dlk54 1/1 Running 0 75s
- Add the Flyte Helm repo:
helm repo add flyteorg https://flyteorg.github.io/flyte
At this point the dependencies required by Flyte are ready. You can now choose which form factor to deploy:
-
Single binary: all Flyte components (
flyteadmin
,flytepropeller
,flyteconsole
, etc) packaged into a single Pod. This is useful for environments with limited resources and a need for quick setup. -
Core: all components as standalone Pods, and potentially different number of replicas. This is required for multi-K8s-cluster environments.
You can only have one of these form factors on a single K8s cluster. The following sections guide you through the setup process for each.
- In order to avoid saving the DB password in plain text to the
values
file, we leverage a feature of theflyte-binary
chart that allows to consume pre-created secrets:
- Create an external secret containing the DB password:
cat <<EOF >local-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: flyte-binary-inline-config-secret
namespace: flyte
type: Opaque
stringData:
202-database-secrets.yaml: |
database:
postgres:
password: "postgres"
EOF
- Submit the manifest:
kubectl create -f local-secret.yaml
- Describe the secret:
kubectl describe secret flyte-binary-inline-config-secret -n flyte
Example output:
Name: flyte-binary-inline-config-secret
Namespace: flyte
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
202-database-secrets.yaml: 48 bytes
- Download the values file:
curl -sL https://raw.githubusercontent.com/davidmirror-ops/flyte-the-hard-way/main/docs/on-premises/single-node/manifests/onprem-flyte-binary-values.yaml > onprem-flyte-binary-values.yaml
- Install Flyte:
helm install flyte-binary flyteorg/flyte-binary --values onprem-flyte-binary-values.yaml -n flyte
Example output:
NAME: flyte-binary
LAST DEPLOYED: Wed Aug 23 19:12:23 2023
NAMESPACE: flyte
STATUS: deployed
REVISION: 1
TEST SUITE: None
- Verify the
flyte-binary
Pod is inRunning
state:
kubectl get pods -n flyte
Example output:
NAME READY STATUS RESTARTS AGE
postgres-6f6bb8bff7-9sjnj 1/1 Running 0 30m
minio-7d795cd5d8-dlk54 1/1 Running 0 30m
flyte-binary-58d779b9d8-z2hzs 1/1 Running 0 23s
Congratulations!
You have setup Flyte single binary. Now, learn how to connect to your Flyte instance
The following configuration requests about 3 CPU cores and 3 GB of memory for the different Flyte components without accounting for workflow executions.
- Download the values file
curl -sL https://raw.githubusercontent.com/davidmirror-ops/flyte-the-hard-way/main/docs/on-premises/single-node/manifests/onprem-flyte-core-values.yaml > onprem-flyte-core-values.yaml
- Review the values file if you need to change anything in the
userSettings
section. - Install the
flyte-core
Helm chart:
helm install flyte-core flyteorg/flyte-core --values onprem-flyte-core-values.yaml -n flyte
Example output
NAME: flyte-core
LAST DEPLOYED: Fri Mar 8 11:09:10 2024
NAMESPACE: flyte
STATUS: deployed
REVISION: 1
TEST SUITE: None
- Wait for the Pods to come up:
kubectl get po -n flyte
NAME READY STATUS RESTARTS AGE
postgres-d56745848-7dkhl 1/1 Running 4 (3d ago) 16d
minio-758b9b5d86-s2tnl 1/1 Running 3 (3d ago) 7d21h
syncresources-7cdd9f468c-kzndm 1/1 Running 0 58s
flyteconsole-856d9c594b-qmjv8 1/1 Running 0 58s
datacatalog-bddddcc47-lnmhk 1/1 Running 0 58s
flytepropeller-6dbb9f8cb5-w7wsn 1/1 Running 0 58s
flyte-pod-webhook-867c44bdd4-thrth 1/1 Running 0 58s
flyteadmin-66cb66764d-j7cx2 1/1 Running 0 58s
flytescheduler-579b6cb648-jmmgm 1/1 Running 0 58s
- Configure your Flyte config file for local connections (typically located at
$HOME/.flyte/config.yaml
):
If you haven't done so, install
flytectl
and runflytectl config init
so the config file is created. Check out the instructions here
admin:
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint: localhost:8089
authType: Pkce
insecure: true
logger:
show-source: true
level: 6
- Create a local DNS entry so the Flyte CLI connects to the
minio
service using its FQDN:
- In an OSX environment:
sudo vi /etc/hosts
- Add a new entry with the
minio
service name:
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 minio.flyte.svc.cluster.local
- In three different terminal windows, start three port-forwarding sessions. As each Helm chart uses different Services and ports, the commands are different:
kubectl -n flyte port-forward service/minio 9000:9000
kubectl -n flyte port-forward service/flyte-binary-grpc 8089:8089
kubectl -n flyte port-forward service/flyte-binary-http 8088:8088
kubectl -n flyte port-forward service/minio 9000:9000
kubectl -n flyte port-forward service/flyteadmin 8089:81
kubectl -n flyte port-forward service/flyteconsole 8088:80
- Save the following "hello world" workflow definition:
cat <<<EOF >hello_world.py
from flytekit import task, workflow
@task
def say_hello() -> str:
return "hello world"
@workflow
def my_wf() -> str:
res = say_hello()
return res
if __name__ == "__main__":
print(f"Running my_wf() {my_wf()}")
EOF
- Execute the workflow on the Flyte cluster:
pyflyte run --remote hello_world.py my_wf
Example output:
Go to http://localhost:8089/console/projects/flytesnacks/domains/development/executions/f0c602e28c5c84d46b22 to see execution in the console.
NOTE: different to what the CLI output indicates, use the
8088
port instead of 8089 to connect to the UI
Congratulations!
You have a working Flyte instance running on a local Kubernetes environment. Head over to the next sections to productionize your deployment.