Install Kubeflow
This guide describes how to use the kfctl.sh
script to
deploy Kubeflow on Amazon Web Services (AWS).
Note: Amazon Web Services (AWS) is moving from
kfctl.sh
to a command line interface (CLI) which gives you more control over your configuration and better reliability.
Prerequisites
- Install kubectl
- Install and configure the AWS Command Line Interface (AWS CLI):
- Install the AWS Command Line Interface.
- Configure the AWS CLI by running the following command:
aws configure
. - Enter your Access Keys (Access Key ID and Secret Access Key).
- Enter your preferred AWS Region and default output options.
- Install eksctl (version 0.1.27 or newer) and the aws-iam-authenticator.
- Install jq.
- Install ksonnet.
You do not need to have an existing Amazon Elastic Container Service for Kubernetes (Amazon EKS) cluster. The deployment process will create a cluster for you.
Understanding the deployment process
The deployment process is controlled by 4 different commands:
- init - The initial one-time set up.
- generate - Creates the configuration files that define your various resources.
- apply - Creates or updates the resources.
- delete - Deletes the resources.
With the exception of init
, all commands take an argument which describes the set of resources to apply the command to; this argument can be one of the following:
- platform - All AWS resources; that is, anything that doesn’t run on Kubernetes. Like IAM policy attachments, Amazon EKS cluster creation, etc.
- k8s - All Kubernetes resources. Such as Kubeflow packages and add-on packages like
fluentd
oristio
. - all - Both AWS and Kubernetes resources.
App layout
Your Kubeflow app
directory contains the following files and directories:
- app.yaml - Defines the configuration related to your Kubeflow deployment.
- These values are set when you run
kfctl init
. - These values are snapshotted inside
app.yaml
to make your app self contained.
- These values are set when you run
- ${KFAPP}/aws_config - A directory that contains a sample
eksctl
cluster configuration file that defines the AWS cluster and policy files to attach to your node group roles.- This directory is created when you run
kfctl.sh generate platform
. - You can modify the
cluster_config.yaml
andcluster_features.sh
files to customize your AWS infrastructure.
- This directory is created when you run
- ${KFAPP}/k8s_specs - A directory that contains YAML specifications for daemons deployed on your Kubernetes Engine cluster.
- ${KFAPP}/ks_app - A directory that contains the ksonnet application for Kubeflow.
- The directory is created when you run
kfctl generate k8s
. - You can use ksonnet to customize Kubeflow.
- The directory is created when you run
The provisioning scripts can either bring up a new cluster and install Kubeflow on it, or you can install Kubeflow on your existing cluster. We recommend that you create a new cluster for better isolation.
If you experience any issues running these scripts, see the troubleshooting guidance for more information.
Kubeflow installation
-
Run the following commands to download the latest
kfctl.sh
export KUBEFLOW_SRC=/tmp/kubeflow-aws export KUBEFLOW_TAG=master mkdir -p ${KUBEFLOW_SRC} && cd ${KUBEFLOW_SRC} curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
- KUBEFLOW_SRC - Full path to your preferred download directory. Please use the full absolute path, for example
/tmp/kubeflow-aws
- KUBEFLOW_SRC - Full path to your preferred download directory. Please use the full absolute path, for example
-
Run the following commands to set up your environment and initialize the cluster.
Note: If you would like to install Kubeflow on your existing EKS cluster, please skip this step and follow the setup instructions for an existing cluster instead. When you are finished, return here and resume with the next step.
export KFAPP=kfapp export REGION=us-west-2 export AWS_CLUSTER_NAME=kubeflow-aws ${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform aws \ --awsClusterName ${AWS_CLUSTER_NAME} \ --awsRegion ${REGION}
- AWS_CLUSTER_NAME - Specify a unique name for your Amazon EKS cluster.
- KFAPP - Use a relative directory name here rather than absolute path, such as
kfapp
. - REGION - Use the AWS Region you want to create your cluster in.
-
Generate and apply platform changes.
You can customize your cluster configuration, control plane logging, and private cluster endpoint access before you
apply platform
, please see Customizing Kubeflow on AWS for more information.cd ${KFAPP} ${KUBEFLOW_SRC}/scripts/kfctl.sh generate platform # Customize your Amazon EKS cluster configuration before following the next step ${KUBEFLOW_SRC}/scripts/kfctl.sh apply platform
-
Generate and apply the Kubernetes changes.
${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s
Important!!! By default, these scripts create an AWS Application Load Balancer for Kubeflow that is open to public. This is good for development testing and for short term use, but we do not recommend that you use this configuration for production workloads.
To secure your installation, you have two options:
-
Disable ingress before you
apply k8s
. Open${KUBEFLOW_SRC}/${KFAPP}/env.sh
and edit theKUBEFLOW_COMPONENTS
environment variable. Delete,\"alb-ingress-controller\",\"istio-ingress\"
and save the file. -
Follow the instructions to add authentication before you
apply k8s
Once your customization is done, you can run this command to deploy Kubeflow.
${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s
-
-
Wait for all the resources to become ready in the
kubeflow
namespace.kubectl -n kubeflow get all
-
Open Kubeflow Dashboard
-
If you chose to use a load balancer, you can retrieve the public DNS name here.
kubectl get ingress -n istio-system NAMESPACE NAME HOSTS ADDRESS PORTS AGE istio-system istio-ingress * a743484b-istiosystem-istio-2af2-xxxxxx.us-west-2.elb.amazonaws.com 80 1h
This deployment may take 3-5 minutes to become ready. Verify that the address works by opening it in your preferred Internet browser. You can also run
kubectl delete istio-ingress -n istio-system
to remove the load balancer entirely. -
If you didn’t create a load balancer, please use port-forwarding to visit your cluster. Run following command and visit
localhost:8080
.kubectl port-forward -n kubeflow `kubectl get pods -n kubeflow --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}'` 8080:80
-
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.