DSL Overview
The Kubeflow Pipelines DSL is a set of Python libraries that you can use to specify machine learning (ML) workflows, including pipelines and their components. (If you’re new to pipelines, see the conceptual guides to pipelines and components.)
The DSL compiler compiles your Python DSL code into a single static configuration (YAML) that the Pipeline Service can process. The Pipeline Service, in turn, converts the static configuration into a set of Kubernetes resources for execution.
Installing the DSL
The DSL is part of the Kubeflow Pipelines software development kit (SDK), which includes the DSL as well as Python libraries to interact with the Kubeflow Pipeline APIs.
Follow the guide to installing the Kubeflow Pipelines SDK.
Introduction to main DSL functions and classes
This section introduces the DSL functions and classes that you use most often. You can see all classes and functions in the Kubeflow Pipelines DSL.
Pipelines
To create a pipeline, write your own pipeline function and use the DSL’s
pipeline(name, description)
function
as a decorator.
Usage:
@kfp.dsl.pipeline(
name='My pipeline',
description='My machine learning pipeline'
)
def my_pipeline(a: PipelineParam, b: PipelineParam):
...
Note: The
Pipeline()
class
is not useful for creating pipelines. Instead, you should define your pipeline
function and decorate it with @kfp.dsl.pipeline
as described above. The class
is useful for getting a pipeline object and its operations when implementing a
compiler.
Components
To create a component for your pipeline, write your own component function and
use the DSL’s
component(func)
function
as a decorator.
Usage:
@kfp.dsl.component
def my_component(my_param):
...
return dsl.ContainerOp()
The above component
decorator requires the function to return a ContainerOp
instance. The main purpose of using this decorator is to enable
DSL static type checking.
Pipeline parameters
The
PipelineParam(object)
class
represents a data type that you can pass between pipeline components.
You can use a PipelineParam
object as an argument in your pipeline function.
The object is then a pipeline parameter that shows up in Kubeflow Pipelines UI.
A PipelineParam
can also represent an intermediate value that you pass between
components.
Usage as an argument in a pipeline function:
@kfp.dsl.pipeline(
name='My pipeline',
description='My machine learning pipeline'
)
def my_pipeline(
my_num = dsl.PipelineParam(name='num-foos', value=1000),
my_name = dsl.PipelineParam(name='my-name', value='some text'),
my_url = dsl.PipelineParam(name='foo-url', value='http://example.com')):
...
The DSL supports auto-conversion from string to PipelineParam
. You can
therefore write the same function like this:
@kfp.dsl.pipeline(
name='My pipeline',
description='My machine learning pipeline'
)
def my_pipeline(
my_num='1000',
my_name='some text',
my_url='http://example.com'):
...
See more about PipelineParam
objects in the guide to building a
component.
Types
The
types
module contains a list of types defined by the Kubeflow Pipelines SDK. Types
include basic types like String
, Integer
, Float
, and Bool
, as well as
domain-specific types like GCPProjectID
and GCRPath
.
See the guide to DSL static type checking.
Next steps
- See how to build a pipeline.
- Build a reusable component for sharing in multiple pipelines.
- Read about writing recursive functions in the DSL.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.