Building and Deploying a Pipeline on VAST DataEngine

Prev Next

Overview

A pipeline may feature any number of sequences connecting triggers and functions. A sequence typically begins with a trigger, which invokes a function. A function can invoke another function. That function may invoke another function. And so on. Your pipeline can feature multiple sequences. Each function deployment within the pipeline has an independent deployment configuration.

Building a pipeline involves configuring triggers and functions, connecting triggers and functions in sequences, configuring the deployment of each function and finally, deploying the function.

Workflow

  1. Review the types of triggers that you can use:

    • Element triggers. These triggers watch for events where elements are added or removed from a view or where tags are added or removed from elements in a view.  

    • Schedule triggers. These triggers issue events on a schedule.

  2. Design your pipeline, which will consist of deployment sequences of triggers and functions.

  3. Optionally create triggers and create functions before you create the pipeline. You can also do this after creating the pipeline resource or, if applicable, use existing triggers and functions.

  4. Create the Pipeline Resource

  5. Build the Pipeline. At this stage, you will be able to select triggers and functions that are already available, or create new ones. This stage includes configuring the deployment of each function. You will be able to save a draft of the pipeline at any stage, to save your work if it is not ready to be deployed.

  6. Deploy the Pipeline. Once it is deployed, you will be able to observe logs and traces through the VAST DataEngine web user interface. You can also leverage the DataEngine API for telemetry observation.    

Create the Pipeline Resource

  1. From the left navigation menu, select Pipeline Management.

  2. Click Create New Pipeline

  3. Complete the fields:

    Pipeline Name

    Enter a name for the pipeline.

    Description

    Enter a description for the pipeline.

    Kubernetes Cluster

    From the dropdown, select the Kubernetes cluster that you want to use to execute the pipeline.

    Namespace

    Select which namespace to use.

    Pipeline Secret Keys

    Use this field to pass a secret to all function deployments in the pipeline. The secret can contain multiple key-value pairs, such as access keys for accessing data and metadata of S3 buckets.

    Note

    You can also specify secrets for individual function deployments in their individual configurations.

    You can enable functions to access these secret keys through use of the VAST DataEngine runtime SDK ctx.secrets function in your function code.  

    You can either enter the secrets manually or you can upload them as a yaml file.

    To enter secrets manually:

    1. Click Create secret.

    2. Enter the secret name in the Secret Name field.

    3. In the Key and Value fields provided, enter one of the keys of the secret and a valid value for it.

    4. Click the add button provided to add another key-value pair row as needed until you have added all keys needed for the secret.

    5. To add another secret, click Create Another secret and repeat the above steps.

    To upload a secret in a .yaml file, click Import Yaml File.

    To remove all secrets, click Clear All Secrets.

    Environment Variables

    Environment variables are key–value pairs that you can define for your functions to access at runtime. They are part of the execution environment provided to your functions.

    You will be able to define environment variables per function deployment when you build the pipeline.

    In this field, define any environment variables that you want to provide to all functions in the pipeline sequence. Function level environment variables overwrite pipeline level ones if the key of a key value pair matches and the value does not.

    To add environment variables, do one of the following:

    • Enter the variables:

      1. Click Add Variable.

      2. Enter a key in the Key field and a value in the Value field.

      3. If you want to add another variable, click Add button and enter another pair.

      4. Repeat until you have added all the variables you want to add.

    • Import environment variables from a file:Click Import ConfigMap.

  4. Click Create Pipeline.

    The pipeline resource is created and the Visual Builder opens, enabling you to build the pipeline.

Build the Pipeline

The Visual Builder enables you to add the triggers and functions that you want to build into the pipeline and to connect them to each other. The pipeline can consist of multiple pipeline flows, with triggers invoking functions and functions invoking functions.

Note

You cannot connect a trigger to another trigger. You cannot connect elements in a loop.

During the process, you can click Save Draft to save a draft at any time. When you are done, you can deploy the pipeline.

The following actions are available to help you build the pipeline:

Drag Pipeline Elements into the Builder

Drag and drop triggers and functions into the visual builder to build your pipeline:

  • To the left of the Visual Builder, select Triggers or Functions to see the library of existing triggers and functions.

  • To add a trigger or function to the pipeline, drag it into the Visual Builder from the list on the left.

Inspect a Trigger or Function in the Visual Builder

To see the properties of any pipeline element in the visual builder, select the element and click the Inspect button to the right of the builder. The element's properties appear on the right.

For a function, the following information is shown:  the container registry where the image is stored, the image source, and the image tag.

For a trigger, the following information is shown: the source view, broker, topic, and the type of trigger.

Search Triggers and Functions

To the left of the Visual Builder, select Triggers or Functions, click the search box and enter a string to search for triggers or functions by name.

Create New Triggers and Functions

  • To create a new trigger, select the Triggers tab at the left of the Visual Builder and click Create New Trigger.  

    Follow the procedure described here.

    The new trigger is now visible in the library of triggers in the left panel.  

  • To create a new function, select the Functions tab at the left of the Visual Builder and click Create New Function.  

    Follow the procedure here.  

Connect Triggers and Functions

You can build into the pipeline a flow in which a trigger invokes a function. When deployed, the pipeline will execute the function on each event of the trigger. You can also connect a function to another function so that the pipeline executes the additional function on the output of the first function.  Within a pipeline, you might have multiple flows, each starting with a different trigger.

Connecting Elements to Each Other

To connect a trigger or function to a function, drag the open handle (a) of the first trigger or function to the solid handle (like b) of the function.

handles.png

A connector appears, showing the direction of invocation (the selected trigger will invoke the selected function):

connectedtriggerfunction.png

Disconnecting Elements from Each Other

To disconnect an element from another element, select the arrow and either drag it into the open space or press Backspace or Delete on your keyboard. The link disappears.

Removing Elements from the Builder

To remove a trigger or function from the builder, select the trigger or function and press the BACKSPACE key on your keyboard.

Configure Function Deployment

Each function deployment included in the pipeline has its own configuration that determines the execution environment provided to the function. There is a default configuration for every function deployment. You can edit it for each function deployment as needed.  

To edit the configuration a function deployment:

  1. Select (click) the function in the Visual Builder. The function deployment configuration appears on the right.

  2. If you want to change which revision of the function to deploy, select an alternate revision from the Revision Number dropdown.

  3. To edit other parameters, click the Edit button next to the section heading, and edit the values in the fields.

    You can configure the following parameters:

Secret Keys

Add secret keys to pass a secret to the specific function deployment. The secret can contain multiple key-value pairs, such as access keys for accessing data and metadata of S3 buckets.

Note

Secret keys that are specified in the pipeline configuration are passed to all function deployments.

You can enable functions to access these secret keys through use of the VAST DataEngine runtime SDK ctx.secrets function in your function code. See The VAST DataEngine Runtime SDK Guide for details.  

To add secrets:

  1. Click Create secret.

  2. Enter the secret name in the Secret Name field.

  3. In the Key and Value fields provided, enter one of the keys of the secret and a valid value for it.

  4. Click the add button provided to add another key-value pair row as needed until you have added all keys needed for the secret.

  5. To add another secret, click Create Another secret and repeat the above steps.

Deployment Resources

Most of these resources have minimum and maximum values. Minimum values are guaranteed resource to provide to the function deployment. Maximum values are limits on the maximum resources provided to the function deployment.  

Concurrency

The minimum and maximum number of functions to be deployed on the Kubernetes cluster (the number of pods).

CPU

The minimum CPU to guarantee for the function and the maximum CPU to allow to the function.

Memory

The minimum memory to guarantee for the function and the maximum memory to allow to the function.

Autoscaling RPS factor

The rate of requests per second at which autoscaling should begin.

Disk (Ephemeral)

This is not provided by default. If the function needs ephemeral storage, enter an amount of capacity to provide to the function.  

Deployment Configuration

Timeout

The timeout for the function.

Retries

The number of retries if the function deployment fails or times out.

Log level

The log level in case of failure. Possible values:

  • NOTSET

  • DEBUG

  • INFO (default)

  • WARNING

  • ERROR

  • CRITICAL

Method of Delivery

Method of processing trigger events. Possible values:

  • ordered. Processes events in the same sequence that they are produced.

  • unordered. Events may be processed concurrently.

Environment variables

Environment variables are key–value pairs that you can define, for your function to access at runtime. They are part of the execution environment provided to your function.

Deploy the Pipeline

When you are done building a pipeline with the Visual Builder, click Deploy.

To deploy a pipeline that you saved as a draft, find the draft in the Pipelines tab of the Pipelines Management page. The status shows as Draft . Right-click the pipeline and select Deploy.

You can view the status of the pipeline in the Pipelines tab of the Pipelines Management page. Initially, while the pipeline is being deployed, the status is In Progress. When deployed, the status changes to Running.