Seven Useful DigDag Tips and Tricks

DigDag is a pretty simple tool to install and run. There are quite a few tips and tricks that you can use to make your interaction productive. Here are my favourite ones.

Setup DigDag Config

DigDag command takes quite a few parameters. Instead of remembering to enter them every time, one can create a properties file. Its a standard Java properties file. You can have as many as you want with different name. I have quite a few client (dev.properties, test.properties, prod.properties) and one for running local server (server.properties). This makes me productive and also reduces mistakes. You can find the list of properties that can go into server and client properties file on the documentation site.

Enable Authentication

By default the authentication is not enabled. So its important to enable the authentication on the server even if it is behind a firewall. The simplest one would be basic authentication. The easiest way to enable is to add the following parameters to your server config

server.authenticator-class = io.digdag.standards.auth.basic.BasicAuthenticator
basicauth.username = admin
basicauth.password = password
basicauth.admin = true

You will have to pass the –config flag to the DigDag server start command

digdag server --database . --config server.properties

Run a different version of Python

You can run your Python 2 or Python 3 versions by specifying it in the .dig file. In this example I am forcing the job to use Python 3 version. Similarly you can also make it use the Conda version of Python.

timezone: UTC
_export:
  py:
    python: /usr/bin/python3

+step1:
  echo>: start ${session_time}

+step2:
  py>: example.MyWorkflow.my_task

Use other programming languages or call a binary

DigDag supports Python and Ruby by default. This should be enough to get the most things done. But there could be cases where you want to use another language or a binary to get something done. Using Shell operator sh>: you can call any accessible commands or shell scripts. In the below digfile I am calling a lua program test.lua using the shell operator.

timezone: UTC
_export:
  py:
    python: /usr/bin/python3

+step1:
  echo>: start ${session_time}

+step2:
  py>: example.MyWorkflow.my_task

+step3:
  sh>: lua test.lua

Use Restful APIs

DigDag exposes a set of Restful APIs. You can control the aspects of DigDag using it. I have written a fairly long blog post about using the DigDag APIs. For a quick start, pass –enable-swagger flag to DigDag server. This will expose the Swagger UI for you to explore the APIs.

digdag server --database . --config server.properties --enable-swagger

Use Docker

It’s a good idea to run the tasks inside a docker container. This keeps the DigDag and the environment under which tasks are running separate. You can do this by adding a reference to docker image under _export.

_export:
  docker:
    image: ubuntu:latest
    pull_always: true

+step1:
  py>: example.MyWorkflow.my_task

By default DigDag caches the docker image. You can set pull_always flag to force pull the latest image, every time the task starts.

Make HTTP Requests

You can easily make HTTP requests using http>: operator. You can use it to send updates to remote servers or get some values. I use it to send status updates, send some push messages using pushover etc

timezone: UTC

+run_task_1:
  http>: http://webhook.site/1a1fe11a-fa5d-42ca-a961-ab5b56b39164
  method: POST
  content:
    status: STARTED
    time: ${session_time}
  content_format: json  
  headers:  
    - Authorization: taytduywtuQYW

There are my top tips. What is yours?

You may also like...

1 Response

  1. Thejesh GN says:

    Two more:

    1. setup local_digdag_server.properties to access the server with basic password enabled. Assuming server is running on a local machine and has been setup to use username=admin password=password

    client.http.endpoint = 127.0.0.1:65432
    client.http.headers.authorization = Basic YWRtaW46cGFzc3dvcmQ=

    #Then call digdag client

    digdag attempts –config local_digdag_server.properties

    2. In case if you are using the browser to access the server with auth, use the header modifier plugins like SimpleModifyHeaders. For example in SimpleModifyHeaders setup

    url: 127.0.0.1:65432
    Action: add
    header field: Authorization
    field value: Basic YWRtaW46cGFzc3dvcmQ=
    apply on: request

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.