Development workflow

Prerequisites

Required software

In order to code on Orchest, you need to have the following installed on your system:

Cluster for development

Currently, the development scripts/tools assume that you are have Orchest installed in minikube. It is recommended, but not mandatory, to mount the Orchest repository in minikube, which allows redeploying services and incremental development:

# Delete any existing cluster
minikube delete

# Start minikube with the repository mounted in the required place.
# Run this command while you are in the Orchest repository directory.
minikube start \
  --cpus 6 \
  --addons ingress \
  --mount-string="$(pwd):/orchest-dev-repo" --mount

After the minikube cluster is created, follow the steps of a regular installation.

Installing Orchest for development

Development environment

Run the code below to install all dependencies needed for incremental development, building the docs, running tests, and automatically running pre-commit hooks:

# Make sure you are inside the orchest root directory

# pre-commit hooks
pre-commit install

# Dependencies to run unit tests
sudo apt-get install -y default-libmysqlclient-dev

# Frontend dependencies for incremental development
npm run setup --install && pnpm i

# Dependencies to build the docs
python3 -m pip install -r docs/requirements.txt

Building services locally

To easily test code changes of an arbitrary service, you will need to 1) rebuild the service image and 2) make it available to the k8s deployment. The procedure changes slightly depending on the deployment type.

Single node

Generally speaking, single node deployments make it far easier to test changes. For example, to make changes on the orchest-api service, do the following:

# Verify if in-node docker engine is active
[[ -n "${MINIKUBE_ACTIVE_DOCKERD}" ]] && echo $MINIKUBE_ACTIVE_DOCKERD || echo "Not active"

# If not active, set it
eval $(minikube -p minikube docker-env)

# Save the Orchest version in use
export TAG=$(orchest version --json | jq -r .version)

# Build the desired image
scripts/build_container.sh -i orchest-api -t $TAG -o $TAG

# Kill the pods of the orchest-api, so that the new image gets used
# when new pods are deployed
kubectl delete pods -n orchest -l "app.kubernetes.io/name=orchest-api"

Alternatively, you can run scripts/build_container.sh -m -t $TAG -o $TAG to rebuild the minimal required set of images.

Multi node

The procedure above is not possible in multi node deployments though, and it’s also error prone when it comes to setting the right tag, label, etc. For this reason, we provide the following scripts:

# Redeploy a service after building the image using the repo code.
# This is the script that you will likely use the most. This script
# assumes Orchest is installed and running, since it interacts with
# an Orchest service.
bash scripts/redeploy_orchest_service_on_minikube.sh orchest-api

# Remove an image from minikube. Can be useful to force a pull from
# a registry.
bash scripts/remove_image_from_minikube.sh orchest/orchest-api

# Build an image with a given tag, on all nodes.
bash scripts/build_image_in_minikube.sh orchest-api v2022.03.7

# Run arbitrary commands on all nodes.
bash scripts/run_in_minikube.sh echo "hello"

Warning

The redeploy and build_image scripts require the Orchest repository to be mounted in minikube. However, note that multi node mounting might not be supported by all minikube drivers. We have tested with docker, the default driver.

Incremental development (hot reloading)

The steps above allow you to rebuild the images for the services. In addition, you can also set Orchest to run in dev mode with orchest patch --dev so that code changes are instantly reflected, without having to build the containers again. The services that support dev mode are:

  • orchest-webserver

  • orchest-api

  • auth-server

Note

It is good practice to rebuild all containers before committing your changes.

# In case any new dependencies were changed or added they need to
# be installed.
pnpm i

# Run the client dev server for hot reloading of client (i.e. FE) files.
pnpm run dev &

orchest start

orchest patch --dev

Don’t forget to disable cache (DevTools -> Disable cache) or force reload (Command/Ctrl + Shift + R) to see frontend changes propagate.

Note

🎉 Awesome! Everything is set up now and you are ready to start coding. Have a look at our best practices and our GitHub to find interesting issues to work on.

Testing

Unit tests

Unit tests are being ported to k8s, stay tuned :)!

Integration tests

Integration tests are being ported to k8s, stay tuned :)!

Making changes

Before committing

Make sure your development environment is set up correctly (see prerequisites) so that pre-commit can automatically take care of running the appropriate formatters and linters when running git commit. Lastly, it is good practice to rebuild all containers (and restart Orchest) to do some manual testing and running the unit tests to make sure your changes didn’t break anything:

# Rebuild containers to do manual testing.
scripts/build_containers.sh

# Run unit tests.
scripts/run_tests.sh

In our CI we also run all of these checks together with integration tests to make sure the codebase remains stable. To read more about testing, check out the testing section.

IDE & language servers

Note

👉 This section is for VS Code and pyright users.

If you use VS Code (or the pyright language server to be more precise) the different services contain their own pyrightconfig.json file that configures smart features such as auto complete, go to definition, find all references, and more. For this to work, you need to install the dependencies of the services in the correct virtual environment by running:

scripts/run_tests.sh

Next you can create a workspace file that sets up VS Code to use the right Python interpreters (do note that this won’t include all the files defined in the Orchest repo), e.g.:

{
    "folders": [
        {
            "path": "services/orchest-api"
        },
        {
            "path": "services/orchest-webserver"
        },
        {
            "path": "services/base-images/runnable-shared"
        },
        {
            "path": "services/session-sidecar"
        },
        {
            "path": "services/memory-server"
        },
        {
            "name": "orchest-sdk",
            "path": "orchest-sdk/python"
        },
        {
            "name": "internal lib Python",
            "path": "lib/python/orchest-internals/"
        }
    ],
    "settings": {}
}

Python dependencies

Python dependencies for the microservices are specified using pip’s requirements.txt files. Those files are automatically generated by pip-tools from requirements.in files by calling pip-compile, which locks all the transitive dependencies. After a locked requirements.txt file is in place, subsequent calls to pip-compile will not upgrade any of the dependencies unless the constraints in requirements.in are modified.

To manually upgrade a dependency to a newer version, there are several options:

pip-compile -P <dep>  # Upgrades <dep> to latest version
pip-compile -U  # Try to upgrade everything

As a general rule, avoid writing exact pins in requirements.in unless there are known incompatibilities. In addition, avoid manually editing requirements.txt files, since they will be automatically generated.

Warning

A bug in pip-tools affects local dependencies. Older versions are not affected, but they are not compatible with modern pip. At the time of writing, the best way forward is to install this fork (see this PR for details):

pip install -U "pip-tools @ git+https://github.com/richafrank/pip-tools.git@combine-without-copy"

Database schema migrations

Whenever one of the services’s database models (in their respective models.py) have been changed, a database migration has to be performed so that all existing users are unaffected by the schema change on update (since they can then be automatically migrated to the latest version).

# Depending on the service that requires schema changes.
scripts/migration_manager.sh orchest-api migrate
scripts/migration_manager.sh orchest-webserver migrate

# For more options run:
scripts/migration_manager.sh --help

Run Orchest Controller locally

For easier debugging it is possible to run the orchest-controller locally with a debugger. We will explain how to do so using VSCode. Make sure your cluster is set up and you’ve installed Go, then follow the steps below:

Run the orchest-controller with a debugger in VSCode, example launch.json:

{
    "configurations": [
        {
            "name": "Launch ctrl",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "${workspaceFolder}/cmd/controller/main.go",
            "args": [
                "--inCluster=false",
                "--defaultVersion=<INSERT VERSION, e.g. v2022.05.0>",
                "--assetsDir=${workspaceFolder}/deploy",
                "--endpoint=:5000"
            ],
            "env": {
                "KUBECONFIG":"~/.kube/config",
            },
        },
    ]
}

Next install Orchest and afterwards issue other commands to test the controller with:

# Asuming you are in the root of the orchest git repository
orchest install --dev

# Delete orchest-controller deployment so that the one started with
# the debugger does everything
kubectl delete -n orchest deploy orchest-controller

The Orchest Controller should now be running inside a debugger session.

Without using VSCode

Build the orchest-controller binary via the Makefile in services/orchest-controller and run the orchest-controller by passing the following command line arguments:

# Asuming you have built the controller via "make controller" command
./bin/controller --inCluster=false --defaultVersion=v2022.05.3 \
--endpoint=:5000 --assetsDir=./deploy

Building the docs

Our docs are build using Read the Docs with Sphinx and written in reStructuredText.

To build the docs, run:

cd docs
make html

Tip

👉 If you didn’t follow the prerequisites, then make sure you’ve installed the needed requirements to builds the docs:

python3 -m pip install -r docs/requirements.txt

Opening a PR

Note

When opening a PR please change the base in which you want to merge from master to dev. The GitHub docs describe how this can be done.

We use gitflow as our branching model with master and dev being the described master and develop branches respectively. Therefore, we require PRs to be merged into dev instead of master.

When opening the PR a checklist will automatically appear to guide you to successfully completing your PR 🏁

Testing environment base image changes

By default, the image builder will pull a base image from Docker Hub based on the version of the cluster. For example, when building an environment image using the provided “python” base image, the builder will pull docker.io/orchest/base-kernel-py:<cluster version>. This makes it difficult to test changes to environment base images.

When running Orchest in development mode (orchest patch --dev), the docker socket of the cluster node will be exposed to the builder. When that’s the case, it’s possible to instruct the builder to pull from the local daemon by adding # LOCAL IMAGE to the first line of the custom build script.

Example:

  • orchest patch --dev

  • eval $(minikube -p minikube docker-env)

  • bash scripts/build_container.sh -i base-kernel-py -o v2022.05.3 -t v2022.05.3

  • select the image of choice or specify a custom one like orchest/base-kernel-new-language

  • add # LOCAL IMAGE to the first line of the custom build script and build

Note

As you rebuild, the image builder will pull the newest image.

Note

When you specify a custom image you can also specify the image tag to avoid the back-end making assumptions for you.

Testing jupyter base image changes

Required reading: testing environment base image changes. Again, simply add # LOCAL IMAGE to the first line of the custom build script.

Example:

  • orchest patch --dev

  • eval $(minikube -p minikube docker-env)

  • bash scripts/build_container.sh -i jupyter-server -o v2022.05.3 -t v2022.05.3

  • add # LOCAL IMAGE to the first line of the custom build script and build

Note

It’s currently not possible to specify a custom tag, the back-end will always try to pull an image with a tag equal to the cluster version.