Development workflow¶
Prerequisites¶
Required software¶
In order to code on Orchest, you need to have the following installed on your system:
Python version
3.x
helm (if you intend to develop files in
/deploy
)kubectl (you might want to try out a tool like
k9s
in the long run)-
install go if you work on the controller
jq (useful when working with JSON in your terminal)
Google Chrome (integration tests only)
Cluster for development¶
Currently, the development scripts/tools assume that you are have Orchest installed in minikube. It is recommended, but not mandatory, to mount the Orchest repository in minikube, which allows redeploying services and incremental development:
# Delete any existing cluster
minikube delete
# Start minikube with the repository mounted in the required place.
# Run this command while you are in the Orchest repository directory.
minikube start \
--cpus 6 \
--addons ingress \
--mount-string="$(pwd):/orchest-dev-repo" --mount
After the minikube cluster is created, follow the steps of a regular installation.
Installing Orchest for development¶
Development environment¶
Run the code below to install all dependencies needed for incremental development, building the docs, running tests, and automatically running pre-commit hooks:
# Make sure you are inside the orchest root directory
# pre-commit hooks
pre-commit install
# Dependencies to run unit tests
sudo apt-get install -y default-libmysqlclient-dev
# Frontend dependencies for incremental development
npm run setup --install && pnpm i
# Dependencies to build the docs
python3 -m pip install -r docs/requirements.txt
Building services locally¶
To easily test code changes of an arbitrary service, you will need to 1) rebuild the service image and 2) make it available to the k8s deployment. The procedure changes slightly depending on the deployment type.
Single node¶
Generally speaking, single node deployments make it far easier to test changes.
For example, to make changes on the orchest-api
service, do the following:
# Verify if in-node docker engine is active
[[ -n "${MINIKUBE_ACTIVE_DOCKERD}" ]] && echo $MINIKUBE_ACTIVE_DOCKERD || echo "Not active"
# If not active, set it
eval $(minikube -p minikube docker-env)
# Save the Orchest version in use
export TAG=$(orchest version --json | jq -r .version)
# Build the desired image
scripts/build_container.sh -i orchest-api -t $TAG -o $TAG
# Kill the pods of the orchest-api, so that the new image gets used
# when new pods are deployed
kubectl delete pods -n orchest -l "app.kubernetes.io/name=orchest-api"
Alternatively, you can run scripts/build_container.sh -m -t $TAG -o $TAG
to rebuild the minimal required set of images.
Multi node¶
The procedure above is not possible in multi node deployments though, and it’s also error prone when it comes to setting the right tag, label, etc. For this reason, we provide the following scripts:
# Redeploy a service after building the image using the repo code.
# This is the script that you will likely use the most. This script
# assumes Orchest is installed and running, since it interacts with
# an Orchest service.
bash scripts/redeploy_orchest_service_on_minikube.sh orchest-api
# Remove an image from minikube. Can be useful to force a pull from
# a registry.
bash scripts/remove_image_from_minikube.sh orchest/orchest-api
# Build an image with a given tag, on all nodes.
bash scripts/build_image_in_minikube.sh orchest-api v2022.03.7
# Run arbitrary commands on all nodes.
bash scripts/run_in_minikube.sh echo "hello"
Warning
The redeploy and build_image scripts require the Orchest repository to be mounted in minikube. However, note that multi node mounting might not be supported by all minikube drivers. We have tested with docker, the default driver.
Incremental development (hot reloading)¶
The steps above allow you to rebuild the images for the services.
In addition, you can also set Orchest to run in dev mode with orchest patch --dev
so that code changes are instantly reflected, without having to build the containers again.
The services that support dev mode are:
orchest-webserver
orchest-api
auth-server
Note
It is good practice to rebuild all containers before committing your changes.
# In case any new dependencies were changed or added they need to
# be installed.
pnpm i
# Run the client dev server for hot reloading of client (i.e. FE) files.
pnpm run dev &
orchest start
orchest patch --dev
Don’t forget to disable cache (DevTools -> Disable cache) or force reload (Command/Ctrl + Shift + R) to see frontend changes propagate.
Note
🎉 Awesome! Everything is set up now and you are ready to start coding. Have a look at our best practices and our GitHub to find interesting issues to work on.
Testing¶
Unit tests¶
Unit tests are being ported to k8s, stay tuned :)!
Integration tests¶
Integration tests are being ported to k8s, stay tuned :)!
Making changes¶
Before committing¶
Make sure your development environment is set up correctly (see prerequisites) so that pre-commit can automatically take care of running the appropriate
formatters and linters when running git commit
. Lastly, it is good practice to rebuild all
containers (and restart Orchest) to do some manual testing and running the unit tests to make sure your changes didn’t break anything:
# Rebuild containers to do manual testing.
scripts/build_containers.sh
# Run unit tests.
scripts/run_tests.sh
In our CI we also run all of these checks together with integration tests to make sure the codebase remains stable. To read more about testing, check out the testing section.
IDE & language servers¶
Note
👉 This section is for VS Code and pyright users.
If you use VS Code (or the pyright language server to be
more precise) the different services contain their own pyrightconfig.json
file
that configures smart features such as auto complete, go to definition, find all references,
and more. For this to work, you need to install the dependencies of the services in the correct
virtual environment by running:
scripts/run_tests.sh
Next you can create a workspace file that sets up VS Code to use the right Python interpreters (do note that this won’t include all the files defined in the Orchest repo), e.g.:
{
"folders": [
{
"path": "services/orchest-api"
},
{
"path": "services/orchest-webserver"
},
{
"path": "services/base-images/runnable-shared"
},
{
"path": "services/session-sidecar"
},
{
"path": "services/memory-server"
},
{
"name": "orchest-sdk",
"path": "orchest-sdk/python"
},
{
"name": "internal lib Python",
"path": "lib/python/orchest-internals/"
}
],
"settings": {}
}
Python dependencies¶
Python dependencies for the microservices are specified using pip’s requirements.txt
files.
Those files are automatically generated by pip-tools
from requirements.in
files by calling pip-compile
, which locks all the transitive
dependencies. After a locked requirements.txt
file is in place,
subsequent calls to pip-compile
will not upgrade any of the dependencies
unless the constraints in requirements.in
are modified.
To manually upgrade a dependency to a newer version, there are several options:
pip-compile -P <dep> # Upgrades <dep> to latest version
pip-compile -U # Try to upgrade everything
As a general rule, avoid writing exact pins in requirements.in
unless there are known incompatibilities.
In addition, avoid manually editing requirements.txt
files,
since they will be automatically generated.
Warning
A bug in pip-tools affects local dependencies. Older versions are not affected, but they are not compatible with modern pip. At the time of writing, the best way forward is to install this fork (see this PR for details):
pip install -U "pip-tools @ git+https://github.com/richafrank/pip-tools.git@combine-without-copy"
Database schema migrations¶
Whenever one of the services’s database models (in their respective models.py
) have been
changed, a database migration has to be performed so that all existing users are unaffected by the
schema change on update (since they can then be automatically migrated to the latest version).
# Depending on the service that requires schema changes.
scripts/migration_manager.sh orchest-api migrate
scripts/migration_manager.sh orchest-webserver migrate
# For more options run:
scripts/migration_manager.sh --help
Run Orchest Controller locally¶
For easier debugging it is possible to run the orchest-controller
locally with a debugger. We
will explain how to do so using VSCode. Make sure your cluster is set up and you’ve installed Go, then follow the steps below:
Run the orchest-controller
with a debugger in VSCode, example launch.json
:
{
"configurations": [
{
"name": "Launch ctrl",
"type": "go",
"request": "launch",
"mode": "debug",
"program": "${workspaceFolder}/cmd/controller/main.go",
"args": [
"--inCluster=false",
"--defaultVersion=<INSERT VERSION, e.g. v2022.05.0>",
"--assetsDir=${workspaceFolder}/deploy",
"--endpoint=:5000"
],
"env": {
"KUBECONFIG":"~/.kube/config",
},
},
]
}
Next install Orchest and afterwards issue other commands to test the controller with:
# Asuming you are in the root of the orchest git repository
orchest install --dev
# Delete orchest-controller deployment so that the one started with
# the debugger does everything
kubectl delete -n orchest deploy orchest-controller
The Orchest Controller should now be running inside a debugger session.
Without using VSCode¶
Build the orchest-controller
binary via the Makefile
in services/orchest-controller
and
run the orchest-controller
by passing the following command line arguments:
# Asuming you have built the controller via "make controller" command
./bin/controller --inCluster=false --defaultVersion=v2022.05.3 \
--endpoint=:5000 --assetsDir=./deploy
Building the docs¶
Our docs are build using Read the Docs with Sphinx and written in reStructuredText.
To build the docs, run:
cd docs
make html
Tip
👉 If you didn’t follow the prerequisites, then make sure you’ve installed the needed requirements to builds the docs:
python3 -m pip install -r docs/requirements.txt
Opening a PR¶
Note
When opening a PR please change the base in which you want to merge from master
to dev
.
The GitHub docs
describe how this can be done.
We use gitflow as
our branching model with master
and dev
being the described master
and develop
branches respectively. Therefore, we require PRs to be merged into dev
instead of master
.
When opening the PR a checklist will automatically appear to guide you to successfully completing your PR 🏁
Testing environment base image changes¶
By default, the image builder will pull a base image from Docker Hub based on the version of the
cluster. For example, when building an environment image using the provided “python” base image, the
builder will pull docker.io/orchest/base-kernel-py:<cluster version>
. This makes it difficult to
test changes to environment base images.
When running Orchest in development mode (orchest patch --dev
), the docker socket
of the cluster node will be exposed to the builder. When that’s the case, it’s
possible to instruct the builder to pull from the local daemon by adding # LOCAL IMAGE
to the
first line of the custom build script.
Example:
orchest patch --dev
eval $(minikube -p minikube docker-env)
bash scripts/build_container.sh -i base-kernel-py -o v2022.05.3 -t v2022.05.3
select the image of choice or specify a custom one like
orchest/base-kernel-new-language
add
# LOCAL IMAGE
to the first line of the custom build script and build
Note
As you rebuild, the image builder will pull the newest image.
Note
When you specify a custom image you can also specify the image tag to avoid the back-end making assumptions for you.
Testing jupyter base image changes¶
Required reading: testing environment base image changes.
Again, simply add # LOCAL IMAGE
to the first line of the custom build script.
Example:
orchest patch --dev
eval $(minikube -p minikube docker-env)
bash scripts/build_container.sh -i jupyter-server -o v2022.05.3 -t v2022.05.3
add
# LOCAL IMAGE
to the first line of the custom build script and build
Note
It’s currently not possible to specify a custom tag, the back-end will always try to pull an image with a tag equal to the cluster version.