Projects

A project is the main container for organizing related pipelines, jobs, environments and code.

A project is basically a git repository. For example, a Project might be organized like:

.
├── .git/
├── .orchest
│   ├── environments/
│   └── pipelines/
├── california_housing.orchest
├── collect-results.ipynb
└── get-data.py

Projects also contain jobs, however, these are not stored in the project’s filesystem.

You can access Project files in your code running inside environments using relative paths. For absolute paths, all files of a project are mounted to the /project-dir directory.

Getting started

You can get started with Projects by:

  • Creating a new project

  • Importing an existing project using its git repository URL (see how to import a project).

  • Importing Orchest curated or community contributed examples.

Tip

👉 See quickstart tutorial.

Project versioning

A Project’s .orchest directory should be versioned since it defines the environment in use. This enables the Project to run on every machine.

The /data directory can be used to store data locally. The /data directory can be accessed by all pipelines across all projects, even by jobs.

Secrets should be set with environment variables to avoid them being versioned.

Using git inside Orchest

Tip

👉 See video tutorial: versioning using git in Orchest.

You can use git inside Orchest with the pre-installed jupyterlab-git extension. Get started by adding your user.name and user.email in configure JupyterLab. For example:

git config --global user.name "John Doe"
git config --global user.email "john@example.org"

Use the following command to add a private SSH key to your terminal session in JupyterLab:

echo "chmod 400 /data/id_rsa" >> ~/.bashrc
echo "ssh-add /data/id_rsa 2>/dev/null" >> ~/.bashrc
echo "if [ -z \$SSH_AGENT_PID ]; then exec ssh-agent bash; fi" >> ~/.bashrc
mkdir -p ~/.ssh
printf "%s\n" "Host github.com" " IdentityFile /data/id_rsa" >> ~/.ssh/config
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

Ensure the id_rsa private key file is uploaded through the pipeline file manager in the data/ folder.

Warning

🚨 Adding a private key file to the /data folder exposes it to everyone using your Orchest instance.

You can then version using git using:

  • JupyterLab terminal.

  • JupyterLab git extension UI.