Projects¶
A project is the main container for organizing related pipelines, jobs, environments and code.
A project is basically a git
repository. For example, a Project might be organized like:
.
├── .git/
├── .orchest
│ ├── environments/
│ └── pipelines/
├── california_housing.orchest
├── collect-results.ipynb
└── get-data.py
Projects also contain jobs, however, these are not stored in the project’s filesystem.
You can access Project files in your code running inside environments using relative paths. For absolute paths, all files of a project are mounted to the /project-dir
directory.
Getting started¶
You can get started with Projects by:
Creating a new project
Importing an existing project using its git repository URL (see how to import a project).
Importing Orchest curated or community contributed examples.
Tip
👉 See quickstart tutorial.
Project versioning¶
A Project’s .orchest
directory should be versioned since it defines the environment in use. This enables the Project to run on every machine.
The /data
directory can be used to store data locally. The /data
directory can be accessed by all pipelines across all projects, even by jobs.
Secrets should be set with environment variables to avoid them being versioned.
Using git
inside Orchest¶
Tip
👉 See video tutorial: versioning using git in Orchest.
You can use git
inside Orchest with the pre-installed jupyterlab-git extension. Get started by adding your user.name
and user.email
in configure JupyterLab. For example:
git config --global user.name "John Doe"
git config --global user.email "john@example.org"
Use the following command to add a private SSH key to your terminal session in JupyterLab:
echo "chmod 400 /data/id_rsa" >> ~/.bashrc
echo "ssh-add /data/id_rsa 2>/dev/null" >> ~/.bashrc
echo "if [ -z \$SSH_AGENT_PID ]; then exec ssh-agent bash; fi" >> ~/.bashrc
mkdir -p ~/.ssh
printf "%s\n" "Host github.com" " IdentityFile /data/id_rsa" >> ~/.ssh/config
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
Ensure the id_rsa
private key file is uploaded through the pipeline file manager in the data/
folder.
Warning
🚨 Adding a private key file to the /data
folder exposes it to everyone using your Orchest instance.
You can then version using git
using:
JupyterLab terminal.
JupyterLab git extension UI.