# Running experiments

The main class is {py:class}`~experimaestro.experiment` - context manager for running experiments, handling job submission and monitoring.

When using the command line interface to run experiment, the main object
of interaction is {py:class}`~experimaestro.experiments.cli.ExperimentHelper` - helper class for CLI-based experiment execution.

## Experiment services

- {py:class}`~experimaestro.scheduler.services.Service` - Base class for experiment services.
- {py:class}`~experimaestro.scheduler.services.WebService` - Web-based service with HTTP endpoint.
- {py:class}`~experimaestro.scheduler.services.ServiceListener` - Listener for service state changes.


## Experiment configuration

The module `experimaestro.experiments` contain code factorizing boilerplate for
launching experiments. It allows to setup the experimental environment and
read ``YAML`` configuration files to setup some experimental parameters.

This can be extended to support more specific experiment helpers (see e.g.
experimaestro-ir for an example).

{py:class}`~experimaestro.experiments.ConfigurationBase` should be the parent class of any configuration.

### Example

An `experiment.py` file:

```python
from experimaestro.experiments import ExperimentHelper, configuration, ConfigurationBase

@configuration
class Configuration(ConfigurationBase):
    #: Default learning rate
    learning_rate: float = 1e-3

def run(
    helper: ExperimentHelper, cfg: Configuration
):
    # Experimental code
    ...
```

With `full.yaml` located in the same folder as `experiment.py`

```yaml
    file: experiment
    learning_rate: 1e-4
```

The experiment can be started with

```sh
    experimaestro run-experiment --run-mode NORMAL full.yaml
```

See the [CLI documentation](cli.md#running-experiments) for more details

### Experiment code in a module

The Python path can be set by the configuration file, and module be used instead
of a file:

```yaml
    # Module name containing the "run" function
    module: first_stage.experiment

    # Python paths relative to the directory containing this YAML file
    # By default, the python path is based on the hypothesis that
    # the YAML file is in the same folder as the loaded python module.
    # For instance, for `first_stage.experiment`, the python path
    # would be set automatically to the parent folder `..`. For `first_stage.sub.experiment`,
    # this would be set to `../..`
    pythonpath:
        - ..
```

### YAML Configuration Reference

The YAML configuration file supports the following options from {py:class}`~experimaestro.experiments.ConfigurationBase`:

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `id` | string | **required** | Unique identifier for the experiment |
| `file` | string | `"experiment"` | Relative path to the Python file containing the `run` function |
| `module` | string | `None` | Python module containing the `run` function (mutually exclusive with `file`) |
| `pythonpath` | list | `None` | List of paths to add to Python path (relative to YAML file directory) |
| `imports` | list | `None` | List of YAML file paths to import and merge (current file takes priority) |
| `parent` | string | `None` | *(Deprecated, use `imports`)* Relative path to a parent YAML file to inherit from |
| `pre_experiment` | string | `None` | Python file path or module name to execute before importing the experiment |
| `title` | string | `""` | Short description of the experiment |
| `subtitle` | string | `""` | Additional details about the experiment |
| `description` | string | `""` | Full description of the experiment |
| `paper` | string | `""` | Source paper reference for this experiment |
| `add_timestamp` | bool | `False` | Append timestamp (`YYYYMMDD-HHMM`) to experiment ID |
| `dirty_git` | string | `"warn"` | Action when git has uncommitted changes: `ignore`, `warn`, or `error` |

#### Configuration inheritance

YAML files can import other configuration files using the `imports` option. Imported files are merged in order, with the current file taking priority:

```yaml
# base.yaml
id: base-experiment
learning_rate: 1e-3
batch_size: 32
```

```yaml
# optimizer.yaml
optimizer: adam
weight_decay: 0.01
```

```yaml
# experiment.yaml
imports:
  - base.yaml
  - optimizer.yaml
id: my-experiment
learning_rate: 1e-4  # Override value from base.yaml
```

#### Multiple YAML files

You can also merge multiple YAML files using CLI options:

```bash
# Pre-yaml files are loaded first, then main file, then post-yaml files
experimaestro run-experiment --pre-yaml defaults.yaml --post-yaml overrides.yaml main.yaml
```

#### Inline configuration overrides

Override specific values from the command line using `-c` with [OmegaConf dotlist syntax](https://omegaconf.readthedocs.io/):

```bash
# Simple values
experimaestro run-experiment -c learning_rate=1e-5 -c batch_size=64 experiment.yaml

# Nested values using dot notation
experimaestro run-experiment -c model.hidden_size=512 -c model.num_layers=6 experiment.yaml

# List items by index
experimaestro run-experiment -c data.transforms.0.name=resize experiment.yaml
```

#### Previewing the merged configuration

Use `--show` to output the final merged configuration as JSON without running the experiment. This is useful for debugging configuration inheritance and overrides:

```bash
experimaestro run-experiment --show experiment.yaml

# With overrides - see the final result
experimaestro run-experiment --show -c learning_rate=1e-5 --pre-yaml base.yaml experiment.yaml
```

### Pre-experiment Setup

The `pre_experiment` option allows you to run Python code **before** the experiment module is imported. It can be specified as:

- **A file path**: Relative path to a Python file (e.g., `pre_experiment.py`)
- **A module name**: Python module to import (e.g., `mypackage.pre_experiment`)

This is useful for:

- Setting environment variables to control library behavior
- Mocking heavy modules to speed up the experiment setup phase (the actual job execution will use real modules)
- Configuring logging or other global state

#### Example: Mock heavy modules with mock_modules

For experiments that import heavy libraries like PyTorch or transformers, you can use {py:func}`~experimaestro.experiments.mock_modules` to mock these modules during the experiment setup phase. This significantly speeds up configuration parsing while the actual job execution still uses the real modules.

```python
# pre_experiment.py
import os
from experimaestro.experiments import mock_modules

# Set environment variables
os.environ["OMP_NUM_THREADS"] = "4"

# Mock PyTorch and related modules (submodules are automatically included)
mock_modules(['torch', 'pytorch_lightning', 'transformers', 'huggingface_hub'])
```

```yaml
id: my-experiment
pre_experiment: pre_experiment.py
file: experiment
```

The `mock_modules` function provides:

- **Module mocking**: Any import of the specified modules returns fake objects that silently accept attribute access, method calls, and instantiation
- **Automatic decorator support**: All mocked objects work as decorators, supporting both `@decorator` and `@decorator(args)` patterns (e.g., `@torch.compile`, `@torch.no_grad()`, `@torch.jit.script`)
- **Inheritance support**: Code that inherits from mocked classes (like `torch.nn.Module` or `torch.autograd.Function`) works correctly without metaclass conflicts
- **Generic type support**: Subscript notation like `Tensor[int]` or `Module[str, Tensor]` works correctly

This is particularly useful for large codebases with many PyTorch modules where importing takes significant time during experiment configuration.

#### Example: Using a module name

If you have a package with a pre-experiment module, you can reference it by module name:

```yaml
id: my-experiment
pre_experiment: mypackage.pre_experiment
module: mypackage.experiment
```

This is useful when:
- Your pre-experiment code is part of an installed package
- You want to share pre-experiment setup across multiple experiments

### Dirty Git Check

Experimaestro can check whether your project's git repository has uncommitted changes when starting an experiment. This helps ensure reproducibility by warning you (or preventing you) from running experiments with uncommitted code changes.

The `dirty_git` option controls the behavior:

| Value | Description |
|-------|-------------|
| `ignore` | Don't check or warn about uncommitted changes |
| `warn` | Log a warning if there are uncommitted changes (default) |
| `error` | Raise a {py:class}`~experimaestro.DirtyGitError` exception and abort the experiment |

#### YAML configuration

```yaml
id: my-experiment
dirty_git: error  # Fail if git is dirty
```

#### Python API

When using the experiment context manager directly, you can pass the `dirty_git` parameter:

```python
from experimaestro import experiment, DirtyGitAction

# Using the enum
with experiment(workdir, "my-experiment", dirty_git=DirtyGitAction.ERROR) as xp:
    ...

# Or using the string value
with experiment(workdir, "my-experiment", dirty_git=DirtyGitAction.WARN) as xp:
    ...
```

#### Handling DirtyGitError

When `dirty_git` is set to `error`, a {py:class}`~experimaestro.DirtyGitError` exception is raised if the repository has uncommitted changes:

```python
from experimaestro import experiment, DirtyGitAction, DirtyGitError

try:
    with experiment(workdir, "my-experiment", dirty_git=DirtyGitAction.ERROR) as xp:
        ...
except DirtyGitError as e:
    print(f"Cannot run experiment: {e}")
```

### Common handling

See {py:class}`~experimaestro.experiments.cli.ExperimentHelper` for the CLI helper class.

## Experiment metadata

### Hostname tracking

Experimaestro automatically records the hostname where each experiment run is launched. This information is useful for identifying which machine was used when running experiments across multiple hosts.

The hostname is:
- Recorded when a new experiment run starts
- Stored in both the workspace database and on disk (in `xp/{experiment_id}/informations.json`)
- Displayed in the experiments list in both CLI and TUI
- Preserved during database resync operations

To view the hostname for experiments:

```bash
# CLI - shows hostname in brackets
experimaestro experiments --workdir /path/to/workdir list
# Output: my-experiment [hostname.local] (5/10 jobs)

# TUI - hostname shown in "Host" column
experimaestro experiments --workdir /path/to/workdir monitor --console
```

with workdir one of the directories defined in the [Settings](settings.md)