How it works
🎓 Step by Step
Wetlands leverages Pixi, a package management tool for developers, or Micromamba, a fast, native reimplementation of the Conda package manager.
- Pixi or Micromamba Setup: When
EnvironmentManageris initialized, it checks for apixiormicromambaexecutable at the specified path (e.g.,"micromamba/"). If not found, it downloads a self-contained Pixi or Micromamba binary suitable for the current operating system and architecture into that directory. This means Wetlands doesn't require a pre-existing Conda/Mamba installation. - Environment Creation:
create(envName, dependencies)uses Pixi or Micromamba commands (pixi init /path/to/envNameormicromamba create -n envName -c channel package ...) to build a new, isolated Conda environment within the Pixi or Micromamba prefix (e.g.,pixi/workspaces/envName/envs/default/ormicromamba/envs/envName). When using Pixi, Wetlands also creates a workspace for the environment (e.g.pixi/workspace/envName/). Note that the main environment is returned if it already satisfies the required dependencies. - Dependency Installation: Dependencies (Conda packages, Pip packages) are installed into the target environment using
pixi add ...ormicromamba install ...andpip install ...(executed within the activated environment). - Launching Workers (
launch):launch(max_workers=N)starts one or moremodule_executorworker processes within the activated target environment usingsubprocess.Popen. All workers share the same Conda environment on disk — no duplication.- Each worker listens on its own local socket using
multiprocessing.connection.Listener. - The main process connects to each worker using
multiprocessing.connection.Client. - A dedicated IPC reader daemon thread is started per worker to receive messages asynchronously.
- A health monitor daemon thread is started to periodically check all workers for liveness and inactivity timeouts (see below).
- Execution (
submit/execute/import_module):submit(module, func, args)creates aTask[T]object, dispatches the function call to an idle worker, and returns theTaskimmediately. If all workers are busy, the task is queued internally and dispatched when the next worker becomes available.execute(module, func, args)is a blocking shortcut: it submits the call and waits for the result before returning.import_module(module)creates a proxy object in the main process. When methods are called on this proxy, it triggers theexecutemechanism described above.- Each worker imports the target module, executes the function with the provided arguments, and sends the result (or exception) back to the main process via its IPC connection.
- Task Lifecycle:
- A
Taskgoes through states:PENDING → RUNNING → COMPLETED(orFAILED/CANCELED). - The worker sends typed IPC messages:
execution finished,error,update(progress), andcanceled. - The reader thread dispatches these messages to the
Taskobject, which notifies registered listeners and resolves its internalconcurrent.futures.Future. - When a worker finishes a task, it is returned to the idle pool and the next queued task (if any) is dispatched to it.
- A
- Progress and Cancellation:
- Remote code can report progress by declaring a
taskparameter in the function signature. Wetlands detects it viainspect.signature()and injects aRemoteTaskHandleautomatically. - The handle provides
task.update()for progress,task.set_output()for intermediate results,task.cancel_requestedfor cooperative cancellation, andtask.log()for remote logging.
- Remote code can report progress by declaring a
- Worker Health Monitoring:
- A background daemon thread monitors all workers every few seconds.
- If a worker process has exited (crash, OOM kill, etc.), the monitor fails the active task, removes the worker, and launches a replacement with the same configuration.
- If
worker_timeoutis set and a worker has not sent any IPC message within that duration, it is treated as hung: the active task is failed, the worker is killed and replaced. - On
env.exit(), the health monitor stops and any tasks still in the queue are failed with a descriptive error.
- Direct Execution (
execute_commands): This method directly activates the target environment and runs the provided shell commands usingsubprocess.Popen(no worker processes involved here). The user is responsible for managing the launched process and any necessary communication. - Isolation: Each environment created by Wetlands is fully isolated, preventing dependency conflicts between different environments or with the main application's environment.
🔀 Worker Pool Architecture
When max_workers > 1 is passed to launch(), Wetlands starts multiple module_executor processes, all activating the same conda environment. This provides true process-level parallelism without duplicating the environment on disk.
┌─ module_executor (pid 1001) ─ port 5001
env.launch(max_workers=4) ──────├─ module_executor (pid 1002) ─ port 5002
(one conda env on disk) ├─ module_executor (pid 1003) ─ port 5003
└─ module_executor (pid 1004) ─ port 5004
Each worker holds its own subprocess, port, IPC connection, and a dedicated reader thread. Tasks are dispatched to idle workers from an internal pool; when all workers are busy, tasks queue and are dispatched as workers become available. A health monitor thread runs alongside the pool, detecting crashed or hung workers and replacing them transparently.
Multi-process is preferred over multi-thread because Wetlands' primary use case is scientific computing (numpy, torch, cellpose, stardist). Separate processes provide true parallelism (no GIL), failure isolation (one crash doesn't kill other tasks), and separate sys.path/sys.argv per worker. The only cost is memory (~200–400 MB per worker for a typical scientific stack), controlled via max_workers.
With max_workers=1 (the default), the pool is a trivial pass-through: one worker, one connection, one reader thread. Behavior is identical to a single-worker setup.
⚙️ Under the Hood
Wetlands uses the EnvironmentManager.execute_commands() for different operations (to create environments, install dependencies, etc).
Behind the scenes, this method creates and executes a temporary script (a bash script on Linux and Mac, and a PowerShell script on Windows) which looks like the following:
# Initialize Micromamba
cd "/path/to/examples/micromamba"
export MAMBA_ROOT_PREFIX="/path/to/examples/micromamba"
eval "$(micromamba shell hook -s posix)"
# Create the cellpose environment
cd "/Users/amasson/Travail/wetlands/examples"
micromamba --rc-file "/path/to/examples/micromamba/.mambarc" create -n cellpose python=3.12.7 -y
# Activate the environment
cd "/path/to/examples/"
micromamba activate cellpose
# Install the dependencies
echo "Installing conda dependencies..."
micromamba --rc-file "/path/to/examples/micromamba/.mambarc" install "cellpose==3.1.0" -y
# Execute optional custom commands
python -u example_module.py