Skip to main content

Documentation Index

Fetch the complete documentation index at: https://hubify.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Compute

Hubify Labs gives you on-demand access to high-end GPU compute for running experiments. Powered by RunPod for on-demand GPU pods.

Supported Hardware

GPUVRAMBest ForCost Range
H200141 GBLarge-scale MCMC, foundation model inference, multi-survey sweeps$$$
H10080 GBTraining runs, medium MCMC chains, anomaly detection$$
A10080 GBGeneral GPU compute, smaller models$
CPUN/AData preprocessing, analysis, lightweight tasksFree tier

Pod Lifecycle

1

Provision

When an experiment needs GPU, Hubify provisions a pod on RunPod. The system selects the optimal GPU type based on the experiment’s memory and compute requirements.
2

Initialize

The pod boots with your lab’s environment: dependencies installed, data mounted, SSH keys configured.
3

Execute

Your experiment runs on the pod. Logs stream in real time. Intermediate results checkpoint to persistent storage.
4

Teardown

When the experiment completes (or fails), the pod is torn down automatically. Results are saved to your lab before teardown.

Cost Optimization

Hubify automatically optimizes for cost:
total_cost = runtime_hours * cost_per_hour

If H200 finishes in 1 hour at $4/hr = $4
   H100 finishes in 3 hours at $2/hr = $6
   β†’ System picks H200 (cheaper overall)
You can set a monthly budget cap per lab. When you approach the limit, experiments queue instead of launching, and you get a notification.

GPU Inference Playbook

Always use torch.utils.data.DataLoader with num_workers=16, pin_memory=True, prefetch_factor=4 for image/data inference. This gives a 32x speedup over serial processing.
Key rules from the playbook:
  • Never use serial PIL decoding for batch image processing
  • Never use ProcessPoolExecutor for GPU-bound work
  • Never use HuggingFace streaming for production inference
  • Always pin memory and prefetch for GPU DataLoaders

Persistent Storage

Each lab gets persistent storage that survives pod teardowns:
  • /workspace/ on pods maps to your lab’s persistent volume
  • Experiment outputs are automatically synced back to the lab
  • Datasets can be pre-staged in persistent storage for fast access

SSH Access

Every running pod is accessible via SSH for debugging:
# Get SSH command for a running pod
hubify pod ssh EXP-054

# Direct SSH (shown in pod details)
ssh root@205.196.19.52 -p 11452

Idle Pod Detection

An idle GPU is a violation. Hubify monitors pod utilization and alerts you when a pod is sitting idle. The system will suggest the next experiment to deploy on an idle pod.
If a pod finishes its assigned experiment and no follow-up is queued, the system:
  1. Alerts you that the pod is idle
  2. Suggests experiments from the queue that could use this pod
  3. Auto-deploys the next experiment if you have auto-schedule enabled

CLI

# List active pods
hubify pod list

# Launch a pod manually
hubify pod create --gpu h100 --hours 4

# Check pod status
hubify pod status pod-abc123

# SSH into a pod
hubify pod ssh pod-abc123

# Terminate a pod
hubify pod stop pod-abc123

# View cost summary
hubify pod cost --month current

AI Experiment Runner

When no GPU pod is running, Hubify can execute experiments via the AI runner mode β€” Claude generates a plausible scientific result from the experiment hypothesis and metric, completing the experiment without GPU costs. Toggle via EXPERIMENT_RUNNER_MODE in the orchestrator environment. Useful for development and dry runs.