Use Neptune in HPO jobs
When running a hyperparameter optimization job, you can use Neptune Scale to track all the metadata from the study and each trial.
The Training setup section contains a sample model configuration. You can then choose between two approaches:
-
Option A: Log metadata from multiple trials to a single run. Useful for analyzing end results of the overall study, not individual trials.
- Faster, as no Neptune syncing is needed after each trial.
- Convenient display of an entire study's metadata in a single-run dashboard.
- Trials can't be compared across different studies.
-
Option B: Log each trial to its own run. Useful for comparing multiple trials across multiple studies.
- Trials can be compared across different studies.
- Slower, as Neptune needs to sync after each trial.
- Since trial metadata is logged in different studies, it can't be displayed in a single-run dashboard.
Before you start
-
Configure your Neptune API token and project. For details, see Get started.
-
Install Neptune Scale and dependencies:
pip install -U neptune-scale torch torchvision tqdm "numpy<2.0"
Training setup
Import libraries:
from neptune_scale import Run
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from tqdm.auto import trange, tqdm
import math
from datetime import datetime
ALLOWED_DATATYPES = [int, float, str, datetime, bool, list, set]
Set the hyperparameters and search space:
parameters = {
"batch_size": 128,
"input_size": (1, 28, 28),
"n_classes": 10,
"epochs": 3,
"device": torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
}
input_size = math.prod(parameters["input_size"])
learning_rates = [0.025, 0.05, 0.075] # learning rate choices
Set up the model and dataset:
class BaseModel(nn.Module):
def __init__(self, input_size, num_classes):
super(BaseModel, self).__init__()
self.fc1 = nn.Linear(input_size, 128)
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, num_classes)
self.relu = nn.ReLU()
self.dropout = nn.Dropout(0.5)
def forward(self, x):
x = self.relu(self.fc1(x))
x = self.dropout(x)
x = self.relu(self.fc2(x))
x = self.dropout(x)
x = self.fc3(x)
return x
criterion = nn.CrossEntropyLoss()
data_tfms = {
"train": transforms.Compose(
[
transforms.ToTensor(),
]
)
}
trainset = datasets.MNIST(
root="mnist",
train=True,
download=True,
transform=data_tfms["train"],
)
trainloader = torch.utils.data.DataLoader(
trainset,
batch_size=parameters["batch_size"],
shuffle=True,
num_workers=0,
)
Initialize the model:
model = BaseModel(
input_size,
parameters["n_classes"],
).to(parameters["device"])
Option A: Log all trials to single run
In this approach, we create a global Neptune run that tracks metadata from all trials:
from random import random
run = Run(run_id=f"hpo-{random()}")
The run identifier must be unique within the project. The above is one way to generate a unique ID.
Passing credentials without environment variables
Although not recommended, you can also pass your Neptune API token and project name directly in the code:
from neptune_scale import Run
run = Run(
project="team-alpha/project-x", # your full project name here
api_token="h0dHBzOi8aHR0cHM6...Y2MifQ==", # your API token here
)
Instead, consider setting your API token and project path to the NEPTUNE_API_TOKEN
and NEPTUNE_PROJECT
environment variables, respectively.
Let's log the configuration that's common across all trials:
for key in parameters:
if type(parameters[key]) not in ALLOWED_DATATYPES:
run.log_configs({f"config/{key}": str(parameters[key])})
else:
run.log_configs({f"config/{key}": parameters[key]})
This creates a namespace config
and, inside it, a attribute for each hyperparameter.
Next, we define a training loop:
for trial, lr in tqdm(
enumerate(learning_rates),
total=len(learning_rates),
desc="Trials",
):
# Log trial hyperparameters
run.log_configs({f"trials/{trial}/parameters/lr": lr})
optimizer = optim.SGD(model.parameters(), lr=lr)
# Initialize attributes for best values across all trials
best_acc = None
step = 0
for epoch in trange(parameters["epochs"], desc=f"Trial {trial} - lr: {lr}"):
run.log_metrics(data={f"trials/{trial}/epochs": epoch}, step=epoch)
for x, y in trainloader:
x, y = x.to(parameters["device"]), y.to(parameters["device"])
optimizer.zero_grad()
x = x.view(x.size(0), -1)
outputs = model(x)
loss = criterion(outputs, y)
_, preds = torch.max(outputs, 1)
acc = (torch.sum(preds == y.data)) / len(x)
# Log trial metrics
run.log_metrics(
data={
f"trials/{trial}/metrics/batch/loss": float(loss),
f"trials/{trial}/metrics/batch/acc": float(acc),
},
step=step,
)
# Log best values across all trials
if best_acc is None or acc > best_acc:
best_acc = acc
run.log_configs(
{
"best/trial": trial,
"best/metrics/loss": float(loss),
"best/metrics/acc": float(acc),
"best/parameters/lr": lr,
}
)
loss.backward()
optimizer.step()
step += 1
Finally, we close the run at the end:
run.close()
To explore the logged metadata, open your project in the Neptune app and navigate to the Runs section.
In the run metadata, the best
namespace contains the best trial, with its metrics and parameters. The trials
namespace contains metadata across all trials.
- To organize all relevant metadata in one view, create a custom dashboard.
See example dashboard
- To view best trials across different runs, create saved table views. See an example.
See example table view
Option B: Log each trial to separate run
Create a sweep-level identifier:
import uuid
sweep_id = str(uuid.uuid4())
Create a sweep-level Neptune run:
sweep_run = Run(run_id=f"sweep-{sweep_id}")
sweep_run.add_tags(["sweep"])
To connect the sweep-level and trial-level runs, add the sweep ID as a tag:
sweep_run.add_tags([sweep_id], group_tags=True)
Log the configuration that's common across all trials:
for key in parameters:
if type(parameters[key]) not in ALLOWED_DATATYPES:
sweep_run.log_configs({f"config/{key}": str(parameters[key])})
else:
sweep_run.log_configs({f"config/{key}": parameters[key]})
This creates a namespace config
and, inside it, a attribute for each hyperparameter.
Define the training loop:
# Initialize attributes for best values across all trials
best_acc = None
for trial, lr in tqdm(
enumerate(learning_rates),
total=len(learning_rates),
desc="Trials",
):
# Create a trial-level run
with Run(run_id=f"trial-{sweep_id}-{trial}") as trial_run:
trial_run.add_tags(["trial"])
# Add sweep_id to the trial-level run
trial_run.add_tags([sweep_id], group_tags=True)
# Log trial number and hyperparams
trial_run.log_configs({"trial_num": trial, "parameters/lr": lr})
optimizer = optim.SGD(model.parameters(), lr=lr)
step = 0
for epoch in trange(parameters["epochs"], desc=f"Trial {trial} - lr: {lr}"):
trial_run.log_metrics(data={"epochs": epoch}, step=epoch)
for x, y in trainloader:
x, y = x.to(parameters["device"]), y.to(parameters["device"])
optimizer.zero_grad()
x = x.view(x.size(0), -1)
outputs = model(x)
loss = criterion(outputs, y)
_, preds = torch.max(outputs, 1)
acc = (torch.sum(preds == y.data)) / len(x)
# Log trial metrics
trial_run.log_metrics(
data={
"metrics/batch/loss": float(loss),
"metrics/batch/acc": float(acc),
},
step=step,
)
# Log best values across all trials to Sweep-level run
if best_acc is None or acc > best_acc:
best_acc = acc
sweep_run.log_configs(
{
"best/trial": trial,
"best/metrics/loss": float(loss),
"best/metrics/acc": float(acc),
"best/parameters/lr": lr,
}
)
loss.backward()
optimizer.step()
step += 1
Each trial-level run is automatically stopped upon exiting the context. To stop the sweep-level run, use:
sweep_run.close()
To explore the logged metadata, open your project in the Neptune app and navigate to the Runs section.
The best trial, with its metrics and parameters, is available in the best
namespace of each sweep-level run.
- If you have multiple sweeps, you can group runs by sweep ID.
See example table view
- To compare trials within or across sweeps, select runs by toggling their eye icons () and create a compare dashboard.
See example compare dashboard
- To see both sweep-level and trial-level comparisons together, export charts or dashboards to a report.
See example report
- To compare the average of trials across different sweeps, enable Average grouped runs in chart controls.