Fetch metadata from a run or experiment
To connect to your project, first initialize a project object to perform the fetching on. Then, to access individual experiments and fetch their metadata, you can either:
- Initialize a
ReadOnlyRun
object directly. - Fetch and iterate over multiple read-only runs with the
fetch_read_only_experiments()
orfetch_read_only_runs()
method.
What's the connection between a run and experiment?
In the code, experiments are represented as runs. An experiment run has the experiment name stored in its sys/name
attribute.
In the below example, a run is created as the head of the experiment gull-flying-skills
:
from neptune_scale import Run
run = Run(
experiment_name="gull-flying-skills",
run_id="vigilant-kittiwake-1",
)
If a new run is created with the same experiment name, it becomes the new representant run for the experiment:
run = Run(
experiment_name="gull-flying-skills",
run_id="vigilant-kittiwake-2",
)
The vigilant-kittiwake-1
run is still accessible as part of the experiment history, but it's no longer considered an experiment.
Before you start
Install neptune-fetcher:
pip install -U neptune-fetcher
Initialize read-only project
To create a read-only project to operate on, use:
from neptune_fetcher import ReadOnlyProject
project = ReadOnlyProject()
If you haven't set your Neptune credentials as environment variables, you can pass the project name or API token as arguments:
project = ReadOnlyProject(
project="team-alpha/project-x", # your full project name here
api_token="h0dHBzOi8aHR0cHM6...Y2MifQ==", # your API token here
)
Option A: Initialize read-only run
You can create read-only runs based on either experiment name or run ID.
- Experiment name
- Run ID
- Neptune ID
The name provided to the experiment_name
argument at creation. Stored in the sys/name
attribute.
project = ReadOnlyProject()
run = ReadOnlyRun(project, experiment_name="seagull-flying-skills")
The identifier provided to the with_id
argument at experiment creation. Stored in the sys/custom_run_id
attribute.
project = ReadOnlyProject()
run = ReadOnlyRun(project, custom_id="seagull-156xc4f")
The identifier auto-generated by Neptune. Stored in the sys/id
attribute.
project = ReadOnlyProject()
run = ReadOnlyRun(project, with_id="TES-1")
Option B: Fetch read-only experiments or runs
The following methods return experiments or runs in the form of ReadOnlyRun
objects:
- Fetch experiments
- Fetch runs
project = ReadOnlyProject()
for experiment in project.fetch_experiments():
print(experiment)
project = ReadOnlyProject()
for run in project.fetch_runs():
print(run)
You can then fetch metadata from the individual run objects.
Accessing run attributes
Use the syntax run[attribute_name]
to look up attributes and fetch the logged values:
run = ReadOnlyRun(experiment_name="seabird-flying-skills")
run_id = run["sys/custom_run_id"].fetch()
To list the attributes of a run, use the .attribute_names
property:
run = ReadOnlyRun(...)
print(list(run.attribute_names))
Pre-fetch attributes to cache
To improve performance, you can pre-fetch metadata attributes to the internal cache. This way, access to the attributes' values is more efficient.
Pre-fetching metrics
To pre-fetch metric attributes to the cache, use prefetch_series_values()
:
run = ReadOnlyRun(...)
run.prefetch_series_values(["metrics/loss", "metrics/accuracy"])
# No more calls to the API
Then, fetch the values using one of the available methods:
print(run["metrics/loss"].fetch_values())
print(run["metrics/accuracy"].fetch_values())
For details, see Fetching series values.
How are metrics logged?
A metric attribute is created using the log_metrics()
function. It's of type FloatSeries
.
For details, see Log metrics.
Step range
To limit the step range to pre-fetch, pass a 2-tuple to the step_range
argument:
run.prefetch_series_values(
paths=["metrics/loss", "metrics/accuracy"],
step_range=(100.0, None),
)
Neptune currently supports only a left boundary.
Progress bar
By default, a tdqm-based progress bar is used to indicate the download progress.
-
To disable it, set the
progress_bar
argument toNone
. -
To use a custom progress bar, define a type of
ProgressBarCallback
and pass it to theprogress_bar
argument:run.prefetch_series_values(
...,
progress_bar=MyProgressBar,
)Defining a custom progress bar
To use a custom callback to visualize the download progress, define a class that inherits from
ProgressBarCallback
. Then, when fetching metadata, pass the type (not an instance) to theprogress_bar
argument.Example callback definition, using clickfrom types import TracebackType
from typing import Any, Optional, Type
from neptune.typing import ProgressBarCallback
class MyProgressBar(ProgressBarCallback):
def __init__(self, *, description: Optional[str] = None, **_: Any) -> None:
super().__init__()
from click import progressbar
...
self._progress_bar = progressbar(iterable=None, length=1, label=description)
...
def update(self, *, by: int, total: Optional[int] = None) -> None:
if total:
self._progress_bar.length = total
self._progress_bar.update(by)
...
def __enter__(self) -> "MyProgressBar":
self._progress_bar.__enter__()
return self
...
def __exit__(
self,
exc_type: Optional[Type[BaseException]],
exc_val: Optional[BaseException],
exc_tb: Optional[TracebackType],
) -> None:
self._progress_bar.__exit__(exc_type, exc_val, exc_tb)Using the callbacksome_fetching_method(progress_bar=MyProgressBar)
Multi-threading
To speed up the fetching process of metric attributes, you can use multithreading. To enable it, set the use_threads
parameter to True
:
run.prefetch_series_values(
paths=["metrics/loss", "metrics/accuracy"],
use_threads=True,
)
By default, the maximum number of workers is 10. You can change this number with the
NEPTUNE_FETCHER_MAX_WORKERS
environment variable.
Pre-fetching configs
run = ReadOnlyRun(...)
run.prefetch(["parameters/optimizer", "parameters/init_lr"])
# No more calls to the API
optimizer = run["parameters/optimizer"].fetch()
lr = run["parameters/init_lr"].fetch()
For more examples, see Fetching single values.
How are configs logged?
A config attribute is created using the log_configs()
function. It can be one of several simple types, such as Float
, Datetime
, or String
.
For details, see Log configs.
Fetch metadata values
Once you've obtained a read-only run object, access its attributes with run[attribute_name]
and then use a suitable fetching method to download the metadata from the attribute.
run = ReadOnlyRun(...)
# for faster access, optionally pre-fetch attributes to cache
run.prefetch(["metrics/accuracy", "metrics/loss"])
Fetching metrics (series values)
To fetch metric values as a pandas DataFrame, use fetch_values()
:
acc_values = run["metrics/accuracy"].fetch_values()
loss_values = run["metrics/loss"].fetch_values()
Timestamps and inherited metrics are included by default. To leave them out, use:
limited_loss_values = run["metrics/loss"].fetch_values(
include_timestamp=False,
include_inherited=False,
)
To fetch only the last logged value, use fetch_last()
:
loss_final = run["metrics/loss"].fetch_last()
Limit step range
To fetch values from a limited step range, pass a 2-tuple to the step_range
argument:
loss_values_from_step_100 = run["metrics/loss"].fetch_values(step_range=(100.0, None))
Neptune currently supports only a left boundary.
Configure progress bar
By default, a tdqm-based progress bar is used to indicate the download progress.
-
To disable it, set the
progress_bar
argument toNone
. -
To use a custom progress bar, define a type of
ProgressBarCallback
and pass it to theprogress_bar
argument:acc_values = run["metrics/accuracy"].fetch_values(
...,
progress_bar=MyProgressBar,
)Defining a custom progress bar
To use a custom callback to visualize the download progress, define a class that inherits from
ProgressBarCallback
. Then, when fetching metadata, pass the type (not an instance) to theprogress_bar
argument.Example callback definition, using clickfrom types import TracebackType
from typing import Any, Optional, Type
from neptune.typing import ProgressBarCallback
class MyProgressBar(ProgressBarCallback):
def __init__(self, *, description: Optional[str] = None, **_: Any) -> None:
super().__init__()
from click import progressbar
...
self._progress_bar = progressbar(iterable=None, length=1, label=description)
...
def update(self, *, by: int, total: Optional[int] = None) -> None:
if total:
self._progress_bar.length = total
self._progress_bar.update(by)
...
def __enter__(self) -> "MyProgressBar":
self._progress_bar.__enter__()
return self
...
def __exit__(
self,
exc_type: Optional[Type[BaseException]],
exc_val: Optional[BaseException],
exc_tb: Optional[TracebackType],
) -> None:
self._progress_bar.__exit__(exc_type, exc_val, exc_tb)Using the callbacksome_fetching_method(progress_bar=MyProgressBar)
Fetching configs (single values)
To fetch a value of a simple type, call fetch()
on the attribute:
f1 = run["scores/f1"].fetch()
created_at = run["sys/creation_time"].fetch()
Fetching tags
To fetch tags as a dictionary of strings, use:
tagset = run["sys/tags"].fetch()