Skip to main content

Fetch runs or experiments

Similar to displaying and filtering runs in the experiments table, you can fetch runs meeting certain criteria and choose which attributes to include as columns.

Before you start

Install neptune-fetcher:

pip install -U neptune-fetcher

Step 1: Initialize read-only project

To create a read-only project to perform the fetching on, use:

from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject()

If you haven't set your Neptune credentials as environment variables, you can pass the project name or API token as arguments:

project = ReadOnlyProject(
project="team-alpha/project-x", # your full project name here
api_token="h0dHBzOi8aHR0cHM6...Y2MifQ==", # your API token here
)

Step 2: Use fetching methods

Each fetching method has a variant for both experiments and runs:

  • Fetching experiments: Only runs that represent current experiments are returned.

    When fetching experiments that have a history of forked or restarted runs, the historical runs are not included.

  • Fetching runs: All runs, including those that no longer represent experiments, are returned.

What's the connection between a run and experiment?

In the code, experiments are represented as runs. An experiment run has the experiment name stored in its sys/name attribute.

In the below example, a run is created as the head of the experiment gull-flying-skills:

from neptune_scale import Run

run = Run(
experiment_name="gull-flying-skills",
run_id="vigilant-kittiwake-1",
)

If a new run is created with the same experiment name, it becomes the new representant run for the experiment:

run = Run(
experiment_name="gull-flying-skills",
run_id="vigilant-kittiwake-2",
)

The vigilant-kittiwake-1 run is still accessible as part of the experiment history, but it's no longer considered an experiment.

Fetch metadata as data frame

To fetch an experiment's metadata as a pandas DataFrame, use fetch_experiments_df():

project = ReadOnlyProject()

all_experiments_df = project.fetch_experiments_df()

Filter by name or ID

You can use a regular expression to match experiment names:

Include experiments that match regex
specific_experiments_df = project.fetch_experiments_df(
names_regex=r"astute-.+-135"
)
Exclude experiments that match regex
specific_experiments_df = project.fetch_experiments_df(
names_exclude_regex=r"experiment-\d{2,4}"
)

Neptune uses the RE2 regular expression library. For supported regex features and limitations, see the RE2 syntax guide.

You can also fetch experiments by custom run ID:

specific_experiments_df = project.fetch_experiments_df(
custom_ids=["astute-kittiwake-14", "bombastic-seagull-2", "regal-xeme-18"]
)
specific_experiments_df = project.fetch_experiments_df(
custom_id_regex=r"[a-e]{2}_.+"
)

The custom ID refers to the identifier set with the run_id argument at experiment creation.

Filter by metadata value

To construct a custom filter, use the query argument and the Neptune Query Language:

experiments_df = project.fetch_experiments_df(
query="(last(`accuracy`:floatSeries) > 0.88) AND (`f1`:float > 0.9)",
)

Limit columns

To limit the number of returned columns, you can:

  • specify columns with the columns argument
  • retrieve extra columns that match a regex pattern with the columns_regex argument

For example:

experiments_df = project.fetch_experiments_df(
columns=["sys/modification_time", "scores/f1"],
columns_regex=r"tree/.*",
)

Combine filters

If you combine multiple criteria, they're joined by the logical AND operator.

The below example returns experiments that meet the following criteria:

  • The name matches the regular expression tree/.*
  • The last logged accuracy value is higher than 0.9
  • The logged learning_rate value is less than 0.01

Additionally, the returned data frame only includes the creation and modification times as columns.

experiments_df = my_project.fetch_experiments_df(
names_regex=r"tree/.*",
query=r'(last(`accuracy`:floatSeries) > 0.9) AND (`learning_rate`:float < 0.01)',
columns=["sys/creation_time", "sys/modification_time"],
)

List project experiments or runs

To list the identifiers of all the experiments or runs of a project, use:

project = ReadOnlyProject()

for experiment in project.list_experiments():
print(experiment)

The above methods return the identifiers as an iterator of dictionaries.

To instead get the identifiers as a data frame, use:

project = ReadOnlyProject()

df = project.fetch_experiments()

Fetch read-only experiments or runs

To download metadata from individual experiments or runs, fetch them as ReadOnlyRun objects.

For details, see Fetch metadata from a run or experiment.