Skip to main content

Neptune Query Language (NQL)

note

This feature is experimental. We're happy to hear your feedback through GitHub

When fetching runs from your project, use the Neptune Query Language (NQL) to filter the runs by an attribute and other criteria.

NQL usage

Use the query arguement of fetch_runs_df() and fetch_experiments_df() methods to pass a raw NQL string:

from neptune_fetcher import ReadOnlyProject

project = ReadOnlyProject("workspace/project")

project.fetch_runs_df(
query='(last(`accuracy`:floatSeries) > 0.88) AND (`learning_rate`:float < 0.01)'
)
How is NQL different from the app search?

The search query builder in the web app has extra functionality added on top, to make query building more convenient. Queries are converted to raw NQL underneath the hood.

In the first version of adding querying capabilities to the API, we're exposing NQL without modifications.

NQL syntax

An NQL query has the following parts:

`<attribute name>`:<attributeType> <OPERATOR> <zero or more values>

For example:

`scores/f1`:float >= 0.60

Building a query

The following tabs walk you through constructing each part of a valid query:

Your query
`scores/f1`

Use the attribute name that you specified when assigning the metadata to the run. For the above example, it's run.log_configs({"scores/f1": f1_score}).

While usually not necessary, it's safest to enclose the attribute name in single backquotes (`).

Multi-clause queries

You can build a complex query, in which multiple conditions are joined by logical operators.

Surround the clauses with () and use AND or OR to join them:

(`attribute1`:attributeType = value1) AND (`attribute2`:attributeType = value2)
Example: Particular learning rate and high enough final accuracy
query='(last(`metrics/acc`:floatSeries) >= 0.85) AND (`learning_rate`:float = 0.002)'

Note that each run is matched against the full query individually.

Negation

You can use NOT in front of operators or clauses.

The following are equivalent and both exclude runs that have "blobfish" in their name:

`sys/name`:string NOT CONTAINS "blobfish"
NOT `sys/name`:string CONTAINS "blobfish"

You can also negate joined clauses by enclosing them with parentheses:

Example: Exclude failed runs whose names contain "blobfish"
NOT (`sys/name`:string CONTAINS blobfish AND `sys/failed`:bool = True)

Aggregate functions of numerical series

You can use the following statistical (aggregate) functions on FloatSeries attributes:

  • average()
  • last()
  • max()
  • min()
  • variance()

For example, to filter by the last logged score of a float series attribute with the path metrics/accuracy, use:

last(`metrics/accuracy`:floatSeries) >= 0.80

Examples

Models small enough to be used on mobile that have decent test accuracy

NQL query
(`model_info/size_MB`:float <= 50MB) AND (last(`test/acc`:floatSeries) > 0.90)
What was logged
run = Run(...)
run.log_configs({"model_info/size_MB": 45})
for epoch in epochs:
# training loop
acc = ...
run.log_metrics({"test/acc": acc})

All of Jackie's runs from the current exploration task

NQL query
(`sys/owner`:string = "jackie") AND (`sys/tags`:stringSet CONTAINS "exploration")
What was logged
run.add_tags=(tags=["exploration", "pretrained"])

All failed runs from the start of the year

NQL query
(sys/creation_time:datetime > "2024-01-01T00:00:00Z") AND (sys/failed:bool = True)
What was logged
# Date is in 2024
run = Run(...)
# Exception was raised during execution

Supported data types

See example queries for the supported data types.

Float

To query float values, use:

Retrieve runs with F1 score lower than 0.5
project.fetch_runs_df(
query="`f1_score`:float < 0.50"
)

In this case, the logging code could be something like run.log_configs({"f1_score": 0.48}) for a run matching the expression.

Float series

To obtain a value that characterizes a series of values, use an aggregate function:

Filter by last appended accuracy score
last(`metrics/accuracy`:floatSeries) >= 0.80

The following statistical functions are supported:

  • average()
  • last()
  • max()
  • min()
  • variance()

String

You can filter either by the full string, or use the CONTAINS operator to access substrings.

Exact match
project.fetch_runs_df(
query='`sys/name`:string = "cunning-blobfish"'
)
Partial match (contains substring)
project.fetch_runs_df(
query='`sys/name`:string CONTAINS "blobfish"'
)

See also Name.

To match against a regular expression, use the operators MATCHES and NOT MATCHES:

Matches regex
project.fetch_runs_df(
query=r'`parameters/optimizer`:string MATCHES "Ada\\w+"'
)
Doesn't match regex
project.fetch_runs_df(
query=r'`parameters/optimizer`:string NOT MATCHES "Ada\\w+"'
)

Note: When using regex with the query argument, you must escape backslashes and quotes in the pattern. In this case, using a raw Python string is less cluttered than passing a regular string:

Escaping characters in regular Python string
project.fetch_runs_df(
query=r'`parameters/optimizer`:string MATCHES "Ada\\\\w+"'
)

Tags

Tags are stored as a StringSet in the auto-created sys/tags attribute. To filter by one or more tags, this is the attribute you need to access.

Query by single tag
project.fetch_runs_df(
query='`sys/tags`:stringSet CONTAINS "tag-name"'
)
Query by multiple tags: Matches at least one tag (OR)
(`sys/tags`:stringSet CONTAINS "tag1") OR (`sys/tags`:stringSet CONTAINS "tag2")
Query by multiple tags: Matches all tags (AND)
(`sys/tags`:stringSet CONTAINS "tag1") AND (`sys/tags`:stringSet CONTAINS "tag2")

System metadata

The system namespace (sys) automatically stores basic metadata about the environment and run. Most of the values are simple string, float, or Boolean values.

Date and time

Neptune automatically creates three timestamp attributes:

  • sys/creation_time: When the run object was first created.
  • sys/modification_time: When the object was last modified. For example, a tag was removed or some metadata was logged.
  • sys/ping_time: When the object last interacted with the Python client library. That is, something was logged or modified through the code.

For the value, enter a combined date and time representation with a time-zone specification, in ISO 8601 format:

YYYY-MM-DDThh:mm:ssZ

Where Z is the time-zone offset for UTC. You can use a different offset.

Pinged by the Python client after 5 AM UTC on a specific date
`sys/ping_time`:datetime > "2024-02-06T05:00:00Z"
Pinged by the Python client after 5 AM Japanese time on a specific date
`sys/ping_time`:datetime > "2024-02-06T05:00:00+09"

You can also enter relative time values:

  • -2h (last 2 hours)
  • -5d (last 5 days)
  • -1M (last month)
Created more than 3 months ago
`sys/creation_time`:datetime < "-3M"

Description

To filter runs by the description, use:

Exact match
project.fetch_runs_df(
query='`sys/description`:string = "test run on new data"'
)
Partial match (contains substring)
project.fetch_runs_df(
query='`sys/description`:string CONTAINS "new data"'
)

ID

To filter runs by their Neptune ID, use:

Single run
project.fetch_runs_df(
query='`sys/id`:string = "NLI-345"'
)

To fetch multiple specific runs at once, use the OR operator:

Multiple runs
project.fetch_runs_df(
query='(`sys/id`:string = "NLI-35") OR (`sys/id`:string = "NLI-36")'
)

Name

To filter experiments by their name, use:

Exact match
project.fetch_runs_df(
query='`sys/name`:string = "cunning-blobfish"'
)
Partial match (contains substring)
project.fetch_runs_df(
query='`sys/name`:string CONTAINS "blobfish"'
)

You can also use a regular expression to match experiment names. In this case, instead of query, use the names_regex parameters:

Include experiments that match regex
specific_experiments_df = project.fetch_experiments_df(
names_regex=r"astute-.+-135"
)
Exclude experiments that match regex
specific_experiments_df = project.fetch_experiments_df(
names_exclude_regex=r"experiment-\d{2,4}"
)

Neptune uses the RE2 regular expression library. For supported regex features and limitations, see the RE2 syntax guide.

Owner

To filter by the user or service account that created the run, use:

By owner: Regular username
project.fetch_runs_df(
query='`sys/owner`:string = "jackie"'
)
By one of the workspace service accounts
project.fetch_runs_df(
query='`sys/owner`:string CONTAINS "@ml-team"'
)

In this case, the expression matches all service account names that belong to the workspace ml-team.

State

To fetch only closed runs, use:

Fetch inactive runs
project.fetch_runs_df(
query='`sys/state`:experimentState = "inactive"'
)

Failed status

If an exception occurred during the run, it's set as "Failed". In practice, it means the sys/failed attribute is set to True.

Fetch failed runs
project.fetch_runs_df(
query='`sys/failed`:bool = True'
)