Edit this page

Metadata

RiverBench includes rich RDF metadata for each dataset, profile, schema, and the suite itself. This metadata is used to generate the website, and can also be used by other tools. The metadata is permissively licensed.

Accessing metadata

On each page of a RiverBench resource (e.g., dataset, task, profile) you will find a box with links to the RDF metadata. You can also use the HTTP content negotation mechanism on permanent URLs (starting with https://w3id.org/riverbench/) to request the machine-readable metadata instead of the HTML page.

You can find the permanent URL in the Info box with metadata download links, or by copying the Permanent URL link in the top right corner of the page:

Permanent URL

Examples of URLs that will return the metadata with content negotiation:

To request a metadata file in a given format explicitly, you can also append .nt, .ttl, .rdf, or .jelly to these URLs.

The following metadata formats are supported:

N-Triples (.nt, content type application/n-triples)
Turtle (.ttl, content type text/turtle)
RDF/XML (.rdf, content type application/rdf+xml)
Jelly (.jelly, content type application/x-jelly-rdf)

If you are curious, you can find the rules that make this work here.

Embedded JSON-LD metadata

The HTML pages of RiverBench resources include embedded JSON-LD metadata in the <script type="application/ld+json"> tag. This metadata is used by search engines and other tools to understand the content of the page.

This metadata does not include all the information, and it also uses schema.org terms instead of DCAT / DCMI terms. The translation from DCAT is done automatically, roughly using this mapping.

Metadata dumps

Starting from RiverBench version 2.0.0, the entire metadata of RiverBench is published in easily accessible dumps. The dump for a given RiverBench release can be downladed from the main page of RiverBench. The links to download the dump are in the "Info" box near the top of the page.

The dumps can also be downloaded directly using the following URLs, where {version} is the version tag of the suite release (e.g., dev or 2.0.0):

https://w3id.org/riverbench/dumps/{version}.{extension}.gz
- Metadata dump without benchmark results (a single RDF graph).
- Available since RiverBench 2.0.0.
- Supported extensions: nt, ttl, rdf, jelly.
https://w3id.org/riverbench/dumps-with-results/{version}.{extension}.gz
- Metadata dump with community-reported benchmark results. The default graph contains the RiverBench metadata. The benchmark results are in named graphs, using the nanopublication structure.
- Available since RiverBench 2.1.0.
- Supported extensions: nq, trig, jelly.

Editing metadata

A large portion of the metadata is automatically generated. However, the rest is written manually in Turtle files in various repositories:

RiverBench main repo / metadata.ttl – metadata about the suite itself
{Dataset repo} / metadata.ttl – metadata about the dataset
{Category repo} / metadata.ttl – metadata about the benchmark category
{Category repo} / profiles / {profile name}.ttl – metadata about the profile
{Category repo} / tasks / {task name} / metadata.ttl – metadata about the benchmark task

All of these files can be conveniently accessed and edited using the Edit this page or Edit metadata button at the top of the page:

Edit this page – RDF/Turtle

Feel free to submit pull requests to these files to fix errors or add new information. After the pull request is accepted, the changes will be reflected automatically in the website and the READMEs.

Used ontologies

The metadata uses mainly these ontologies:

DCAT 3
DCMI Metadata Terms
FOAF
RDF Stream Taxonomy (RDF-STaX) – for RDF stream type annotations
EuroVoc – for dataset themes
VoID
RiverBench metadata ontology
RiverBench documentation ontology