Skip to content

Creating a new benchmark task

This guide explains how to propose a new benchmark task to be included in RiverBench.

Step 0: What is a benchmark task?

A benchmark task is a description of a concrete benchmarking procedure, for example: measuring deserialization throughput. Each task can be associated with several metrics, for example average throughput, 95th percentile throughput, and memory usage.

Each task belongs to one benchmark category. A benchmark category is a group of tasks that share the same requirements for datasets. For example, the stream category contains tasks that require a grouped RDF stream as input. Have a look at the list of benchmark categories to see the existing categories.

Step 1: Create a task proposal

Open a new task proposal in the RiverBench repository: New task proposal

Fill in the fields with the required information, using the instructions embedded in the form.

Note

If you have trouble filling in any of the fields, you can leave them blank and ask a maintainer for help.

Does your task require more than just RDF data? SPARQL, anyone?

Attaching SPARQL queries, RML mappings, or other additional files to datasets is not supported yet. However, there are plans to implement this feature in the future. If you need this functionality, please leave a comment on this issue to let us know.

Step 2: Wait for approval

RiverBench curators will be notified your request and will review the form and the task description. The curators may ask for additional information or clarifications.

Step 3: Create a pull request

Once the task proposal is approved, you will be able to create a pull request to add the task to the category repository. The pull request should:

  • Create a new subfolder under the tasks folder of the category repository. The name of the folder must match the task's identifier.
  • Create a metadata.ttl file using this template to the task's folder.
  • Fill out the metadata in the metadata.ttl file using the information from the task proposal.
  • The description of the task in dcterms:description should be only enough to understand what the task is about. The details about metrics, specific procedures, etc., should be included in the task's documentation (see below).
  • Example of a completed metadata.ttl file: stream-latency-end-to-end.
  • Create a index.md file that will contain the task's elaborated description. Example: stream-latency-end-to-end.
  • (optional) Create any number of additional documentation pages for the task, for example, a page with the task's results. You can also include images in the task's folder.

Step 4: Wait for merge

The admin will review your pull request (if they don't – remind them in the your task proposal with a comment). Once the pull request is approved, the task will be added to the category repository and will be available on the RiverBench website. 🎉

See also