Skip to content

Dataset: muziekweb (development version)

The dataset consists of the main graph of Muziekweb, a high-quality Dutch knowledge base about music, containing information about artists, CD, LPs, and more. The knowledge base is richly annotated and contains plentiful links to external resources.

Info

Download this metadata in RDF: Turtle, N-Triples, RDF/XML, Jelly
Source repository: dataset-muziekweb
Permanent URL: https://w3id.org/riverbench/datasets/muziekweb/dev

Go to download links

Stream preview (click to expand)
0000000000.ttl
<https://data.muziekweb.nl/Link/00001178b1ca0a9390b0ea37d5e5b4dfaad63657>
        a                        <https://data.muziekweb.nl/vocab/ExternalLink>;
        <http://www.w3.org/2000/01/rdf-schema#label>
                "Listen to \"Lukas Passion 1748\" on \"Spotify\""@nl;
        <http://schema.org/url>  "https://open.spotify.com/album/13O0k9nDMic8IsZXjlB2Bl"^^<http://www.w3.org/2001/XMLSchema#anyURI>;
        <https://data.muziekweb.nl/vocab/provider>
                "Spotify" .
0000000010.ttl
<https://data.muziekweb.nl/Link/0000ac289c77969f6c9365d43ba089a8b779e0ab>
        a                              <https://data.muziekweb.nl/vocab/OrderInformation>;
        <http://www.w3.org/2000/01/rdf-schema#label>
                "order ‘Odeon 10/14’ from BERTUS"@nl;
        <http://schema.org/offeredBy>  "BERTUS";
        <https://data.muziekweb.nl/vocab/ean>
                "3596972402623";
        <https://data.muziekweb.nl/vocab/label>
                <https://data.muziekweb.nl/Link/L00000005556>;
        <https://data.muziekweb.nl/vocab/labelNumber>
                "324062" .
0000000100.ttl
<https://data.muziekweb.nl/Link/0004dde0f50630fe85984f480665eb6ad5c87cab>
        a                              <https://data.muziekweb.nl/vocab/OrderInformation>;
        <http://www.w3.org/2000/01/rdf-schema#label>
                "order ‘Shi qing, hua yi’ from UNIVERSAL - DIGITAL"@nl;
        <http://schema.org/offeredBy>  "UNIVERSAL - DIGITAL";
        <https://data.muziekweb.nl/vocab/ean>
                "602517321144";
        <https://data.muziekweb.nl/vocab/label>
                <https://data.muziekweb.nl/Link/L00000021678>;
        <https://data.muziekweb.nl/vocab/labelNumber>
                "602517321144" .
0000001000.ttl
<https://data.muziekweb.nl/Link/003980fc34a52eba8ea58b5b83d7d54552d9bc31>
        a                        <https://data.muziekweb.nl/vocab/ExternalLink>;
        <http://www.w3.org/2000/01/rdf-schema#label>
                "Listen to \"Iberia\" on \"Spotify\""@nl;
        <http://schema.org/url>  "https://open.spotify.com/album/4d6QpbCGzljwYE6BtrOoD5"^^<http://www.w3.org/2001/XMLSchema#anyURI>;
        <https://data.muziekweb.nl/vocab/provider>
                "Spotify" .
0000010000.ttl
<https://data.muziekweb.nl/Link/0236038d60cf58c5652e053296cffe229dc0ab59>
        a                              <https://data.muziekweb.nl/vocab/OrderInformation>;
        <http://www.w3.org/2000/01/rdf-schema#label>
                "order ‘Chicken rhythm ; vol.2’ from SOUND PRODUCTS"@nl;
        <http://schema.org/offeredBy>  "SOUND PRODUCTS";
        <https://data.muziekweb.nl/vocab/ean>
                "31287017016";
        <https://data.muziekweb.nl/vocab/label>
                <https://data.muziekweb.nl/Link/L00000006328>;
        <https://data.muziekweb.nl/vocab/labelNumber>
                "LP 3055" .

General information

Technical metadata

  • Has stream type usage:
    • RDF stream type usage (​1)
      • Type: RDF stream type usage (stax:RdfStreamTypeUsage)
      • Comment: The dataset can be viewed as a stream of graphs corresponding to items in the knowledge base. Each graph is uniquely identified by its subject IRI. (en)
      • Has stream type: RDF subject graph stream (stax:subjectGraphStream)
    • RDF stream type usage (​2)
  • Has stream element count: 2,450,357
  • Has stream element split:
    • Type: Stream elements split by topic (rb:TopicStreamElementSplit)
    • Comment: Each stream element corresponds to a different item in the knowledge base. The size of elements varies depending on how much information is there on a given item. (en)
    • Has subject shape:
      • Has subject shape (​1)
        • Comment: Target instances of any class. (en)
        • Target subjects of: Type (rdf:type)
      • Has subject shape (​2)
  • Uses vocabulary:
  • Conforms to W3C RDF 1.1 specification: yes
  • Conforms to W3C RDF-star draft specification as of December 17, 2021: yes
  • Uses generalized triples: no
  • Uses generalized RDF datasets: no
  • Uses RDF-star: no
  • Language: nl

Distributions

The dataset is published in a few size variants, each containing a specific number of stream elements. For each size, there are three distribution types available: flat (just an N-Triples/N-Quads file), streaming (a .tar.gz archive with Turtle/TriG files, one file per stream element), and Jelly (a native binary format for streaming RDF). See the documentation for more details.

Distribution size Statements Flat Streaming Jelly
10K 51,721 865.7 KB 861.0 KB 770.1 KB
100K 517,454 8.5 MB 8.4 MB 7.8 MB
1M 6,916,692 91.7 MB 86.8 MB 82.9 MB
Full 36,195,263 294.6 MB 244.4 MB 261.4 MB

The full metadata of all distributions can be found below.

Full flat distribution

Full stream distribution

Full Jelly distribution

1M elements flat distribution

1M elements stream distribution

1M elements Jelly distribution

100K elements flat distribution

100K elements stream distribution

100K elements Jelly distribution

10K elements flat distribution

10K elements stream distribution

10K elements Jelly distribution

Statistics

Statistics for full distributions

  • Title: Statistics for full distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 44,452,562 ~3,452,546 18.14 18.46 3 6,638
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 15,040,430 ~4,914,385 6.14 5.00 0 338
Simple literals 4,975,319 ~2,229,891 2.03 0.91 0 336
Datatype literals 5,383,476 ~1,138,339 2.20 3.03 0 7
Language literals 4,681,635 ~1,531,542 1.91 2.04 0 12
Datatypes 4,645,137 7 1.90 2.58 0 6
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 2,450,357 ~2,451,259 1.00 0.00 1 1
Predicates 22,909,902 ~46 9.35 6.79 1 25
Objects 34,132,733 ~7,980,006 13.93 16.91 1 6,659
Graphs 2,450,357 ~1 1.00 0.00 1 1
Statements 36,195,263 N/A 14.77 17.50 1 6,660
Bytes per statement N/A N/A 142.55 16.47 104.00 695.90

Statistics for 1M distributions

  • Title: Statistics for 1M distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 9,658,163 ~1,108,872 9.66 7.13 3 57
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 4,125,961 ~2,306,863 4.13 2.56 0 18
Simple literals 2,136,904 ~1,090,693 2.14 0.89 0 10
Datatype literals 783,604 ~388,659 0.78 1.71 0 7
Language literals 1,205,453 ~821,840 1.21 1.07 0 8
Datatypes 714,783 6 0.71 1.47 0 6
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 1,000,000 ~999,698 1.00 0.00 1 1
Predicates 6,078,202 ~33 6.08 3.60 1 24
Objects 6,705,922 ~2,483,034 6.71 6.08 1 50
Graphs 1,000,000 ~1 1.00 0.00 1 1
Statements 6,916,692 N/A 6.92 6.56 1 52
Bytes per statement N/A N/A 155.95 11.88 113.13 217.25

Statistics for 100K distributions

  • Title: Statistics for 100K distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 778,178 ~110,449 7.78 1.38 5 9
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 345,232 ~242,124 3.45 0.60 2 12
Simple literals 219,197 ~119,367 2.19 0.88 1 9
Datatype literals 33,557 ~33,438 0.34 0.47 0 1
Language literals 92,478 ~89,509 0.92 0.29 0 3
Datatypes 33,557 1 0.34 0.47 0 1
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 100,000 ~99,960 1.00 0.00 1 1
Predicates 513,894 ~10 5.14 0.93 3 6
Objects 509,516 ~252,891 5.10 0.96 3 14
Graphs 100,000 ~1 1.00 0.00 1 1
Statements 517,454 N/A 5.17 1.01 3 14
Bytes per statement N/A N/A 157.67 10.26 140.10 202.00

Statistics for 10K distributions

  • Title: Statistics for 10K distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 77,802 ~12,509 7.78 1.38 5 9
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 34,504 ~24,689 3.45 0.60 2 8
Simple literals 21,909 ~12,094 2.19 0.87 1 6
Datatype literals 3,332 ~3,332 0.33 0.47 0 1
Language literals 9,263 ~9,199 0.93 0.29 0 2
Datatypes 3,332 1 0.33 0.47 0 1
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 10,000 ~10,009 1.00 0.00 1 1
Predicates 51,364 ~10 5.14 0.93 3 6
Objects 50,942 ~27,082 5.09 0.96 3 10
Graphs 10,000 ~1 1.00 0.00 1 1
Statements 51,721 N/A 5.17 1.01 3 11
Bytes per statement N/A N/A 157.64 10.25 141.75 200.40