Dataset: linked-spending (development version)
This is a subset of the LinkedSpending dataset (LS package 2013-9), which contains government spending information from around the world. The dataset uses the RDF Data Cube vocabulary. Only the spending observations were kept in this subset, extra contextual information was discarded. See the website and the paper for more details.
Info
Download this metadata in RDF: Turtle, N-Triples, RDF/XML, Jelly
Source repository: dataset-linked-spending
Permanent URL: https://w3id.org/riverbench/datasets/linked-spending/dev
Stream preview (click to expand)
<http://linkedspending.aksw.org/resource/observation-2011saiki_budget-/935a15a2195545d040e64b8b486a3f4a833cbb7c>
a <http://purl.org/linked-data/cube#Observation>;
<http://www.w3.org/2000/01/rdf-schema#label>
"2011saiki_budget, observation /935a15a2195545d040e64b8b486a3f4a833cbb7c";
<http://dbpedia.org/ontology/currency>
<http://dbpedia.org/resource/Japanese_yen>;
<http://dublincore.org/documents/2012/06/14/dcmi-terms/source>
[] ;
<http://linkedspending.aksw.org/ontology/amount>
"0.0";
<http://linkedspending.aksw.org/ontology/category>
<http://openspending.org/2011saiki_budget/category/8>;
<http://linkedspending.aksw.org/ontology/subcategory>
<http://openspending.org/2011saiki_budget/subcategory/8-1>;
<http://linkedspending.aksw.org/ontology/uniquekey>
"627";
<http://purl.org/linked-data/cube#dataSet>
<http://linkedspending.aksw.org/resource/2011saiki_budget>;
<http://purl.org/linked-data/sdmx/2009/attribute#refArea>
<http://linkedgeodata.org/triplify/node424313451>;
<http://purl.org/linked-data/sdmx/2009/dimension#refPeriod>
"2011-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://linkedspending.aksw.org/resource/observation-2011saiki_budget-/93b403bcc1b4323e65a08e0a941116ddb35d971e>
a <http://purl.org/linked-data/cube#Observation>;
<http://www.w3.org/2000/01/rdf-schema#label>
"2011saiki_budget, observation /93b403bcc1b4323e65a08e0a941116ddb35d971e";
<http://dbpedia.org/ontology/currency>
<http://dbpedia.org/resource/Japanese_yen>;
<http://dublincore.org/documents/2012/06/14/dcmi-terms/source>
[] ;
<http://linkedspending.aksw.org/ontology/amount>
"23800.0";
<http://linkedspending.aksw.org/ontology/category>
<http://openspending.org/2011saiki_budget/category/2>;
<http://linkedspending.aksw.org/ontology/subcategory>
<http://openspending.org/2011saiki_budget/subcategory/2-2>;
<http://linkedspending.aksw.org/ontology/uniquekey>
"2730";
<http://purl.org/linked-data/cube#dataSet>
<http://linkedspending.aksw.org/resource/2011saiki_budget>;
<http://purl.org/linked-data/sdmx/2009/attribute#refArea>
<http://linkedgeodata.org/triplify/node424313451>;
<http://purl.org/linked-data/sdmx/2009/dimension#refPeriod>
"2011-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://linkedspending.aksw.org/resource/observation-2011saiki_budget-/165df673595e4987e590097a4b307bad389a864a>
a <http://purl.org/linked-data/cube#Observation>;
<http://www.w3.org/2000/01/rdf-schema#label>
"2011saiki_budget, observation /165df673595e4987e590097a4b307bad389a864a";
<http://dbpedia.org/ontology/currency>
<http://dbpedia.org/resource/Japanese_yen>;
<http://dublincore.org/documents/2012/06/14/dcmi-terms/source>
[] ;
<http://linkedspending.aksw.org/ontology/amount>
"4701946.0";
<http://linkedspending.aksw.org/ontology/category>
<http://openspending.org/2011saiki_budget/category/2>;
<http://linkedspending.aksw.org/ontology/subcategory>
<http://openspending.org/2011saiki_budget/subcategory/2-2>;
<http://linkedspending.aksw.org/ontology/uniquekey>
"3478";
<http://purl.org/linked-data/cube#dataSet>
<http://linkedspending.aksw.org/resource/2011saiki_budget>;
<http://purl.org/linked-data/sdmx/2009/attribute#refArea>
<http://linkedgeodata.org/triplify/node424313451>;
<http://purl.org/linked-data/sdmx/2009/dimension#refPeriod>
"2011-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://linkedspending.aksw.org/resource/observation-2011saiki_budget-/b410f194c507d51d7fa43d983cec8fa3eb76f921>
a <http://purl.org/linked-data/cube#Observation>;
<http://www.w3.org/2000/01/rdf-schema#label>
"2011saiki_budget, observation /b410f194c507d51d7fa43d983cec8fa3eb76f921";
<http://dbpedia.org/ontology/currency>
<http://dbpedia.org/resource/Japanese_yen>;
<http://dublincore.org/documents/2012/06/14/dcmi-terms/source>
[] ;
<http://linkedspending.aksw.org/ontology/amount>
"11313.0";
<http://linkedspending.aksw.org/ontology/category>
<http://openspending.org/2011saiki_budget/category/2>;
<http://linkedspending.aksw.org/ontology/subcategory>
<http://openspending.org/2011saiki_budget/subcategory/2-2>;
<http://linkedspending.aksw.org/ontology/uniquekey>
"3161";
<http://purl.org/linked-data/cube#dataSet>
<http://linkedspending.aksw.org/resource/2011saiki_budget>;
<http://purl.org/linked-data/sdmx/2009/attribute#refArea>
<http://linkedgeodata.org/triplify/node424313451>;
<http://purl.org/linked-data/sdmx/2009/dimension#refPeriod>
"2011-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://linkedspending.aksw.org/resource/observation-aide-publique-au-developpement-france-2011-/757555294af72fb53f0434bba6a71b8a9ae9692c>
a <http://purl.org/linked-data/cube#Observation>;
<http://www.w3.org/2000/01/rdf-schema#label>
"aide-publique-au-developpement-france-2011, observation /757555294af72fb53f0434bba6a71b8a9ae9692c";
<http://dbpedia.org/ontology/currency>
<http://dbpedia.org/resource/Euro>;
<http://dublincore.org/documents/2012/06/14/dcmi-terms/source>
[] ;
<http://linkedspending.aksw.org/ontology/amount>
"504520.0";
<http://linkedspending.aksw.org/ontology/amount-paid>
"168810";
<http://linkedspending.aksw.org/ontology/bimulti>
"bilatérale";
<http://linkedspending.aksw.org/ontology/canal>
"Public sector";
<http://linkedspending.aksw.org/ontology/code>
"2011-6627";
<http://linkedspending.aksw.org/ontology/date-debut>
"26/09/2011";
<http://linkedspending.aksw.org/ontology/date-fin>
"30/06/2017";
<http://linkedspending.aksw.org/ontology/description>
"Systèmes de production agricoles durables à Madagascar";
<http://linkedspending.aksw.org/ontology/from>
<http://openspending.org/aide-publique-au-developpement-france-2011/from/afd>;
<http://linkedspending.aksw.org/ontology/nature-operation>
"activité déjà notifiée antérieurement (augmentation/diminution d'un engagement antérieur, versement d'un engagement antérieur)";
<http://linkedspending.aksw.org/ontology/secteur>
"INDUSTRIES MANUFACTURIERES";
<http://linkedspending.aksw.org/ontology/titre>
"EXTENS.SECT.AGRO-IND.PALM.HUIL ";
<http://linkedspending.aksw.org/ontology/to>
<http://openspending.org/aide-publique-au-developpement-france-2011/to/cate-d-ivoire>;
<http://linkedspending.aksw.org/ontology/type-aide>
"Fonds communs/financements groupés";
<http://linkedspending.aksw.org/ontology/type-financement>
"Prêt d'aide sauf réorganisation de la dette";
<http://linkedspending.aksw.org/ontology/type-ressource>
"APD (aide publique au développement)";
<http://purl.org/linked-data/cube#dataSet>
<http://linkedspending.aksw.org/resource/aide-publique-au-developpement-france-2011>;
<http://purl.org/linked-data/sdmx/2009/attribute#refArea>
<http://linkedgeodata.org/triplify/node1363947712>;
<http://purl.org/linked-data/sdmx/2009/dimension#refPeriod>
"1986-12-06"^^<http://www.w3.org/2001/XMLSchema#date> .
General information
- Title: LinkedSpending (en)
- Identifier:
linked-spending - Version:
dev - Theme:
- Financial statistics (eurovoc:4263)
- Official statistics (eurovoc:4267)
- Public expenditure (eurovoc:403)
- Public finance (eurovoc:1018)
- Creator:
- Konrad Höffner (1)
- Name: Konrad Höffner
- Comment: Creator and maintainer of the LinkedSpending dataset. (en)
- Homepage: https://www.imise.uni-leipzig.de/Mitarbeiter/Konrad_Hoeffner
- AKSW team (2)
- Name: AKSW team
- Homepage: http://aksw.org/Team
- Piotr Sowiński (3)
- Name: Piotr Sowiński
- Nickname: Ostrzyciel
- Comment: Processing the dataset
- Homepage:
- Konrad Höffner (1)
- License: https://spdx.org/licenses/PDDL-1.0
- Source:
- http://linkedspending.aksw.org/
- Höffner, K., Martin, M., & Lehmann, J. (2015). LinkedSpending: OpenSpending becomes Linked Open Data. Semantic Web, 7(1), 95–104. https://doi.org/10.3233/sw-150172 (1)
- Date Issued: 2023-05-01
- Date Modified: 2026-06-20
- Landing page: linked-spending (dev)
- BibTeX citation:
@article{H_ffner_2015, title={LinkedSpending: OpenSpending becomes Linked Open Data}, volume={7}, ISSN={1570-0844}, url={http://dx.doi.org/10.3233/SW-150172}, DOI={10.3233/sw-150172}, number={1}, journal={Semantic Web}, publisher={SAGE Publications}, author={Höffner, Konrad and Martin, Michael and Lehmann, Jens}, editor={Noy, Natasha}, year={2015}, month=Mar, pages={95–104} }
Technical metadata
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 2,477,552
- Has stream element split:
- Type: Stream elements split by topic (rb:TopicStreamElementSplit)
- Comment: Each stream element corresponds to one subject (usually an observation). (en)
- Has subject shape:
- Has subject shape (1)
- Comment: Some elements have their class missing. (en)
- Target subjects of: Label (rdfs:label)
- Has subject shape (2)
- Comment: Target instances of any class. (en)
- Target subjects of: Type (rdf:type)
- Has subject shape (1)
- Uses vocabulary:
- Conforms to W3C RDF 1.1 specification: yes
- Conforms to W3C RDF-star draft specification as of December 17, 2021: yes
- Uses generalized triples: no
- Uses generalized RDF datasets: no
- Uses RDF-star: no
Distributions
Download links
The dataset is published in a few size variants, each containing a specific number of stream elements. For each size, there are three distribution types available: flat (an N-Triples/N-Quads file in the RDF Message Log format), streaming (a .tar.gz archive with Turtle/TriG files, one file per stream element), and Jelly (a native binary format for streaming RDF). See the documentation for more details.
| Distribution size | Statements | Flat | Streaming | Jelly |
|---|---|---|---|---|
| 10K | 158,342 | 2.0 MB | 1.3 MB | 1.2 MB |
| 100K | 1,716,898 | 17.0 MB | 10.0 MB | 11.4 MB |
| 1M | 23,371,403 | 226.2 MB | 139.9 MB | 131.9 MB |
| Full | 55,097,866 | 561.0 MB | 346.0 MB | 326.6 MB |
The full metadata of all distributions can be found below.
Full flat distribution
- Title: Full flat distribution
- Identifier:
flat-full - Has file name:
flat_full.nt.gz - Has distribution type:
- Flat distribution (RDF Messages) (rb:flatDistribution)
- Full distribution (rb:fullDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 2,477,552
- Byte size: 561.0 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
2b316147ff3541daa70a1365ac94ad1e - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
9b1d0d020e238201e8abffb6d0c8d1859dfa1f36 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/flat_full.nt.gz
Full stream distribution
- Title: Full stream distribution
- Identifier:
stream-full - Has file name:
stream_full.tar.gz - Has distribution type:
- Full distribution (rb:fullDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 2,477,552
- Byte size: 346.0 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
e8136920dd68f17c7b683ff50daa09a7 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
4f63e300fa1024b3ecf5bdc84c44bfee44e5d33f - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/stream_full.tar.gz
Full Jelly distribution
- Title: Full Jelly distribution
- Identifier:
jelly-full - Has file name:
jelly_full.jelly.gz - Has distribution type:
- Full distribution (rb:fullDistribution)
- Jelly distribution (rb:jellyDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 2,477,552
- Byte size: 326.6 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
4d8ef86efdbe63d8614bdb217dc502e6 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
5a3974cac8acd190de151e24aa33565f10b6f606 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/jelly_full.jelly.gz
1M elements flat distribution
- Title: 1M elements flat distribution
- Identifier:
flat-1m - Has file name:
flat_1M.nt.gz - Has distribution type:
- Flat distribution (RDF Messages) (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 1,000,000
- Byte size: 226.2 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
3b4ed62e5f6a46a1f472edf00ffbac60 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
859941427787238146fd57deafd60b21ddaf20b2 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/flat_1M.nt.gz
1M elements stream distribution
- Title: 1M elements stream distribution
- Identifier:
stream-1m - Has file name:
stream_1M.tar.gz - Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 1,000,000
- Byte size: 139.9 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
621a14a0cc9fe45a5ffc75531851ac67 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
2778232a32af5aa32c0c885146de00c5c340bcaa - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/stream_1M.tar.gz
1M elements Jelly distribution
- Title: 1M elements Jelly distribution
- Identifier:
jelly-1m - Has file name:
jelly_1M.jelly.gz - Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 1,000,000
- Byte size: 131.9 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
38c83c9b9af5f8a33cf8aa0572e04d48 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
f190d02bc4cc4f243bb6b9e77a80a846661ea185 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/jelly_1M.jelly.gz
100K elements flat distribution
- Title: 100K elements flat distribution
- Identifier:
flat-100k - Has file name:
flat_100K.nt.gz - Has distribution type:
- Flat distribution (RDF Messages) (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 100,000
- Byte size: 17.0 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
3083d66ca7d4cecc2d01cbd8c647ea5e - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
6b7750210c101a58cd18b623808a9ebb458a751d - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/flat_100K.nt.gz
100K elements stream distribution
- Title: 100K elements stream distribution
- Identifier:
stream-100k - Has file name:
stream_100K.tar.gz - Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 100,000
- Byte size: 10.0 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
18c6057f69cde7657d0b2ceb6cf2a43e - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
9f642efa9d262a706656c8c9a4b1f84073a8aaa2 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/stream_100K.tar.gz
100K elements Jelly distribution
- Title: 100K elements Jelly distribution
- Identifier:
jelly-100k - Has file name:
jelly_100K.jelly.gz - Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 100,000
- Byte size: 11.4 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
03a4fe12e7d683e8239c78f2a2ea3a67 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
a2721a1bc9e2836e285ea63128dc3a3fce1d7363 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/jelly_100K.jelly.gz
10K elements flat distribution
- Title: 10K elements flat distribution
- Identifier:
flat-10k - Has file name:
flat_10K.nt.gz - Has distribution type:
- Flat distribution (RDF Messages) (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 10,000
- Byte size: 2.0 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
a662f93908f409f133eb1fda3a75493d - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
3726c6110c3d612beb7f0986783b0b322647e3aa - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/flat_10K.nt.gz
10K elements stream distribution
- Title: 10K elements stream distribution
- Identifier:
stream-10k - Has file name:
stream_10K.tar.gz - Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 10,000
- Byte size: 1.3 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
09cf0d59155b29dc18e97237f4140da6 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
e0cc192f3cb853fa35924e9afc858bbe0d190481 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/stream_10K.tar.gz
10K elements Jelly distribution
- Title: 10K elements Jelly distribution
- Identifier:
jelly-10k - Has file name:
jelly_10K.jelly.gz - Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs corresponding to statistical observations or other subjects (e.g., statistical properties). Each graph is uniquely identified by its subject IRI. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 10,000
- Byte size: 1.2 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
fdcc5fbfa52ae1566d577814ed473524 - Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
af0343168389d422c258e86d4c86f32889d273d4 - Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/linked-spending/dev/files/jelly_10K.jelly.gz
Statistics
Statistics for full distributions
- Title: Statistics for full distributions
| Sum | Unique | Mean | St. dev. | Min. | Max. | |
|---|---|---|---|---|---|---|
| IRIs | 83,873,394 | ~3,232,432 | 33.85 | 12.84 | 3 | 84 |
| Blank nodes | 2,583,713 | N/A | 1.04 | 0.21 | 0 | 2 |
| Literals | 18,740,789 | ~4,836,067 | 7.56 | 3.87 | 0 | 43 |
| Simple literals | 15,143,397 | ~4,843,751 | 6.11 | 3.69 | 0 | 43 |
| Datatype literals | 3,597,392 | ~7,209 | 1.45 | 0.55 | 0 | 4 |
| Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
| Datatypes | 3,409,585 | 3 | 1.38 | 0.49 | 0 | 2 |
| ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Subjects | 2,477,552 | ~2,468,698 | 1.00 | 0.00 | 1 | 1 |
| Predicates | 51,106,554 | ~708 | 20.63 | 7.50 | 2 | 49 |
| Objects | 51,613,790 | ~8,039,135 | 20.83 | 7.97 | 1 | 86 |
| Graphs | 2,477,552 | ~1 | 1.00 | 0.00 | 1 | 1 |
| Statements | 55,097,866 | N/A | 22.24 | 9.61 | 2 | 86 |
| Bytes per statement | N/A | N/A | 215.17 | 50.49 | 2.09 | 16,200.00 |
Statistics for 1M distributions
- Title: Statistics for 1M distributions
| Sum | Unique | Mean | St. dev. | Min. | Max. | |
|---|---|---|---|---|---|---|
| IRIs | 35,194,519 | ~1,336,113 | 35.19 | 13.13 | 3 | 60 |
| Blank nodes | 1,020,633 | N/A | 1.02 | 0.15 | 0 | 2 |
| Literals | 7,242,453 | ~2,005,026 | 7.24 | 3.04 | 0 | 26 |
| Simple literals | 5,666,659 | ~1,995,706 | 5.67 | 2.79 | 0 | 26 |
| Datatype literals | 1,575,794 | ~5,841 | 1.58 | 0.62 | 0 | 4 |
| Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
| Datatypes | 1,481,292 | 3 | 1.48 | 0.50 | 0 | 2 |
| ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Subjects | 1,000,000 | ~998,186 | 1.00 | 0.00 | 1 | 1 |
| Predicates | 19,928,203 | ~383 | 19.93 | 5.05 | 2 | 32 |
| Objects | 22,529,402 | ~3,328,758 | 22.53 | 9.84 | 1 | 52 |
| Graphs | 1,000,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
| Statements | 23,371,403 | N/A | 23.37 | 10.26 | 2 | 52 |
| Bytes per statement | N/A | N/A | 218.21 | 67.92 | 2.09 | 16,200.00 |
Statistics for 100K distributions
- Title: Statistics for 100K distributions
| Sum | Unique | Mean | St. dev. | Min. | Max. | |
|---|---|---|---|---|---|---|
| IRIs | 2,497,664 | ~210,282 | 24.98 | 2.52 | 3 | 30 |
| Blank nodes | 99,849 | N/A | 1.00 | 0.04 | 0 | 1 |
| Literals | 809,395 | ~187,264 | 8.09 | 2.49 | 0 | 17 |
| Simple literals | 709,140 | ~185,894 | 7.09 | 2.49 | 0 | 17 |
| Datatype literals | 100,255 | ~2,538 | 1.00 | 0.07 | 0 | 2 |
| Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
| Datatypes | 100,255 | 3 | 1.00 | 0.07 | 0 | 2 |
| ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Subjects | 100,000 | ~100,086 | 1.00 | 0.00 | 1 | 1 |
| Predicates | 1,716,279 | ~56 | 17.16 | 2.45 | 2 | 23 |
| Objects | 1,590,629 | ~396,718 | 15.91 | 2.41 | 1 | 34 |
| Graphs | 100,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
| Statements | 1,716,898 | N/A | 17.17 | 2.44 | 2 | 34 |
| Bytes per statement | N/A | N/A | 228.15 | 30.31 | 2.09 | 2,517.50 |
Statistics for 10K distributions
- Title: Statistics for 10K distributions
| Sum | Unique | Mean | St. dev. | Min. | Max. | |
|---|---|---|---|---|---|---|
| IRIs | 226,552 | ~10,827 | 22.66 | 6.07 | 3 | 30 |
| Blank nodes | 9,907 | N/A | 0.99 | 0.10 | 0 | 1 |
| Literals | 86,396 | ~32,579 | 8.64 | 5.27 | 0 | 16 |
| Simple literals | 76,087 | ~31,639 | 7.61 | 5.28 | 0 | 15 |
| Datatype literals | 10,309 | ~976 | 1.03 | 0.22 | 0 | 2 |
| Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
| Datatypes | 10,309 | 3 | 1.03 | 0.22 | 0 | 2 |
| ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
| Subjects | 10,000 | ~9,970 | 1.00 | 0.00 | 1 | 1 |
| Predicates | 157,827 | ~48 | 15.78 | 5.84 | 2 | 23 |
| Objects | 155,028 | ~43,291 | 15.50 | 5.43 | 1 | 23 |
| Graphs | 10,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
| Statements | 158,342 | N/A | 15.83 | 5.82 | 2 | 23 |
| Bytes per statement | N/A | N/A | 226.19 | 72.83 | 2.09 | 2,517.50 |