Dataset: officegraph (development version)
OfficeGraph is a real-world data of measurements from 444 IoT devices taken over 11 months. The devices are made up of 17 different sensor models which measure different properties. Data was taken in Dutch 7-story office building and consists of about 90 million RDF triples. See also the paper for more details.
The elements in the dataset are ordered from oldest to newest by the measurement time (saref:hasTimestamp predicate).
Additional data such as data about devices taking measurements and rooms in the building as well as copy of this dataset can be found here.
Info
Download this metadata in RDF: Turtle, N-Triples, RDF/XML, Jelly
Source repository: dataset-officegraph
Permanent URL: https://w3id.org/riverbench/datasets/officegraph/dev
Stream preview (click to expand)
PREFIX ic: <https://interconnectproject.eu/example/>
PREFIX om: <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ic:property_R5_56__co2_
a ic:CO2Level .
ic:measurement_R5_56__co2__0
a saref:Measurement;
saref:hasTimestamp "2022-02-28T23:59:00"^^xsd:dateTime;
saref:hasValue "504"^^xsd:float;
saref:isMeasuredIn om:partsPerMillion;
saref:relatesToProperty ic:property_R5_56__co2_ .
PREFIX ic: <https://interconnectproject.eu/example/>
PREFIX om: <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ic:property_R5_157__humidity_
a saref:Humidity .
ic:measurement_R5_157__humidity__0
a saref:Measurement;
saref:hasTimestamp "2022-03-01T00:00:00"^^xsd:dateTime;
saref:hasValue "23"^^xsd:float;
saref:isMeasuredIn om:percent;
saref:relatesToProperty ic:property_R5_157__humidity_ .
PREFIX ic: <https://interconnectproject.eu/example/>
PREFIX om: <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ic:property_R5_15__humidity_
a saref:Humidity .
ic:measurement_R5_15__humidity__0
a saref:Measurement;
saref:hasTimestamp "2022-03-01T00:02:00"^^xsd:dateTime;
saref:hasValue "22"^^xsd:float;
saref:isMeasuredIn om:percent;
saref:relatesToProperty ic:property_R5_15__humidity_ .
PREFIX ic: <https://interconnectproject.eu/example/>
PREFIX om: <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ic:property_R5_73__co2_
a ic:CO2Level .
ic:measurement_R5_73__co2__1
a saref:Measurement;
saref:hasTimestamp "2022-03-01T00:31:00"^^xsd:dateTime;
saref:hasValue "430"^^xsd:float;
saref:isMeasuredIn om:partsPerMillion;
saref:relatesToProperty ic:property_R5_73__co2_ .
PREFIX ic: <https://interconnectproject.eu/example/>
PREFIX om: <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ic:property_R5_159__humidity_
a saref:Humidity .
ic:measurement_R5_159__humidity__11
a saref:Measurement;
saref:hasTimestamp "2022-03-01T05:47:00"^^xsd:dateTime;
saref:hasValue "26"^^xsd:float;
saref:isMeasuredIn om:percent;
saref:relatesToProperty ic:property_R5_159__humidity_ .
General information
- Title: OfficeGraph (en)
- Identifier:
officegraph
- Version:
dev
- Theme:
- Data collection (eurovoc:6030)
- Internet of Things (eurovoc:c_b12a760a)
- Office space (eurovoc:c_4b5a18f8)
- Creator:
- Adam Skaskiewicz (1)
- Name: Adam Skaskiewicz
- Nickname:
adamskas
- Comment: Author of benchmark dataset (en)
- Roderick van der Weerdt (2)
- Name: Roderick van der Weerdt
- Comment: Co-author of original dataset (en)
- Victor de Boer (3)
- Name: Victor de Boer
- Comment: Co-author of original dataset (en)
- Ronald Siebes (4)
- Name: Ronald Siebes
- Comment: Co-author of original dataset (en)
- Ronnie Groenewold (5)
- Name: Ronnie Groenewold
- Comment: Co-author of original dataset (en)
- Frank van Harmelen (6)
- Name: Frank van Harmelen
- Comment: Co-author of original dataset (en)
- Adam Skaskiewicz (1)
- License: https://spdx.org/licenses/CC-BY-4.0
- Source:
- van der Weerdt, R., de Boer, V., Siebes, R., Groenewold, R., & van Harmelen, F. (2024). OfficeGraph: A Knowledge Graph of Office Building IoT Measurements. The Semantic Web, 94–109. https://doi.org/10.1007/978-3-031-60635-9_6 (1)
- https://github.com/RoderickvanderWeerdt/OfficeGraph/tree/main
- Date Issued: 2025-01-18
- Date Modified: 2025-01-27
- Landing page: officegraph (dev)
- BibTeX citation:
@inbook{van_der_Weerdt_2024, title={OfficeGraph: A Knowledge Graph of Office Building IoT Measurements}, ISBN={9783031606359}, ISSN={1611-3349}, url={http://dx.doi.org/10.1007/978-3-031-60635-9_6}, DOI={10.1007/978-3-031-60635-9_6}, booktitle={The Semantic Web}, publisher={Springer Nature Switzerland}, author={van der Weerdt, Roderick and de Boer, Victor and Siebes, Ronald and Groenewold, Ronnie and van Harmelen, Frank}, year={2024}, pages={94–109} }
Technical metadata
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 14,930,478
- Has stream element split:
- Type:
- Stream elements split by time (rb:TimeStreamElementSplit)
- Stream elements split by topic (rb:TopicStreamElementSplit)
- Comment: Each stream element corresponds to one measurement from one sensor in the office building. (en)
- Has temporal property: http://www.w3.org/ns/sosa/resultTime
- Has subject shape:
- Comment: Target instances of class
saref:Measurement
. (en) - Target class: https://saref.etsi.org/core/Measurement
- Comment: Target instances of class
- Type:
- Uses vocabulary:
- Conforms to W3C RDF 1.1 specification: yes
- Conforms to W3C RDF-star draft specification as of December 17, 2021: yes
- Uses generalized triples: no
- Uses generalized RDF datasets: no
- Uses RDF-star: no
- Temporal resolution: PT1M
Distributions
Download links
The dataset is published in a few size variants, each containing a specific number of stream elements. For each size, there are three distribution types available: flat (just an N-Triples/N-Quads file), streaming (a .tar.gz archive with Turtle/TriG files, one file per stream element), and Jelly (a native binary format for streaming RDF). See the documentation for more details.
Distribution size | Statements | Flat | Streaming | Jelly |
---|---|---|---|---|
10K | 61,674 | 315.9 KB | 289.7 KB | 180.6 KB |
100K | 612,689 | 2.8 MB | 2.5 MB | 1.8 MB |
1M | 6,154,979 | 31.4 MB | 27.8 MB | 20.6 MB |
10M | 61,173,473 | 335.7 MB | 293.0 MB | 222.4 MB |
Full | 91,378,858 | 506.2 MB | 441.2 MB | 338.3 MB |
The full metadata of all distributions can be found below.
Full flat distribution
- Title: Full flat distribution
- Identifier:
flat-full
- Has file name:
flat_full.nt.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Full distribution (rb:fullDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 14,930,478
- Byte size: 506.2 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
f84c1a062b2c327f7161162b714670bd
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
8959eaf3ec27c8e014830c92c00717f3eec21658
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/flat_full.nt.gz
Full stream distribution
- Title: Full stream distribution
- Identifier:
stream-full
- Has file name:
stream_full.tar.gz
- Has distribution type:
- Full distribution (rb:fullDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 14,930,478
- Byte size: 441.2 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
6436f8708cb21fef440f7f908b315166
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
2e6229c04ff32addcc7e9eb31dbfb4873e8d425c
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/stream_full.tar.gz
Full Jelly distribution
- Title: Full Jelly distribution
- Identifier:
jelly-full
- Has file name:
jelly_full.jelly.gz
- Has distribution type:
- Full distribution (rb:fullDistribution)
- Jelly distribution (rb:jellyDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 14,930,478
- Byte size: 338.3 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
7a17888a928aa223673c51fbab855b33
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
bbc923718e9f7813087777044ecd3059a15f1c57
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-full
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/jelly_full.jelly.gz
10M elements flat distribution
- Title: 10M elements flat distribution
- Identifier:
flat-10m
- Has file name:
flat_10M.nt.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 10,000,000
- Byte size: 335.7 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
41f57e353027e86d0f1b1a33bbf5f0f1
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
c2670aea911731b5643f79576b54684d17215e1a
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/flat_10M.nt.gz
10M elements stream distribution
- Title: 10M elements stream distribution
- Identifier:
stream-10m
- Has file name:
stream_10M.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 10,000,000
- Byte size: 293.0 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
b6e2d46bebbd72f6e51591e21300480f
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
a0589a50c7f4de9037f1432478a0378b86054071
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/stream_10M.tar.gz
10M elements Jelly distribution
- Title: 10M elements Jelly distribution
- Identifier:
jelly-10m
- Has file name:
jelly_10M.jelly.gz
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 10,000,000
- Byte size: 222.4 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
98862b38e86265d9e2476031abd383ba
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
7b6338f7cbea6db14e02abe20d38bbd0e00397a3
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/jelly_10M.jelly.gz
1M elements flat distribution
- Title: 1M elements flat distribution
- Identifier:
flat-1m
- Has file name:
flat_1M.nt.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 1,000,000
- Byte size: 31.4 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
041d029147c494fa780f47fd9aedc51b
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
13ab2f59bf7cfecfec3ff2237a393459440c1b83
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/flat_1M.nt.gz
1M elements stream distribution
- Title: 1M elements stream distribution
- Identifier:
stream-1m
- Has file name:
stream_1M.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 1,000,000
- Byte size: 27.8 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
814c44756910cfe531a2c9f78f8f8ac7
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
3990a8f7d21acb050ccb690bc23fd2da95518274
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/stream_1M.tar.gz
1M elements Jelly distribution
- Title: 1M elements Jelly distribution
- Identifier:
jelly-1m
- Has file name:
jelly_1M.jelly.gz
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 1,000,000
- Byte size: 20.6 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
8d4b97c615dc6c195abedab62e9dbfd3
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
ada0027c4c482ef44a7e43cd673b211546e16de5
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-1m
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/jelly_1M.jelly.gz
100K elements flat distribution
- Title: 100K elements flat distribution
- Identifier:
flat-100k
- Has file name:
flat_100K.nt.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 100,000
- Byte size: 2.8 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
f2c14112c127d06c27685b53307a0e30
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
fd4ac9e3050739dc8f9a5c132c56b7a56ed30335
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/flat_100K.nt.gz
100K elements stream distribution
- Title: 100K elements stream distribution
- Identifier:
stream-100k
- Has file name:
stream_100K.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 100,000
- Byte size: 2.5 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
87dd6a0f1d6303a319605e6a2415926b
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
c2e9b33b612fdc3a7ffe532a35dfd8fece077580
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/stream_100K.tar.gz
100K elements Jelly distribution
- Title: 100K elements Jelly distribution
- Identifier:
jelly-100k
- Has file name:
jelly_100K.jelly.gz
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has stream element count: 100,000
- Byte size: 1.8 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
137035ccc17adbfc3618bc619c010790
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
744cb07f62a178c4469e9e0e7bb8511879f8c50d
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-100k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/jelly_100K.jelly.gz
10K elements flat distribution
- Title: 10K elements flat distribution
- Identifier:
flat-10k
- Has file name:
flat_10K.nt.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has stream element count: 10,000
- Byte size: 315.9 KB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
d807758f2c4af735170ff51f7133ece6
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
898da0d5d15cb0cf460dbacbc3d70e15ec1fc783
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/flat_10K.nt.gz
10K elements stream distribution
- Title: 10K elements stream distribution
- Identifier:
stream-10k
- Has file name:
stream_10K.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has stream element count: 10,000
- Byte size: 289.7 KB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
27e0a18ae9ecc8b60254412665c72edc
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
05a56507437c136371d091784335f1647ca67591
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/stream_10K.tar.gz
10K elements Jelly distribution
- Title: 10K elements Jelly distribution
- Identifier:
jelly-10k
- Has file name:
jelly_10K.jelly.gz
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to one measurement from one sensor in the office building. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 10,000
- Byte size: 180.6 KB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
8c87e9c6323f65ae54b76acf09b9d2b5
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
63bafe99dbf287c414685f46652ca3abfb9344f6
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Statistics: statistics-10k
- Download URL: https://w3id.org/riverbench/datasets/officegraph/dev/files/jelly_10K.jelly.gz
Statistics
Statistics for full distributions
- Title: Statistics for full distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
---|---|---|---|---|---|---|
IRIs | 152,298,088 | ~14,920,502 | 10.20 | 1.40 | 10 | 20 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Literals | 29,860,956 | ~1,099,129 | 2.00 | 0.00 | 2 | 2 |
Simple literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 29,860,956 | ~1,099,129 | 2.00 | 0.00 | 2 | 2 |
Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatypes | 29,860,956 | 3 | 2.00 | 0.00 | 2 | 2 |
ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Subjects | 30,459,628 | ~14,920,458 | 2.04 | 0.28 | 2 | 4 |
Predicates | 76,448,380 | ~11 | 5.12 | 0.84 | 5 | 11 |
Objects | 91,079,522 | ~1,406,627 | 6.10 | 0.70 | 6 | 11 |
Graphs | 14,930,478 | ~1 | 1.00 | 0.00 | 1 | 1 |
Statements | 91,378,858 | N/A | 6.12 | 0.84 | 6 | 12 |
Bytes per statement | N/A | N/A | 180.36 | 12.49 | 166.67 | 216.83 |
Statistics for 10M distributions
- Title: Statistics for 10M distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
---|---|---|---|---|---|---|
IRIs | 101,955,784 | ~10,030,838 | 10.20 | 1.38 | 10 | 20 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Literals | 20,000,000 | ~739,656 | 2.00 | 0.00 | 2 | 2 |
Simple literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 20,000,000 | ~739,656 | 2.00 | 0.00 | 2 | 2 |
Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatypes | 20,000,000 | 3 | 2.00 | 0.00 | 2 | 2 |
ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Subjects | 20,391,162 | ~10,030,793 | 2.04 | 0.28 | 2 | 4 |
Predicates | 51,173,473 | ~11 | 5.12 | 0.83 | 5 | 11 |
Objects | 60,977,892 | ~938,944 | 6.10 | 0.69 | 6 | 11 |
Graphs | 10,000,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
Statements | 61,173,473 | N/A | 6.12 | 0.83 | 6 | 12 |
Bytes per statement | N/A | N/A | 181.37 | 13.11 | 166.67 | 216.83 |
Statistics for 1M distributions
- Title: Statistics for 1M distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
---|---|---|---|---|---|---|
IRIs | 10,258,294 | ~997,886 | 10.26 | 1.59 | 10 | 20 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Literals | 2,000,000 | ~67,938 | 2.00 | 0.00 | 2 | 2 |
Simple literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 2,000,000 | ~67,938 | 2.00 | 0.00 | 2 | 2 |
Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatypes | 2,000,000 | 3 | 2.00 | 0.00 | 2 | 2 |
ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Subjects | 2,051,664 | ~997,839 | 2.05 | 0.32 | 2 | 4 |
Predicates | 5,154,979 | ~11 | 5.15 | 0.95 | 5 | 11 |
Objects | 6,129,147 | ~96,542 | 6.13 | 0.79 | 6 | 11 |
Graphs | 1,000,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
Statements | 6,154,979 | N/A | 6.15 | 0.95 | 6 | 12 |
Bytes per statement | N/A | N/A | 185.10 | 14.13 | 166.67 | 216.58 |
Statistics for 100K distributions
- Title: Statistics for 100K distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
---|---|---|---|---|---|---|
IRIs | 1,021,144 | ~101,446 | 10.21 | 1.44 | 10 | 20 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Literals | 200,000 | ~5,751 | 2.00 | 0.00 | 2 | 2 |
Simple literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 200,000 | ~5,751 | 2.00 | 0.00 | 2 | 2 |
Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatypes | 200,000 | 3 | 2.00 | 0.00 | 2 | 2 |
ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Subjects | 204,234 | ~101,401 | 2.04 | 0.29 | 2 | 4 |
Predicates | 512,689 | ~11 | 5.13 | 0.86 | 5 | 11 |
Objects | 610,572 | ~9,491 | 6.11 | 0.72 | 6 | 11 |
Graphs | 100,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
Statements | 612,689 | N/A | 6.13 | 0.86 | 6 | 12 |
Bytes per statement | N/A | N/A | 186.99 | 14.66 | 166.67 | 215.50 |
Statistics for 10K distributions
- Title: Statistics for 10K distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
---|---|---|---|---|---|---|
IRIs | 102,786 | ~10,906 | 10.28 | 1.64 | 10 | 20 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Literals | 20,000 | ~723 | 2.00 | 0.00 | 2 | 2 |
Simple literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 20,000 | ~723 | 2.00 | 0.00 | 2 | 2 |
Language literals | 0 | ~0 | 0.00 | 0.00 | 0 | 0 |
Datatypes | 20,000 | 3 | 2.00 | 0.00 | 2 | 2 |
ASCII control chars | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Subjects | 20,562 | ~10,858 | 2.06 | 0.33 | 2 | 4 |
Predicates | 51,674 | ~11 | 5.17 | 0.99 | 5 | 11 |
Objects | 61,393 | ~1,928 | 6.14 | 0.82 | 6 | 11 |
Graphs | 10,000 | ~1 | 1.00 | 0.00 | 1 | 1 |
Statements | 61,674 | N/A | 6.17 | 0.99 | 6 | 12 |
Bytes per statement | N/A | N/A | 175.36 | 10.15 | 166.67 | 213.17 |