Skip to content

Dataset: officegraph (development version)

OfficeGraph is a real-world data of measurements from 444 IoT devices taken over 11 months. The devices are made up of 17 different sensor models which measure different properties. Data was taken in Dutch 7-story office building and consists of about 90 million RDF triples. See also the paper for more details.

The elements in the dataset are ordered from oldest to newest by the measurement time (saref:hasTimestamp predicate).

Additional data such as data about devices taking measurements and rooms in the building as well as copy of this dataset can be found here.

Info

Download this metadata in RDF: Turtle, N-Triples, RDF/XML, Jelly
Source repository: dataset-officegraph
Permanent URL: https://w3id.org/riverbench/datasets/officegraph/dev

Go to download links

Stream preview (click to expand)
0000000000.ttl
PREFIX ic:    <https://interconnectproject.eu/example/>
PREFIX om:    <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>

ic:property_R5_56__co2_
        a       ic:CO2Level .

ic:measurement_R5_56__co2__0
        a                        saref:Measurement;
        saref:hasTimestamp       "2022-02-28T23:59:00"^^xsd:dateTime;
        saref:hasValue           "504"^^xsd:float;
        saref:isMeasuredIn       om:partsPerMillion;
        saref:relatesToProperty  ic:property_R5_56__co2_ .
0000000010.ttl
PREFIX ic:    <https://interconnectproject.eu/example/>
PREFIX om:    <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>

ic:property_R5_157__humidity_
        a       saref:Humidity .

ic:measurement_R5_157__humidity__0
        a                        saref:Measurement;
        saref:hasTimestamp       "2022-03-01T00:00:00"^^xsd:dateTime;
        saref:hasValue           "23"^^xsd:float;
        saref:isMeasuredIn       om:percent;
        saref:relatesToProperty  ic:property_R5_157__humidity_ .
0000000100.ttl
PREFIX ic:    <https://interconnectproject.eu/example/>
PREFIX om:    <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>

ic:property_R5_15__humidity_
        a       saref:Humidity .

ic:measurement_R5_15__humidity__0
        a                        saref:Measurement;
        saref:hasTimestamp       "2022-03-01T00:02:00"^^xsd:dateTime;
        saref:hasValue           "22"^^xsd:float;
        saref:isMeasuredIn       om:percent;
        saref:relatesToProperty  ic:property_R5_15__humidity_ .
0000001000.ttl
PREFIX ic:    <https://interconnectproject.eu/example/>
PREFIX om:    <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>

ic:property_R5_73__co2_
        a       ic:CO2Level .

ic:measurement_R5_73__co2__1
        a                        saref:Measurement;
        saref:hasTimestamp       "2022-03-01T00:31:00"^^xsd:dateTime;
        saref:hasValue           "430"^^xsd:float;
        saref:isMeasuredIn       om:partsPerMillion;
        saref:relatesToProperty  ic:property_R5_73__co2_ .
0000010000.ttl
PREFIX ic:    <https://interconnectproject.eu/example/>
PREFIX om:    <http://www.wurvoc.org/vocabularies/om-1.8/>
PREFIX saref: <https://saref.etsi.org/core/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>

ic:property_R5_159__humidity_
        a       saref:Humidity .

ic:measurement_R5_159__humidity__11
        a                        saref:Measurement;
        saref:hasTimestamp       "2022-03-01T05:47:00"^^xsd:dateTime;
        saref:hasValue           "26"^^xsd:float;
        saref:isMeasuredIn       om:percent;
        saref:relatesToProperty  ic:property_R5_159__humidity_ .

General information

  • Title: OfficeGraph (en)
  • Identifier: officegraph
  • Version: dev
  • Theme:
  • Creator:
    • Adam Skaskiewicz (​1)
      • Name: Adam Skaskiewicz
      • Nickname: adamskas
      • Comment: Author of benchmark dataset (en)
    • Roderick van der Weerdt (​2)
      • Name: Roderick van der Weerdt
      • Comment: Co-author of original dataset (en)
    • Victor de Boer (​3)
      • Name: Victor de Boer
      • Comment: Co-author of original dataset (en)
    • Ronald Siebes (​4)
      • Name: Ronald Siebes
      • Comment: Co-author of original dataset (en)
    • Ronnie Groenewold (​5)
      • Name: Ronnie Groenewold
      • Comment: Co-author of original dataset (en)
    • Frank van Harmelen (​6)
      • Name: Frank van Harmelen
      • Comment: Co-author of original dataset (en)
  • License: https://spdx.org/licenses/CC-BY-4.0
  • Source:
  • Date Issued: 2025-01-18
  • Date Modified: 2025-01-27
  • Landing page: officegraph (dev)
  1. BibTeX citation:
    @inbook{van_der_Weerdt_2024, title={OfficeGraph: A Knowledge Graph of Office Building IoT Measurements}, ISBN={9783031606359}, ISSN={1611-3349}, url={http://dx.doi.org/10.1007/978-3-031-60635-9_6}, DOI={10.1007/978-3-031-60635-9_6}, booktitle={The Semantic Web}, publisher={Springer Nature Switzerland}, author={van der Weerdt, Roderick and de Boer, Victor and Siebes, Ronald and Groenewold, Ronnie and van Harmelen, Frank}, year={2024}, pages={94–109} }
    

Technical metadata

Distributions

The dataset is published in a few size variants, each containing a specific number of stream elements. For each size, there are three distribution types available: flat (just an N-Triples/N-Quads file), streaming (a .tar.gz archive with Turtle/TriG files, one file per stream element), and Jelly (a native binary format for streaming RDF). See the documentation for more details.

Distribution size Statements Flat Streaming Jelly
10K 61,674 315.9 KB 289.7 KB 180.6 KB
100K 612,689 2.8 MB 2.5 MB 1.8 MB
1M 6,154,979 31.4 MB 27.8 MB 20.6 MB
10M 61,173,473 335.7 MB 293.0 MB 222.4 MB
Full 91,378,858 506.2 MB 441.2 MB 338.3 MB

The full metadata of all distributions can be found below.

Full flat distribution

Full stream distribution

Full Jelly distribution

10M elements flat distribution

10M elements stream distribution

10M elements Jelly distribution

1M elements flat distribution

1M elements stream distribution

1M elements Jelly distribution

100K elements flat distribution

100K elements stream distribution

100K elements Jelly distribution

10K elements flat distribution

10K elements stream distribution

10K elements Jelly distribution

Statistics

Statistics for full distributions

  • Title: Statistics for full distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 152,298,088 ~14,920,502 10.20 1.40 10 20
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 29,860,956 ~1,099,129 2.00 0.00 2 2
Simple literals 0 ~0 0.00 0.00 0 0
Datatype literals 29,860,956 ~1,099,129 2.00 0.00 2 2
Language literals 0 ~0 0.00 0.00 0 0
Datatypes 29,860,956 3 2.00 0.00 2 2
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 30,459,628 ~14,920,458 2.04 0.28 2 4
Predicates 76,448,380 ~11 5.12 0.84 5 11
Objects 91,079,522 ~1,406,627 6.10 0.70 6 11
Graphs 14,930,478 ~1 1.00 0.00 1 1
Statements 91,378,858 N/A 6.12 0.84 6 12
Bytes per statement N/A N/A 180.36 12.49 166.67 216.83

Statistics for 10M distributions

  • Title: Statistics for 10M distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 101,955,784 ~10,030,838 10.20 1.38 10 20
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 20,000,000 ~739,656 2.00 0.00 2 2
Simple literals 0 ~0 0.00 0.00 0 0
Datatype literals 20,000,000 ~739,656 2.00 0.00 2 2
Language literals 0 ~0 0.00 0.00 0 0
Datatypes 20,000,000 3 2.00 0.00 2 2
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 20,391,162 ~10,030,793 2.04 0.28 2 4
Predicates 51,173,473 ~11 5.12 0.83 5 11
Objects 60,977,892 ~938,944 6.10 0.69 6 11
Graphs 10,000,000 ~1 1.00 0.00 1 1
Statements 61,173,473 N/A 6.12 0.83 6 12
Bytes per statement N/A N/A 181.37 13.11 166.67 216.83

Statistics for 1M distributions

  • Title: Statistics for 1M distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 10,258,294 ~997,886 10.26 1.59 10 20
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 2,000,000 ~67,938 2.00 0.00 2 2
Simple literals 0 ~0 0.00 0.00 0 0
Datatype literals 2,000,000 ~67,938 2.00 0.00 2 2
Language literals 0 ~0 0.00 0.00 0 0
Datatypes 2,000,000 3 2.00 0.00 2 2
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 2,051,664 ~997,839 2.05 0.32 2 4
Predicates 5,154,979 ~11 5.15 0.95 5 11
Objects 6,129,147 ~96,542 6.13 0.79 6 11
Graphs 1,000,000 ~1 1.00 0.00 1 1
Statements 6,154,979 N/A 6.15 0.95 6 12
Bytes per statement N/A N/A 185.10 14.13 166.67 216.58

Statistics for 100K distributions

  • Title: Statistics for 100K distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 1,021,144 ~101,446 10.21 1.44 10 20
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 200,000 ~5,751 2.00 0.00 2 2
Simple literals 0 ~0 0.00 0.00 0 0
Datatype literals 200,000 ~5,751 2.00 0.00 2 2
Language literals 0 ~0 0.00 0.00 0 0
Datatypes 200,000 3 2.00 0.00 2 2
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 204,234 ~101,401 2.04 0.29 2 4
Predicates 512,689 ~11 5.13 0.86 5 11
Objects 610,572 ~9,491 6.11 0.72 6 11
Graphs 100,000 ~1 1.00 0.00 1 1
Statements 612,689 N/A 6.13 0.86 6 12
Bytes per statement N/A N/A 186.99 14.66 166.67 215.50

Statistics for 10K distributions

  • Title: Statistics for 10K distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 102,786 ~10,906 10.28 1.64 10 20
Blank nodes 0 N/A 0.00 0.00 0 0
Literals 20,000 ~723 2.00 0.00 2 2
Simple literals 0 ~0 0.00 0.00 0 0
Datatype literals 20,000 ~723 2.00 0.00 2 2
Language literals 0 ~0 0.00 0.00 0 0
Datatypes 20,000 3 2.00 0.00 2 2
ASCII control chars 0 N/A 0.00 0.00 0 0
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 20,562 ~10,858 2.06 0.33 2 4
Predicates 51,674 ~11 5.17 0.99 5 11
Objects 61,393 ~1,928 6.14 0.82 6 11
Graphs 10,000 ~1 1.00 0.00 1 1
Statements 61,674 N/A 6.17 0.99 6 12
Bytes per statement N/A N/A 175.36 10.15 166.67 213.17