nanopubs (development version)
Nanopublications are small units of publishable information, used for scientific results and more. This dataset is based on a subset of a dump of all available nanopublications as of April 5, 2018. Only the first 5M of freely-licensed nanopubs were included. Each nanopub consists of several RDF graphs and thus is an RDF dataset. The included data is primarily from the biomedical domain. More information: paper, website.
General information
- Title: Nanopublications
- Identifier: nanopubs
- Has version: dev
- Theme:
- Bibliographical (rbt:bibliographical)
- Biomedical (rbt:biomedical)
- Scientific (rbt:scientific)
- Creator:
- Authors of the included nanopublications (cited within the dataset) (1)
- Name: Authors of the included nanopublications (cited within the dataset)
- Tobias Kuhn (2)
- Name: Tobias Kuhn
- Homepage: https://orcid.org/0000-0002-1267-0234
- Comment: Author of the nanopublications dump
- Piotr Sowiński (3)
- Name: Piotr Sowiński
- Nickname: Ostrzyciel
- Homepage:
- Authors of the included nanopublications (cited within the dataset) (1)
- License: https://spdx.org/licenses/CC-BY-SA-3.0
- Source: https://doi.org/10.5281/zenodo.1213293
- Rights: This dataset only includes freely-licensed publications (CC BY, CC BY-SA, or ODbL licenses). Each nanopublication includes information about its original authors and is self-citing. The dataset is marked as under CC BY-SA, as this is the most restrictive license in the dataset.
- Date Issued: 2023-04-30
- Date Modified: 2023-05-08
- Landing page: nanopubs (dev)
- Conforms To: Metadata (https://w3id.org/riverbench/schema/metadata)
Technical metadata
- Has stream element type: Quads (rb:quads)
- Has stream element count: 5,000,000
- Has stream element split:
- Type: Stream elements split by topic (rb:TopicStreamElementSplit)
- Comment: Each stream element is one nanopublication.
- Uses ontology:
- Conforms to W3C RDF 1.1 specification: yes
- Conforms to W3C RDF-star draft specification as of December 17, 2021: yes
- Uses generalized triples: no
- Uses generalized RDF datasets: no
- Uses RDF-star: no
Distributions
Full quad stream distribution
- Title: Full quad stream distribution
- Identifier: stream-full
- Has file name: stream_full.tar.gz
- Has distribution type:
- Full distribution (rb:fullDistribution)
- Quad stream distribution (rb:quadStreamDistribution)
- Has stream element count: 5,000,000
- Byte size: 1.02 GB
- Media type: application/trig
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
e89ece1e97fb2fbad5d2bcd0176877db
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
785c60e617925cc046c2be39753ca1ab097f63bb
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/stream_full.tar.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 236,997,015
- Unique count (estimated): 47,162,247
- Mean: 47.40
- Standard deviation: 5.68
- Minimum: 25
- Maximum: 142
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 35,643,551
- Unique count (estimated): 3,277,169
- Mean: 7.13
- Standard deviation: 0.60
- Minimum: 3
- Maximum: 34
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 18,591,732
- Unique count (estimated): 2,222,348
- Mean: 3.72
- Standard deviation: 2.05
- Minimum: 1
- Maximum: 20
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 8,423,981
- Unique count (estimated): 21,459
- Mean: 1.68
- Standard deviation: 0.59
- Minimum: 0
- Maximum: 4
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 8,627,838
- Unique count (estimated): 1,033,269
- Mean: 1.73
- Standard deviation: 1.48
- Minimum: 0
- Maximum: 32
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 46,588,350
- Mean: 9.32
- Standard deviation: 4.43
- Minimum: 3
- Maximum: 68
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 97,782,581
- Mean: 19.56
- Standard deviation: 1.81
- Minimum: 11
- Maximum: 22
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 159,866,280
- Mean: 31.97
- Standard deviation: 6.03
- Minimum: 14
- Maximum: 157
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 20,000,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 171,885,662
- Mean: 34.38
- Standard deviation: 9.93
- Minimum: 16
- Maximum: 196
Full flat distribution
- Title: Full flat distribution
- Identifier: flat-full
- Has file name: flat_full.nq.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Full distribution (rb:fullDistribution)
- Has stream element count: 5,000,000
- Byte size: 1.68 GB
- Media type: application/n-quads
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
387dc1e1adc92f2fd91b6c315fdf9436
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
110d9c128ca0d9e48f806086a373ea3639cf94f4
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/flat_full.nq.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 236,997,015
- Unique count (estimated): 47,162,247
- Mean: 47.40
- Standard deviation: 5.68
- Minimum: 25
- Maximum: 142
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 35,643,551
- Unique count (estimated): 3,277,169
- Mean: 7.13
- Standard deviation: 0.60
- Minimum: 3
- Maximum: 34
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 18,591,732
- Unique count (estimated): 2,222,348
- Mean: 3.72
- Standard deviation: 2.05
- Minimum: 1
- Maximum: 20
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 8,423,981
- Unique count (estimated): 21,459
- Mean: 1.68
- Standard deviation: 0.59
- Minimum: 0
- Maximum: 4
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 8,627,838
- Unique count (estimated): 1,033,269
- Mean: 1.73
- Standard deviation: 1.48
- Minimum: 0
- Maximum: 32
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 46,588,350
- Mean: 9.32
- Standard deviation: 4.43
- Minimum: 3
- Maximum: 68
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 97,782,581
- Mean: 19.56
- Standard deviation: 1.81
- Minimum: 11
- Maximum: 22
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 159,866,280
- Mean: 31.97
- Standard deviation: 6.03
- Minimum: 14
- Maximum: 157
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 20,000,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 171,885,662
- Mean: 34.38
- Standard deviation: 9.93
- Minimum: 16
- Maximum: 196
1M elements quad stream distribution
- Title: 1M elements quad stream distribution
- Identifier: stream-1m
- Has file name:
stream_1M.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Quad stream distribution (rb:quadStreamDistribution)
- Has stream element count: 1,000,000
- Byte size: 277.17 MB
- Media type: application/trig
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
8494894f7c70ea8ac325f5c785a04599
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
8ba64f3e9fafa34d3f81b535213fe20f4962485b
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/stream_1M.tar.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 48,861,863
- Unique count (estimated): 7,017,131
- Mean: 48.86
- Standard deviation: 3.03
- Minimum: 40
- Maximum: 100
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 7,030,328
- Unique count (estimated): 620,603
- Mean: 7.03
- Standard deviation: 0.24
- Minimum: 7
- Maximum: 22
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 2,321,606
- Unique count (estimated): 86,968
- Mean: 2.32
- Standard deviation: 1.11
- Minimum: 2
- Maximum: 20
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 1,944,837
- Unique count (estimated): 2,895
- Mean: 1.94
- Standard deviation: 0.27
- Minimum: 1
- Maximum: 3
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 2,763,885
- Unique count (estimated): 530,740
- Mean: 2.76
- Standard deviation: 0.81
- Minimum: 0
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 9,350,387
- Mean: 9.35
- Standard deviation: 2.33
- Minimum: 6
- Maximum: 52
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 20,738,344
- Mean: 20.74
- Standard deviation: 0.94
- Minimum: 17
- Maximum: 22
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 32,151,723
- Mean: 32.15
- Standard deviation: 3.09
- Minimum: 27
- Maximum: 89
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 4,000,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 33,423,542
- Mean: 33.42
- Standard deviation: 4.64
- Minimum: 28
- Maximum: 135
1M elements flat distribution
- Title: 1M elements flat distribution
- Identifier: flat-1m
- Has file name:
flat_1M.nq.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 1,000,000
- Byte size: 384.61 MB
- Media type: application/n-quads
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
1e2dec57fc8ceb1c212c994d0c8ea04c
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
9171cabf2cca60d5180de3ddfb0438115b65121f
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/flat_1M.nq.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 48,861,863
- Unique count (estimated): 7,017,131
- Mean: 48.86
- Standard deviation: 3.03
- Minimum: 40
- Maximum: 100
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 7,030,328
- Unique count (estimated): 620,603
- Mean: 7.03
- Standard deviation: 0.24
- Minimum: 7
- Maximum: 22
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 2,321,606
- Unique count (estimated): 86,968
- Mean: 2.32
- Standard deviation: 1.11
- Minimum: 2
- Maximum: 20
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 1,944,837
- Unique count (estimated): 2,895
- Mean: 1.94
- Standard deviation: 0.27
- Minimum: 1
- Maximum: 3
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 2,763,885
- Unique count (estimated): 530,740
- Mean: 2.76
- Standard deviation: 0.81
- Minimum: 0
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 9,350,387
- Mean: 9.35
- Standard deviation: 2.33
- Minimum: 6
- Maximum: 52
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 20,738,344
- Mean: 20.74
- Standard deviation: 0.94
- Minimum: 17
- Maximum: 22
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 32,151,723
- Mean: 32.15
- Standard deviation: 3.09
- Minimum: 27
- Maximum: 89
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 4,000,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 33,423,542
- Mean: 33.42
- Standard deviation: 4.64
- Minimum: 28
- Maximum: 135
100K elements quad stream distribution
- Title: 100K elements quad stream distribution
- Identifier:
stream-100k
- Has file name:
stream_100K.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Quad stream distribution (rb:quadStreamDistribution)
- Has stream element count: 100,000
- Byte size: 25.59 MB
- Media type: application/trig
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
66e0d00b260e50c5079ca2f1966a34e8
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
506fc730a0d79db55238fcd1344be40d191795e7
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/stream_100K.tar.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 4,907,266
- Unique count (estimated): 671,463
- Mean: 49.07
- Standard deviation: 1.94
- Minimum: 44
- Maximum: 50
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 700,000
- Unique count (estimated): 60,262
- Mean: 7.00
- Standard deviation: 0.00
- Minimum: 7
- Maximum: 7
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 200,000
- Unique count (estimated): 4
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 200,000
- Unique count (estimated): 154
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 300,000
- Unique count (estimated): 60,104
- Mean: 3.00
- Standard deviation: 0.00
- Minimum: 3
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 925,880
- Mean: 9.26
- Standard deviation: 1.55
- Minimum: 6
- Maximum: 10
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 2,100,000
- Mean: 21.00
- Standard deviation: 0.00
- Minimum: 21
- Maximum: 21
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 3,207,266
- Mean: 32.07
- Standard deviation: 1.94
- Minimum: 27
- Maximum: 33
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 400,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 3,307,350
- Mean: 33.07
- Standard deviation: 1.94
- Minimum: 29
- Maximum: 34
100K elements flat distribution
- Title: 100K elements flat distribution
- Identifier: flat-100k
- Has file name:
flat_100K.nq.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 100,000
- Byte size: 35.73 MB
- Media type: application/n-quads
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
7180fd388408b9e48075afa559dacf89
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
b38c9d95416a30a353ac46192f6ab18accc77084
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/flat_100K.nq.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 4,907,266
- Unique count (estimated): 671,463
- Mean: 49.07
- Standard deviation: 1.94
- Minimum: 44
- Maximum: 50
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 700,000
- Unique count (estimated): 60,262
- Mean: 7.00
- Standard deviation: 0.00
- Minimum: 7
- Maximum: 7
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 200,000
- Unique count (estimated): 4
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 200,000
- Unique count (estimated): 154
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 300,000
- Unique count (estimated): 60,104
- Mean: 3.00
- Standard deviation: 0.00
- Minimum: 3
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 925,880
- Mean: 9.26
- Standard deviation: 1.55
- Minimum: 6
- Maximum: 10
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 2,100,000
- Mean: 21.00
- Standard deviation: 0.00
- Minimum: 21
- Maximum: 21
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 3,207,266
- Mean: 32.07
- Standard deviation: 1.94
- Minimum: 27
- Maximum: 33
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 400,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 3,307,350
- Mean: 33.07
- Standard deviation: 1.94
- Minimum: 29
- Maximum: 34
10K elements quad stream distribution
- Title: 10K elements quad stream distribution
- Identifier: stream-10k
- Has file name:
stream_10K.tar.gz
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Quad stream distribution (rb:quadStreamDistribution)
- Has stream element count: 10,000
- Byte size: 2.55 MB
- Media type: application/trig
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
690afb9f81ffeae3f1cc4bfef18adbcf
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
7d9261549b0f712a37822d2a7cbb73bc1346c0e3
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/stream_10K.tar.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 500,000
- Unique count (estimated): 73,219
- Mean: 50.00
- Standard deviation: 0.00
- Minimum: 50
- Maximum: 50
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 70,000
- Unique count (estimated): 6,874
- Mean: 7.00
- Standard deviation: 0.00
- Minimum: 7
- Maximum: 7
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 20,000
- Unique count (estimated): 2
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 20,000
- Unique count (estimated): 15
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 30,000
- Unique count (estimated): 6,857
- Mean: 3.00
- Standard deviation: 0.00
- Minimum: 3
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 100,000
- Mean: 10.00
- Standard deviation: 0.00
- Minimum: 10
- Maximum: 10
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 210,000
- Mean: 21.00
- Standard deviation: 0.00
- Minimum: 21
- Maximum: 21
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 330,000
- Mean: 33.00
- Standard deviation: 0.00
- Minimum: 33
- Maximum: 33
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 40,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 340,000
- Mean: 34.00
- Standard deviation: 0.00
- Minimum: 34
- Maximum: 34
10K elements flat distribution
- Title: 10K elements flat distribution
- Identifier: flat-10k
- Has file name:
flat_10K.nq.gz
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 10,000
- Byte size: 3.47 MB
- Media type: application/n-quads
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
6d6e1ec6871a259250b61bab84d0d810
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
e2415dd8ec272c94922beb79f2a025146d2db319
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/nanopubs/dev/files/flat_10K.nq.gz
Has statistics
IRI count statistics
- Type: IRI count statistics (rb:IriCountStatistics)
- Sum: 500,000
- Unique count (estimated): 73,219
- Mean: 50.00
- Standard deviation: 0.00
- Minimum: 50
- Maximum: 50
Blank node count statistics
- Type: Blank node count statistics (rb:BlankNodeCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Literal count statistics
- Type: Literal count statistics (rb:LiteralCountStatistics)
- Sum: 70,000
- Unique count (estimated): 6,874
- Mean: 7.00
- Standard deviation: 0.00
- Minimum: 7
- Maximum: 7
Simple literal count statistics
- Type: Simple literal count statistics (rb:SimpleLiteralCountStatistics)
- Sum: 20,000
- Unique count (estimated): 2
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Datatype literal count statistics
- Type: Datatype literal count statistics (rb:DatatypeLiteralCountStatistics)
- Sum: 20,000
- Unique count (estimated): 15
- Mean: 2.00
- Standard deviation: 0.00
- Minimum: 2
- Maximum: 2
Language string count statistics
- Type: Language string count statistics (rb:LanguageLiteralCountStatistics)
- Sum: 30,000
- Unique count (estimated): 6,857
- Mean: 3.00
- Standard deviation: 0.00
- Minimum: 3
- Maximum: 3
Quoted triple count statistics
- Type: Quoted triple count statistics (rb:QuotedTripleCountStatistics)
- Sum: 0
- Mean: 0.00
- Standard deviation: 0.00
- Minimum: 0
- Maximum: 0
Subject count statistics
- Type: Subject count statistics (rb:SubjectCountStatistics)
- Sum: 100,000
- Mean: 10.00
- Standard deviation: 0.00
- Minimum: 10
- Maximum: 10
Predicate count statistics
- Type: Predicate count statistics (rb:PredicateCountStatistics)
- Sum: 210,000
- Mean: 21.00
- Standard deviation: 0.00
- Minimum: 21
- Maximum: 21
Object count statistics
- Type: Object count statistics (rb:ObjectCountStatistics)
- Sum: 330,000
- Mean: 33.00
- Standard deviation: 0.00
- Minimum: 33
- Maximum: 33
Graph count statistics
- Type: Graph count statistics (rb:GraphCountStatistics)
- Sum: 40,000
- Mean: 4.00
- Standard deviation: 0.00
- Minimum: 4
- Maximum: 4
Statement count statistics
- Type: Statement count statistics (rb:StatementCountStatistics)
- Sum: 340,000
- Mean: 34.00
- Standard deviation: 0.00
- Minimum: 34
- Maximum: 34