yago-annotated-facts (1.0.2)
This is a subset of the YAGO 4 knowledge base (paper), based on Wikidata, version from February 24, 2020. This dataset includes only the fact annotations in RDF-star, that is facts about facts. Each stream element corresponds to one item in Wikidata.
Download this metadata in RDF: Turtle, N-Triples, RDF/XML, Jelly
Source repository: yago-annotated-facts
Stream preview (click to expand)
<< <http://yago-knowledge.org/resource/_Q56236170> <http://schema.org/dissolutionDate> "2009"^^<http://www.w3.org/2001/XMLSchema#gYear> >>
<http://schema.org/endDate> "2009-12"^^<http://www.w3.org/2001/XMLSchema#gYearMonth>;
<http://schema.org/startDate> "2009-06"^^<http://www.w3.org/2001/XMLSchema#gYearMonth> .
<< <http://yago-knowledge.org/resource/Open_Science_Radio_Q18744554> <http://schema.org/creator> <http://yago-knowledge.org/resource/Matthias_Fromm_Q18748012> >>
<http://schema.org/startDate> "2013-01-02"^^<http://www.w3.org/2001/XMLSchema#date> .
<< <http://yago-knowledge.org/resource/Open_Science_Radio_Q18744554> <http://schema.org/creator> <http://yago-knowledge.org/resource/Konrad_Förstner_Q18744528> >>
<http://schema.org/startDate> "2014-01-19"^^<http://www.w3.org/2001/XMLSchema#date> .
<< <http://yago-knowledge.org/resource/Carnaval_na_avenida_Central,_atual_avenida_Rio_Branco_Q65621070> <http://schema.org/dateCreated> "1906-06-22"^^<http://www.w3.org/2001/XMLSchema#date> >>
<http://schema.org/endDate> "1906"^^<http://www.w3.org/2001/XMLSchema#gYear>;
<http://schema.org/startDate> "1906"^^<http://www.w3.org/2001/XMLSchema#gYear> .
<< <http://yago-knowledge.org/resource/Margherita_Cagol> <http://schema.org/nationality> <http://yago-knowledge.org/resource/Kingdom_of_Italy> >>
<http://schema.org/endDate> "1946-06-18"^^<http://www.w3.org/2001/XMLSchema#date>;
<http://schema.org/startDate> "1945-04-08"^^<http://www.w3.org/2001/XMLSchema#date> .
<< <http://yago-knowledge.org/resource/Margherita_Cagol> <http://schema.org/nationality> <http://yago-knowledge.org/resource/Italy> >>
<http://schema.org/endDate> "1975-06-05"^^<http://www.w3.org/2001/XMLSchema#date>;
<http://schema.org/startDate> "1946-06-18"^^<http://www.w3.org/2001/XMLSchema#date> .
<< <http://yago-knowledge.org/resource/Mihrengiz_Kadın> <http://schema.org/nationality> <http://yago-knowledge.org/resource/Ottoman_Empire> >>
<http://schema.org/endDate> "1923"^^<http://www.w3.org/2001/XMLSchema#gYear>;
<http://schema.org/startDate> "1869"^^<http://www.w3.org/2001/XMLSchema#gYear> .
General information
- Title: YAGO annotated facts (en)
- Identifier:
- Has version: 1.0.2
- Theme:
- Encyclopaedia (eurovoc:4137)
- Metadata (eurovoc:c_40f54e0c)
- Open data (eurovoc:c_5ea6e5c4)
- Creator:
- The creators and contributors of Wikidata (1)
- Name: The creators and contributors of Wikidata
- Homepage: https://www.wikidata.org/
- The YAGO team of Télécom Paris and the Max Planck Institute for Informatics (2)
- Name: The YAGO team of Télécom Paris and the Max Planck Institute for Informatics
- Homepage: https://yago-knowledge.org/contributors
- Piotr Sowiński (3)
- Name: Piotr Sowiński
- Nickname: Ostrzyciel
- Homepage:
- The creators and contributors of Wikidata (1)
- License: https://spdx.org/licenses/CC-BY-SA-3.0
- Source:
- Date Issued: 2023-04-30
- Date Modified: 2024-06-07
- Landing page: yago-annotated-facts (1.0.2)
- Conforms To: Metadata (https://w3id.org/riverbench/schema/metadata)
Technical metadata
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has stream element count: 617,768
- Has stream element split:
- Type: Stream elements split by topic (rb:TopicStreamElementSplit)
- Has subject shape:
- Comment: Custom target – subject of any quoted triple in the subject position. (en)
- Target custom: YAGO annotated facts target (rb:yagoTarget)
- Comment: Every stream element corresponds to one Wikidata item. (en)
- Uses vocabulary: http://schema.org/
- Conforms to W3C RDF 1.1 specification: no
- Conforms to W3C RDF-star draft specification as of December 17, 2021: yes
- Uses generalized triples: no
- Uses generalized RDF datasets: no
- Uses RDF-star: yes
Full stream distribution
- Title: Full stream distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has distribution type:
- Full distribution (rb:fullDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream element count: 617,768
- Byte size: 36.17 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/stream_full.tar.gz
- Statistics: statistics-full
Full Jelly distribution
- Title: Full Jelly distribution
- Identifier:
- Has file name:
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has distribution type:
- Full distribution (rb:fullDistribution)
- Jelly distribution (rb:jellyDistribution)
- Has stream element count: 617,768
- Byte size: 28.89 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/jelly_full.jelly.gz
- Statistics: statistics-full
Full flat distribution
- Title: Full flat distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Full distribution (rb:fullDistribution)
- Has stream element count: 617,768
- Byte size: 28.74 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/flat_full.nt.gz
- Statistics: statistics-full
100K elements stream distribution
- Title: 100K elements stream distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream element count: 100,000
- Byte size: 3.57 MB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/stream_100K.tar.gz
- Statistics: statistics-100k
100K elements Jelly distribution
- Title: 100K elements Jelly distribution
- Identifier:
- Has file name:
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (1)
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 100,000
- Byte size: 2.72 MB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/jelly_100K.jelly.gz
- Statistics: statistics-100k
100K elements flat distribution
- Title: 100K elements flat distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 100,000
- Byte size: 2.38 MB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/flat_100K.nt.gz
- Statistics: statistics-100k
10K elements stream distribution
- Title: 10K elements stream distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- Has distribution type:
- Partial distribution (rb:partialDistribution)
- Stream distribution (rb:streamDistribution)
- Has stream element count: 10,000
- Byte size: 376.46 KB
- Media type: text/turtle
- Packaging format: application/tar
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/stream_10K.tar.gz
- Statistics: statistics-10k
10K elements Jelly distribution
- Title: 10K elements Jelly distribution
- Identifier:
- Has file name:
- Has stream type usage:
- RDF stream type usage (1)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- RDF stream type usage (2)
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a stream of graphs. Each graph corresponds to the RDF-star annotations of one Wikidata item. (en)
- Has stream type: RDF subject graph stream (stax:subjectGraphStream)
- RDF stream type usage (1)
- Has distribution type:
- Jelly distribution (rb:jellyDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 10,000
- Byte size: 260.84 KB
- Media type: application/x-jelly-rdf
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/jelly_10K.jelly.gz
- Statistics: statistics-10k
10K elements flat distribution
- Title: 10K elements flat distribution
- Identifier:
- Has file name:
- Has stream type usage:
- Type: RDF stream type usage (stax:RdfStreamTypeUsage)
- Comment: The dataset can be viewed as a flattened stream of triples. (en)
- Has stream type: Flat RDF triple stream (stax:flatTripleStream)
- Has distribution type:
- Flat distribution (rb:flatDistribution)
- Partial distribution (rb:partialDistribution)
- Has stream element count: 10,000
- Byte size: 256.72 KB
- Media type: application/n-triples
- Compression format: application/gzip
- Checksum:
- Checksum (1)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_md5 (spdx:checksumAlgorithm_md5)
- Checksum (2)
- Type: Checksum (spdx:Checksum)
- ChecksumValue:
- Algorithm: ChecksumAlgorithm_sha1 (spdx:checksumAlgorithm_sha1)
- Checksum (1)
- Download URL: https://w3id.org/riverbench/datasets/yago-annotated-facts/1.0.2/files/flat_10K.nt.gz
- Statistics: statistics-10k
Statistics for full distributions
- Title: Statistics for full distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
IRIs | 3,631,687 | 594,855 | 5.88 | 3.22 | 3 | 853 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Graphs | 617,768 | 1 | 1.00 | 0.00 | 1 | 1 |
Statements | 2,484,547 | N/A | 4.02 | 6.10 | 1 | 1,455 |
Literals | 1,736,327 | 57,578 | 2.81 | 2.50 | 1 | 66 |
Simple literals | 211 | 174 | 0.00 | 0.02 | 0 | 3 |
Datatype literals | 1,736,116 | 57,405 | 2.81 | 2.50 | 1 | 66 |
Language literals | 0 | 0 | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 2,484,547 | N/A | 4.02 | 6.10 | 1 | 1,455 |
Subjects | 2,009,932 | 1,896,569 | 3.25 | 3.04 | 2 | 850 |
Predicates | 1,622,855 | 75 | 2.63 | 0.48 | 2 | 3 |
Objects | 3,127,393 | 166,380 | 5.06 | 5.06 | 1 | 853 |
Statistics for 100K distributions
- Title: Statistics for 100K distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
IRIs | 502,972 | 102,657 | 5.03 | 5.30 | 3 | 853 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Graphs | 100,000 | 1 | 1.00 | 0.00 | 1 | 1 |
Statements | 226,648 | N/A | 2.27 | 9.28 | 1 | 1,455 |
Literals | 187,612 | 37,329 | 1.88 | 0.98 | 1 | 49 |
Simple literals | 66 | 66 | 0.00 | 0.03 | 0 | 3 |
Datatype literals | 187,546 | 37,263 | 1.88 | 0.97 | 1 | 49 |
Language literals | 0 | 0 | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 226,648 | N/A | 2.27 | 9.28 | 1 | 1,455 |
Subjects | 246,103 | 237,937 | 2.46 | 5.24 | 2 | 850 |
Predicates | 257,646 | 32 | 2.58 | 0.49 | 2 | 3 |
Objects | 332,939 | 52,847 | 3.33 | 5.46 | 1 | 853 |
Statistics for 10K distributions
- Title: Statistics for 10K distributions
Sum | Unique | Mean | St. dev. | Min. | Max. | |
IRIs | 49,533 | 10,233 | 4.95 | 0.93 | 3 | 10 |
Blank nodes | 0 | N/A | 0.00 | 0.00 | 0 | 0 |
Graphs | 10,000 | 1 | 1.00 | 0.00 | 1 | 1 |
Statements | 22,977 | N/A | 2.30 | 1.34 | 1 | 10 |
Literals | 19,576 | 7,332 | 1.96 | 0.90 | 1 | 8 |
Simple literals | 0 | 0 | 0.00 | 0.00 | 0 | 0 |
Datatype literals | 19,576 | 7,332 | 1.96 | 0.90 | 1 | 8 |
Language literals | 0 | 0 | 0.00 | 0.00 | 0 | 0 |
Quoted triples | 22,977 | N/A | 2.30 | 1.34 | 1 | 10 |
Subjects | 23,762 | 23,762 | 2.38 | 0.53 | 2 | 7 |
Predicates | 26,100 | 12 | 2.61 | 0.49 | 2 | 3 |
Objects | 33,009 | 7,553 | 3.30 | 1.39 | 1 | 13 |