Skip to content

Dataset: politiquices (development version)

Support and opposition relations extracted from news articles archived in Arquivo.pt. The dataset describes news articles in Portuguese and the presented political stances. Dataset source, more information about the project (Portuguese).

Stream preview (click to expand)
0000000000.ttl
PREFIX ns1: <http://purl.org/dc/elements/1.1/>
PREFIX ns2: <http://www.politiquices.pt/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

[ ns2:ent1      <http://www.wikidata.org/entity/Q745134>;
  ns2:ent1_str  "Mota Amaral";
  ns2:ent2      <http://www.wikidata.org/entity/Q57398>;
  ns2:ent2_str  "Cavaco";
  ns2:score     "0.6829476952552795"^^xsd:float;
  ns2:type      "ent1_other_ent2";
  ns2:url       <https://www.linguateca.pt/CHAVE?PUBLICO-19940819-122>
] .

<https://www.linguateca.pt/CHAVE?PUBLICO-19940819-122>
        ns1:date   "1994-08-19"^^xsd:date;
        ns1:title  "Mota Amaral com Cavaco"@pt .
0000000010.ttl
PREFIX ns1: <http://www.politiquices.pt/>
PREFIX ns2: <http://purl.org/dc/elements/1.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

[ ns1:ent1      <http://www.wikidata.org/entity/Q737410>;
  ns1:ent1_str  "Manuel Alegre";
  ns1:ent2      <http://www.wikidata.org/entity/Q1688029>;
  ns1:ent2_str  "Jerónimo de Sousa";
  ns1:score     "1.0"^^xsd:float;
  ns1:type      "ent1_opposes_ent2";
  ns1:url       <https://publico.pt/1238588>
] .

<https://publico.pt/1238588>
        ns2:date   "2005-11-12"^^xsd:date;
        ns2:title  "Manuel Alegre critica campanha agressiva de Jerónimo de Sousa"@pt .
0000000100.ttl
PREFIX ns1: <http://www.politiquices.pt/>
PREFIX ns2: <http://purl.org/dc/elements/1.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

[ ns1:ent1      <http://www.wikidata.org/entity/Q76>;
  ns1:ent1_str  "Obama";
  ns1:ent2      <http://www.wikidata.org/entity/Q567>;
  ns1:ent2_str  "Merkel";
  ns1:score     "0.9981821775436401"^^xsd:float;
  ns1:type      "ent1_other_ent2";
  ns1:url       <https://arquivo.pt/wayback/20141127055917/http://www.publico.pt/mundo/noticia/o-ebola-e-a-mais-grave-urgencia-sanitaria-dos-ultimos-anos-dizem-obama-e-merkel-1673060>
] .

<https://arquivo.pt/wayback/20141127055917/http://www.publico.pt/mundo/noticia/o-ebola-e-a-mais-grave-urgencia-sanitaria-dos-ultimos-anos-dizem-obama-e-merkel-1673060>
        ns2:date   "2014-11-27"^^xsd:date;
        ns2:title  "O ébola é \"a mais grave urgência sanitária dos últimos anos\", dizem Obama e Merkel"@pt .
0000001000.ttl
PREFIX ns1: <http://www.politiquices.pt/>
PREFIX ns2: <http://purl.org/dc/elements/1.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

[ ns1:ent1      <http://www.wikidata.org/entity/Q2571494>;
  ns1:ent1_str  "Carlos César";
  ns1:ent2      <http://www.wikidata.org/entity/Q57398>;
  ns1:ent2_str  "Cavaco";
  ns1:score     "0.9920841455459595"^^xsd:float;
  ns1:type      "ent1_opposes_ent2";
  ns1:url       <https://arquivo.pt/wayback/20151119204457/http://observador.pt/2015/11/18/carlos-cesar-responsabiliza-cavaco-por-incontinencia-verbal-entre-os-partidos/>
] .

<https://arquivo.pt/wayback/20151119204457/http://observador.pt/2015/11/18/carlos-cesar-responsabiliza-cavaco-por-incontinencia-verbal-entre-os-partidos/>
        ns2:date   "2015-11-19"^^xsd:date;
        ns2:title  "Carlos César responsabiliza Cavaco por \"incontinência verbal\" entre os partidos"@pt .
0000010000.ttl
PREFIX ns1: <http://purl.org/dc/elements/1.1/>
PREFIX ns2: <http://www.politiquices.pt/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

[ ns2:ent1      <http://www.wikidata.org/entity/Q456034>;
  ns2:ent1_str  "Isabel dos Santos";
  ns2:ent2      <http://www.wikidata.org/entity/Q1318666>;
  ns2:ent2_str  "Ulrich";
  ns2:score     "1.0"^^xsd:float;
  ns2:type      "ent1_opposes_ent2";
  ns2:url       <https://arquivo.pt/wayback/20151003185118/http://economico.sapo.pt/noticias/isabel-dos-santos-vai-oporse-ao-plano-de-ulrich-para-angola_230444.html>
] .

<https://arquivo.pt/wayback/20151003185118/http://economico.sapo.pt/noticias/isabel-dos-santos-vai-oporse-ao-plano-de-ulrich-para-angola_230444.html>
        ns1:date   "2015-10-03"^^xsd:date;
        ns1:title  "Isabel dos Santos vai opor-se ao plano de Ulrich para Angola"@pt .

General information

Technical metadata

Distributions

The dataset is published in a few size variants, each containing a specific number of stream elements. For each size, there are three distribution types available: flat (just an N-Triples/N-Quads file), streaming (a .tar.gz archive with Turtle/TriG files, one file per stream element), and Jelly (a native binary format for streaming RDF). See the documentation for more details.

Distribution size Statements Flat Streaming Jelly
10K 90,000 1.6 MB 1.4 MB 1.4 MB
Full 159,957 2.9 MB 2.4 MB 2.5 MB

The full metadata of all distributions can be found below.

10K elements Jelly distribution

10K elements flat distribution

Full Jelly distribution

Full flat distribution

10K elements stream distribution

Full stream distribution

Statistics

Statistics for 10K distributions

  • Title: Statistics for 10K distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 119,998 ~10,704 12.00 0.01 11 12
Blank nodes 10,000 N/A 1.00 0.00 1 1
Literals 60,000 ~22,255 6.00 0.00 6 6
Simple literals 30,000 ~1,064 3.00 0.00 3 3
Datatype literals 20,000 ~11,259 2.00 0.00 2 2
Language literals 10,000 ~9,953 1.00 0.00 1 1
Datatypes 20,000 2 2.00 0.00 2 2
ASCII control chars 5 N/A 0.00 0.03 0 2
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 20,000 ~20,027 2.00 0.00 2 2
Predicates 90,000 ~9 9.00 0.00 9 9
Objects 89,998 ~32,878 9.00 0.01 8 9
Graphs 10,000 ~1 1.00 0.00 1 1
Statements 90,000 N/A 9.00 0.00 9 9
Bytes per statement N/A N/A 142.15 17.60 104.67 218.67

Statistics for full distributions

  • Title: Statistics for full distributions
Sum Unique Mean St. dev. Min. Max.
IRIs 213,274 ~18,562 12.00 0.01 11 12
Blank nodes 17,773 N/A 1.00 0.00 1 1
Literals 106,638 ~36,211 6.00 0.00 6 6
Simple literals 53,319 ~1,295 3.00 0.00 3 3
Datatype literals 35,546 ~17,272 2.00 0.00 2 2
Language literals 17,773 ~17,642 1.00 0.00 1 1
Datatypes 35,546 2 2.00 0.00 2 2
ASCII control chars 21 N/A 0.00 0.04 0 3
Quoted triples 0 N/A 0.00 0.00 0 0
Subjects 35,546 ~35,498 2.00 0.00 2 2
Predicates 159,957 ~9 9.00 0.00 9 9
Objects 159,955 ~54,678 9.00 0.01 8 9
Graphs 17,773 ~1 1.00 0.00 1 1
Statements 159,957 N/A 9.00 0.00 9 9
Bytes per statement N/A N/A 142.10 17.69 104.67 218.67