Saturday, March 25, 2023

Wikidata sparql examples

 Research in programming Wikidata/Cities - Wikiversity

Which country has the most sister cities?
#defaultView:BubbleChart
SELECT ?countryLabel (COUNT(?sister) as ?sisterCount) WHERE {       # Selecting number of distinct sister cities of particular country cities which are ... 
  SELECT DISTINCT ?countryLabel ?sister WHERE {                           
    VALUES ?cityTypes {wd:Q3957 wd:Q515 wd:Q1549591 wd:Q1637706}
    ?city wdt:P31 ?cityTypes.                                       # ... instances of different types of cities ...
    ?city wdt:P17 ?country.                                         # ... with filled property "country" ...
    ?city wdt:P190 ?sister.                                         # ... with filled property "sister city"
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  }                                 
}
GROUP BY ?countryLabel

ORDER BY DESC(?sisterCount)

Wikidata:SPARQL tutorial - Wikidata

# (film) items with "cast member P161" including "tom hanks Q2263"

SELECT DISTINCT ?item ?itemLabel WHERE {
 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
 {
  SELECT DISTINCT ?item WHERE {
   ?item p:P161 ?statement0. # cast member
   ?statement0 (ps:P161/(wdt:P279*)) wd:Q2263. # tom hanks
  }
  LIMIT 100
 }
}

# simplified, same
SELECT DISTINCT ?item ?itemLabel 
WHERE {
  ?item p:P161 ?statement0. # cast member
  ?statement0 (ps:P161/(wdt:P279*)) wd:Q2263. # tom hanks
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100


data: Freebase, Wikidata, DBpedia (from Wikipedia)

Freebase (database) - Wikipedia

Data Dumps  |  Freebase API (Deprecated)  |  Google Developers

1.9B RDF triples, 31 GB

Freebase Easy - Dataset Download

3.3 GB

Freebase Easy (Cities in Europe)








Both projects publish RDF data about entities. The source of the data is very different: whereas DBpedia extracts the data from the infoboxes, Wikidata will collect data entered through its interfaces. 

Data in Wikidata will also be annotated with its provenance: it does not simply state the population of Germany, but it also requires a source to be given for the data. The two data repositories will co-exist.



DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project.


The Developers page lists the file as 22 GB gzip compressed and 250 GB uncompressed, although a recent download exceeds this file size (a May 2016 download amounted to >30 GB compressed and >400 GB uncompressed).