Learning SPARQL with Wikidata

SPARQL has a lot of similarities to SQL and also is very different. Here I am going to get the English works of Rabindranath Tagore from Wikidata. I will use this as an exercise to learn SPARQL. As you will see, I have not tried to explain the terms here as I expect some knowledge of SQL and Semantic Triples. Since I go step by step, I think it's easy to understand.

This blog post is in continuation to my other blog posts about Learning about Semantic Web and TripleTrying WikiData.

Let start with getting all the works from Wikidata. You can run the queries on Wikidata's query interface here. Paste the query and press play button to execute.

Wikidata Query Interface
Wikidata Query Interface

SELECT and LIMIT

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?author
WHERE {
   ?work wdt:P50 ?author .
}
LIMIT 10

Here ?work ?author are variables. Here are we are trying to get any ?work that has an ?author. So the statement

?work wdt:P50 ?author

is like any other semantic triple.

subject a predicate and an object 

But make sure it ends with a period aka .

wdt:P50 denotes author property. And we are LIMITing the result set to 10. Since we are just exploring.

FILTER

So our query lists anything that has author property. But now what if we want to get just the works of Rabindranath Tagore. We can do that by comparing the author property to Rabindranath Tagore. WikidataId of is Rabindranath Tagore - Q7241

We can add a FILER to filter out

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?author
WHERE {
    ?work wdt:P50 ?author .
    FILTER ( ?author = wd:Q7241 )
}

VALUES

Another way is to use VALUEs keyword. It assigns a value to a variable so it beccomes easy to substitute. So the query gets simpler

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?work ?author
WHERE {
  ?work wdt:P50 ?author .
  VALUES ( ?author ) {
    ( wd:Q7241 )
  } 
  
}

Now let's go one step further and get only English works. That's done based on the attribute of the work, Language. The attribute is wdt:P407. And the Wikidata entity for English language is wd:Q1860. So we can filter it.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?author ?pubLang
WHERE {
  VALUES ( ?author ?pubLang) {
    ( wd:Q7241 wd:Q1860)
  }
  ?work wdt:P50 ?author .
  ?work wdt:P407 ?pubLang .  
}

Let's get the title of the work.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?title ?pubLang ?pubDate
WHERE {
  VALUES ( ?author ?pubLang ) {
    ( wd:Q7241 wd:Q1860  )
  }  

  ?work wdt:P50 ?author .
  ?work wdt:P1476 ?title .
  ?work wdt:P407 ?pubLang .
  ?work wdt::P577 ?pubDate .
}

OPTIONAL

Now let's get published date wdt:P577. But lets make it optional. Like an outer join. Include even if the property does't exist or have a value.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?title ?pubLang ?pubDate
WHERE {
  VALUES ( ?author ?pubLang ) {
    ( wd:Q7241 wd:Q1860  )
  }  
  
  ?work wdt:P50 ?author .
  ?work wdt:P1476 ?title .
  ?work wdt:P407 ?pubLang .
  OPTIONAL { ?work wdt:P577 ?pubDate } . 
}

Similarly get the label of the language. In wikidata the labels can exist in many languages. So we are going to filter it only for English language labels

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?title ?pubLang ?pubDate ?langauge
WHERE {
  VALUES ( ?author ?pubLang ) {
    ( wd:Q7241 wd:Q1860  )
  }  
  
  ?work wdt:P50 ?author .
  ?work wdt:P1476 ?title .
  ?work wdt:P407 ?pubLang .
  OPTIONAL { ?pubLang rdfs:label ?langauge } .
  OPTIONAL { ?work wdt:P577 ?pubDate } . 
  FILTER(LANG(?langauge) = 'en')
}

ORDER BY

And then sort by ?pubDate in descending order.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?work ?title ?pubLang ?pubDate ?langauge
WHERE {
  VALUES ( ?author ?pubLang ) {
    ( wd:Q7241 wd:Q1860  )
  }  
  
  ?work wdt:P50 ?author .
  ?work wdt:P1476 ?title .
  ?work wdt:P407 ?pubLang .
  OPTIONAL { ?pubLang rdfs:label ?langauge } .
  OPTIONAL { ?work wdt:P577 ?pubDate } . 
  FILTER(LANG(?langauge) = 'en')
}

ORDER BY DESC(?pubDate)

So when we run that query on Wikidata. We get a table and here is how it looks. (I have actually embedded the results iframe from Wikidata.

I will try and include other advanced features in the next part. What do you think? Was this was useful?


You can read this blog using RSS Feed. But if you are the person who loves getting emails, then you can join my readers by signing up.

Join 2,240 other subscribers