Factgrid Federated: How to retrieve data from Wikidata and DBpedia from the Factgrid SPARQL endpoint

Unfortunately becoming an official source for federated {something missing} from Wikidata is not as easy as one writing Factgrid on a waiting list. More over the time perspective seems in unclear. {Gib mir den Absatz auf Deutsch…}

{Auch der nächste Satz, sags mir auf Deutsch und ich sags auf Englisch…} But: Becoming Factgrid becoming a starting point for federated queries to Wikidata and DBpedia is much easier. Thanks to Lucas Werkmeister from Wikimedia Deutschland the Factgrid SPARQL endpoint is now able to request data from Wikidata as well as DBpedia, the linked open data generated from Wikipedia articles.

So Factgrid is now not an isolated database anymore, but integrated in the linked open data universe. An easy example: Every Factgrid item that has a Wikidata QID can be queried for property-value statements from the both Wikidata and DBpedia.

How does it work? Let me explaint it quickly:

Use Prefixes as a gateway to other data sources

The main difference {between what and what?} is that you have to take into consideration the data sources with prefixes at the top of your query. A prefix is basically a signpost or gateway to other ontologies and data sources and must be included at the top of queries.

Since Factgrid uses the same software as Wikidata, the default it to use the prefix wd for items and wdt for properties in Factgrid, which is the default for Wikidata, hence the wd. This works fine if only one data source is used.

Integrating Wikidata in Factgrid queries, it makes sense to change the prefixes in such a way the data source is visible at first glance. I decided to use fg_ for Factgrid, wd_ for Wikidata and db_ for DBpedia. Of course, you can use any other prefix as signpost, like factgrid_item as long as you declare it as a prefix at the top, e.g. `PREFIX fgfactgrid_item: <https://database.factgrid.de/entity/>.

Possible prefixes:

# Factgrid
PREFIX fg: <https://database.factgrid.de/entity/>
PREFIX fgt: <https://database.factgrid.de/prop/direct/>
# DBPedia Categories
PREFIX dbc: <http://dbpedia.org/resource/Category:>
# dbpedia ontology
PREFIX dbo: <http://dbpedia.org/ontology/>
# dbpedia resource
PREFIX dbr: <http://dbpedia.org/resource/>
# Wikidata Prefixes
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
# standard prefixes
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX dct: <http://purl.org/dc/terms/>

Structure of federated queries

There are many to Rome and to federated queries. The ones I generated so far have the follwing basic structure:

  • set the prefixes
  • look up a specific item in Factgrid named fg_item
  • look for the Wikidata QID of the Factgrid item and convert this string it to an IRI called wd_item in order to use it for quering Wikidata or DBpedia
  • get the item in DBpedia using the OWL ontology with that specific Wikidata QID ?db_item owl:sameAs ?wd_item
  • define properties that are of interested as [VALUES](https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial#VALUES) and give them a name, e.g. relations_db
  • get the resulting value

Examples for federated queries

Magnus Hirscheld partners

Time for an example. Let’s search for unmarried partners of the famous German sexologist Magnus Hirschfeld in Factgrid, Wikidata and DBpedia. The query behind this link looks up the specific properties for unmarried partners in all three sources and delivers the name as well as an image. The technical parts included as comments.

As you can see, there are three lines. One is empty, it’s the first data source, Factgrid, that has nothing to offer for that request. The second row is from DBpedia, because the _db columns are filled, the first is from Wikidata (_wd) .

So there are two partners in those three data sources: Karl Giese in DBpedia and Li Shiu Tong in Wikidata. Both are correct, but neither Wikidata nor DBpedia have all information. Just by combining the sources we get the full picture.

A note on property Labels: I don’t know yet how to include both property labels from two sources (Factgrid and Wikidata), because both rely on the PREFIX wikibase: <http://wikiba.se/ontology#>.

Get all persons that are mentioned in a Factgrid item’s Wikipedia article and show their image

Like before, the example is Magnus Hirschfeld.

Link to the query

A tool to make mass comparisons

Relying on these query mechanisms I built a tool to compare Factgrid and Wikipedia statements in bulk. I make use of Factgrids P343, that offer the translation between Factgrid and Wikidata properties. In the first iteration of the app, it only works for properties relating to people.

The goal is to find missing relations on both Factgrid and Wikidata and create triples automatically that can be imported to Wikidata or Factgrid, including a source and timestamp.

For example, Factgrid’s Property for unmarried Partner is P117 and it’s corresponding Wikidata property P451 is entered on Factgrid. This let’s us search for all Factgrid Items that have a Wikidata QID and compare the partners on Factgrid with the partners on Wikidata.

As of today, June 27, 2022, there are six relations that are both in Factgrid as well as Wikidata. For example, that Lida Gustava Heymann was Anita Augspurg’s partner can be found on Factgrid and Wikidata.

Seven statements are only in Factgrid, but not in Wikidata, e.g. that Amalie Zephyrine von Salm-Kyrburg’s partner was Alexandre François Marie de Beauharnais. Since both, Salm-Kyrburg and de Beauharnais have a Wikidata QID in Factgrid, we can automatically build the import statements for Wikidata Quickstatements Tool. You get those import statements when you click “Download data for Wikidata import”. Copy the content of the downloaded file and paste it into the Quickstatement Tool. Besides the statement, it includes a source and timestamp.

It works the other way, too. There are three unmarried partnerships of Factgrid items in Wikidata, that Factgrid has not covered. One might want to have those statements in Factgrid as well and with downloading the file and importing it with Factgrids Quickstatements Tool it can be easily included in Factgrid’s data.

But this only works for items that already exist in Factgrid. You can detect them, when the column value looks like Factgrid, Wikidata. If there’s only value = Wikidata then it cannot be imported straightahead, because it only exists so far only in Wikidata. This is the case for Anne Louise Germaine de Staël’s partner Louis Marie de Narbonne-Lara in row 1. So even we the tool detects three partnerships in Wikidata missing in Factgrid, only two can be imported quickly.

In contrast to the tables showing the statements in both data sources and the statements only in Factgrid, there are no labels, just links for the values. It would be possible to fetch them from Wikidata, but it would take quite a time load them. In favor of loading time this information is missing.

It might be the case that you are only interesed in a certain subset of Factgrid items. Then you can add a filter in the text field on the left sidebar.

Try it yourself

The link to the app is apps.katharinabrunner.de/compare-factgrid-wikidata/. Try it yourself, I am looking forward to your feedback!

Soon, I want to extend the app such that it works on all other properties as well that allow a comparison between Factgrid and Wikidata.

Want to know more about federated queries?

The technical documentation of Wikidata is excellent and offers many, many examples. A starting point for federated queries can be found here.

Leave a Reply

Your email address will not be published. Required fields are marked *