The (sobering) status report of Friday 13, April 2018

[A version of this was originally posted here]

[Postscript Friday 4, May 2018: SPARQL is on, we are in the middle of our first more massiv data input]

Four months have passed since the kick-off workshop shop, and the FactGrid project has run into its first unexpected problems. We are confident that we will solve the – primarily technical – issues, but one of the lessons we have learned so far is that we will need the support of a larger community in order to situate the FactGrid Project with more impact in the Wikidata-community.

What do we want to achieve? We are still trying to launch a Wikibase installation with the aim to offer a platform for original research. Data hosted on the FactGrid will be free to be used by Wikidata. Data will leave the FactGrid database, however, with the personal authorisations of research which Wikidata is not be able to generate.

Digital humanities projects interested to work on the collective FactGrid platform will sponsor software developments with their respective DH-funding. The cooperation with Wikimedia should make sure that tools sponsored by us will become part of the wider Wikibase software package. We want to prevent island solutions.

What kinds of problems have we been facing? And where do we need you?

Problem 1: The software is free but the vital tools do not work outside the Wikidata environment.

Wikibase is – relatively – easy to install, but the central tools you need in order to get data into and out of the database – QuickStatements and SPARQL – proved to be hard wired to the original Wikidata compound. Lucas Werkmeister has managed to free Quick-Statements from these ties. SPARQL remains on his agenda. We have no idea how tools that use SPARQL (in order to create visualisations for instance) will work once we have the independent SPARQL version. The software problems have blasted our entire schedule.

Problem 2: Getting the first sets of data into the FactGrid.

We have four larger spread sheets of data from the Gotha Illuminati project which we want to use in order to create an attractive show case.

Our data sets are intriguing and able to attract a wider public interest without further advertisement. They should create steam for the engine if we manage to convinced all the parties involved (Freemason, Berlin State Archive, Gotha Reasearch Centre and the Wikimedia Community) to risk a crowd sourced identification of the roughly 6,000 digitised documents which we have been gathering over the last four years. We have underestimated, however, the problems an empty database (a database without any properties and any primary items) is causing.

If you feel cool with QuickStatements and if you think an empty wikibase installation is just the free space you have been dreaming of, join the team and help us to learn how we can use our data with the brilliant software.

Problem 3: We will need a more massive Wikidata and/or GND input.

We will need our own landscape of information ready to be improved if we want to attract other projects of historical research and regular internet users (with wider a genealogical project for instance). A strategist is here needed, someone with ideas how we could (for instance) acquire all the names of people with birth dates between 1400 and 1800 from Wkidata and/or the GND for our database. (To keep the database clean we might focus on basic data like names, birth dates, places of birth and death, and family connections). Wikibase fans who feel you could organise such an import, feel inspired! We would offer you all the freedom of the experiment you would ask for.

Problem 4: We will need something like forms which users can fill in in order to create standardised CVs with the Wikibase software.

Adrian Heine has taken the first steps into this project. Our aim is a Wikibase environment which normal people can correspond with like they have been corresponding with the Wikipedia software so far. You pick a person of your interest and you get a questionnaire with modules (on places and addresses that the respective person has lived, on employments he or she has been in, on the person’s genealogy, on personal contacts we can prove). Wikibase is presently not exactly ready to be edited by normal people.

Problem 5 (a project for the future): Wikibase needs something like a standard Wikibase-Interpreter

Magnus Manske’s Reasonator has been the cool thing on all my presentations of the Wikibase software in DH-circles. You can pick your language and you get an organised data sheet.

Things get difficult if you want to correct or augment the Reasonator’s information sheet; and things get even more difficult if you want to run the Reasonator on your own platform. The development of an immediate interface that produces smooth pages of structured information will be necessary in order to motivate people to gather information for Wikidata (or any affiliate). The Wikibase-Interpreter would be ready to offer the complete knowledge on any field of interest. It would be ready to list all the places a person is known to have visited, all the contacts he or she is known to have had – whether face to face or through letters. Think of the thousands of contacts of the Leibniz’ correspondence – a problem to be solved with pages that give an overview and “more” on the user’s particular request. The Reasonator is, so far not reading Wikidata directly, nor is it part of the Wikimedia software development. Wikidata will need its own Interpreter in order to become an independent source of information – an independent source that also serves all the Wikipedias around the Globe.

We need to change the way we are organising all this

We have been able to offer a couple of grants in 2017 in order to get the project going. We should be able to use the further funding of DH-projects interested in the software and the collaborative platform to inspire if not to fully finance future tools. DH-projects will, however, only risk a cooperation with the FactGrid project and with Wikimedia as the software and data-partner if we manage to offer an attractive show case of what can be done. The Illuminati files are an immensely cool project to begin with. The global interest in these files is huge, everyone has heard of the Illuminati and here you get their most secret files. Visualisations of networks and of the geographical spread of the secret order will find a good test case here. If we should be able to inspire a crowd sourced annotation of all the known documents – that would stir up a global press attention.

We are, however, far from the show case which we could present anywhere at this moment.