We are proud to announce a new and massive Wikibase project that should keep a large community busy for far more than a year: Last month the president of the University of Erfurt, Prof. Dr. Walter Bauer-Wabnegg, and Dr. Elisabeth Niggemann, director-general of the German National Library in Frankfurt and Leipzig (DNB) signed a memorandum of understanding that aims to bring GND data into the FactGrid – on a grand scale.
The GND, the German Integrated Authority File, is an authority file of millions of persons plus corporate bodies, conferences and events, geographic information, topics and works – designed to shape the exchange between libraries, archives and academic projects in the DACH countries of Germany, Austria and Switzerland.
integrating the GND into the FactGrid had been our constant topic of discussion during the last year. A Wikibase instance becomes a cool thing to contribute to, as soon as it becomes the research tool that you would use yourself in your research. GND data links into the world of open data; they clarify who or what you are speaking of in your research in all German-language contexts – and they will reach out to the other global authority files and to the universe of library data.
In April 2018 it became clearer that the FactGrid would eventually be one of several Wikibase instances which could and should in this case aim for a larger federation. Early in June it transpired that the German National Library was on its way to test Wikibase in a software evaluation, with the aim to run possibly about ten Wikibase instances in a constant exchange with each other. That was when we contacted the DNB with our own agenda to import their data. We wanted to try, so that our proposal, could become a platform for “original research” – a platform without GND or Wikidata criteria of notability – in the evolving network of Wikibase platforms. Users will be allowed to create Q-Numbers for infants who died right after birth on FactGrid, and the GND and Wikidata will be free to decide under their criteria of notability and relevance, whether they would like to use our information – information they can now quote as original research from the FactGrid platform (with the detailed information of the projects behind this research).
Whilst the GND is CCO and free to be copied, the open joint venture with the German National Library aims to bring transparency into the data input. The more transparency we can bring into all the design decisions in this early stage, the better the wikibase platforms we are heading towards, will eventually be able to communicate with each other.
Now a team has to be formed. The German National Library and the Gotha research institutions of the University of Erfurt will send members into the team. The question is: Will we be able to broaden this team? We should have experts from the Wikimedia communities on board – people who know Wikibase and Wikidata, people who are used to community work on a regular wiki.
- We would like to attract people who know how to formulate SPARQL searches and who will be able to test data models and make suggestions for the improved data models we should use, in order to handle the massive data sets we are expecting.
- We’re looking for Wikibase experts who know how to bring in tens of millions of records into a Wikibase installation, and who know how to interconnect these records with genealogical and geographical links.
- We do not yet know how we will keep the FactGrid manageable with respect to the wave of doublets and name parallels we are facing: The GND has these name parallels in unprecedented numbers. We will have to find ways to quickly inform researchers whether a person they have found in a document is already on the FactGrid or whether they will have to create the item. The hunt for items to be merged will become a permanent issue and we do not yet know how to technically support a community on this collective quest.
- We will create new and complex fields of expertise: Millions of personal data sets will come with career statements. The FactGrid will turn all these statements into Q-Items, which we will have to organise in order to allow sociological searches for instance. The FactGrid project on historical jobs and their evolution will be only one of these projects.
- We need players with Wikipedia experience: Though we will restrict ourselves to clear name accounts, we widely invite users with professional to private ambition to join the platform with their projects – whether they are focused on private genealogy or on publicly funded historical research.
- We will have to provide a simplified FactGrid user interface that will bypass the SPARQL QueryService and the mushrooming Wikibase input pages. Magnus Manske’s Reasonator might become our standard interface for regular users, who will access the FactGrid as if they are accessing library catalogues – through organsied input forms.
- We will eventually need help with database maintenance. It is particularly unfortunate that our project is primarily the work of historians, who do not always have a keen eye on how to optimally supply this technology.
The FactGrid will grow – and it will offer plenty of space for people to develop their own projects within this growth.
Scan of the Memorandum of Understanding (in German)
- Barbara Fischer & Jens Ohlig, “Neues Testfeld für Wikibase: Eine Bundesbehörde geht auf Expedition im Wikiversum.” 2019-05-09 at https://blog.wikimedia.de