The data set in basic queries:
|
The links above give access to our first controlled vocabulary on FactGrid: “The FactGrid vocabulary of types of functional texts.”
Types of functional texts are not a matter of course. The corresponding genres of literary texts are, with all the problems discussed in the literary debate, far better known. The genres of functional texts are less controversial, but also less comprehensive. You find them in any mass of public records in the form of “insurance policies,” “interrogation protocols,” and “school reports,” to the odd “delousing certificate.” They do their jobs – so why collect the terms?
The technical answer is that Wikibase is software that invites you to use very specific vocabularies. You can run SPARQL queries on these specific terms and they will retrieve the “delousing certificate” in the mass of data, and you can, with very simple switches, bring far broader fields into view. The reduction to fields is the first step into statistics. If you can bring the variety under broader headings you can get a quick view of any vast production in your table. The vocabularies you need for this purpose have to be more than just lists of words. They need categorisations, common denominators above the words, ontologies.
A linguist’s perspective and data model
Our “Vocabulary of types of functional texts” has the required structural depth to allow statistical analysis and the broader analysis of larger bodies of texts. So far it is based primarily on the two Properties P894 “Eckard Rolf class of functional text types” and P912 “Speech act qualities.” The analysis is under both properties based on Eckard Rolf’s Die Funktionen der Gebrauchstextsorten (Berlin/ New York, 1993). Tobias Christ asked for the import of this vocabulary and its inherent structure for a project on functional texts of Germany’s Nazi era. He will explore handbooks for the organisers of Hitler Youth camps, official directives on the insignia of uniforms, etc. Rolf’s book is immensely practical with its in-depth analysis of 2055 terms arranged here in the five branches of illocutionary acts. Types of functional texts, under this premise, are essentially illocutionary speech acts as proposed by Austin and Searle in the 1950s and 1960 in their five branches:
- assertives = speech acts that commit a speaker to the truth of the expressed proposition,
- directives = speech acts that are to cause the hearer to take a particular action, e.g. requests, commands and advice,
- commissives = speech acts that commit a speaker to some future action, e.g. promises and oaths,
- expressives = speech acts that express on the speaker’s attitudes and emotions towards the proposition, e.g. congratulations, excuses and thanks
- declarations = speech acts that change the reality in accord with the proposition of the declaration, e.g. baptisms, pronouncing someone guilty or pronouncing someone husband and wife
…thus the Wikipedia article illocutionary acts. Rolf deployed three further layers underneath this basic differentiation: two layers (of more or less specific) options on how the respective aims can be achieved and the fourth layer of situational conditions. The “delousing certificate” is under this matrix a “declarative statement” (a person is “declared” to be free of lice after the required treatment). The certificate will add a “personal dimension” to the bearer of the certificate – he or she will be free again to interact with others with the legitimation of the certificate. The statement is finally “body related.” Rolf created 100 groups under these four layers. The “delousing certificate” is in group “DECLA 12” together with the “allergy passport” or the “vaccination certificate.” Other types of texts do different things differently: A “doctoral thesis” (ASS 24) is an “assertive” – it commits the speaker to the truth of his or her exploration. The work is supposed to be “descriptive” and “argumentative.” The author will hand in this work with the “intention to gain a specific qualification.” Neighbouring types of texts such as the “book review” share some but not all features: Book reviews are again “assertives” and “descriptive” but without the author’s intention to gain a specific qualification with them. Their focus lies on a “judgment” they pass.
The following search gives the entire vocabulary in the four languages that are presently fully supported with Eckard Rolf’s primary classes and their basic categorisation:
- Types of functional texts, generic terms in German, English, French, Spanish with Eckard Rolf’s bottom-line classification https://tinyurl.com/27ctrfam
The actual set of words is – especially on its German side – larger than the set of items. Rolf had separated terms like “Jagdschein” and “Jagdkarte” – in this case to have the German and the Austrian terms. In English both things are “hunting permits” unless we decide to offer individual hunting permits all around the world. In other cases the differences were stylistic, created by registers that could not be reproduced in English, French, or Spanish. We eventually reduced the set to objects of essentially the same meaning. About 200 words are now variants in the alias sections and on the P34 “naming” Property where they can attract explanations of their proper use.
Eckard Rolf presented his original 2055 terms together with a series of structural overviews. Tobias Christ offered a condensed PDF-version of these visualisations in a single tree structure. You can use the EntiTree App to show rhis structure on your screen:
Any of the nodes in the representation can determine a specific query of the terms at the end of the ensuing ramification. This is the complete list of speech act qualities searchable on the P912 Property of “Qualities of speech acts”:
- All FactGrid “Qualities of speech acts” https://tinyurl.com/25uf6h4d
It is just as easy to generate statistics on each structural level. Here is the visualisation of the top level in a bar chart:
The following four searches give the scripts for each level (the level difference is determined with the Q-Item in line 8):
- The five illocutionary purposes of speech acts Q538467
- Division by general way to achieve the purpose Q538468
- Division by specific way to achieve the purpose Q538469
- Division by primary conditions of speech acts Q538470
The searches above cover the entire terminological set so they can now be run on any specific body of texts.
The open tool
The FactGrid database version of Eckard Rolf’s structural analysis should turn the book’s considerations into an immensely practical tool ready to download into any other software environment and ready to be expanded on FactGrid. New types will not compromise the original set – it remains intact through the statement P124+Q514322 (“listed in Eckard Rolf, Die Funktionen der Gebrauchstextsorten”) that is made on every individual word:
- The 2055 German generic categories Eckard Rolf listed in 1993 https://tinyurl.com/2elvdm7u
The best way to add a new term is to find neighbouring terms and to adopt their statements. FactGrid already had a couple of candidates, such as the popular “Briefsteller” (the “letter writer’s guide”) of the German 18th century, or the “Quibus Licet” (the letter which Illuminati had to hand in every month to stay in contact with the “unknown superiors”).
Our first “controlled vocabulary” is with these preliminary remarks still very much of an experiment. —
- It will be interesting to offer the generic terms also as “Lexemes”: — Wikibase Lexemes are special entities that organise individual words in their languages.
- The French and Spanish labels in particular are still very artificial translations – we should have original terms for each of these items as referenced in historical documents.
- The present vocabulary is not yet matched with external databases. Wikidata and the GND are the two most urgent data partners here. The following search gives the matching so far: https://tinyurl.com/284jbjhc
- The linguist’s categorisation should be seen as one option to make sense of all these terms. One can easily think of other qualities of speech acts. In our preliminary talk Rolf proposed to explore, for instance, the assumed-sincerity dimension in many of these speech acts. “Lip service” was his example – a speech act where the “honesty of the emittent is unclear or doubtful.” One can just as well create completely independent properties on features of genres beyond the linguist’s interest.
- We should eventually expand this work. A vocabulary of genres in all the arts and literature would be of interest here. One would balance such a vocabulary with a particular vocabulary of “historical generic terms.” (I remember, I once wrote a 700 page book with a plea to explore these terminologies in all their “deficiencies.” The deficiencies, so I proposed back then, were usually the first indications that people were not doing the things we are doing with works of “art” and “literature” in our debates. We might question our keenness on succinct definitions in these particular fields, so my thought ages ago; we do not really define words in order to settle debates, we are always far more interested in the destabilisation the definition will actually produce – but that is already a topic for a very different blog post.)
Published as part of the NFDI4Memory Task Area “Data Connectivity”, Historical Data Center Halle, project number 501609550.