Online digital sources in international economic history: a frustrated view
The post that follows is only a snapshot of some reflections I presented at the last conference Digital Humanities Luxembourg (DHLU) on 5 December 2013. I picked up a couple of examples to illustrate my personal research experience and hopefully convey my strand of thought on the use of digital sources in international economic history – a frustrated experience, but clearly not depressed!
Online digital sources available to the historian are constantly increasing, and represent formidable research opportunities and challenges. Ian Milligan and Frédéric Clavert have most recently reflected on their uses, misuses and biases. As an international economic historian, having worked on topics more specifically linked to European integration, these online digital sources are many: the Margaret Thatcher Archive, the European Economic Community’s Committee of Governors’ papers on the European Central Bank’s website, the papers of François-Xavier Ortoli (a former EEC Commission’s president, and then vice-president), the Gerald D. Ford Presidential Digital Library – to name but a few.
These sources are not just numerous; they are also of a very high quality. All Prime Minister’s (PREM) files are online on the Margaret Thatcher Archive’s website, making considerably easier any archival research that would focus on her premiership. The agendas, and, most importantly the full records of the EEC Committee of Governors’ sessions are available on the European Central Bank’s (ECB) website – these were one of the most important sources of my research on the creation of the European Monetary System (EMS). The Ford Digital Library comprises a systemic digitisation of presidential correspondence with foreign leaders and memoranda of conversations linked to foreign affairs, which is a fantastic resource to any international historian.
But if such online digital sources are numerous and of a high quality, the use I could make of them was in fact limited – hence my frustration. Two examples can help make sense of this paradoxical situation. When writing A Europe Made of Money, I was looking for precedents to the EMS. To cut a long story short: the standard way of presenting the EMS’ creation is that is a bold new unprecedented step. I argue the opposite: many plans proposed before the EMS negotiations suggested similar ideas; and the EMS is less ambitious than it seems at first sight.
How did I carry out my research in the archives and came to this conclusion? (Regardless as to whether the documents were digitised and available online; or only accessible in dusty and remote archives.) I was not searching for a specific name or plan, as by definition I could not know what had been, or had not been discussed in advance; I could not guess what my conclusions would be before having actually seen the primary sources. I had instead to go through thousands of pages of documents (memoranda, records of meetings, letters) and hope to find some grain for my mill – or not. As a consequence a digital record was not necessarily helpful. It could make my research easier in that it prevented me from doing a long and expensive trip to an archive’s repository; but it did not spare me the need to do a close reading – by opposition to the “distant reading” that can be performed with digital tools – of the documents, in order to find what I could not guess in advance.
Duisenberg and OCR quality
Where digitalised documents could be helpful was in the case I knew the name of a specific information I was looking for. To take but one example: one of the plans that bears some resemblance with one of the EMS’ central features (for those interested in the EMS’ mechanics, I’m thinking of the so-called “divergence indicator”) is called the Duisenberg memorandum. The then Dutch finance minister Wim Duisenberg – who is best known for becoming later the first ECB president – put this plan forward in 1976. Once I discovered the existence of this memorandum, and realised its links with the EMS, I could then perform systematic searches in the documents that were in digital format in order to find – or not – an occurrence in the text.
The quality of the Optical Character Recognition (OCR) was hence key. If letters are missed or wrongly recognised, this can lead to missing some crucial information. And unfortunately the OCR’s quality is not always very high – if some OCR had been carried out at all. If, for whatever technical reason, an occurrence of “Duisenberg” had been recognised as “Duisen berg”, “D ui s e n ber g” or “Duisemb erq”, there is little hope to find this occurrence back through a classic text search for which I would neatly type “Duisenberg”. Which brought me back to the classic “close reading.”
One specific example is probably better than a long speech: during the Committee of Governors’ meeting that took place on 9 November 1976 in Basle, the EEC governors discussed the ideas advanced by Duisenberg. If running a text search in the record of the meeting, no occurrence of the word “Duisenberg” will be found. But if instead you type “zone” (instead of the full “zones-cibles”, so as to minimise the risk of a letter not being recognised), you will find five instances in two different pages, showing that the governors did discuss the Duisenberg memorandum, yet without calling it so. I should add that while doing this search, I got myself caught by surprise. I was looking at my own text (page 115 in my book) where I write that Gordon Richardson, governor of the Bank of England and president of the Committee, had made a noteworthy remark during this meeting. I wanted to find this remark back in the record and thus typed “Richardson”, but this gave no result. I was struck by this for a few moments, and started a “close reading” of the text thinking I had made a mistake and was looking at the wrong file, until I realised that Richardson must have appeared in the text as “Le Président”, as he was presiding the Committee… And indeed if typing “Président”, four results come up in two different pages. Interestingly, if typing “sident” (so as to avoid the “é” which may have been wrongly recognised), more occurrences of the actual word “Président” appear – together with unrelated phrases such as “ne résident pas” –, in particular one page 5 of the record, where “Président” has been recognised as “Priisident.”
Naming names
The Richardson/Président mistake I did highlights a somewhat more classic issue, that is, the terminology that was used. Not all persons involved in the discussions were using the exact phrasing of “Duisenberg memorandum.” Some were talking of “Dutch proposals” or “plan” and others of “target zones proposal” (as creating “target zones” for currency fluctuations was one of the central features of Duisenberg’s ideas). Some policymakers even focused on the man who was behind much of the writing of the proposal, Conrad Oort, the then head of the Dutch treasury, and as a consequence talked of the “Oort proposals.” These multiple expressions made text searches considerably more complicated, especially when adding on top of this that the OCR was poor.
Nor was my research made easier by apparently well-defined European institutions. The centrality of the European Council – the regular meeting, three times a year, of the EEC’s heads of government – in the monetary discussions made my research quite problematic. As the European Council had been created in 1974, not all the actors involved in the policy process necessarily commonly used the very name of “European Council” right from the beginning. Some talked of “EEC summit meeting”, others of “summit conferences”, and so on – not to mention those who mixed it up with the “Council of Ministers,” the “Council of Europe” or simply “the Council” (have your own guess at which Council they were thinking of…)! Of course, this is far from being limited to online digital sources, and is admittedly a valid observation for all types of sources, including classic paper inventories (where the situation could potentially be worse, as the inventory could still be provisional, and hence use some phrasings peculiar to the 1970s that had since then lost currency).
There are many other similar examples that I could develop. The two examples presented above are in no way intended to convey a depressed experience of research using (online) digital sources. These sources made my research considerably easier than it would have been without them, thanks to a more straightforward data retrieval, useful – if limited – ad hoc search of some keywords, and in general better working conditions (digital cameras!), with everything stored on my laptop and on various hard drives and USB sticks. Yet I felt still very far from being able to do a meaningful use of distant reading tools, that could have allowed me to carry out a search for a “Duisenberg plan” throughout my whole corpus of documents – and most importantly make use of much more sophisticated text mining softwares. The time when a historical research in international economic relations will be fully relying on online digital sources seems then, unfortunately, still quite remote to me. My current research on the development of international financial regulation and supervision brings me to many commercial banks’ archives, which are (not yet?) digitised, and hence the situation is in a way even “worse.” The opportunities offered by digital tools and sources are already in place, and let me think that the frustration I feel is merely a symptom of the transitory stage we are going through – a stage where historical sources are sometimes in digital format, sometimes in a poor digital format, and still all too often in no digital format at all. For the sake of contradicting myself, I am currently writing another blog post making full use of an online digital resource and text mining tools – hopefully I’ll be able to post it here soon.