20th Century Press Archives – history in newspaper clippings, made accessible by ZBW and Wikimedia

Translate this post

Before the time of the internet, newspapers were the main source of information about current events. Large institutions – publishers, banks, government agencies – maintained clipping archives to provide quick, random access to the collected information. The Hamburger Welt-Wirtschafts-Archiv, founded in 1908 and maintained until 2005, holded some 19 Million articles, organized in thematic folders on persons, companies and institutions, commodities and wares, and about countries and events/topics – the latter by far the largest of the four archives. From the outset, the archives were open to trade and industry, academic research and the general public. The thematic coverage was broad, and more than 1,500 sources from all over the world were evaluated – for tens of thousands of topics such as “Social position of women in the Ottoman Empire” or the first Channel Tunnel Company.

Mappen des Personenarchivs (2005)

Fortunately, the ZBW – Leibniz Information Centre for Economics had, with funding from the German Research Foundation (DFG), digitized all the clippings and other material up to 1960. However, the publication of the material was slow, mainly because the intellectual property rights status of each clipping had to be checked individually. In 2019, the focus of ZBW as a research infrastructure institution for economics did not allow to allocate resources to further indexing the holdings of the press archives (PM20). Therefore, ZBW decided to place all metadata of the archives under a CC0 licence, and to start a cooperation with Wikimedia Deutschland. In a data donation to Wikidata, the archives’ metadata was integrated into existing or newly created Wikidata items – for example more than 5000 20th century company items were created while 3900 pre-existing were enriched with facts like “headquarter location”, “industry” or “board member”. By the end of 2022, all of the archives’ existing folders were connected to items in Wikidata, as well as the subject and geographical classification. In turn, Wikidata adds value to the ZBW’s PM20 website, in particular by providing a search function that makes use of synonyms, some of which were contributed as aliases during the data donation, some of which were added by other Wikidata users.

Adding value in a Wikipedia project

In early 2024, changes in European intellectual property law made it possible for ZBW to publish all digitized material up to 1949 (though sadly only within the EU legal area). Very different from the well-prepared, neatly organized folders published earlier, this meant fresh access to 3.8 million digitized pages from raw microfilm. The Wikipedia Projekt Pressearchiv was set up not only to make the press archives better known to Wikipedians and to help them with using it, but also to provide additional in-depth indexing of the material. Combined with earlier work by participants, the project has already added more than 15,000 links to the material, sometimes only at the top level of countries, wares or companies, sometimes deeper down the folder hierarchy. The project uses the Wikitech infrastructure to maintain code and the PM20 master database. In addition, it established a workflow to return dataset enhancements to ZBW, where they are integrated into the actual PM20 website in a largely automated process.

Handschriftliche Systematik des Länder-Sach-Archivs (Ausschnitt)

The collaboration has proven to be a win-win situation: the ZBW has been able to replace its previous highly sophisticated but hard-to-maintain web application with a much more sustainable site of static pages, interlinked with Wikidata and continually improved by volunteer community work. The Wikimedia projects gains convenient access to an enormous amount of well-organized 20th century contemporary material, often unavailable elsewhere. While many other press archives have already quietly disappeard, this collaboration ensures that these unique public archives will be preserved and kept open to Wikipedia authors as well as scholarly researchers and the general public.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

4 Comments
Inline Feedbacks
View all comments

I think there’s a typo above, in “founded in 2008 and maintained until 2005”. I mean, archives are time-travelling institutions, I know, but still… 🙂

Great article and a really amazing project! I just have a couple of questions..
For the Wikipedia Projekt Pressearchiv, when you say you’ve “added more than 15,000 links to the material” – are those links as statements in wikidata? Or is it something on the PM20 side…?
Also, could you say something more about the Wikitech infrastructure you’re using? Is there somewhere we can read more about it?
Thanks!