


ISyReaDeT
Integrated System for Recovery and Archiving Degraded Texts
(imaging)

Written texts, documents and manuscripts are a very important part of European cultural and historical heritage, held in national libraries, museums and private collections. The deterioration of these texts over time presents a challenge for those who rely on the material for historical studies, and also those responsible for conserving the material for future generations.
Documents can be damaged in outbreaks of fire, flooding, and sometimes in deliberate acts of vandalism. Poor restoration techniques used in the past can obscure the text. Widespread problems are due to the damage to paper caused by iron gallic ink which was used commonly in the 16th and 17th centuries, and the deterioration of poor quality paper used in the 19th and early 20th centuries. The problem extends to government and commercial records, and the extent of the problem is such that Europe stands to lose large quantities of important documents over the next 20 to 30 years.
Modern sensitive restoration can help save damaged texts, but where restoration is not possible, multi-spectral imaging can sometimes reveal words or other features of a document which have faded, or are obscured by smoke damage or poor restoration.
Multi-spectral imaging is well known in the forensic fields and has been developed by project partner Art Innovation for analysis of paintings for conservation purposes. Previous work on multispectral imaging of degraded texts has used experimental systems which are not always appropriate for a standardised service, for example the fibre optic backlighting used by Kiernan which was effective in revealing hidden letters, but extremely labour intensive.
In contrast, ISyReADeT will aim to develop a system which is suitable for large-scale routine use and is capable of extracting the hidden features from a range of common degraded texts.
Advanced image analysis techniques can be used to help analyse degraded texts, using techniques developed in satellite remote sensing, medical imaging and forensic science. There have been several notable studies to date, but implementation on a wide scale has not yet been undertaken. Pioneering studies were performed by Benton, Gillespie and Soha on an ancient manuscript of Arnald of Villanova. More recently Kiernan at the University of Kentucky collaborated with the British Library on the analysis of a Beowulf codex which was damaged by fire in 1731.
The Electronic Beowulf is now published as 2 CD-ROMs by the British Library and University of Michigan Press. As well as containing digital images of the manuscript, it contains transcriptions encoded in SGML (Standard Generalised Markup Language) with tagging to enhance the searchability of the text. Prototypes of intelligent Optical Character Recognition programmes specifically designed for degraded texts have also been developed.
ISyReADeT aims to build on this previous research to develop a working, commercial system which can be used on a routine basis to digitise and virtual-restore degraded and damaged texts across Europe.
Partners
• T.E.A. sas di E. Console & C., Catanzaro, Italy (coordinator)
• Art Conservation BV, Vlaardingen, The Netherlands
• Quillet SA, Loix, France
• Art Innovation, The Netherlands
• Acciss Bretagne SA, Plouzane, France
• Transmedia Technology Ltd., Swansea, England
• Consiglio Nazionale delle Ricerche - Istituto per i Processi chimico-fisici e Istituto
di Scienza
• e Tecnologie dell'Informazione, Pisa, Italy
• École Nationale Supérieure des Télécommunications de Bretagne, Brest, France