Degraded Document Restoration

Bleed-through Removal

Sponsor: IRCSET

Reduction in legibility due to progressive degradation is often encountered in the study of documents, particularly handwritten. Many libraries host large collections of manuscripts and documents which are especially vulnerable to such degradations due to the fragile nature of the media on which they were written. Physical restoration of degraded documents is a cost and time intensive process, and may affect the integrity of the original. Restoration methods using automatic image processing techniques therefore have become increasingly popular as they have the advantage of being able to make any number of alterations to the document appearance, whilst leaving the original intact. Loss of textual information in documents may be classified into four categories: (i) Fading of text due to light exposure or flaking ink. (ii) Obscured or missing text due to degradation of the writing medium. (iii) Bleed-through interference, where ink has seeped through from one side of a page to the other. (iv) Digitisation of documents may introduce noise artefacts and degrade the textual information further. The main focus area of this project is the problem of degradation caused by bleed-through interference. Details of our ground truth bleed-through database, and results of bleed-through removal algorithm may be found here.

