Dimosthenis Karatzas

Publications

Refereed Papers

Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception

D. Karatzas, A. Antonacopoulos

Proceedings of the 17th International Conference on Pattern Recognition (ICPR2004), Vol.2, Cambridge, UK, August 23-26, 2004, IEEE-CS Press, pp. 634-637

Abstract

This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves.

Full Paper

Download

back


Valid XHTML 1.0! Valid CSS! Number of visitors since 3 June 2005:
Best viewed in 1024x768 - © 2005-06
Designed by: Christos Papadopoulos - Maintained by: Dimosthenis Karatzas