Skew Angle Estimation of Scanned Handwritten Arabic Documents Using a Time-Frequency Analysis of the Projection Histograms

Estimation de L’Inclinaison D’un Document Arabe Manuscrit Numérisé par Analyse Temps-Fréquence des Histogrammes de Projection

Nazih Ouwayed Abdel Belaïd  François Auger 

Université Nancy 2, LORIA, équipe READ, Vandoeuvre-Lès-Nancy, France

Université de Nantes, IREENA site de Saint-Nazaire, France

30 September 2009
Ancient Arabic textual archives contain a heavy volume of handwritten documents that need to be scanned and indexed. Some of these documents are skewed, making their recognition and indexing difficult because straight lines are more suitable for the word extraction by recognition systems. We are looking for a method that can robustly estimate this orientation, whatever the size of the document. The scientific literature already proposes some solutions for image document skew angle estimation. The projection techniques seem the most appropriate ones but need to be adapted to Arabic documents. In fact, in Arabic script, the words are made of PAWs (Parts of Arabic Words) which are almost vertical or oblique and which may distort the calculation of local orientation. This prevents to apply local techniques like nearest neighbors, because of the alignment irregularity, or global techniques such as the Hough Transform because of the difficulty of locating voting points. Although these techniques fit well to printed documents, they remain inadequate to handwritten documents, in which the interline distance is random and the skew angle can be large. Kavallieratou et al. employed Cohen's class distributions on Latin documents. This Cohen's class contains all the quadratic time-frequency distributions that are covariant under time- and frequency-shifts. The members of this class are identified by a particular kernel φdD(τ,ξ), which determines their theoretical properties and their practical readability.

In Kavallieratou's paper, the relationship between the distributions properties and the experimental results are not highlighted. We propose in this article to look for the most relevant properties related to the skew angle estimation problem and to find, thanks to them, the best distribution to use.

To estimate the orientation angle, we propose to compute a time-frequency representation of the analytic signal xa(t) of the centered squared root of the projection histogram x(t) of the document. The projection angle corresponding to the histogram with the highest maximum value of its time-frequency representation is considered as an estimation of the document orientation. To study the effectiveness of our approach, we have experimented it on 864 Arabic handwritten documents. These documents have different sizes, contain several types of writing, layout (with 1 or 2 columns), a mix of text and tables, etc. The experiments were prepared after a manual orientation of the documents into different angles ranging from – 75° to +90°. We found that the Wigner-Ville distribution reaches the highest estimation rate (100 %). The other distributions yield a lower estimation rate, either because they do not satisfy properties that are important for the skew angle estimation problem, such as the scale invariance property and the support conservation, because their localization of the signal components is not sufficiently precise to provide a skew angle estimation with the maximum of the representation, or because the parameters of these distributions are not fitted to the analysed histogram profiles. The skew angle estimator using the Wigner-Ville distribution is also compared to the projection analysis and Fourier Transform methods.


Nous présentons dans cet article une nouvelle méthode de détermination de l'inclinaison d'un document manuscrit arabe à l'aide d'une représentation temps-fréquence énergétique de la classe de Cohen. Cette méthode consiste à calculer d'abord les histogrammes de projection obtenus pour différents angles, puis à déterminer la valeur maximale de la représentation temps-fréquence de la racine carrée de ces histogrammes. L'orientation du document est alors estimée par l'angle de projection fournissant la valeur maximale la plus élevée. La méthode proposée a été testée sur 864 documents inclinés avec 9 représentations temps-fréquence différentes. Les résultats sont présentés et analysés à la fin de cet article.


Handwritten documents, energy distributions, Cohen’s class, projection histograms, skew angle estimation.

Mots clés

Documents manuscrits, distributions d'énergie, classe de Cohen, histogramme de projection, estimation de l'angle d'orientation.

1. Introduction
2. Calcul de l'orientation par Analyse des Histogrammes de Projection
3. Les Distributions Temps-Fréquence
4. Application des Distributions Temps-Fréquence pour L'estimation de L'orientation
5. Résultats Expérimentaux et Discussion
6. Conclusion et Perspectives

