Writer Identification Using Allograph Distributions
Identification de Scripteurs Utilisant les Distributions d’Allographes
OPEN ACCESS
A method is proposed to allow the retrieval of the identity of the writer of a non-constraint handwritten text by matching it with some reference handwritten documents.The matching is based on a metric computed on the distributions of the allograph of the letters featuring a unique writing style.An automatic system segments the text into characters and assigns a partial membership to the different representative prototypes of the considered letter of the Roman alphabet. Two different datasets are used to assess this system.A writer identification rate of 99.2% is obtained when the reference dataset is composed of 120 French documents.On the other dataset with 200 English texts,the identification rate reaches 87%.Online handwriting is considered by this system.
Résumé
Ce papier propose une méthode permettant d’identifier le scripteur d’un texte quelconque de quelques lignes en le comparant à des écritures de références. La comparaison est basée sur une mesure de mise en correspondance des distributions des allographes de lettres représentatifs des styles d’écriture. Un système automatique segmente le texte en lettres,puis classe chaque lettre de manière probabiliste parmi les prototypes disponibles pour cette lettre. Deux bases de complexité différentes sont utilisées pour valuer ce système. Un taux d’identification de 99,2 % est obtenu sur une base de recherche de 120 textes écrits en français,tandis qu’il se situe à 87 % sur une base de recherche de 200 textes écrits en anglais. Cette méthode est développée sur de l’écriture en ligne.
Writer identification,information retrieval,online handwriting,k-nearest neighbor,allograph.
Mots clés
Identification de scripteur,recherche d’information,écriture manuscrite en-ligne,k-plus-proches-voisins, allographe.
[Bensefia 05] A. BENSEFIA, T. PAQUET, L. HEUTTE, “Handwritten Document Analysis for Automatic Writer Recognition”, Electronic Letters on Computer Vision and Image Analysis, 2005, vol. 5, no. 2, pp. 72-86.
[Bulacu 07] M. BULACU and L. SCHOMAKER, “Text-Independent Writer Identification and Verification sing Textural and Allographic Features”, IEEE Trans.Pattern Analysis and Machine Intelligence, vol. 29, no.4,Apr 2007, pp. 701-717.
[Busch 05] A. BUSCH, W.W. BOLES and S. SRIDHARAN, “Texture for Script Identification”, IEEE Trans.Pattern Analysis and Machine Intelligence, vol. 27, no.11 Nov 2005, pp. 1720-1732.
[Chan 08] S.K CHAN, C. VIARD-GAUDIN and Y.H TAY, “On line Text Independent Writer Identification Using Character Prototypes Distribution”, Proc.of SPIE IS&T Electronic Imaging: Document Recognition and Retrieval XV, 2008, vol. 6815, pp.1-9
[Cover 91] T. COVER and J. THOMAS, “Elements of Information Theory” Wiley, 1991, pp.13-41.
[Han 06] J. HAN and M. KAMBER, “Data Mining:Concepts and Techniques”, Elsevier, 2006, pp.383-460.
[He 08] Z. HE, X. YOU and Y. Y. TANG. “Writer identification of Chinese handwriting documents using hidden Markov tree model”, Pattern Recognition 41, 2008, pp. 1295 – 1307.
[Hochberg 97] J. HOCHBERG, P. KELLY, T. THOMAS and L. KERNS, “Automatic script identification from document images using clusterbased templates”, IEEE Trans.Pattern Analysis and Machine Intelligence, vol.19 no.2, Feb 1997, pp.176-181.
[Hoppner 99] F. HOPPNER,F. KLAWONN,R. KRUSEand T. RUNKLER, “Fuzzy Cluster Analysis:Methods for Classification,Data Analysis and Image Recognition”, Wiley, 1999, pp. 5-31.
[Jain 03] A.K. JAIN and A. M. NAMBOODIRI, “Indexing and Retrieval of On-line Handwritten Documents”, Proceedings of the 7th International Conference on Document Analysis & Recognition, 2003, pp.655-659.
[Niels 07] R. NIELS and L. VUURPIJL,Automatic Allograph Matching in Forensic Writer Identification, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 21, No. 1 (2007), pp. 61–81.
[Niels 08] R. NIELS, F. GOOTJEN and L. VUURPIJL, “Writer Identification Through Information Retrieval: the Allograph Weight Vector”, International Conference on Frontiers in Handwriting Recognition, 2008, pp. 481–486.
[Oviatt 00] S. OVIATT, P. COHEN, L. WU, L. DUNCAN, B. SUHM, J. BERS, T. HOLZMAN, T. WINOGRAD, J. LANDAY, J. LARSON and D. FERRO, “Designing the ser Interface for Multimodal Speech and Pen-Based esture Applications: State-of-the-Art Systems and Future Research Directions”, Human Computer Interaction, 2000, Vol. 15, No. 4, pp. 263-322.
[Peña 08] S. PEÑA SALDARRIAGA, E. MORIN, and C. VIARDGAUDIN, “Categorization of On-line Handwritten Documents”, Proceedings of the Eighth IAPR Workshop on Document Analysis Systems, In Proc. DAS2008, Sep 2008, pp. 95-102.
[Peña 09] S. PEÑA SALDARRIAGA, C. VIARD-GAUDIN, and E. MORIN, “On-line Handwritten Text Categorization”, Proceedings of IS&T/SPIE Electronic Imaging, Document Recognition and Retrieval XVI, vol. 7247, Jan 2009, pp. 727409-1 - 727409-11.
[Perraud 05] F. PERRAUD, C. VIARD-GAUDIN, E. MORIN, P.M. LALLICAN, “Statistical Language Models for On-Line HandwritingRecognition”,IEICE Transactions on Information and Systems Image Understanding and Digital Document, Vol.E88-D No.8, 2005, pp.1807-1814.
[Pitak 04] T. PITAK and T. MATSUURA, “On-line Writer Recognition for Thai Based on Velocity of Bary center of Pen-point Movement”, Proceedings of IEEE International Conference on Image Processing, October 2004, pp.889-892.
[Said 00] H. E. S. SAID, T. N. TAN, and K. D. BAKER, “Personal identification based on handwriting,” Pattern Recognition, vol. 33, 2000, pp. 149-160.
[Salton 88] G. SALTON and C. BUCKLEY, “Term-weighting approaches in automatic text retrieval”, Information Processing & Management 24(5), 1988, pp. 513–523.
[Schlapbach 04] A. SCHLAPBACH and H. BUNKE, “Using HMM based recognizers for writer identification and verification,” in Proceedings International Workshop on Frontiers in Handwriting Recognition, IWFHR, Tokyo, 2004, pp. 167-172.
[Schomaker 04] L. SCHOMAKER and M. BULACU, “Automatic Writer Identification sing Connected-Component Contours and Edge-Based Features of ppercase Western Script”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 6, June 2004, pp. 787-798.
[Srihari 01] S. N. SRIHARI, S.-H. CHA, and S. LEE, “Establishing Handwriting Individuality sing Pattern Recognition Techniques,” in Proceedings of the Sixth International Conference on Document Analysis and Recognition, 2001, pp. 1195-1204.
[Stuart 05] STUART H. JAMES, JON J. NORDBY, “Forensic science: an introduction to scientific and investigative techniques”, CRC Press, 2005, ISBN0849327474, 9780849327476, 778 pages.
[Tan 08] GUO XIAN TAN, C. VIARD-GAUDIN, and A. KOT, «Identification de Scripteurs basée sur une Distribution Probabiliste de Prototypes d’Allographes », Colloque International Francophone sur l’Ecrit et le Document, Oct. 2008, pp. 139-144.
[Tan 09] GUO XIAN TAN, C. VIARD-GAUDIN, and A. KOT, “Automatic writer identification framework for online handwritten documents using character prototypes”, in Pattern Recognition 42 (2009), pp. 3313-3323.
[Viard-Gaudin 99] C.VIARD-GAUDIN, P-M LALLICAN, S. KNERR and P. BINTER, “The IRESTE On/Off (IRONOFF) Dual Handwriting Database”, Proceedings of the 5th International Conference on Document Analysis & Recognition, Sep 1999, pp. 455-458.
[Vision 09] Vision Objects Industrial Text Recogniser SDK, “MyScript Builder Help”, SDK documentation, http://www.visionobjects.com/about-us/download-center/_263/myscript-products-datasheets.html, 2009
[Yasushi 03] Y. YASUSHI, T. NAGAO and N. KOMATSU, “Text-indicated Writer Verification Using Hidden Markov Models”, Proceedings of the 7th International Conference on Document Analysis & Recognition, 2003, pp.329-332.
[Zois 00] E. N. ZOIS and V. ANASTASSOPOULOS, “Morphological waveform coding for writer identification,” Pattern Recognition, vol. 33, 2000, pp. 385-398.