Incremental Learning with Few Data for Online Handwritten Character Recognition. Apprentissage Incrémental avec Peu de Données pour la Reconnaissance de Caractères Manuscrits En-Ligne

Incremental Learning with Few Data for Online Handwritten Character Recognition

Apprentissage Incrémental avec Peu de Données pour la Reconnaissance de Caractères Manuscrits En-Ligne

Abdullah Almaksour Harold Mouchère  Eric Anquetil 

Laboratoire IRISA/INSA, Campus Universitaire de Beaulieu,Avenue du Général Leclerc, 35042 RENNES Cedex

Laboratoire IRCCyN, Rue Christian Pauc, BP 50609, 44306 Nantes Cedex 03

Page: 
323-338
|
Received: 
15 December 2009
|
Accepted: 
N/A
|
Published: 
31 December 2009
| Citation

OPEN ACCESS

Abstract: 

The experiments are based on the recognition of the 26 isolated lower case Latin letter.The writer specific datasets were written on a PDA by 18 writers.Each writer has randomly inputted 40 times each character,i.e 1040 characters per writer.In order to estimate the performance of the incremental learning strategy for each writer,we proceed by a 4-fold cross-validation technique.Three quarters of the dataset (780 letters) are used to incrementally learn the system, and one quarter (260 letters) is used to estimate the evolving of system capacity during the learning process.The presented results in the figures are the average of the 18 tests (18 writers).Each pattern in our system is described by a set of 21 features.A new example of each class is presented to the system in each learning cycle.

We compare in these experiments the performance of the two incremental learning strategies in terms of the complexity of the classifier,and the quality of the classifier.we evaluate also the impact of using the artificial characters generation on the quality and the complexity of the classifier in our incremental learning system.

We note that the confusion-driven strategy results in a classifier with a quality equal to or greater than that obtained with the two-phase strategy,with creating fewer prototypes.We find that using the confusion-driven strategy with artificial characters generation,a recognition rate about 90% is reached after only 5 learning examples, and such rate rapidly improves reaching 94% after 10 examples,and about 97% after 30 examples.We note also that recognition error rate decreases by 40% using artificial characters generation techniques.

We conclude that confusion-driven strategy achieves a better recognition rate for the same number of prototypes comparing to two-phase one.It can be also noted that the quality/complexity ratio of the classifier is enhanced thanks to the artificial characters generation.A special emphasis for a possible future work is placed on reducing the number of prototypes in the system either by deleting the “useless”prototypes or by merging redundant ones.We plan also to explore other approaches to synthesize handwriting,inspired in particularly by the Sigma-lognormal model.

Résumé

Dans ce papier,nous présentons un nouvel algorithme d’apprentissage incrémental d’un système de reconnaissance en-ligne de caractères manuscrits. L’objectif est d’apprendre «à la volée » toute nouvelle classe de caractères à partir de très peu d’exemples de caractères tout en optimisant les classes déjà modélisées au fur et à mesure de la saisie de nouveaux exemples. Le système proposé est capable de surmonter le problème du manque de données d’apprentissage lors de l’introduction d’une nouvelle classe de caractères grâce à la synthèse de caractères artificiels. Les tests ont été conduits dans le cadre d’un apprentissage incrémental mono-scripteur de lettres minuscules cursives sur une base de 18 scripteurs. Les résultats montrent qu’un bon taux de reconnaissance (environ 90 %) est atteint en utilisant seulement 5 exemples d’apprentissage par classe. De plus,ce taux augmente rapidement pour atteindre 94 % pour 10 exemples,et environ 97 % pour 30. Une réduction d’erreur de 40 % est obtenue en utilisant la synthèse de caractères par rapport à une stratégie sans synthèse.

Keywords: 

Mots clés 

Apprentissage incrémental,reconnaissance de caractères manuscrits en-ligne,synthèse de données manuscrites,rejet d’ambiguïté,systèmes d’inférence floue.

1.Introduction
2.Principes Généraux de l’Approche Proposée
3.Stratégies d’Apprentissage Incrémental
4.Accélération de l’Apprentissage par la Synthèse
5.Expérimentations
6.Conclusion et Travaux Futurs
  References

[1] David W. AHA, Dennis KIBLER and Marc K. ALBERT. Instancebased learning algorithms. Mach. Learn., 6(1) :37-66, 1991. 

[2] Bernadette BOUCHON-MEUNIER, Christophe MARSALA, editors. Logique Floue, Principes,Aide à la Décision. Hermès-Lavoisier, 2003. 

[3] Javier CANO, Juan-Carlos PÉREZ-CORTES, Joaquim ARLANDIS and Rafael LLOBET. Training set expansions in handwritten character recognition. In Proceedings of the 9th International Workshop on Structural and Syntactic Pattern Recognition (SSPR) and 4th Statistical Pattern Recognition (SPR), pages 548-556, 2002. 

[4] G. CARPENTER, S. GROSSBERG, N. MARKUZON, J. REYNOLDS and D. ROSEN. Fuzzy artmap: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3, 1992. 

[5] Gail A. CARPENTER and Stephen GROSSBERG. The art of adaptive pattern recognition by a self-organizing neural network. Computer, 21(3) :77-88, 1988. 

[6] S. DE BACKER and P. SCHEUNDERS. Texture segmentation by frequency-sensitive elliptical competitive learning. Image and Vision Computing, 19(9-10) :639-648, 2001. 

[7] G. LORETTE, E. ANQUETIL. Automatic generation of hierarchical fuzzy classification systems based on explicit fuzzy rules deduced from possibilistic clustering:Application to on-line handwritten character recognition. In Proceedings of the international conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, (IPMU’96), 1996. 

[8] Phayung Meesad Gary G. YEN. An effective neuro-fuzzy paradigm for machinery condition health monitoring. IEEE Transactions on Systems, Man, and Cybernetics, 31-4 :523-536, 2001. 

[9] Muriel HELMER and Horst BUNKE. Generation and use of synthetic training data in cursive handwriting recognition. In Proceedings of 1st Iberian Conference Pattern Recognition and Image Analysis (IbPRIA), pages 336-345, 2003. 

[10] J-S.R. JANG. Anfis: adaptive-network-based fuzzy inference systems. Systems, Man and Cybernetics, IEEE Transactions on, 23 :665685, 1993. 

[11] Joseph J. LAVIOLA Jr. and Robert C. ZELEZNIK. A practical approach for writer-dependent symbol recognition using a writerindependant symbol recognizer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11) :1917-1926, 2007. 

[12] Teuvo KOHONEN. The self-organizing map. Proceedings of IEEE, 78(9) :1464-1480, 1990. 

[13] Nicholas LITTLESTONE. Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow. In COLT ‘91: Proceedings of the fourth annual workshop on Computational learning theory, pages 147-156, San Francisco, CA, USA, 1991. Morgan Kaufmann Publishers Inc.

[14] Nick LITTLESTONE and Manfred K. WARMUTH. The weighted majority algorithm. Inf. Comput., 108(2) :212-261, 1994. 

[15] Edwin LUGHOFER. Flexfis:A robust incremental learning approach for evolving takagi-sugeno fuzzy models. IEEE T. Fuzzy Systems, 16(6) :1393-1410, 2008. 

[16] Farès MENASRI, Nicole VINCENT, and Emmanuel AUGUSTIN. Reconnaissance de chiffres farsi isolés par réseau de neurones à convolutions. In Actes du Colloque International Francophone sur l’Ecrit et le Document (CIFED’08), pages 127-132, 2008. 

[17] Fernanda LI MINKU, Hirotaka INOUE, and Xin YAO. Negative correlation in incremental learning. Natural Computing: an international journal, 8(2) :289-320, 2009. 

[18] H. MOUCHÈRE, E. ANQUETIL, and N. RAGOT. One-line writer adaptation for handwriting recognition using fuzzy inference systems. International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), 21(1) :99-116, 2007. 

[19] Harold MOUCHÈRE and Eric ANQUETIL. A unified strategy to deal with different natures of reject. In 18th ICPR, 2006. 

[20] Harold MOUCHÈRE and Eric ANQUETIL. Synthèse de caractères manuscrits en-ligne pour la reconnaissance de l’écriture. In Actes du Colloque International Francophone sur l’Ecrit et le Document (CIFED’06), pages 187-192, 2006. 

[21] Harold MOUCHÈRE. Étude des mécanismes d’adaptation et de rejet pour l’optimisation de classifieurs: Application à la reconnaissance de l’écriture manuscrite en-ligne. PhD thesis, Institut National des Sciences Appliquées de Rennes (INSA), 2007. 

[22] Loïc OUDOT, Lionel PREVOST, Alvaro MOISES, and Maurice MILGRAM. Selfsupervised writer adaptation using perceptive concepts: Application to on-line text recognition. In 17th ICPR, volume 2, pages 598-601, 2004. 

[23] Réjean PLAMONDON and Moussa DJIOUA. A multi-level representation paradigm for handwriting stroke generation. Human Movement Science, 25 :586-607, October 2006. 

[24] Robi POLIKAR, Lalita UDPA, Satish UDPA, and Vasant HONAVAR. Learn++: An incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man, and Cybernetics, 31 :497-508, 2001. 

[25] Michalski R. REINKE, R. Incremental learning of concept descriptions: A method and experimental results. Hayes J. Michie D., Richards J., eds.: Machine Intelligence, 11 :263-288, 1988. 

[26] Javad SADRI, Ching Y. SUEN, and Tien D. BUI. A new clustering method for improving plasticity and stability in handwritten character recognition systems. Pattern Recognition, International Conference on, 2 :1130-1133, 2006. 

[27] Patrice Y. SIMARD, Dave STEINKRAUS, and John C. PLATT. Best practice for convolutional neural network applied to visual analysis. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR), 2003. 

[28] Tamás VARGA and Horst BUNKE. Generation of synthetic data for an hmm-based handwriting recognition systems. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR), pages 618-622, 2003.