A Symbol Spotting Approach in Graphic Documents
Une Approche de Localisation de Symboles Non-Segmentés dans des Documents Graphiques
This paper addresses the problem of symbol spotting for graphic documents.We propose an approach where each graphic document is indexed as a text document by using the vector model and an inverted file structure.The method relies on a visual vocabulary built from a shape descriptor adapted to the document level and invariant under classical geometric transforms (rotation,scaling and translation).Regions of interest (ROI) selected with high degree of confidence using a voting strategy are considered as occurrences of a query symbol.
The symbol spotting problem consists in locating all instances of a symbol embedded in documents.The representation of these symbols is not straightforward by using a good shape (symbol) descriptor because they are not isolated from their context.Therefore,a common strategy for symbol spotting consists in decomposing documents into components and in applying a shape descriptor on each of them.A vectorization step is needed for most of the approaches and usually,only symbols which satisfy some conditions are retrieved (eg.convexity,connectivity,closure,...).Our objective is to tackle the problem from a point of view where neither symbol hypothesis nor vectorization step is needed.First of all,we proposed a descriptor to represent graphic symbols and its extension to document level.Then,we exploit a technique based on the concept of visual words for indexing graphic documents and for spotting non-segmented symbols into documents.Finally,we introduce a voting process on the detected ROI in order to locate instances of a query symbol.
In order to represent graphic symbols,we propose an adaptive solution based on shape contexts.It consists in adapting shape context for points of interest.We use the DoG detector to extract the points of interest that are nearby the junctions of object model at different resolutions.The shape context at each point of interest (CFPI) is normalized by the dominant orientation of the point of interest and the mean distance between the point of interest and contour points to make the CFPI invariant under rotation and scaling.Therefore,a symbol S is described by a set of CFPI at the interest points of S.We use also this descriptor to extract the local information in a graphic document by computing the CFPI in the neighbour region of each point of interest.We define the neighbour region for each point of interest according to its resolution.
With the goal of reducing the complexity of on-line matchings (for searching and spotting),we use the concept of visual words for our system thanks to the information pre-computed in the off-line step.A clustering technique is executed on the set of descriptors CFPI,calculated from all documents in the database,to create visual words.As the CFPIs are matched with visual words,we can use indexing and retrieval techniques for text documents on graphic documents. While matching the CFPIs with visual words,we propose to associate one CFPI with several visual words according to its similarities with these visual words.The objective is to reduce the ambiguity for the CFPI which are nearby cluster boundaries.
For spotting the instances of a symbol in a document,first,we detect in the document the ROI corresponding to the query symbol and then we execute a voting process on these regions.The ROI detection is based on the relation between the considered keypoint and the bounding box of the query.The centre of each region of interest is voted from the similarity between this region and the query,based on a text retrieval technique (the vector model).This technique uses the appearance frequencies of visual words in each region to make the comparison between the query and ROI in the document.Regions having high values are considered as potential regions containing a symbol instance.We have tested the adaptation of CFPI for graphic symbols on the GREC’03 dataset composed of isolated symbols.This base contains symbols of 50 models with different sizes and orientations.The results obtained with the CFPI and the R-signature are shown.The two descriptors have similar performance while querying with complete symbols.However, the CFPI is more robust when the goal is to retrieve incomplete symbols.For evaluating the spotting system,our tests are executed on a collection of synthetic documents from the SESYD project.In spite of errors,the results are very promising and show the feasibility of our approach.
Dans cet article,nous proposons une méthode de localisation de symboles dans des documents graphiques. Les occurrences du symbole dans un document sont détectées grâce à un processus de vote sur des régions candidates. L’approche repose sur un vocabulaire visuel et afin de réduire la complexité d’appariement d’un symbole avec d’autres nous utilisons le modèle vectoriel et une indexation par un fichier inverse. Cette méthode s’appuie sur un descripteur défini à partir du concept de contexte de forme1 adapté aux points d’intérêt. Ce descripteur est invariant à la rotation,à la translation et aux changements d’échelles. Les résultats expérimentaux sur la recherche de symboles isolés et sur la localisation de symboles nonsegmentés dans le document sont très prometteurs.
Symbol descriptor,graphic symbol,visual word,symbol spotting.
Descripteur de symboles,symbole graphique,mot visuel,recherche par le contenu,localisation de symboles.
 A. JAIN, R. DUIN, and J. MAO, “Statistical pattern recognition: a review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, 2000.
 S. TABBONE and L. LLADOS, “A propos de la reconnaissance de documents graphiques: synthèse et perspectives,” in TAIMA, Hammamet, Tunisia, 2007.
 Y. RUI, A. SHE, and T. HUANG, “A modified fourier descriptor for shape matching in mars,” in Workshop on Image Databases and Multi Media Search, vol. 8,Amsterdam, Netherlands, 1998, pp. 165–180.
 D. ZHANG and G. LU, “Study and evaluation of different fourier methods for image retrieval,” Image and Vision Computing, vol. 23, pp. 33–49, 2005.
 R. PROKOP and A. REEVES, “A survey of moment-based techniques for unoccluded object representation and recognition,” CVGIP: Graphical Models and Image Processing, vol. 54, no. 5, pp. 438–460, 1992.
 L. WENYIN,W. ZHANG, and L. YAN, “An interactive example-driven approach to graphics recognition in engineering drawings,” International Journal of Document Analysis and Recognition, vol. 9, no. 1, pp. 13–29, March 2007.
 J. FONSECA,A. FERREIRA, and J. JOAQUIM, “Content-based retrieval of technical drawings,” International Journal of Computer Applications in Technology,vol. 23,no. 2-3,pp. 86–100,March 2005.
 H. LOCTEAU, S. ADAM, E. TRUPIN, J. LABICHE, and P. HEROUX, “Symbol spotting using full visibility graph representation,” in Seventh IAPR International Workshop on Graphics Recognition, Curitiba, Brazil, September 2007.
 M. RUSINOL and J. LLADOS, “Symbol spotting in technical drawings using vectorial signatures,” in Graphics Recognition. Ten Years Review and Future Perspectives. Springer Berlin / Heidelberg, October 2006, vol. 3926/2006, pp. 35–46.
 S. TABBONE and D. ZUWALA, “An indexing method for graphical documents,” in International Conference on Document Analysis and Recognition, vol. 2, Curitiba, Brazil, 2007, pp. 789–793.
 M. RUSINOL and J. LLADOS, “A region-based hashing approach for symbol spotting in technical documents,”in Seventh IAPR International Workshop on Graphics Recognition, Curitiba, Brazil, September 2007.
 D. ZUWALA and S. TABBONE, “Une méthode de localisation et de reconnaissance de symboles sans connaissance a prior,” in Colloque International Francophone sur l’Ecrit et le Document – CIFED’06, Fribourg, Suisse, 2006, pp. 127–131.
 S. BELONGIE, J. MALIK, and J. PUZICHA, “Shape matching and object recognition using shape contexts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509–522,Avril 2002.
 J. SIVIC and A. ZISSERMAN, “Video google: Efficient visual search of videos,” in Toward Category-Level Object Recognition. Springer Berlin / Heidelberg, 2006, vol. 4170/2006, pp. 127–144.
 D. G. LOWE, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, November 2004.
 S. AGARWAL, A. AWAN, and D. ROTH, “Learning to detect objects in images via a sparse, part-based representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475–1490, November 2004.
 A. BOSCH, A. ZISSERMAN, and X. MUNOZ, “Scene classification via plsa,” in Computer Vision – ECCV 2006. Springer Berlin / Heidelberg, May 2006, vol. 3954/2006, pp. 517–530.
 C. SCHMID, R. MOHR, and C. BAUCKHAGE, “Comparing and evaluating interest points,” in In Proceedings of the ICCV, Bombay, India, 1998, pp. 230–235.
 K. MIKOLAJCZYK and C. SCHMID, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 27, no. 10, pp. 1615–1630, October 2005.
 S. TABBONE, L. ALONSO, and D. ZIOU, “Behavior of the laplacian of gaussian extrema,” Journal of Mathematical Imaging and Vision, vol. 23, no. 1, pp. 107–128, July 2005.
 R. BAEZA-YATES and B. RIBEIRO-NETO, Modern Information Retrieval. New York:ACM Press / Addison-Wesley, 1999.
 J. R. SMITH, “Image retrieval evaluation,” in IEEE Workshop on Content-based Access to Image and Video Databases, Bombay, India. IEEE Computer Society Washington, DC, USA, June 1998, p. 112.
 TH.GEVERS and A. SMEULDERS, Emerging Topics in Computer Vision. Addison-Wesley / Prentice Hall, 2004, ch. 8. Content-Based Image Retrieval:An Overview.
 S. TABBONE and L. WENDLING, “Recognition of symbols in grey level line drawings from an adaptation of the radon transform,” in In Proceedings of 17th International Conference on Pattern Recognition, vol. 2, Cambridge (UK), 2004, pp. 570–573.
 M. DELALANDRE, T. PRIDMORE, E. VALVENY, H. LOCTEAU, and E. TRUPIN, “Building synthetic graphical documents for performance evaluation revised,” in Selected Papers of Workshop on Graphics Recognition (GREC), Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2008, vol. 5046, pp. 288–298.