Reconnaissance supervisée et non supervisée de lois à partir d’échantillons finis

Reconnaissance supervisée et non supervisée de lois à partir d’échantillons finis

Olivier Alata Christian Olivier Zhan Jin Yannis Pousset 

Lab. H. Curien, UMR CNRS 5516, Université Jean Monnet Saint-Etienne Bât. F 18, rue du Prof. Benoît Lauras - F-42000, Saint-Etienne

Institut XLIM, dépt. SIC, UMR CNRS 7252, Université de Poitiers Bât. SP2MI, BP 30179 - 86962 Futuroscope-Chasseneuil cedex

Corresponding Author Email: 
{yannis.pousset,christian.olivier}@univ-poitiers.fr
Page: 
101-121
|
DOI: 
https://doi.org/10.3166/ts.29.101-121
Received: 
N/A
|
Accepted: 
N/A
|
Published: 
30 April 2012
| Citation

OPEN ACCESS

Abstract: 

In this paper, we address the problem of law recognition from samples with size varying from 100 to 10000 or more. The application context concerns the modelling of mobile radio channel for configurations where transmitter and receiver are either in “line of sight” or in “non line of sight” situations. This problem is crucial to improve the digital communication. In the digital transmission community, the Kolmogorov-Smirnov’s distance is commonly used. More seldomly in this context, a kernel method is considered to approach the probability laws before the comparative test. We suggest here using the information criteria, on one hand to approach the probability laws by a histogram, and on the other hand to select the best law model. We shall study the supervised and unsupervised cases and shall compare the methods in these realistic situations. Results show the interest of using information criteria based methods.

RÉSUMÉ 

Dans cet article, nous abordons le problème de la reconnaissance de lois de probabilité à partir d’échantillons variant de 100 à 10 000 ou plus. Le contexte applicatif porte sur la modélisation de canaux radio-mobile en situation de visibilité ou de non-visibilité directe entre émetteur et récepteur. Ce problème est crucial pour améliorer les communications numériques. Dans la communauté des transmissions numériques, il est courant d’utiliser la distance de Kolmogorov-Smirnov. Plus rarement, une méthode à noyau est considérée avant le test comparatif. Nous proposons d’utiliser les critères d’information (IC), d’une part pour approcher les lois de probabilité par un histogramme, et d’autre part pour sélectionner le meilleur modèle de loi. Nous étudions les cas supervisé et non supervisé et comparons les méthodes dans ces situations réalistes. Les résultats montrent l’intérêt d’utiliser les méthodes exploitant les IC. 

Keywords: 

law recognition, supervised and unsupervised methods, information criteria, kernel density estimator, digital communications

MOTS-CLÉS

reconnaissance de lois, approches supervisée et non supervisée, critères d’information, estimateur de densité par noyau, communications numériques

Extended abstract
1. Introduction
2. Méthodologies
3. Résultats expérimentaux
4. Conclusion
  References

Akaike H. (1974). A new look at the statistical model identification. IEEE Trans. on Automatic Control, vol. 19, no 6, p. 716-723.

Alata O., Olivier C. (2003). Choice of a 2d causal ar texture model using information criteria. Pattern Recognition Letters, vol. 24, no 9-10, p. 1191-1201.

Babich F., Lombardi G. (2000). Statistical analysis and characterization of the indoor propagation channel. IEEE Trans. on Communications, vol. 48, no 3, p. 455-464.

Barbot J. P., Levy A. J., Bic J. C. (1992, June). Estimation of fast fading distribution functions. In in proc. of the ursi commission f symposium. Ravenscar, (U.K.).

Basseville M. (1989). Distance measures for signal processing and pattern recognition. Signal Processing, vol. 18, no 4, p. 349-369.

Botev Z. I. (2007). A novel nonparametric density estimator. Technical Report. University of Queensland, Brisbane (Australia). Broersen P. M. T. (2000, December). Finite Sample Criteria for Autoregressive Order Selection. IEEE Trans. on Signal Processing, vol. 48, no 12, p. 3550-3558.

Bultitude R. J. C., Hahn R. F., Davies R. J. (1998). Propagation considerations for the design of an indoor broad-band communications system at ehf. IEEE Trans. on Vehicular Technology, vol. 47, no 1, p. 235-245.

Coq G. (2008). Utilisation d’approches probabilistes basées sur les critères entropiques pour la recherche d’information sur support multimédia. Thèse de doctorat, Université de Poitiers (France).

Coq G., Alata O., Olivier C., Pousset Y., Li X. (2009, September). Reconnaissance de loi via une estimation de densités par histogramme pour la modélisation de canaux de transmission. In Actes de la conférence gretsi. Dijon (France).

Coq G., Alata O., Pousset Y., Li X., Olivier C. (2009, April). Law recognition via histogrambased estimation. In in proc. ieee iccasp, p. 3425-3428. Taïpei (Taïwan).

Coq G., Olivier C., Alata O., Arnaudon M. (2007, Sept.). Information criteria and arithmetic codings: an illustration on raw images. In in proc. 15th eusipco, p. 634-638. Poznan (Poland).

Dholer M. (2003). Virtual antenna arrays. PhD, King’s College London, London (UK).

El Matouat A., Hallin M. (1996). Order selection, stochastic complexity and Kullback-Leibler information. In Athens conference on applied probability and time series analysis, vol. ii (1995), vol. 115, p. 291–299. New York, Springer.

Fryziej M. (2001). Caractérisation large bande du canal radio intra-bâtiment à 60 ghz. Thèse de doctorat, CNAM (Paris, France).

Goldsmith A. (2005). Wireless communications. Cambridge University Press.

Grünwald P. (2005). A tutorial introduction to the minimum description length principle. In Advances in minimum description length: Theory and applications. MIT Press.

Hannan E., Quinn B. (1979). The determination of the order of an autoregression. Journal of the Royal Statistic Society, vol. 41, no 2, p. 190-195.

Hashemi H., Tholl D. (1994). Statistical modeling and simulation of the rms delay spread of indoor radio propagation channels. IEEE Transactions on Vehicular Technology, vol. 43, n o 1, p. 110-120.

Hurvich C. M., Tsai C.-L. (1989). Regression and time series model selection in small samples. Biometrika, vol. 76, no 2, p. 297-307.

Iskander M., Yun Z. (2002). Propagation prediction models for wireless communication systems. IEEE Trans. on Microwave Theory and Techniques, vol. 50, no 3, p. 662-673.

Jouzel F., Olivier C., El Matouat A. (1998, September). Information criteria based edge detection. In EUSIPCO-signal processing IX, Rhodes (Greece), vol. 2, p. 997-1000. Elsevier.

Olivier C., Alata O. (2009, July). The information criteria: examples of applications in image and signal processing. In Optimisation in image and signal processing, desip series, p. 105- 136.

Wiley & ISTE Ltd ed. Olivier C., Jouzel F., Matouat A. E. (1999, May). Choice of the Number of Component Clusters in Mixture Models by Information Criteria. In Proc. iapr vision interface, p. 74-81.

Proakis J. (1995). Digital communications, 3th edition. MacGraw-Hill, New-York.

Rappaport T. (1996). Wireless communications : Principle and practice. Englewood Cliffs, NJ: Prentice-Hall.

Rissanen J. (1976). Generalized kraft inequality and arithmetic coding. IBM Journal of Research and Development, vol. 20, no 3.

Rissanen J. (1978). Modeling by the shortest data description. Automatica, vol. 14, p. 465-471.

Rissanen J. (1989). Stochastic complexity in statistical inquiry (vol. 15). Teaneck, NJ, World Scientific Publishing Co. Inc.

Rissanen J., Speed T. P., Yu B. (1992). Density estimation by stochastic complexity. IEEE Trans. on Information Theory, vol. 38, no 2, p. 315-323.

Santamaria I., Erdogmus D., Principe J. (2002). Entropy minimization for supervised digital communications channel equalization. IEEE Trans. on Signal Processing, vol. 50, no 5, p. 1184-1192.

Sarkar T. K., Zhong J., Kyungjung K., Medouri A., Salazar-Palma M. (2003). A survey of various propagation models for mobile communication. IEEE Trans. on Antennas and Propagation Magazine, vol. 45, no 3, p. 51-82.

Schwarz G. (1978). Estimating the dimension of a model. Ann. Statist., vol. 6, no 2, p. 461–464.

Seitadji B., Levy A. J. (1994). A statistical model for the simulation of time-varying multipath mobile radio propagation channel. In In proc. ieee int. conf. on acoustics, speech and signal processing (ICASSP), vol. 6, p. 149–152.

Shibata R. (1976). Selection of the order of an autoregressive model by Akaike’s information criterion. Biometrika, vol. 63, no 1, p. 117–126.

Zayen B., Hayar A. (2011). A performance study of kullback-leibler distance-based spectrum sensing algorithm. In In proc. 3rd ieee international conference on ultra modern telecommunications (icumt), p. 1–5.