Rééchantillonnage de l’échelle dans les algorithmes MCMC pour les problèmes inverses bilinéaires* - Scale Resampling within MCMC Algorithms for Bilinear Inverse Problems

Rééchantillonnage de l’échelle dans les algorithmes MCMC pour les problèmes inverses bilinéaires*

Scale Resampling within MCMC Algorithms for Bilinear Inverse Problems

Thomas Veit Jérôme Idier  Saïd Moussaoui 

Équipe Perception, LIVIC, bâtiment 824 – 14, route de la Minière, 78000 Versailles

IRCCyN (UMR CNRS 6597), BP 92101 – Équipe ADTSI 1 rue de la Noë – 44321 Nantes Cedex 3, France

Corresponding Author Email: 
thomas.veit@inrets.fr
Page: 
329-343
|
Received: 
22 April 2008
| |
Accepted: 
N/A
| | Citation

OPEN ACCESS

Abstract: 

Introduction

This work focuses on the Markov chain Monte Carlo (MCMC) algorithms involved in the resolution of bilinear inverse problems. This kind of problem is written z = x * h + noise, where z stands for the observations and x and h stand for the unknown quantities (matrices or vectors) to be estimated. Symbol * refers to any bilinear operator, for example a convolution or a matrix product. Source separation [11, 2] and blind deconvolution [3] are among the most important instances of bilinear inverse problems in the signal processing field.

Bilinear inverse problems are characterized by an intrinsic scale ambiguity: namely, if (x,h) is a solution then for all scalar s =/ 0, (s × x,h/s) is also a solution.

Bayesian framework and MCMC

The Bayesian framework introduces statistical information for solving under-constrained problems. Prior distributions are assigned to the unknowns. The problem modeled in this way is written Z = X * H + B where the capital letters stand for random variables. A usual Bayesian estimator of the unknown quantities x and h is the posterior expectation Eq.(3). Let us emphasize that the choice of the prior distribution in the Bayesian framework raises the scale ambiguity on the unknowns, at least if only proper priors are considered, which will be the case here.

In the usual case where the posterior distribution are not explicitly available, MCMC algorithms are well adapted to yield random realizations x(t) and to compute the estimate as empirical averages.

In the bilinear case, an appealing choice to perform the sampling step is to rely on a Gibbs sampler. The two main steps are then the alternate sampling of x and h according to their conditional posterior distributions fX|H,Z,Θ(x|h,z,θ) and fH|X,Z,Θ(h|x,z,θ), respectively.

Bad mixing properties and existing solutions

In practice, a slow convergence of the Markov chain produced by the Gibbs sampler is observed. Successive sample are highly correlated. More precisely, the scale evolves slowly. To correct this behavior, two types of solutions have been proposed up to now, but none of them are satisfying, at least from a theoretical point of view. The first approach consists in estimating the unknown quantities from Markov chains (produced by Algorithm1) that have not converged yet toward their stationary distribution. This implies that only quantities normalised with respect to scale are considered. One can reasonably expect that the convergence remains fast enough but the decoupling of scale and shape remains hypothetical. The second strategy (Algorithm2) consists of introducing an arbitrary scaling which is in general incompatible with the prior distributions on the unknown variables. Moreover, this scaling strategy modifies the target distribution. The proposed solution rather introduces a rescaling step within the Gibbs sampler in agreement with the prior distributions. One attractive feature of this approach compared to the others is that it is theoretically sound.

Additional sampling step and resulting Gibbs sampler

The proposed approach is inserted seamlessly into the Gibbs sampler. It speeds up the convergence toward the target distribution without modifying it. The main idea is to insert an additional resampling step in order to improve the state space exploration (Equation(4)). In section 3.1, it is shown that the scale parameter S has to be resampled according to Equation(5). The resulting modified Gibbs sampler is then Algorithm3.

From Equation (5), the posterior distribution of the scale parameter can be analytically computed for various prior distributions, as for example Gaussian priors or gamma priors. In these cases, it turns out that the posterior distribution is a Generalized Inverse Gaussian (GIG) for S2 and S, respectively. Let us also remark that the Gaussian case nicely extends to continuous or discrete Gaussian mixture models. In the general case, let us emphasize that this step boils down to sampling a scalar random variable, which is assumed a fairly tractable issue.

Experimental results

The improvements achieved by Algorithm3 are illustrated on a low-dimensional toy example x = (x1,x2) ∈ R2, h = (h1,h2) ∈ R2 and z = x · hT + noise ∈ R2×2. Choosing Gaussian priors for the unknowns and the noise, it is possible to derive the theoretical posterior distributions. The level lines of these distributions are plotted on Figure1.

Figures2, 3 and 5 show the first iterations of Algorithms1, 2, and 3, respectively. The slow convergence of Algorithm1 can be observed on Figure2. Figure3 shows that Algorithm2 does not converge toward the theoretical posterior distribution, while Figure4 illustrates the satisfying convergence properties of the proposed scheme (Algorithm3).

The improvements of the proposed method are also illustrated in the context on non-negative source separation [2] on Figures6 to 9. A better exploration of the state space (Figures7 and 8) and a faster convergence is observed (Figure9).

When focusing on estimators normalized with respect to scale, Figure10 shows that the variance of the estimators is underestimated when using Algorithm1.

Conclusion

This article proposes to insert an additional step in the Gibbs sampling algorithm for solving bilinear inverse problems using the MCMC approach. This additional step corrects the bad mixing properties that is generally observed. The additional sampling step applies to a scale parameter, for which the posterior distribution is theoretically established. In practice, faster convergence is observed, with almost no increase of computational cost. The performance of the proposed solution has been compared with two common alternatives. Normalizing the scale during the sampling process (Algorithm2) is definitely not recommendable since it modifies the target distribution and introduces bias in the variance of the estimates. The basic Gibbs sampling (Algorithm1) requires at least to use normalized estimators.

However, only the proposed method (Algorithm3) ensures fast convergence without restrictions on the estimators. Many applications could benefit from this scale resampling, the most important being source separation and blind deconvolution.

Résumé

Cet article présente une méthode pour améliorer le comportement des algorithmes MCMC impliqués dans la résolution de problèmes inverses bilinéaires tels que la déconvolution aveugle et la séparation de sources. La question plus spécifiquement abordée est celle de la gestion de l’ambiguïté d’échelle intrinsèque aux problèmes inverses bilinéaires. Le cadre bayésien lève l’ambiguïté d’échelle (pourvu que les lois manipulées soient propres) mais les échantillons obtenus par échantillonnage de Gibbs sont fortement corrélés.

L’introduction d’une étape d’échantillonnage d’un paramètre d’échelle améliore radicalement le comportement de l’échantillonneur. En autorisant un déplacement supplémentaire, l’exploration de l’espace d’état est rendue plus efficace. Ceci a pour conséquences une diminution du temps de chauffe, une corrélation moins forte des échantillons et une convergence plus rapide. Par rapport à d’autres solutions envisageables, cette approche est mathématiquement rigoureuse sans augmentation significative du coût de calcul. Son apport est illustré sur un premier exemple simple, puis dans le cadre d’une application à la séparation de sources.

Keywords: 

Inverse problems, Bayesian framework, MCMC algorithms, Gibbs sampler, Scale ambiguity, Blind deconvolution, Source separation.

Mots clés

Problèmes inverses, Cadre bayésien, Algorithmes MCMC, Échantillonneur de Gibbs, Ambiguïté d’échelle, Déconvolution aveugle, Séparation de sources.

 

1. Introduction
2. Contexte Bayésien
3. Étape Supplémentaire D’échantillonnage De L’échelle
4. Application À La Séparation De Sources Par Approche Bayésienne
5. Conclusion
  References

[1] H. SNOUSSI et J. IDIER, «Bayesian blind separation of generalized hyperbolic processes in noisy and underdeterminate mixtures », IEEE Transactions on Signal Processing, vol. 54, n° 9, pp. 3257-3269, September 2006.

[2] S. MOUSSAOUI, D. BRIE, A. MOHAMMAD-DJAFARI et C. CARTERET, « Separation of non-negative mixture of non-negative sources using a Bayesian approach and MCMC sampling », IEEE Transactions on Signal Processing, vol. 54, n° 11, pp. 4133-4145, November 2006.

[3] Q. CHENG, R. CHEN et T.-H. LI, «Simultaneous wavelet estimation and deconvolution of reflection-seismic signals », IEEE Transactions on Geoscience and Remote Sensing, vol. 34, n° 2, pp. 377-384, Mars 1956.

[4] P. CIUCIU, J. IDIER, T. VEIT et T. VINCENT, «Application du rééchantillonnage stochastique de l’échelle en détection-estimation de l’activité cérébrales par IRMf», in Actes du 21ème colloque GRETSI, Troyes, France, pp. 373-376, 2007.

[5] Y.-L. YOU et M. KAVEH, «A regularization approach to joint blur identification and image restoration», IEEE Transactions on Image Processing, vol. 5, n° 3, pp. 416-428, mars 1996.

[6] M. STEPHENS, «Dealing with label-switching in mixture models», Journal of the Royal Statistical Society B, vol. 62, pp. 795-809, 2000.

[7] R. WAAGEPETERSEN et D. SORENSEN, «A tutorial on reversible Jump MCMC with a view toward QTL-mapping », International Statistical Review, vol. 69, pp. 49-61, 2001.

[8] J. DAGPUNAR, « An easily implemented generalized inverse Gaussian generator», Communications in Statistics, Simulation and Computing,, vol. 18, n° 2, pp. 703-710, 1989.

[9] T. VEIT et J. IDIER, «Échantillonnage de l’échelle dans les algorithmes MCMC pour les problèmes inverses bilinéaires», in Actes du 21ème colloque GRETSI, Troyes, France, pp. 1233-1236, septembre 2007.

[10] S. SÉNÉCAL et P.-O. AMBLARD, «Bayesian separation of discrete sources via Gibbs sampling », in Proceedings of International Conference on Independent Component Analysis and Blind Signal Separation (ICA’2000), pp. 566-572, 2000.

[11] H. SNOUSSI et A. MOHAMMAD-DJAFARI, «Fast joint separation and segmentation of mixed images », Journal of Electronic Imaging, vol. 13, n° 2, pp. 349-361, 2004.

[12] M. ICHIR et A. MOHAMMAD-DJAFARI, «Hidden Markov models for wavelet-based blind source separation », IEEE Transactions on Image Processing, vol. 17, n° 7, pp. 1887-1899, Juillet 2006.

[13] C. FÉVOTTE et S. J. GODSILL, «A Bayesian approach for blind separation of sparse sources », IEEE Transactions on Audio, Speech and Language Processing, vol. 14, n° 6, pp. 2174-2188, novembre 2006.

[14] S. MOUSSAOUI, C. CARTERET, D. BRIE et A. MOHAMMADDJAFARI, «Bayesian analysis of spectral mixture data using Markov Chain Monte Carlo methods », Chemometrics and Intelligent Laboratory Systems, vol. 81, n° 2, pp. 137-148, April 2006.

[15] A. GELMAN et D. RUBIN, «Inference from iterative simulation using multiple sequences », Statistical Science, vol. 7, pp. 457-472, 1992.