Home Journals ISI Reducing the number of predicates for approaches to distribution of data warehouses

JOURNAL METRICS

CiteScore 2022: 2.7 ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2022: 0.267 ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2022: 0.615 ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

123.png

Reducing the number of predicates for approaches to distribution of data warehouses

Mourad Ghorbel | Karima Tekaya | Abdelaziz Abdellatif

URAPOP, Université de Tunis El Manar, Faculté des Sciences de Tunis, El Manar 2092, Tunisie

Université de Tunis École supérieure des sciences économiques et commerciales de Tunis Montfleury 1089, Tunisie

LIPAH, Université de Tunis El Manar, Faculté des Sciences de Tunis, El Manar 2092, Tunisie

Corresponding Author Email:

ghorbel.fst@gmail.com, Montfleury 1089, Tunisie, abdelaziz.abdellatif@fst.rnu.tn

Received:

N/A

| |

Accepted:

N/A

| | Citation

isi21_1_06_ghorbel.pdf

OPEN ACCESS

Abstract:

In the domain of data warehousing, most approaches of distribution are essentially based on the techniques of fragmentation and allocation tables. These approaches exploit in input extracts predicates of OLAP queries most used in the partitioning process. Since continues increase of the number of predicates, and her negative impact, it becomes more and more interesting to reduce this increase before the fragmentation process. In this paper, we propose a solution based on a classification algorithm to reduce the number of predicates in the data warehouses allocation approaches. The proposed solution encompasses for phases: a preliminary phase for predicates selection, a predicates coding phase as binary matrices, a classification phase of these predicates using the k-means algorithm and a final phase toreduce the number of predicates. We validate our solution on a real data warehouse basing on the APB1 and TPC-H benchmarks.

Keywords:

data warehouse, classification, distribution

1. Introduction

2. État de l’art

3. Solution proposée

4. Validation expérimentale

5. Conclusion

References

Barr M. (2010). Approche dirigée par les fourmis pour la fragmentation horizontale des entrepôts de données relationnels. Mémoire de mastère, École nationale Supérieure d’Informatique, Algérie.

Bellatreche L. (2008). Bitmap join indexes and data partitioning. Encyclopedia of Data Warehousing and Mining 2nd Edition, 5, p. 37-38.

Bellatreche L., Boukhalfa K., Richard P. (2011). Primary and referential horizontal partitioning selection problems. Concepts, Algorithms and Advisor Tool.

Bellatreche L., Karlapalem K. et Li Q. (2004). Derived horizontal class partitioning in oodb: Design strategies, analytical model and evaluation. Conceptual Modeling ER’98, p. 465-479.

Boukhalfa K. (2009). De la conception physique aux outils d’administration et de tuning des entrepôts de données. Thèse de doctorat, Université de Poitiers.

Darmont J. (2006). Optimisation et évaluation de performance pour l’aide à la conception et à l’administration des entrepôts de données complexes. Thèse de doctorat, Université Lumière Lyon 2.

Derrar H., Ahmed-Nacer M., Boussaid O. (2008). Une approche de répartition des données d’un entrepôt basée sur l’Optimisation par Essaim Particulaire. In EDA 2008, p. 141-150.

Derrar H., Ahmed-Nacer M., Boussaid O. (2012). Particle swarm optimisation for data warehouse logical design. International Journal of Bio-Inspired Computation. vol. 4, n° 4, July, p. 249-257.

Derrar H., Ahmed-Nacer M., Boussaid O. (2013). Exploiting data access for dynamic fragmentation in data warehouse. International Journal of Intelligent Information and Database Systems 01/2013, vol.7, n° 1, p. 34-52. DOI: 10.1504/IJIIDS.2013.051736.

Ghorbel M., Tekaya K., Abdellatif A. (2014). Réduction du nombre des prédicats pour les approches de répartition des entrepôts de données. ASD 2014, 29-31 mai 2014, Hammamet, Tunisie.

Kalnis P., Papadias D. (2001). Optimization algorithms for simultaneous multidimensional queries in olap environments. Data Warehousing and Knowledge Discovery, p. 264-273.

Liang W., Orlowska M., Yu J. (2000). Optimizing multiple dimensional queries simultaneously in multidimensional databases. The VLDB Journal The International Journal on Very Large Data Bases 8, p. 319-338.

Mahboubi H. (2008). Optimisation de la performance des entrepôts de données XML par fragmentation et répartition. Thèse de doctorat, Université Lumière Lyon 2.

Ozsu T. et Valduriez P. (1999). Principes of distributed database systems. Prentice Hall, 19-22.

Pham C. (2002). VPN et solutions pour l’entreprise. SaaS, Université de Pau et des Pays de l’Adour.

Rakotomalala R. (2004). Tanagra : une plate-forme d’expérimentation pour la fouille de données. Open Access Journal, Université Lumière Lyon 2.

Spofford G. (1998). OLAP conseil APB-1 benchmark. Guide d’installation.

Tekaya K., Abdellatif A., Ounalli H. (2010). Data mining based fragmentation technique for distributed data warehouses environment using predicate construction technique. In Sixth International Conference on Networked Computing and Advanced Information Management (NCM), p. 63-68.

Tekaya K. (2011). Fragmentation et allocation dynamiques des entrepôts de données. Thèse de doctorat, Faculté des sciences de Tunis.

Zhang Y., Orlowska M. (1995). Fragmentation approaches for distributed database design. Information Sciences-Applications, p. 117-132.

Ziyati E. (2010). Optimisation de requêtes OLAP en Entrepôts de Données : Approche basée sur la fragmentation génétique. Thèse de doctorat, Faculté des Sciences Rabat.

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

Reducing the number of predicates for approaches to distribution of data warehouses