Home Journals ISI A deep neural network-based algorithm for safe release of big data under random noise disturbance

JOURNAL METRICS

CiteScore 2024: 2.4 ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2024: 0.247 ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2024: 0.582 ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

A deep neural network-based algorithm for safe release of big data under random noise disturbance

Jian Yu | Hui Wang^*

Liuzhou Vocational and Technical College, School of Electronic Information Engineering, Liuzhou 545005, China

Liuzhou Vocational and Technical College, School of Art, Liuzhou 545005, China

Corresponding Author Email:

Huiwang.liuzhou@gmail.com

Received:

| |

Accepted:

| | Citation

189-200.pdf

OPEN ACCESS

Abstract:

Despite its huge benefits, the release of big data is faced with the severe risk of privacy leakage. To solve the problem, this paper proposes a deep neural network (DNN)-based algorithm for safe release of big data under random noise disturbance. Specifically, a random noise of a certain probability distribution was added into the release of the big data, such that the public output will not change significantly whether an individual data record is in the dataset and that that the published data will be basically the same to the original dataset. The algorithm was then optimized in light of the attributes of the correlated datasets in big data. Finally, the proposed algorithm was proved better than the traditional algorithm in large-scale searches of correlated datasets, and capable of ensuring privacy at a lower privacy budget.

Keywords:

deep neural network (DNN), big data, privacy preserving, differential privacy

1. Introduction

2. Definition of privacy in the release of big data

3. Random noise addition mechanism in the release of big data

4. Privacy analysis of correlated datasets

5. Noise addition mechanism of correlated datasets

6. Experiments and results analysis

7. Conclusions

Acknowledgement

This work is supported by {2018,2019} Foundation of Improving Academic Ability in University for Young Scholars of Guangxi.

References

Beimel A., Nissim K., Stemmer U. (2014). Private learning and sanitization: Pure vs. approximate differential privacy. APPROX 2013, RANDOM 2013. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, Vol. 8096, pp. 363-378. https://doi.org/10.1007/978-3-642-40328-6-26

Deng L., Yu D. (2014). Deep learning: Methods and applications. Foundations and Trends® in Signal Processing, Vol. 7, No. 3-4, pp. 197-387. http://dx.doi.org/10.1561/2000000039

Dwork C. (2011a). A firm foundation for private data analysis. Communications of the ACM, Vol. 54, No. 1, pp. 86-95. https://doi.org/10.1145/1866739.1866758

Dwork C. (2011b). The promise of differential privacy: a tutorial on algorithmic techniques. Proc of the 52nd Annual IEEE Symposium on Foundations of Computer Science, USA, pp. 1-2. https://doi.org/10.1109/FOCS.2011.88

Dwork C., Roth A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, No. 3-4, pp. 211-407. https://doi.org/10.1561/0400000042

Fung B. C. M., Wang K., Chen R., Yu P. S. (2010). Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys (CSUR), Vol. 42, No. 4, pp. 1-53. https://doi.org/10.1145/1749603.1749605

Hall R., Rinaldo A., Wasserman L. (2013). Differential privacy for functions and functional data. J. Mach. Learn. Res, Vol. 14, No. 1, pp. 703-727. https://doi.org/10.1109/MCS.2012.2225913

Kifer D., Machanavajjhala A. (2011). No free lunch in data privacy. SIGMOD '11 Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, Athens, Greece, pp. 193-204. https://doi.org/10.1145/1989323.1989345

Kifer D., Machanavajjhala A. (2014). Pufferfish: A framework for mathematical privacy definitions. ACM Transactions on Database Systems, Vol. 39, No. 1, pp. 1-36. https://doi.org/10.1145/2514689

Kifer D., Smith A. D., Thakurta A. (2012). Private convex optimization for empirical risk minimization with applications to high-dimensional regression. In COLT, Edinburgh, United Kingdom Duration, pp.1-40. https://doi.org/10.1109/FOCS.2014.56

Koufogiannis F., Han S., Pappas G. J. (2016). Gradual release of sensitive data under differential privacy. Privacy and Confidentiality, Vol. 7, No. 2, pp. 1-22. https://doi.org/10.29012/jpc.v7i2.649

Li X. G., Li H., Li F. H., Zhu H. (2018). A survey on differential privacy. Journal of Cyber Security, Vol. 3, No. 5, pp. 92-104. http://dx.doi.org/10.19363/J.cnki.cn10-1380/tn.09.08

Noman M., Chen R., Fung B. C. M., Yu S. (2011). Differentially private data release for data mining. Proceeding KDD '11 Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, California, USA, pp. 493-501. https://doi.org/10.1145/2020408.2020487

Parra-Arnau J., Perego A., Ferrari E., Forne J., Rebollo-Monedero D. (2013). Privacy-preserving enhanced collaborative tagging. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 1, pp. 180-193. https://doi.org/10.1109/tkde.2012.248

Wang Y., Wang Y., Singh A. (2016). A theoretical analysis of noisy sparse subspace clustering on dimensionality-reduced data. CoRR, eprint arXiv, Vol. 1610, No. 07650, pp. 99. http://dx.doi.org/10.1109/TIT.2018.2879912

Wong R. C. W., Fu A. W., Wang K., Xu Y., Yu P. S. (2011). Can the utility of anonymized data be used for privacy breaches. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 5, No. 3, pp. 1-24. https://doi.org/10.1145/1993077.1993080

Xiao Q., Chen R., Tan K. (2014). Differentially private network data release via structural inference. Proceeding KDD '14 Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA, pp. 911-920. https://doi.org/10.1145/2623330.2623642

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

A deep neural network-based algorithm for safe release of big data under random noise disturbance