DFRMIdroid: A Comprehensive Fusion Approach Utilizing Permissions and Intents Analysis with the DFR-MI Algorithm for Enhanced Malware Detection on Android Devices

ABSTRACT


INTRODUCTION
Smartphone usage has become increasingly common in our daily lives and can perform various tasks such as sending SMS, online shopping, entertainment, and financial transactions.The popularity of Android systems has made it easier for developers to build apps and offer services, but it has also made it easier for intruders to build apps that can cause harm [1].Malicious apps include viruses, worms, backdoors, spyware, Trojan horses, and rootkits.These apps can misuse device resources and steal data, damage file systems, cause SMS fraud, and lead to premium dialers [2].Recently, the number of malicious apps has increased significantly, with the AV-TEST security institute reporting that Kaspersky detected 5.7 million malware Android packages in 2020 [3].
To prevent attacks, many security vendors provide tools to protect mobile devices and user data, such as antivirus software and firewalls.However, these tools work based on signatures and can only detect known malware.These tools require frequent database updates to detect new malware apps; this process can be resource-intensive, potentially impacting the overall system performance [4].To build a model that can identify recent attacks, researchers have proposed using intelligent methods to detect malware using machine-learning algorithms [5].
Android apps include several features, such as permissions, intent, API calls, hardware features, system calls, and network traffic.Based on these features, the app can be identified as malicious [6].In general, three main techniques can be used for analyzing the behavior of an Android app and extracting essential features: static, dynamic, and hybrid analysis.Features such as permissions, intent, and API calls can be extracted without executing an application.However, dynamic analysis requires running an application on a virtual machine to extract features like system calls and network traffic [5][6][7].The hybrid analysis combines the features of both static and dynamic analysis [8].
Earlier research has shown that feature selection is crucial in building machine-learning models.Researchers use feature selection to remove duplicate or irrelevant features and refine essential features to enhance model performance [9].However, hackers upgrade their applications along with the Android API levels.Many researchers still analyze applications written on older API levels.There is a difference in the features available at different API levels.For example, API level 28 includes 325 permissions, while API level 15 contains only 166 permissions.
Additionally, several distributors create custom permissions to access data, hardware resources, and Web APIs.An increasing number of permissions opens up more opportunities for abuse.Several studies are based on the MalGenome and Drebin datasets for malware analysis, which are based on samples collected between 2010 and 2012 and fall between API levels 9 and 18 [10].
In this research, we introduce an innovative feature selection technique designed to enhance the capability of machine-learning algorithms in effectively distinguishing between malicious and benign Android applications.Furthermore, our developed dataset collected samples from diverse sources, ensuring comprehensive coverage of both older and newer specimens across a wide range of API levels.The main contributions of this work can be summarized as follows: 1. Feature Selection Method: The study introduces a novel feature selection method named Discriminative Feature Ranking-Mutual Information (DFR-MI), which effectively selects features and aids in constructing a model capable of distinguishing between malicious and benign Android applications.This method improves upon the mutual information algorithm in terms of both time and accuracy.The DFR-MI selects the informative features from the most frequent features in the dataset instead of selecting the informative features regardless of their frequency in the dataset as happens with the mutual information algorithm.
2. Comprehensive Dataset: The study develops a large and well-balanced dataset consisting of 32,170 samples and 209 features.The dataset includes permissions (both native and custom) along with intents, providing a comprehensive representation of Android applications.
3. Improved Detection Accuracy: The proposed approach, combining permissions and intents with machine learning techniques, achieves high accuracy in detecting malicious Android apps.The results surpass those obtained using the Drebin dataset, demonstrating the effectiveness of the proposed methodology.
4. Random Forest Algorithm: The study evaluates nine machine learning algorithms and identifies the random forest algorithm as a highly performant choice.It achieves high scores in accuracy, precision, recall, and the f1 score when combined with the proposed feature selection method.
5. Lightweight Models: The study highlights the importance of using lightweight models for mobile devices.The proposed models offer faster training and inference times, addressing battery life concerns and time constraints associated with mobile devices.
The remainder of the paper is structured as follows: Section 2 includes the related work on Android malware detection and the techniques used.Section 3 describes the methodology of the proposed system.Section 4 presents the experiments performed in this study.Section 5 provides conclusion.

RELATED WORK
The surge in mobile device usage and its applications has brought about a parallel increase in the presence of malicious apps.The malicious apps pose many threats such as stealing sensitive user data, misusing applications, and sending short messages to premium numbers.Therefore, researchers developed many models for defense and mitigate the risk of malicious apps.Generally, analyzing the behavior of apps can be categorized into static, dynamic, and hybrid analyses.Static methods can extract static features from an app without executing the app, such as permissions [11,12], intent [13], API calls [14,15], and opcodes [16,17].Dynamic methods detect an app's behavior by monitoring system calls [18,19], memory utilization, CPU utilization [20], and network traffic [21].Hybrid methods combine static and dynamic features [22].Those features are used to distinguish malicious apps from benign ones.
The authors in the study [23] developed a hybrid malware detection system named NTPDroid that extracts network traffic features and permissions from applications.The proposed model employed the FP-Growth technique to generate frequent patterns existing in malicious datasets and benign datasets.The results showed that combining network traffic features with permissions improved detection rates compared to either network traffic features or permissions used alone.The authors in the study [24] presented an intrusion detection system that detects and classifies malicious applications based on analyzing permissions.The proposed method works in three steps: i) extracting features from Android apps, ii) using machine learning for training on the extracted features, and iii) assessing the model's performance using a testing dataset like Drebin and AndroTracker datasets.Various ML algorithms have been evaluated for detecting and classifying malicious applications.For the detection of malware applications, kernel logistic regression achieved the highest 98.2% accuracy.In the study [25], the authors developed a method named Deep-Intent, an online Intrusion Detection System (IDS) that uses an E2E DL implementation for supervised learning and unsupervised feature engineering and only uses implicit intent as a feature.The experiment findings reveal that the presented intent-based IDS could detect malware application software with an AUC of 81% and an accuracy of 77.2%.In the research [20], the authors developed a new Android host-based IDPS (HIDROID) that runs entirely on a mobile device.The HIDROID periodically gathers feature samples at run time from many resources that reflect the utilization of mobile resources like CPU, battery, memory, and other features.The detection engine uses machine learning and statistical methods to develop a model based on data to support benign behaviors.Any observation that fails to meet this model raises an alert, and the prevention agent takes adequate countermeasures to reduce the risk.Experimental test findings reveal that HIDROID can learn from regular activity and distinguish it from abnormal with a highly promising precision of up to 91%.The researchers in the study [26] proposed a lightweight intrusion detection system that detects zero-day attacks efficiently named DroidLight.DroidLight is based on the author's probability distribution and one-class classification.The classification models learn their regular CPU use and network traffic for every mobile application.If there is a significant deviation from the normal pattern, the model raises an intrusion alarm.A real user who interacted with it using three self-developed apps evaluated DroidLight on a real device.DroidLight could identify mobile malware with an accuracy spanning 93.3% to 100%.The study [13] proposed a new static method for detecting Android malware based on intents and permissions.Initially, the presented model used Information Gain to rank both permissions and intents and then combined permissions and intents to find the best set that could provide better accuracy using various machine learning algorithms.The results of the experiments showed that the proposed methods of combining permissions and intents improved detection accuracy over permissions and intents separately.The study [27] proposed a classification mechanism for Android applications that combines dynamic packet analysis with static permissions.First, through static analysis, the proposed system collects static information from Android applications and classifies them as benign or malicious using machine learning.Furthermore, excessive dynamic data-gathering time is avoided by filtering out safe apps.The malware's network traffic is then employed in the dynamic analysis phase to extract many information features, and then machine learning is used to classify the malware family.The model's objective is to limit the number of apps requiring dynamic data collection, which minimizes analysis time overall.After experiments, the results show that the model can achieve high accuracy and reach 96%.The authors [28] proposed the DATDroid method for Android malware detection.DATDroid collects dynamic features such as CPU usage, memory usage, network traffic, system call errors, and system call time.The DATDroid approach achieved an accuracy of 91.7%.The study [29] presented a new approach based on Recurrent Neural Networks (RNN) for identifying malware in Android applications.The suggested method extracts two sets of features, API calls, and permissions from the Android application.According to the experimental results, the RNN achieved a high accuracy of 98.2% on the CICAndMal2017 database.The researchers [30] proposed a new model based on permissions extracted from APK files.The proposed model detects malicious apps based on suspicious permissions.The system extracts essential features such as permissions, permission rates, and small file sizes from the 10000 applications collected from virus share and Google Play.With SVM, the model achieved 89.2% accuracy.The study [31] developed a novel technique for identifying malware in Android apps utilizing the frequent pattern (FP) growth algorithm.This algorithm is used to find more frequent patterns of feature coexistence at different levels.The authors also made several datasets of co-existing features.These included a permissions-coexisting dataset, an API-coexisting dataset, and permission with an API-coexisting dataset.Several machine-learning algorithms were used for testing, and the results show that the random forest, support vector machine, and decision tree got high accuracy and reached 98% using the permission-API co-existence dataset in the CIC_MALDROID2020.
Despite the progress made by the prior researchers in employing diverse methods and achieving promised results, certain limitations persist in their studies.Common limitations include reliance on outdated, imbalanced, and noncomprehensive datasets.Furthermore, the methods employed for selecting informative features often entail high processing demands and are time-consuming.To overcome these challenges, the authors intend to develop a balanced and comprehensive dataset, encompassing samples from various API levels for both malicious and benign applications.Additionally, a feature selection method will be devised to maximize accuracy while minimizing the time required for feature selection.

METHODOLOGY
The primary objective of this study is to construct a robust and efficient machine-learning model that uses permissions and intents to detect malware on Android devices.Figure 1 illustrates the general proposed methodology of the study.

Preparing dataset
A dataset represents a structured collection of data samples specifically assembled and prepared for training and testing machine learning models to classify Android applications as either benign or malicious.Figure 2 illustrates the main steps for creating a dataset.

Sample analysis
After collecting samples of both malware and benign apps, the researchers utilized the Static Dynamic Hybrid Feature Extraction (SDHFE) tool to reverse engineer and decompile them into their source files.The SDHFE tool is a lightweight and automated tool developed by the authors to analyze Android applications and extract features from them.It operates on the Linux operating system.The SDHFE tool is easy to use, allowing researchers to effortlessly generate profiles from the analyzed applications based on selected features.It can extract permissions and intents from manifest files, APIs and opcodes from source code, and system calls from the application's behavior during execution.Notably, it possesses the capability to efficiently analyze and generate profiles for a bulk of applications without requiring human intervention.Furthermore, researchers can leverage this tool to generate profiles that include features from many sources at the same time, such as generating profiles based on permissions and APIs together.For this study, we utilize this tool to extract static features such as permissions and intents.

Feature extraction
The feature extraction process begins promptly after decompiling each sample.The SDHFE tool extracts features primarily from the AndroidManifest.xmlfile, serving as the main source, which generates a profile for each analyzed sample.Our analysis predominantly focuses on two sets of static features: FS1, representing permissions (both native and custom), and FS2, representing intents.Throughout our study, we extracted over 500,000 features associated with both benign and malware samples.The process of analyzing and extracting features from a single sample using the SDHFE tool is depicted in Figure 3.For

Preprocessing
The researcher has performed two functions on the extracted data.The first function scans all profiles generated in the feature extraction phase to collect features and keeps them in a single list called Featurelist.The second function reads the Featurelist and eliminates redundant features from it.The redundant features are either duplicated more than once in the Featurelist or rarely appear in Android samples.The remaining features were saved into a new list called FilterList and appended to the dataset as a header of columns.

Feature vector
After combining the two feature sets FS1∪FS2, a binary feature vector Fv = (f1, f2, …,fn) will be generated for each sample according to Eq. ( 1). (1) The features (permission and intent) are encoded with 1 to signify their presence in an Android application, and 0 if absent.For classification, a class label is added to each feature vector 1 denotes the "malware" class, while 0 denotes the "benign" class.These binary feature vectors are stored in a CSV file for efficient data organization and processing.Figure 4 provides an example of such a feature vector, displaying binary representations of features along with their respective class labels.The dataset preparation process, encompassing feature extraction and feature vector creation, is detailed in Algorithms 1. overfitting problem.This study proposes a novel approach called Discriminative Feature Ranking-Mutual Information (DFR-MI), which combines the mutual information algorithm with the discriminative feature ranking method to further improve the accuracy of the model.The proposed approach utilizes two levels and focuses on selecting the most significant features that effectively differentiate between malicious and benign applications.To provide a comprehensive understanding, the study introduced key definitions that will be used throughout each level.Let S be a set of malware and benign samples in the dataset and denoted by: where, sr represents the r th sample in S and |S| represents the total samples in the dataset.
Let F be a set of features used by malware and benign samples in the dataset and denoted by: where, fr represents the r th feature and |F| represents the total features in the dataset.
Let C be a set of the class labels in the dataset and denoted by: C= {c1, c2}, here we have only two class labels: malware and benign.
Let CF represent candidate features selected from level one and pass to level two.
Definition 1: (Feature Frequency: FF) Calculate the frequency of each feature in malware and benign samples in the dataset.Because the presence of each feature in a specific class is set to 1, and the absence is set to 0. We can find the frequency of each feature as follows: Definition 2: To know the feature that appears more in malware or benign samples, calculate the difference for each feature according to the following equation: where, Dfr means the frequency difference feature at the r index, the Dfr result will be a positive, negative, or zero value.
Positive value: mean the feature is more presence in the malware samples.
Negative value: mean the feature is more presence in the benign samples.
Zero value: indicate the presence of feature are equal in malware and benign samples.

Level one: Discriminative Feature Ranking (DFR)
At this level, the proposed algorithm uses a statistical method to identify whether a specific feature is utilized more frequently in benign or malware apps in two steps.In the first step, use Eqs.( 4) and ( 5) to count the frequency of each feature in malware and benign apps separately.Although this step provides valuable insights into which features are used in each category of apps, it may not help us to distinguish malicious apps from benign ones.For example, Table 2 shows the top 10 features that mostly appeared in malicious and benign samples in our dataset.The 'android.permission.INTERNET' feature appeared 15594 times in malicious samples, which is nearly 97% of malicious samples, and 15536 times in benign samples, which is nearly 97% of benign samples.The same applies to the 'android.permission.ACCESS_WIFI_STATE' feature, the percentage of this feature is very close in both categories.These features may not be at the top level for distinguishing malicious from benign apps.Therefore, the features need more analysis.
In the second step, using Eq. ( 6) to select more important features by subtracting the frequencies of each feature that appears in malicious apps from the frequencies of the same feature that appear in benign apps and store the result with the feature name inside a new dataframe called Discriminative Feature (DF) dataframe.This procedure is robust at identifying whether a particular feature is more frequently found in benign or malicious applications.The rank of features changes in this step and is different from step 1, as shown in Table 3.
Generally, level one provides insight into the features that are more utilized by malicious and benign apps.However, it is not necessary for all frequent features to be more informative in the prediction model because feature counting only considers the frequency of individual features across a dataset, it doesn't take into account the relationships or dependencies between features.This can lead to the inclusion of irrelevant or redundant features in the model.Therefore, we consider level one to works as a filter based on the specific thresholds to narrow down the features to the most common ones in both categories.This helps to reduce the search space and computational complexity at the next level.In this study, we have two thresholds, α1 and α2.The value of α1 represents the number of features in F that appear more in malware apps than benign apps.The value of α2 represents the average frequency of all features that most frequently occur in benign samples.The DF dataframe was filtered based on the α1 and α2 and the new features were saved to a new list known as Candidate Features (CF) list.Finally, the CF became an input to the next level.Figure 5 illustrates candidate features from malware and benign samples passed to level two.Mutual information is a technique that can be used for weighing variables.It is widely used in machine learning problems to assess the mutual independence of two random variables.The value of MI is a non-negative number that ranges from 0 to 1.The maximum value of MI indicates a strong correlation between the two variables.The value of 0 indicates no correlation between the two variables.The following is the mutual information formula: In our study, MI is used to measure the relevance of features received from CF list at the first level.The variable CF indicates whether the cfi appears in an application.C represents the class label of the application belonging to malware or benign application, and p (CF= cfi) indicates the probability that the variable CF is cfi, p(C=ci) represents the probability that the value of C is ci.Based on the basic formula of the MI, the correlation value MI (CF, C) of each feature is obtained.Table 4 represents the top 10 features selected from the mutual information algorithm.
The combination of DFR and MI can help to improve the overall performance of the feature selection process by eliminating redundant features and selecting only significant features for the model.Consequently, this refinement results in more precise predictions and reduces the computational complexity of the model.

EXPERIMENTAL ENVIRONMENT AND RESULT ANALYSIS
In our study, we used windows ten 64-bit operating system machine with Intel(R) Core (TM) i5-2320 CPU @ 3.00GHz, NAVIDIA Quadro 4 GB, and 16 GB of RAM.For processing our data, the GPU was used to accelerate the execution of machine learning algorithms.We implemented our codes on the anaconda platform.The python version is 3, and the basic libraries utilized in this work include pandas, NumPy, sci-kitlearn, TensorFlow, matplotlib, and seaborn.

Machine learning and splitting dataset
We used nine machine-learning algorithms (RF, DT, SVM, KNN, LR, NB, AdaBoost, Gradient Boosting, and ANN) for training and testing on our dataset to find a good model for detecting malware on the smartphone device.During the learning phase, the variables (hyperparameters) of each algorithm are adjusted with some values.Table 5 illustrates the details of the hyperparameters used for each algorithm.
To improve the performance of these models and lower the risk of overfitting, the mutual information (MI) and DFR-MI algorithms were used to choose important features.On this basis, we conducted two experiments to compare the performance of the two algorithms for feature selection and its effects on the predicate models.In the first experiment, the mutual information algorithm chose 75 features and passed them to nine machine-learning algorithms to train on.
In the second experiment, the same process was followed, but this time the DFR-MI algorithm was used instead of the mutual information algorithm to choose the same number of features.The scores of each experiment are shown in Table 6.
The dataset was divided into two sets: 80% of the dataset was used for training and 20% for testing models.In general, it is recommended to use as much data as possible for training to maximize the performance of the models while still reserving enough data for testing to obtain reliable estimates of their performance.The performance of each model was evaluated based on common metrics such as accuracy, precision, recall, and F1 score to determine the best model for Android malware detection.By comparing the results of the two experiments, we found that the RF and the DT with DFR-MI algorithms got a higher score in accuracy in the training case, which is 98.9.While in the case of the test, we found that the RF with the DFR-MI algorithm outperformed all algorithms in terms of accuracy, precision, recall, and the f1 score, which are 98.52, 98.62, 98.41, and 98.52 respectively.This leads to the RF algorithm having the best average score, which is 98.52.We also got the worst result with NB using the mutual information algorithm for most evaluation metrics.The average scores of each algorithm in Table 7 are plotted to generate related graphs as shown in Figure 6.    8 shows the duration time in seconds for each algorithm for training and testing the model with the DFR-MI and MI algorithms.The obtained data in Table 7 are plotted to generate related graphs.As shown in Figure 7, it is clear that the time taken to train and test any model with the proposed algorithm DFR-MI is reduced compared to the MI algorithm by at least 14 seconds.

Performance comparison between our dataset and the benchmark dataset
The authors of this study observed that the Drebin dataset was commonly favored by many researchers during the review process, consistently yielding high accuracy in the detection of malicious Android applications.Building upon this precedent, this study selected the Drebin dataset as a benchmark and conducted a comparative analysis with their own dataset.The visualization of our dataset and the Drebin dataset are illustrated in Figures 8 and 9.
The Darbin dataset dates back to 2012 and contains four feature types: API call signatures, permissions, command signatures, and intent.The number of samples in the Drebin dataset is 15036, of which 5560 are malware samples, and 9476 are benign samples.Our dataset consists of 32170 samples, distributed equally between malware and benign samples.The number of columns in our dataset is 209; 208 columns represent features, and 1 column represents a class label.Two hundred eight features are distributed between permissions and intents.Permissions can be native permissions or custom permissions.The number of native permissions is 102 features, the number of custom permissions is 60, and the number of intents is 46.To compare our dataset with the Drebin dataset, we have removed the feature categories like API call signatures and command signatures from the Drebin dataset.This is because our dataset only contains two feature categories: permissions and intents.After deleting the mentioned features from the Drebin dataset, the remaining features are 136 (23 intents and 113 permissions).In this way, we can a fair comparison between the two datasets with respect to their effectiveness in detecting Android malware based on permissions and intents.Then we applied MI and DFR-MI algorithms for feature selection to select the top 75 features in both datasets and passed them to nine machine learning algorithms.The scores each algorithm got on the Drebin datasets are illustrated in Table 8.According to the scores obtained by each algorithm in Table 9, we noted that most algorithms got high scores in terms of accuracy, precision, recall, and F1 score with the DFR-MI feature selection algorithm on the Drebin dataset.To compare the performance of our dataset with the Drebin dataset, we summarize the average scores of each machine-learning algorithm with the DFR-MI algorithm on both datasets in Table 9.The average scores in Table 9 are plotted to get their related graphs, as shown in Figure 10.The authors of this paper observed that the samples within the Drebin dataset suffer from being outdated, imbalanced, and lacking the inclusion of features utilized by modern malicious samples.These issues were meticulously addressed in the development of a new dataset, resulting in improved outcomes when the machine learning algorithm was trained on it.For example, the feature android.permission.REQUEST_INSTALL_PACKAGES ranked among the top fifteen features in the developed dataset and can be abused by malicious applications to deceive users into installing harmful apps on their devices.Notably, this feature is absent in the older API level utilized by the Drebin dataset.
Overall, we found the random forest algorithm with DFR-MI feature selection to be the best performer on our dataset.So, we chose this algorithm to build a model for figuring out which apps on smartphones are malicious.

Analyzing the confusion matrix of the random forest algorithm
A classification model's performance can be assessed by counting the number of testing samples that the model correctly and incorrectly predicts.A confusion matrix is a table that displays these counts.Figure 11 shows the confusion matrix related to the random forest algorithm for binary classification.The total number of successfully classified samples equals the sum of the diagonals in the matrix.In contrast, the total number of incorrectly classified samples equals the sum of the secondary diagonal in the matrix.
As illustrated in Figure 11, 3165 samples are malware.They are correctly classified as malware samples, while the predictive model misclassifies 44 samples of benign apps as malware.On the other hand, 3173 samples were correctly classified as benign, while 51 malware and incorrectly classified as benign by the predictive model.Overall, the error rate of the proposed model is 0.014.

CONCLUSION
This study presents a significant contribution to the field of Android malware detection.Fusing native and custom permissions with intents, a new dataset was created that is extensive, comprehensive, and encompasses samples developed from API level 1 to API level 32.Extensive experimentation and evaluation using nine machine-learning algorithms were conducted to compare the performance of this dataset against the Drebin benchmark dataset.Due to the comprehensiveness of the developed dataset, it consistently outperformed the Drebin dataset across all predictive models by at least 2.22%.Additionally, a novel feature selection algorithm DFR-MI was proposed with superior performance to the mutual information algorithm in both accuracy and time efficiency across the nine predictive models.The DFR-MI algorithm markedly reduced the training and testing time during the model construction phase.The findings of this study hold significant implications for enhancing mobile security.Precise identification of malicious apps ensures user privacy and defense against threats; developers in Android security can benefit from the developed model and feed it with the extracted permissions and intents from real applications to predict their state.Additionally, the dataset and feature selection algorithm introduced in this study have the potential to advance the development of more effective malware detection systems.Despite these contributions, the proposed model may produce false alarms when the tested application doesn't include any permissions and intents.So future research should address this limitation by exploring additional feature sets or integrating dynamic analysis techniques to further enhance Android malware detection accuracy.

Figure 1 .Figure 2 .
Figure 1.Schematic diagram of the proposed methodology

Figure 3 .
Figure 3. Process of sample analysis and feature extraction

Figure 5 .
Figure 5. Number of candidate features

Figure 6 .
Figure 6.Average scores of the two experiments 4.2 Training and testing model duration time In this subsection, we calculated the duration time needed for each model during training and testing on 75 features twice: once with the DFR-MI and once with mutual information algorithms for feature selection.The DFR-MI algorithm not only increased the model's accuracy but also helped reduce the duration of time for training and testing the model.Table8shows the duration time in seconds for each algorithm for training and testing the model with the DFR-MI and MI algorithms.The obtained data in Table7are plotted to generate related graphs.As shown in Figure7, it is clear that the time taken to train and test any model with the proposed algorithm DFR-MI is reduced compared to the MI algorithm by at least 14 seconds.

Figure 7 .Figure 8 .Figure 9 .
Figure 7. Duration time in seconds for training and testing models with DFR-MI and MI

Figure 10 .
Figure 10.Average scores on both datasets based on the DFR-MI

Figure 11 .
Figure 11.Confusion matrix for random forest

Table 1 .
Details of collected samples Art and design, Beauty, Book, Business, Education, Financial Communication, Entertainment, Health, Medical, Music and Audio, News, hoping, Social.

End For For each profile ∈ ProfileList do: For each feature
each sample ∈ Malware, Benign do: Decompile sample using the SDHFE tool ProfileList.Append (profile for sample based on features extracted from the AndroidManifest.xml)∈ profile do: Featurelist[index] =feature Increment index by one End For End For Remove duplicated features from Featurelist and filter the feature that rarely appears in samples For each feature ∈ Featurelist do: FilterList.Append (feature) if feature Not in FilterList And counting (feature) >=th Where th is a threshold representing the number of times the feature appears in all samples End For DSCSV [rowindex].Append (FilterList ) Generate a Feature vector and append it to a dataset For each profile ∈ ProfileList do: Fvlist.Clear for each profile, indexfv=0, increment rowindex by one For each feature ∈ FilterList do: If feature ∈ profile Fvlist [indexfv] =1 Increment indexfv by one Else Fvlist [indexfv] =0 Increment indexfv by one End If End For If profile ∈ malware sample Fvlist [indexfv +1] =1 Else Fvlist [indexfv +1] =0 End If DSCSV [rowindex].Append (Fvlist) End For End

Table 2 .
Top 10 features in malware and benign samples after the first step of DFR

Table 3 .
Top 10features in malware and benign samples after the second step of DFR

Table 4 .
Top 10 features selected by MI algorithm

Table 5 .
Hyperparameters used by each algorithm in the experiments

Table 6 .
Results of two experiments with MI and DFR-MI features selection algorithms

Table 7 .
Duration time in seconds for training models with DFR-MI and MI

Table 8 .
Results of nine ML algorithms with MI and DFR-MI feature selection algorithms on the Drebin dataset

Table 9 .
Average scores on both datasets based on the DFR-MI algorithm