Sentiment Analysis: Classifying Public Comments on YouTube in Disaster Management Simulation in Indonesia Using Naïve Bayes and Support Vector Machine

.


INTRODUCTION
A disaster is a sudden event caused by natural, non-natural, and social factors, thus impacting the community and the surrounding environment.In general, the causal factors for disasters are caused by several trigger components, threats or hazards, and vulnerabilities that work together systematically to cause risk to the community [1,2].Disaster risk can be reduced through systematic efforts to analyze and reduce the causal factors of disasters, one of which is understanding disaster risk reduction.The outreach related to disaster awareness and preparedness in the community is one of the non-structural disaster mitigation efforts that must be strengthened by increasing awareness, preparedness, and public education on disaster mitigation so that it is entrenched in society.The existence of social media is evidence of the development of information and communication technology that is often used today.Social media can be used as an effective tool in education and outreach to the public.Social media is increasingly used for socialization, simulation, learning, and collaboration.Social media can offer helpful information for containing catastrophe risk and responding to emergencies [3,4].Of the various types of social media sites, online video-sharing applications have been shown to have the highest interactive rate [5].
YouTube is the second-largest search engine behind Google and one of the most significant video-sharing platforms in the world [6].YouTube offers dynamic new opportunities for practical learning models compared to traditional ones [7].
YouTube is the largest online digital video channel, with over two billion users, and more than one billion hours of YouTube videos are watched daily [8].To reduce the risks caused by disasters in Indonesia, the government, through BNPB (the National Agency for Disaster Countermeasures), always tries to provide educative outreach to the public in various ways, such as counseling, training or workshops, as well as direct simulations or through mass media, electronic, and social media [9].
One form of outreach utilizing social media that BNPB has carried out is their YouTube channel, which can be accessed via the link https://www.youtube.com/@BNPBIndonesia/video. The YouTube channel, in addition to covering coverage of activities carried out by BNPB, also contains procedures and steps related to disaster preparedness for the community.Dissemination of disaster management through the YouTube platform can effectively convey information and education to the public.However, to make the dissemination of disaster management through YouTube more effective, of course, it is necessary to see the extent to which the public responds to the content presented.For this reason, this research will classify public comments on the YouTube channel on disaster management simulations in Indonesia through sentiment analysis.Sentiment analysis is an approach that makes it possible to obtain condensed viewpoints from particular sources that contain substantial volumes of data [10].Sentiment analysis identifies, recognizes, and/or categorizes user emotions or thoughts for any service, such as movies, product issues, events, or each attribute, as positive, negative, or neutral [11,12].Many techniques can be employed to classify each attribute, such as the Naive Bayes Classifier, Support Vector Machine, K-NN, RNN, C4.5, Lexicon, and LDA-Based Topic Modeling [13].Some existing literature on sentiment analysis, especially related to social media, explains that the Naï ve Bayes Classifier (NBC) method is a simple method that has a high level of accuracy from classification results where the accuracy level is influenced by the amount of test data [14], the NBC algorithm is perfect for analysis text but not allowed for data in the form of images.Support Vector Machine (SVM) produces the maximum level of accuracy from data in the form of text or images.
In dynamic disaster emergencies, understanding community reactions and perceptions is key to designing a rapid and effective response.The use of sentiment analysis in disaster management can be a tool for policy makers in identifying and understanding how society responds to disaster situations that occur.Maharani in his research used sentiment analysis to classify the relevance of tweet data for disaster emergency response situations during floods in Maharani [15].Li et al. [16] in research created sentiment analysis to collect public opinion via social media to provide information to decision makers in urban disaster management and sustainable development.
Social media involvement has a significant impact on disaster management.Several studies have demonstrated the value and usefulness of YouTube as a data storage and dissemination platform, but no research has been conducted on the extent to which video content presented using YouTube can provide effective understanding for the community to improve disaster preparedness.
From the background, in this research a sentiment analysis will be made using two types of approaches, namely in the form of text and images using NBC and SVM to classify public comments submitted via YouTube channels in order to observe the effectiveness of video content presented on YouTube in disaster preparedness simulations.

LITERATURE REVIEW AND RELATED WORKS
This section discusses several literature reviews on the basic concepts of sentiment analysis and the classification methods used, i.e., Naï ve Bayes (NBC) and Support Vector Machine (SVM).

Sentiment analysis
Sentiment analysis, called opinion mining, is a computational study that identifies and expresses subjectivity, judgments, opinions, feelings, assessments, attitudes, and emotions in a text [17].According to Parabhoi and Saha [18], sentiment analysis is related to the automatic extraction of information from text, and these sentiments can be categorized into positive, negative, neutral, or n-point scale [19,20].Sentiment analysis studies user opinions, assessments, attitudes, and feelings expressed on social media or other online platforms [21].The purpose of sentiment analysis is to extract the attributes and components of several comments on social media and determine several positive, negative, and neutral classes.The results of sentiment analysis can provide an overview of customers so that a company can determine the next steps in developing brands and products.In addition, in the context of government policy, the results of sentiment analysis can be used to develop strategies so that policies can be accepted and/or improved to improve public services.

Classification method
One of the primary subjects of data mining or machine learning is classification [22].When data is utilized for classification, it is grouped according to a label or target.Classification is the process of finding a model (or function) that differentiates a class of data or concepts whose purpose is to be used for predictions class of an object whose class label is unknown.Model found by training data analysis (data object whose class is known).The classification process can be carried out after a relevance analysis to determine the attributes relevant to the process.The predictive accuracy of a classifier can be stated in a contingency table or confusion matrix.As part of supervised learning, classification involves analyzing a data set and then using the pattern found in the analysis findings to classify the test data.The two steps in the data classification process are learning and classification.A classification algorithm is used to evaluate data in learning training, and data testing is then employed in classification to verify the degree of accuracy of the classification rules applied.Based on distinctions in mathematical ideas, classification approaches are categorized into five groups: rule-based, statistical-based, distance-based, decision tree-based, and neural network-based.There are many algorithms from each of these categories, but those that are popular and frequently used include naive Bayes, nearest neighbor, decision tree, and support vector machine [23].

Naï ve Bayes Classifier (NBC)
Naive Bayes is a data classification method based on probability [24].This method, known as Bayes' theorem, predicts future opportunities based on previous experience.The theorem is combined with Naive, where it is assumed that the conditions between the instructions (attributes) are mutually independent.The advantage of this method is that it can be used for quantitative and qualitative data classification, does not require a large amount of data, can be used for classification for two or more classes (multiclass).The approach of the algorithm uses the following equation: Eq. ( 1) shows that c is the class label, z is the applied attribute, while P(c) and P(x|c) are the previous class probabilities.In the naï ve Bayes classification, the data set to be processed is categorized into three classes, positive, negative, and neutral [25].

Support Vector Machine (SVM)
One approach in the Supervised Learning category is the Support Vector Machine algorithm, which implies that machine learning is done on already labeled data.The machine classifies the test data into labels during the decision-making process based on its attributes [26,27].In the SVM method, the main point is to optimize a hyperplane.The hyperplane is a boundary that separates class one support vectors from other class support vectors [28].Optimizing support vectors, especially support vectors that are close between one class and another, is used as a benchmark for classification limits so that the hyperplane will be optimal [29].This vector comes from a dataset converted into a vector value through vectorization after the feature extraction process and is used as a support vector.For example, the training dataset consists of x and y in the form {(1, 1),…,(, )} where x is called a vector and y is the class label.

Kernel in Support Vector Machine (SVM)
The kernel in the Support Vector Machine method is a separator between one class and another.There are several kernels to support vector machines, including Linear, Polynomial, and Radial Basis Functions (RBF) [30] a. Linear Linear kernels use straight lines as a boundary/hyperplane between classes.The linear kernel only requires two variables, including  and .This variable is a vector of the results of the vectorization of the feature extraction weight values.In the calculation, the value of the vector  is transposed before being multiplied by .It's the same with the linear kernel for labeling results in the target class.
b. RBF In its completion, RBF requires parameters gamma and C. Gamma functions as a decision boundary and decision area.For example, if gamma is small, the decision boundary will be small, but the decision area will be vast, and vice versa.The gamma value used must be greater than zero.C serves as a penalty against errors in classification.x is taken from the vector as a result of vectorization.Exp is the exponent of the calculation of x and gamma:

c. Polynomial
The Polynomial kernel has two different parameters from other kernels.Parameter r is an independent parameter called homogeneous if r is filled with zero.At the same time, the parameter d is the degree/square, which is generally filled with d equal to 2:

Confusion matrix
A performance indicator for machine learning classification, the confusion matrix allows for the output of class results to contain more than two classes [31].The confusion matrix is a table for recording the results of classification work [32,33].A classification model's performance is assessed using the confusion matrix, where n is the number of class targets.The performance of the classification model is determined by comparing the numbers in the matrix between the actual value and the predicted value produced by the model.Table 1 below is an example of a confusion matrix.
Precision is the level of accuracy of the model's prediction value in predicting the positive and correct class.This value results from the positive predictive value compared to the number of positive values classified by the model.The following equation can obtain the precision value: Recall is the number of true positive values (TP) compared to actual positive values.Eq. ( 7) below is the formula used to get the recall value.

Related works
Several previous studies have discussed sentiment analysis of comments on social media using various methods.Rajkumar et al. [34] This research uses supervised learning to detect feelings on Twitter.It compares two algorithms: Knearest neighbor (KNN) and Naï ve Bayes (NB).It finds that Naï ve Bayes has a higher ideal accuracy level than KNN.Bamane et al. [35] conducted a sentiment analysis based on the number of likes and dislikes opinions.Khan et al. [36] used the Naï ve Bayes algorithm for multi-label classification to understand the behavior and responses of individuals after viewing specific videos on YouTube.Bhuiyan et al. [37] focused their research on improving the YouTube video capture process.They analyze the sentiment on YouTube comments to find the most relevant videos by search.Barzenji [38] three algorithms (support vector machine, random forest, and Naï ve Bayes) were used to study Twitter tweets' subjective polarity (positive, negative, and neutral).SVM had an accuracy of 82%, compared to 72% for random forest and Naï ve Bayes.Subsequent research conducted by Kavitha et al. [39] categorizes user comments on YouTube based on video relevance using a comparative analysis with the Bag of Words and Association List approaches.Tanesab et al. [40] conducted a sentiment analysis of netizen comments on YouTube using a Support Vector Machine (SVM) with a positive rate of 91.1%.Some of the research presented previously certainly has the same objective, namely to conduct sentiment analysis, but further research that will be carried out tends to look at the effectiveness of the use of media used concerning the socialization of disaster management in Indonesia based on comments from the public.

METHODOLOGY
This section discusses the stages and steps taken to realize the proposed research.the stages we carry out begin with data collection, labeling, pre-processing and classification.The research began by collecting data using the crawling technique to see community comments on various disaster simulation videos on the YouTube channel with the aim of knowing how effective the media used in the educational process, both simulation and socialization.A complete description of the stages to be carried out is shown in Figure 1.

Data collection
Our main goal is to identify video content that contains various guidelines, simulations and education about the disaster preparedness process.The initial step taken is to enter a search command in the browser with the following query: https://www.youtube.com/results?search_query=bnpb+simulasi+bencana+mitigation+disaster+animation+simulation+dis aster.After obtaining data from the various videos we were looking for, we then selected the content of the existing videos one by one by selecting the type of video that was appropriate to the topic of this research, as well as looking at the number of comments from the public regarding the video shows presented.In the selection process stage, we designed a webbased sentiment analysis application to facilitate the crawling, labeling, classification and analysis processes.Crawling is done by entering the videoID in the application, after the request is sent, YouTube will respond by sending the requested web page to the crawler, then extracting the information needed for processing to the next stage.Table 2 below shows the dataset from crawling various types of disaster simulation videos.
Based on the web crawling results we conducted, there were 33 video datasets related to disaster simulation, with a total of 696 comments.Figure 2(a) shows the main interface of the sentiment analysis application, while Figure 2

Labelling
The labeling process in sentiment analysis involved assigning specific sentiment labels to the texts to be analyzed, such as positive, negative, or neutral.The goal of this process is to classify the text based on the sentiment expressed within it.Out of the 696 comments obtained, 204 comments were successfully labeled.This was due to the presence of many redundant comments.The labeled comments consist of 112 positive comments, 43 negative comments, and 49 neutral comments.Figure 3 illustrates an example of the labeling process.

Preprocessing
Before sentiment analysis is carried out from the YouTube comment data taken, it is necessary to process the data to be ready for sentiment analysis.One data mining technique is preprocessing, which transforms unprocessed data into a processable data structure.Freshly retrieved raw data is often incomplete, inconsistent, and contains many errors.Stages in preprocessing techniques: cleaning, case folding, tokenizing, and stemming.

Cleaning
The comment data on each video presented needs to be fully cleaned and prepared before applying any classifying algorithm.Within each text, many (mentions, hashtags, emoticons, punctuation, spaces, unconventional symbols) have no value on classification and must be removed (filtered).One of the biggest advantages of this step is that it makes the data smaller, saving storage capacity.

Case folding
Case folding helps ensure that the same words in a text with different use of uppercase and lowercase letters are considered one entity.It is useful for avoiding situations where the same word is perceived as two different words due to the use of uppercase and lowercase letters.

Stopword removal
Stopword is the process of removing existing words from the list of stopwords.Stopwords, such as prepositions, interjections, and pronouns, appear in large numbers with a function but no meaning.

Stemming
Stemming data is the process of filtering words that contain conjunctions, pronouns, prepositions, and root words by removing prefixes or suffixes.The main purpose of stemming is to reduce the variation of words with the same root word so that they can be counted as one entity in text analysis.

Sentiment analysis classification model
Following data preprocessing, the next step involved the classification process.In this stage, comments from the presented videos were classified.Classification was carried out using the Naï ve Bayes Classifier (NBC) and Support Vector Machine (SVM) algorithms to predict positive, negative, and neutral values.Figure 4 shows the flowchart of the classification process using NBC and SVM.The classification results were then presented using a confusion matrix with accuracy and precision parameters.To detect sentiment classification, in this study, the polarity scores of each attribute were set as follows: positive if > 1, neutral if = 0, and negative if < 1, as per Eq. ( 8):

Feature extraction
Each word found in the reviews will be assigned a weight based on its calculation.In this feature extraction phase, we will perform term weighting, which is the process of assigning values to each term in the preprocessed review data.The method known as Term Frequency-Inverse Document Frequency (TF-IDF) will be employed for assigning weight values [41].The results of this word weighting will then serve as inputs for the classification process.

EXPERIMENTAL RESULTS AND DISCUSSION
In this section, we will present the results of sentiment analysis experiments conducted on a dataset of user comments from various disaster simulation videos on YouTube.Testing scenarios were performed in two ways: performance evaluation scenario and sensitivity evaluation scenario.Two types of datasets were used in this study: a dataset with preprocessing and successfully labeled 204 comments, and a dataset without preprocessing, containing 696 comments.The testing process was conducted six times with data distribution compositions as shown in Tables 3 and 4 below.
Based on the performance evaluation test results, it can be observed that the composition ratio of training data to test data significantly affects the performance of each method used.The use of training and test data composition has an impact on the validation of machine learning models [42].The best accuracy score was achieved when the training data to test data ratio was set at 80:20.NBC obtained an accuracy score of 80.4% with the best execution time of 0.0097 seconds, while SVM could only achieve a maximum accuracy score of 72.3% with the worst execution time of 193.48 seconds.A unique finding during the testing of the NBC method using different training data compositions (20:80 and 30:70) resulted in an execution time of 0.0030 seconds.This is because the SVM model is more suitable for classifying long documents, while NBC is better suited for snippets or short documents [43].Naï ve Bayes is an appropriate and effective algorithm to improve machine learning model performance [44].For precision comparison, NBC achieved a precision score of 64.9%, while SVM obtained a precision score of 51%.In measuring recall, NBC only achieved a score of 69.3%, while SVM was able to achieve the highest score of 71.2%.A high recall score in an SVM model indicates that the model has good ability to identify most of the true positives from the actual positive class in the dataset.This improvement is also due to the use of the C parameter (penalty parameter) with a value of C=1, adjusting the value of the C parameter can enhance identification ability [45].In the process of classifying positive, negative, and neutral sentiment comments, neither method was able to provide the best results that matched the actual sentiment data (manual labeling).The NBC predictions found 116 positive comments, 19 negative comments, and 36 neutral comments, while SVM found 110 positive comments, 9 negative comments, and 36 neutral comments.The differences may be due to some comments that have initial, middle, and final part of the sentence categorized as positive, but actually contain negative words in the middle of the sentence, or vice versa.This naturally results in ambiguous sentences.Ambiguous sentences can lead to errors in the classification process [46].Other constraints were also caused by the imbalance of data in the labeling process.The difference in the number of labeled positive and negative data sets will affect the results obtained [47], and may cause minority class data to be misclassified as majority class data [48].Further sensitivity measurement evaluation with a larger dataset than before resulted in the following results as shown in Table 4.
According to the Table 4, it can be seen that SVM has an excellent accuracy rate of 100%, compared to NBC, which can only achieve a maximum accuracy of 82.8%.The accuracy, precision, recall, and F1-Score values obtained by SVM tend to remain constant when the training data is larger than the test data.This is because SVM aims to maximize the distance to the nearest training points from one of the classes to achieve better classification performance on test data [49].The more data is provided to the training data, the higher the accuracy achieved [50], but using larger datasets also increases the execution time.Furthermore, in measuring sensitivity, each method was able to show the best results with a value of 100%.This certainly proves that the greater the sensitivity value obtained, the greater the classification system can classify the positive class well.

CONCLUSIONS
In this paper, we have analyzed the sentiment of user reviews on disaster simulation videos found on various YouTube channels.We utilized the Naï ve Bayes (NBC) and Support Vector Machine (SVM) methods to classify the dataset of user comments, aiming to assess the effectiveness of using YouTube as a medium to educate the public.We collected a total of 696 comments, and after preprocessing, we obtained a dataset consisting of 204 comments.The labeling process resulted in 112 comments with a positive sentiment, 43 with a negative sentiment, and 49 with a neutral sentiment.Testing was conducted using two evaluation scenarios: performance measurement evaluation and sensitivity measurement evaluation.In terms of performance measurement evaluation using the preprocessed dataset, NBC achieved the highest accuracy rate of 0.804 or 80.4% with an execution time of 0.0097 seconds.
Meanwhile, with an execution time of 193.48 seconds, we obtained an accuracy rate of 0.723, or 72.3%, utilizing the Support Vector Machine (SVM) approach.Furthermore, in the results of sensitivity measurements using the dataset without going through the pre-processing stages, each method was able to show the best results with a value of 100%.The higher sensitivity value indicates better classification of the positive class.Therefore, the choice of a classification method depends on several factors, including the dataset size, data type, data characteristics, execution time, classification objectives, and data dimensions.In conclusion, this research recommends the use of the NBC method for sentiment analysis of public comments on YouTube related to disaster mitigation simulations.This recommendation is based on its fast execution time and the relatively small dataset of comments collected.The successful sentiment analysis results demonstrate that video-based media is still effective for simulation and learning processes, although there is room for further development of simulation media that can seamlessly integrate virtual and real-world simulation processes.
The practical implication given in this research is that using sentiment analysis can help disaster management institutions in designing more effective counseling and communication messages via YouTube channels with content that better understands community needs and is relevant to disaster management steps.including: (1) Understanding People's Emotions and Perceptions: Sentiment analysis allows governments and related institutions to understand people's emotions, perceptions and attitudes towards disasters.This helps in determining appropriate and effective responses.
(2) Evaluate Response Effectiveness: Through sentiment analysis, disaster management agencies can evaluate the effectiveness of their response and refine strategies based on public feedback obtained from social media and other online platforms.
(3) Encourage Public Participation: Through sentiment analysis, public participation in disaster management can be increased by facilitating easier feedback and giving the public a more active role in the decision-making process.
(4) Community-Based Approach: Sentiment analysis enables governments and related institutions to better adopt a community-based approach to disaster management, taking into account the immediate needs, hopes and aspirations of affected communities.

Figure 1 .
Figure 1.Research stages for sentiment analysis

Figure 2 .
Figure 2. Sentiment analysis application and train dataset pages

Table 1 .
Example of confusion matrix False Positive (FP) is the sum of the actual value of the negative class and the predicted value of the positive class by the classification model; False Negative (FN) is the sum of the actual value of the positive class and the predicted value of the negative class by the classification model; Negative (TN) is the sum of the actual value of the negative class and the predicted value of the negative class by the classification model.It is necessary to use the following Eq.(5) to get an accuracy value, measuring how accurate the classification model results are:

Table 2 .
The dataset from crawling disaster simulation videos

Table 3 .
Results of performance evaluation NBC and SVM methods (preprocessing stages)

Table 4 .
Results of sensitivity evaluation NBC and SVM methods (without preprocessing stages)