Forecasting Ginger Harvest Yields: A Comparative Study of Double Exponential Smoothing and Long Short-Term Memory Models

ABSTRACT


INTRODUCTION
The agricultural sector has considerable economic potential.However, the potential for agricultural products, especially ginger, is still relatively low.Therefore, farmers need to plan the yields obtained to avoid substantial losses and meet market demand.This leads to systems that help them in decisionmaking and future planning, such as a forecasting system using past track record data [1].Forecasting is the prediction of future results to meet precise and appropriate targets [2].This research on forecasting the yield of ginger plants helps farmers make plans and meet market demand greatly.
To overcome the above problems, this research requires a forecasting method to produce accurate predictions.Some previous research has applied forecasting methods.Chung and Kim [3] compared the Single Exponential Smoothing (SES) and Artificial Neural Network (ANN) methods, and found that the Mean Square Error (MSE) value of SES was lower compared to that of the ANN method.In addition, they developed the DES method for efficient jitter compensation, which showed that DES-based schemes ran about 100 times faster than Extended Kalman Filter (EKF)-based methods and 19 times faster than Kalman Filter (KF)-based methods.Siregar and Wibawa [4] compared the DES and ANN methods with the SES process on data input for foreign currency exchange, and found that the MAPE results were 53% with an execution time of 561 seconds.Therefore, it can be concluded that the DES method is better than SES for improving ANN performance for forecasting foreign currency exchange rates.In 2020, Regina and Jodiawan [5] forecasted Fast Moving Consumer Goods using time series forecasting and compared it with the Autoregressive Integrated Moving Average Model (ARIMA), moving average (MA), DES, and linear regression (RL) methods.The results of this study indicate that the DES method has the smallest MAPE value among the three, with optimal alpha and gamma.In addition to statistical forecasting methods, machine learning methods have also been studied.The machine learning method is an algorithm that allows users to find and describe structural patterns in data so that the structural patterns can be used [6].LSTM is one of the timeseries forecasting methods of the deep learning group based on Recurrent Neural Network (RNN) with good accuracy.Several studies have used the LSTM method as a time series forecasting method.Shankar et al. [7] compared the LSTM method with seven different time series forecasting methods, such as ARIMA, simple exponential smoothing, Holt-Winter's, error-trend-seasonality, trigonometric regressors (TBATS), neural networks and hybrid ARIMA for container forecasting that Sonali Shankar developed.This study shows that the accuracy of the LSTM method outperforms the seven methods in long-term forecasting.In addition, Bathla [8] predicted stock prices and found that the LSTM method provided better accuracy than Support Vector Regression (SVR).Agarwal and Tarar [9] also found that deep learning methods, such as LSTM, yielded higher accuracy compared to machine learning algorithms for crop prediction, and achieved clear information regarding the amount of soil materials required by removing them separately.
Some of these studies indicate that the accuracy of the LSTM and DES methods has a small error value in forecasting [10].However, both methods have weaknesses, requiring a lot of data and regular maintenance.This takes quite a while to produce accurate predictions.Therefore, it is necessary to compare the two to find patterns with a small error rate.This study uses seasonal data to compare the LSTM and DES methods for predicting ginger yields in Madura.When changes occur repeatedly over a certain period, it creates seasonal data.With these two methods, a forecasting model can be selected as the best and most accurate method for predicting ginger yields.This study aims to find the best way to provide accurate forecasting results using the LSTM and DES methods via error measures, such as MAPE and RMSE values.After knowing the suitable model for predicting ginger yields, forecasting results can undoubtedly be obtained, which can reduce the accumulation of excess profits from erratic consumer demand and reduce losses experienced by ginger farmers in Madura.

Literature review
The crop yield forecasting literature is quite extensive.Therefore, it is possible to find forecasting methods for systematic and pragmatic prediction through previous relevant data.Forecasting methods can be used to recognize the elements of data that affect the amount of deviation due to unexpected factors.Many studies have compared several forecasting models to find a suitable model to predict crop yields, such as least squares [11,12], SES [13], winter exponential smoothing [14], weight moving average [15], and ARIMA [16].However, these models are not accurate at predicting variables.In 2022, Asrul [17] used the DES method to predict potato vegetable yields, which met the annual potato crop production as a material for consideration and the amount of potato production for subsequent market demand was recommended.Therefore, this method can improve forecasting by smoothing the average previous value of time series data in a decreasing (exponential) way.This can provide accurate short-term forecasting, and is easily adapted to changes in data without requiring a lot of data [18].
In addition to the DES method, the LSTM method also has advantages in forecasting time series data, which can be used to overcome long-term dependencies on the input [19].In addition, LSTM also has a memory block that determines which value is selected as the output, which is relevant to the given input [20].Several studies have been conducted in the domain of agricultural forecasting, specifically ginger yield prediction. Elpawati et al. [21] analyzed the relationship between ginger-exporting countries, such as Indonesia, China, India, and the Netherlands, using the Value at Risk/Vector Error Correction Model (VaR/VECM) method.It was found that Indonesia's ginger exports increased by 92%, followed by the Netherlands (7%), China (0.2%), and India (0.8%).Apart from that, Das et al. [22] forecasted ginger production in Bangladesh, and tested the performance of eight trend models using the coefficient of determination (R 2 ) and adjusted R 2 .This research shows that the compound growth rate of ginger production is 1.019 per year.
This study tries to compare the accuracy of the DES and LSTM methods in predicting ginger yields based on the nature of the composition, forecasting time, and data patterns.The MAPE and RMSE approaches are used to measure accuracy.The best forecasting result is based on the level of prediction error.The smaller the error rate, the more precise a method is in prediction [23].

Data collection
This study uses the data on ginger crop yields in the Madura Region in 2015-2019, which was obtained from the Department of Agriculture and Food Security in Pamekasan, Madura, after determining the quantity of the ginger harvest through observation.The ginger harvesting process in Madura is carried out once a week from January to December each year so that approximately 250 datasets are obtained, as shown in Table 1.The yield of the ginger commodity fluctuates from year to year, as shown in Figure 1.This original data consists of yields, years and names of the ginger-producing areas.Before data processing, the data was plotted first to determine the pattern of data flow, making forecasting easier.This data processing aims to ensure that the raw data obtained can be analyzed and conclusions can be drawn easier.Based on the data plot in Figure 1, which shows actual harvest data, it can be concluded that the data distribution is seasonal.In this research, raw data was obtained before analysis.Therefore, the data was preprocessed first, aiming to clean the data from distorted and missing values.This process is highly valid for providing accurate output for forecasting ginger harvest yields.To predict ginger yields, a time-series forecasting method was used.This means that the data is presented based on the time of occurrence without indicating the factors that influence it.The time series method is a quantitative prediction, which is based on the analysis of the pattern of relationships between the variables to be searched (dependent) and the variables that affect them (independent), and changes from time to time, such as weeks, months, quarters, semesters, and years.According to the study of Estiningtyas et al. [24], ginger production is influenced by the abiotic environment, including all living things, such as pests, pathogens, and weeds, which interfere with ginger cultivation.Furthermore, the collected data was analyzed by comparing forecasting methods.

Forecasting method
Forecasting is an approach to predicting the possibilities for future situations by testing data that occurred in the past [25].It predicts event history data or events in the business sector.The importance of forecasting is that it influences someone in making decisions and can also be used as a basis for long-term planning efforts in an organization [26].Forecasting is usually divided into three types, such as short-term, medium-term, and long-term [27].Short-term forecasting is usually used to predict events using periods of the next day, week, or month [28].Medium-term forecasting is a forecasting approach that utilizes time data from one year to two years into the future [29].Finally, long-term forecasting aims to find out events more than two years in the future.Usually, this forecasting uses a time series method which is based on historical data and predicts future event data as output [30].The purpose of forecasting is to reduce the risk in conditions of uncertainty about something that might happen in the future, thereby minimizing this uncertainty.According to the study of Wang and Chaovalitwongse [31], the approach through forecasting or forecasting methods is divided into two parts, i.e., quantitative and qualitative methods.Qualitative methods are used when no sufficient previous data is available.In other words, this method can be used as a basis for consideration in making decisions for forecasting cases with previous data.However, if a lot of previous data is available and meets the criteria, then forecasting with quantitative methods is more effective than qualitative ones [32].Forecasting using quantitative methods has several requirements.For example, previous data is available, can be quantified, and has the same trend as future data [33].To find patterns in the past series and extrapolate these patterns in the future by analyzing the data, this study compares the DES and LSTM methods as a reference for predicting future values.

DES (Holt)
DES is a linear model proposed by Afiyah et al. [18].In the DES method, the smoothing process is carried out twice.In principle, Holt's linear and exponential smoothing methods use the multiple smoothing formulae directly [34].Instead, Holt decides on a seasonal value with parameters different from the two used in the original series.The trend is a smoothed estimate of the average growth at the end of each period [35].Forecasting Holt's linear and exponential smoothing has several steps [36]: (i) The first smoothing value is determined.
(ii) The second smoothing value is determined.
(v) The forecasting value is determined.
where,   is the demand data in period t,   ′ is the SES value, tt is the trend value in t, α and β are the intermediate smoothing parameters ranging between 0-1,  + is the forecasting of m periods, and m is the number of future periods to be forecast.
The   ′ parameter is a basic and pragmatic technique used for forecasting with time series anticipation that estimates only the level components.Meanwhile,   ′ ' is specifically for trends in univariant time series which can help change trends over time in different ways, either in a linear trend or an exponential trend.  is the level smoothing factor and   is the trend smoothing factor with the best trend estimate at time t.

Deep learning LSTM
Deep learning is a branch of machine learning that uses a deep neural network to solve problems [37].Neural networks are inspired by how neurons work in the human brain.Each neuron in the human brain is interconnected and information flows from and between each of these neurons [38].In deep learning, the network consists of several layers which are collections of nodes [39].A node becomes the place where calculations occur.Compared to neural networks, deep learning has more hidden layers, such as more than three (including input and output), or even up to hundreds.LSTM is often used to overcome deficiencies found in RNNs, which are the phenomenon of the magnitude of the gradient disappearing [40].LSTMs use larger data sets and all data information as input to build deep networks.LSTM has three gates that control the use and updating of past text information, namely, the input gate, forget gate, and output gate [41].There are several steps in LSTM among others [42]: (i) Calculation of forget gates (  ).The value of the forget gate is between 0 and 1, as shown in Eq. ( 7).Information that is not needed for the cases being managed is removed using the sigmoid function (σ); (ii) Calculation of cell states (   ) .The tanh activation function is used, which forms a new context candidate, as shown in Eq. ( 8); (iii) Calculation of the gate output (  ).Sigmoid () is used to generate output values and process cell state (  ) on tanh activation, as shown in Eq. (9).
Before the model was trained, the parameters needed for the LSTM model were determined.The hidden state process in LSTM goes through four gates, namely, the forget gate, the input gate, the cell state and the output gate.The forget gate is the first gate that the input passes through using a sigmoid activation function or a sigmoid gate.The hidden state determines the gate input value.At the input gate, the value of the new candidate cell state is calculated and obtained using the tanh activation function.Meanwhile, the gate output uses the sigmoid activation function.The gate output value is used to generate a new hidden state value along with the cell state value.

Evaluation method
According to the study of Peñaloza et al. [43], the accuracy of the forecasting results can be seen from the large difference between the actual and estimated values of the forecasting.The accuracy value is the difference between the actual and forecast results.The residual value is obtained by measuring the accuracy of forecasting results, i.e., MAPE and RMSE.MAPE is the absolute average percentage error for evaluation calculations in measuring the precision or accuracy of a prediction [44].MAPE is calculated using the absolute error for each period divided by the real observed value for that period.The general formula for MAPE can be seen in Eq. ( 10) [45].The smaller the MAPE value, the more accurate the forecasting model [46].RMSE is the root result of the average square of the difference between actual and predicted data, as shown in Eq. ( 11) [7].
where,  ̂1 is the forecasting result,  1 is the actual value,   is the actual value in t data,   is the forecast value on t data, and n is the number of data periods.
where, n is the amount of data,   is the predicted data, and   is the actual data.

RESULTS
This chapter explains several test scenarios for ginger yield forecasting systems.The best α and β parameters influence the forecasting process of the DES method in producing the smallest MAPE and RMSE values, and the forecast results are close to the actual value.Meanwhile, the LSTM method is influenced by the input, forget and output gates for the most minor MAPE and RMSE results.This section discusses the flow of the ginger yield forecasting system using the DES method, as shown in Figure 2.This figure outlines the system flow and the steps required to obtain forecasting values using the DES method, as follows:

System flow and DES test scenario
(i) The magnitude of the parameters α and β is determined between 0 and 1 to examine the accuracy of the prediction results obtained by calculating the MAPE value.
(ii) The first smoothing value is calculated using   ′ .(iii) The second smoothing value is calculated using   ′′ by paying attention to the first smoothing value.
(iv) The value of the constant   is determined by referring to the SES adjustment with the difference between SES and DES.
(v) The value of the trend coefficient   is determined, thereby determining the estimated trend from one time period to the next.
(vi) The forecasting results  + are calculated after calculating the first and second smoothing values, and   and   values using the best α parameter.
This forecasting process began by inputting the normalized harvest results every week from 2015 to 2019.Then the number of harvests for the coming period was predicted by searching for the best α and β values to calculate the accuracy value, thereby examining the accuracy of the prediction results obtained by calculating the MAPE and RMSE values.In the prediction calculation process using the DES method, the dataset was separated, with 74% of the training dataset and 26% of the testing dataset at the training stage.The results can be seen in Table 2.Meanwhile, Table 3 shows the search for α and β and the effects of MAPE and RMSE measurements using the DES method.As shown in the table, when α=0.2 and β=0.1, a decreasing graphic pattern is produced for the MAPE and RMSE values, which is very significant because the α value used is 0.4 and β is 0.1, which is the lowest point.Therefore, the values of MAPE and RMSE are influenced by the value of the constant α in forecasting.

System flow and LSTM test scenarios
The previous section discusses the DES flow.This section addresses the description of the system flows for forecasting crop yields using the LSTM method, as shown in Figure 3. Model identification was carried out by creating a time-series plot.By plotting a time series, data patterns and trends in the observation series can be seen.The model identification process was carried out with stationarity in the variance and average.The step aims to examine whether the collected data is normally distributed or taken from a normal population.The stationarity of the data in the variance was investigated using the Box-Cox transformation so that the lambda (λ) value obtained in the Box-Cox plot was 1.If the lambda value is not 1, then a Box-Cox transformation must be carried out.The results of the Box-Cox test on ginger harvest data can be seen in Figure 4.It can be seen from the figure that the data is not stationary in terms of variance because the value λ = -0.50, or the data is stationary in variance because the value λ ≠ 1 on the Box-Cox plot.This value is smaller than the significance level specified.The probability value is smaller than the 5% significance level, leading to the rejection of H0.This means that there is no unit root problem or it can be said that the data is stationary.Therefore, parameter estimation is carried out by trial and error with the LSTM model.LSTM uses a form of RNN to avoid long-term dependency problems.This RNN model filters information through gate structures to maintain and update the state of memory cells.Then the model's performance is evaluated using MAPE.In this research, three main stages were carried out below.

Preprocessing data
The use of data in this study is divided into two stages, i.e., the training stage for the input and the stage for testing the built input architecture.Before the data was processed by the LSTM model, the data was normalized first as part of the preprocessing with the Min-Max Scaler method.The preprocessing was divided into several phases as follows: (i) The range of each attribute was paid attention to.The attribute range was converted to equal intervals.The minimum feature was made equal to zero and the maximum feature was equal to one.
(ii) The data was expanded to the desired scale.The ranges were provided in the form of tuples as minimum and maximum features.
(iii) If it was false, the in-place scaling was completed.If it was true, a copy was created instead of in-place scaling and the scaled data was truncated to the provided feature range.
The data preprocessing aims to examine the effect on the MAPE value generated by the LSTM method.The results can be seen in Table 4.

LSTM model
The LSTM architecture consists of an input layer, an output layer, and a hidden layer.The hidden layer consists of memory cells, with three gates in one cell, namely, the input gate, forget gate, and output gate, as shown in Figure 3.The input gate functions to control how much information must be stored in the cell state.The forget gate controls how far the value remains in the memory cell.The output gate serves to decide how much content or value is in a memory cell, and is used to calculate the output.The forecasting values were obtained using the LSTM method in the following steps: (i) The value of ft was determined.The forget gate was the first gate passed, which recorded how much cell status Ct-1 was from the previous time and returned to the cell status Ct from the current time.The gate produced a value between 0 and 1 based on ht-1 and xt, with an output of 0. Then the information was considered no longer useful and deleted.Conversely, if the output was 1, then the information was stored for future use.
(ii) The number of network inputs at the current time (xt) was determined.It returned to the cell state Ct which was input with two activation functions (sigmoid and tanh) to select the part to be updated.
(iii) The information was updated to the cell state of the new candidate Ĉt created via the tanh layer, aiming to control how much new information was added.
(iv) The old cell state Ct was updated to become a new cell state by multiplying the old state by ft to delete the information determined on the ft layer.
(v) What was produced by Ot, which passed through the tanh neuron layer, was determined.The result was ht, which subsequently influenced the cell state.
(vi) After the value of Ot was obtained, the cell state was placed through the tanh and multiplied by the gate and sigmoid layer output to produce the value ht.
The modeling of the LSTM network training process can be seen in Table 5.The forget gate, cell state, output gate, and sigmoid values can be seen from the table.

DISCUSSION
This study uses actual data from ginger yields from 2015 to 2019 to produce precise and accurate predictions for the future.A comparison between the DES and LSTM methods shows that both methods have the same advantage in forecasting, i.e., high accuracy.Before making predictions, the actual data was converted into a time series by preprocessing the data.Then ginger yields were predicted using both DES and LSTM.Several trials were carried out, including testing the DES model in two stages, i.e., testing the  value when the β value was fixed and vice versa to find the best  and β.This is shown in Table 3.The test results show that the MAPE value tends to be smaller when the  parameter value is large and the β parameter value is small.This shows that the DES method has the advantage on relatively little data.However, the weakness is that it should be maintained continuously, checked routinely and updated if there are bugs or new important features are added.Meanwhile, the tests carried out by the LSTM model in this study are shown in Table 6, where there are several parameters, including learning rate, epoch, and LSTM units.The initial weight at each LSTM gate was determined randomly and the initial bias was updated to achieve a loss function at a local minimum with the learning rate.Meanwhile, epoch and LSTM units were used to indicate the number of iterations and dataset size in LSTM training to produce a smaller MAPE.The analysis results using the LSTM model show that LSTM can be carried out for long-term time series data because it has memory to store information that will be reused in the calculation process for the next gate.In addition, the prediction results cannot be sufficiently compared with the original data.However, LSTM has a weakness, i.e., reliance on gradient values, where the weights cannot be updated for the next process if the gradient value is lost.with LSTM.The forecasting results can also be seen in Figure 5, with actual ginger yield data shown on the graph with a yellow line, DES prediction data with a blue line and LSTM prediction data with a red line.This shows that the LSTM method provides very good performance in this crop yield prediction model, even better than DES.However, LSTM in data processing requires quite a long time, as is evidenced in Table 8, where LSTM has a gate to regulate the entry and exit of processes that work with entry and exit gates.This shows that LSTM has a high complexity.After processing the data, the model was evaluated using MAPE and RMSE.The evaluation results of the LSTM model show that 38.99% takes 8.726145 seconds while the DES model produces an error rate of 43.49% with 0.00004 seconds, as shown in Table 3. Apart from that, it can be seen from the graph that the ginger harvest in July-December has increased, i.e., in the range of 31,000 tons.Meanwhile, the lowest demand occurs in January-June, with around 27,000 tons for both methods.Therefore, farmers are advised to plant ginger before June so that it can be harvested in July to meet market demand.According to the results of the trials, it can be concluded that this forecasting system provides more information for ginger farmers about the estimated future ginger harvest, thereby enabling farmers to handle the ginger planting in certain months more effectively.
According to the trial results in this study and literature review, it can be found that statistical and deep learning methods can be used to process time series data.Each has a relative effectiveness level, depending on the complexity of the data used.The difference between both methods is the use of time in processing data.In addition, deep learning methods tend to require more time than statistical methods.Therefore, further research could utilize several statistical methods to predict data with a short processing time.The LSTM and DES deep learning algorithms are two different prediction models and are suitable for performing time series analysis to improve performance and accuracy in forecasting.

CONCLUSIONS
Based on the research results and discussion of ginger crop yield prediction with a comparison of the LSTM and DES methods, conclusions can be drawn.Based on the prediction results using the method from ginger crop yield data for 2015-2019, the best value was obtained with MAPE 43.49% and RMSE 12997.34261 in parameters α=0.4 and β=0.1.The predicted results increased from July 2020 to December 2020 compared to the previous period.Based on the LSTM method forecasting results, with 74% training data, timestamp 10, and 100 hidden layer neurons, the MAPE value obtained is 38.99% at a learning rate of 0.0002 and RMSE of 1244.85432.After making predictions using these two methods, the LSTM method is the best one for this data compared to the DES method.This can be seen from the smaller error (MAPE) and RMSE values produced by LSTM compared to DES.Based on forecasting results using the DES and LSTM models with actual data, there is an upward trend.Therefore, it can be concluded that ginger production from 2019 to 2020 will increase by around 14% per year.Therefore, farmers can estimate future harvest plans by paying attention to the availability and price trends of ginger seeds in a certain month or at least three months beforehand.This affects the enthusiasm of farmers to continue to increase the productivity of ginger plants, especially in Madura.

Figure 1 .
Figure 1.Data on ginger commodity yields

Figure 2 .
Figure 2. A block diagram of the DES method for forecasting ginger yields

Figure 3 .
Figure 3. Model of LSTM memory cells

Figure 4 .
Figure 4.A time-series plot of ginger harvest data for Box-Cox stationary variance transformation

Figure 5 .
Figure 5. Graph of forecasting result comparison using the DES and LSTM methods with the actual data

Table 1 .
Sample dataset of ginger plant harvesting

Table 2 .
DES test result for ginger yields forecasting with α=0.4 and β=0.1

Table 3 .
Test scenario of the DES method

Table 4 .
Results of data preprocessing

Table 5 .
Result of the training process with LSTM forecasting ginger crop yields

Table 6 .
This learning rate test aims to obtain the optimal total learning rate for the MAPE and RMSE values from training and testing.The results of testing the learning rate parameter with a max epoch of 10 are shown in Table 7.Meanwhile, the number of max epochs was which aims to find out whether, before the max epoch, the MAPE has exceeded the limit or stopped at the specified max epoch.The max epoch numbers used are 10, 50, 100, 500 and 1000.The max epoch test results are shown in Table 8.When the max epoch number is 100, the optimal max epoch number with the smallest MAPE and RMSE is 38.99% and 1244.85432 with an estimated time of 8.726145 seconds.After obtaining an initial estimate of the LSTM from model identification, the parameters of the candidate LSTM model were estimated using the significant parameters generated at the model identification stage.A p-value is considered significant if it falls below the 5% significance level.The MAPE and RMSE values are the smallest compared with other models.

Table 6 .
Trial scenario of the LSTM method

Table 7 .
Scenario results of LSTM based on learning rate

Table 8 .
Scenario results of LSTM based on epoch and stationarity test results

Table 9 .
Comparison between the results of the DES and LSTM methods and the actual data