Transfer Learning Approach - An Efficient Method to Predict Rainfall Based on Ground-Based Cloud Images

Transfer Learning Approach - An Efficient Method to Predict Rainfall Based on Ground-Based Cloud Images

Geeta Mahadeo AmbildhukeBarnali Gupta Banik 

Department of CSE, Koneru Lakshmaiah Education Foundation, Deemed to be University, Hyderabad, Telangana 500075, India

Corresponding Author Email:
23 July 2021
26 August 2021
31 August 2021
| Citation



Clouds play a vital role in climate prediction. Rainfall prediction also majorly depends on the status and types of clouds present in the sky. Therefore, cloud identification is the most exciting and vital topic in meteorology and attracts most researchers from other areas. This paper presents the transfer learning technique to predict the Rainfall based on ground-based Cloud images responsible for rains. It will predict the estimated Rainfall by identifying the type of cloud by taking cloud images as input. The cloud images in the dataset are divided into three categories(classes) labeled as no-rain to very low-rain, low to medium-rain, and medium to high Rain based on the associated Precipitation responsible for the appropriate Rainfall. This model will be most helpful to the farmers to manage their Irrigation by knowing the status of Rainfall before every irrigation cycle or can also be helpful to take decisions on the outdoor events by taking prior knowledge of Rain. The model is trained on three classes to predict the Rainfall and firstly experimented with CNN. To improve the performance, the experiment is carried out with some best-pretrained models VGG16, Inception-V3, and XCeption using transfer learning and, the results are compared to the regular CNN model. The transfer learning technique is outperformed to get good accuracy as the dataset is too small and presented the best possible results of the model. Google colab with GPU setting makes the task fast and efficient to get the appropriate results in time, and performance achieved by transfer learning is excellent and can fulfill real-time requirements.


rainfall prediction, ground-based cloud images, image classification, deep neural network, convolution neural network, transfer learning

1. Introduction

In the last few years, it has been observed that weather is very unpredictable worldwide. Like farmers who are entirely dependent on the weather, many people got disappointed and suffered a vast loss in crop production. Their whole efforts get wasted due to the unexpected bad weather. People used to predict the weather by looking at the sky and observing the clouds, behavior of animals, smell of soil, etc., by using their sense organs. Nowadays technology has reached a state where machines can predict things like humans once trained [1]. It will be a boon to them if technology can give prior knowledge of such weather changes to make appropriate decisions to protect their crops. To handle such a situation, climate risk information must be provided locally at the village level, where farming is the people's primary occupation, especially in India. Even some significant events get spoiled due to such unexpected weather where the vast crowd is present and such weather can create lots of difficulties to handle the situations and even may lead to a significant loss in terms of money and humanity so, automatic rainfall prediction is the most important factor in many areas especially in Agriculture [2].

Rains play an essential role in agriculture practices, from sowing seeds to harvesting the crops. Irrigation also depends on the Rainfall as water is a scarce resource, and it must be provided very carefully in the right amount and at the right time to the crops in their growth period [3]. Such practices in agriculture of using resources like water, fertilizers, pesticides in a controlled way are called precision agriculture. If the Rain is estimated by predicting the cloud image, it will be very helpful for farmers to irrigate their crops accordingly as less water or more water both can damage the crops.

Clouds play an essential role in predicting the weather like rainy, cloudy or clear, etc. Cloud images can be used to predict the amount of Rainfall locally. Once a farmer knows whether it will rain or drizzle or heavy can take a decision to irrigate the field or not. Lots of work is going on in the field of cloud images for detection of cloud, its type, its movement in meteorology departments to improve the performance of weather analysis and forecasting systems [4]. Cloud images can be categorized in two forms, one is satellite-based, and another is ground-based. Satellite Images are mainly considered for observations and findings for global atmospheric weather by analyzing the top view of the cloud. Still, the ground-based image is helpful to the common man as they are readily available to them and suitable for predicting the local atmospheric conditions by analyzing the bottom cloud characteristics like cloud height, cloud cover, and cloud type [5]. Deep learning techniques make image classification a great success and become more attractive in research due to its high performance and new machine learning framework. Cloud image classification is one of the crucial areas of research that can be used to make decisions for various activities based on weather like outdoor events, Irrigation in agriculture, etc.

However, Training CNN networks is costly in terms of time and resources as it takes days or weeks, or even months for some datasets and is very slow on CPU and can be run faster on GPU, but it is expensive. Training deep neural networks from scratch requires high computational power and a large number of datasets. But for some areas like cloud images, specifically ground-based cloud images, the dataset is not readily available and, if available, with very few images in few thousand. In such cases where the dataset is minimal, instead of training CNN from scratch for thousands of epochs transfer learning technique has an excellent capability of using weights of existing pre-trained architectures which are trained on billions of images and make them useful for a new dataset even with the small number of images and get trained to achieve good accuracy in less time and an efficient manner with simpler machine learning algorithms [6].

In this paper, the leading ten categories of clouds are categorized into three classes based on the precipitation associated with each of them, namely no-rain to deficient rain, low to medium rain, medium to high rain as per the amount of rainfall produced by the clouds. The dataset for cloud images is very small, so transfer learning is used to train the model on a new dataset containing ground-based images of clouds responsible for rain. Here cloud types are only related to Rainfall and not to phenomenon like a thunderstorm, cyclone, etc. The model is trained on these images and will give output depending upon the image of the cloud passed as an input. Model is trained on the ground-based cloud images using the transfer learning concept on the best-pretrained models VGG16, Inception-V3, and XCeption, which are speedy and efficient, and excellent results are obtained as compared to the CNN model from scratch. The following sections will elaborate on previous work done in a similar field of research, more information on cloud types, data collection, models and methodologies used, and the results, and at the end, discussion on conclusion and future scope.

2. Literature Survey

Researchers are working on cloud Image classification and rainfall predictions. It is very much needed to predict the Rainfall due to its unexpected nature, which affects the agriculture sector the most and leads to natural disasters like floods and droughts due to their highly nonlinear nature. In the last few years, many researchers have done lots of work in this area on different types of cloud images, i.e., ground-based or satellite cloud images, and presented various methods and methodologies on image pre-processing, image segmentation, cloud screening algorithm cloud cover, and cloud movement, classifications and many more. Some of the most related work done in this area is discussed below.

Gogoi and Devi [7] used image processing techniques on the cloud images to get information like cloud type/cloud status and sky status using image segmentation or cloud screening algorithms. Using these methods, output parameters like height, color, altitude, classification, and appearance are used to identify the status of Rainfall. Different methods for predicting the Rainfall like the traditional method, statistical method, and numerical method, were also discussed.

Tuominen and Tuononen [8] worked on ground-based cloud images and introduced a method for extracting the information about cloud coverage and cloud movement using neural networks (NN) and the Lucas-Kanade method. Cloud detection is shown successfully in all the possible conditions like clear sky, sky covered with dark clouds, and various cloudy and cloudiness conditions.

Dev et al. [9] introduced an extensive public sky/cloud image database with segmentation named SWIMSEG consists of 1013 images captured by a high-resolution sky imager. Secondly, a segmentation framework is proposed for sky image segmentation which is completely learning-based. Also, evaluation and selection of proper color channels are performed on two different datasets consist of cloud/sky images.

Zhang et al. [10] proposed a new algorithm based on a CNN called CloudNet for the classification of ground-based cloud images. CCSN (Cirrus Cumulus Stratus Nimbus) dataset consists of digital images of clouds taken from the ground and contains more images from previously existing datasets like SWIMCAT. In this dataset, a new type of cloud is introduced called contrails that appear to be line-shaped clouds produced by aircraft engine exhausts to explore its effect on Global warming.

Lagunas and Garces [11] used the concept of transfer learning on the VGG19 model to demonstrate their work on new datasets consists of natural images and illustrations. The proposed new model with adaptive layer -based on optimization strategy modifies only a few layers to better train on new content. A new dataset with illustration images is also created and proposed two models based on transfer learning with a small dataset and a large dataset.

Lai et al. [12] worked on eight types of clouds for classification of cloud images and used CNN with ten layers and seven layers, compared the results, and found test accuracy increased by 10% after reducing the layers, and 1400 epochs are used. Batch Normalization is also used in the CNN after ReLu function to avoid the problem of overfitting.

García Fernández [13] Shows the detailed analysis of the performance of CNN on ground-based sky images for automatic cloud classification where two datasets are created from the original one. One dataset consists of two types of cloud images, and another consists of 6 types of cloud images. Experiments are done on activation and deactivation of subsampling layers and optimizers (SGD and Adagrad) to achieve reasonable accuracy.

Kaviarasu et al. [14] used digital cloud image processing techniques and describes the rainfall forecasting estimation model where Cloud Mask Algorithm is used to find the status of the cloud captured and the classification of cloud types is done by applying the K-Means Clustering technique.

Salot and Swaminarayan [15] worked on digital images of cloud and sky, which were stored in graphic format and applied a cloud screening algorithm and image segmentation technique to identify Sky status, cloud status, and cloud type. Different parameters like height, altitude, and appearance of clouds are observed as a part of the result. Some essential characteristics like color, texture, shape, size are also used to get the status of Rainfall by identifying the cloud type. Three Extraction methods Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), Principal Component Analysis (PCA), have been discussed, which can be used to extract the features and recognize them to predict the status of Rainfall.

Shi et al. [16] proposed an algorithm based on a Deep convolution network for processing cloud images named Deep convolutional activations-based features (DCAF) for ground-based cloud classification where feature extraction is done using various pooling strategies and the labels are classified using Support Vector Machine (SVM) Two datasets named Kiel database and SWIMCAT are used for experiments and the results are compared over previous methods LBP and texton based approach.

Tang et al. [17] describes improvement in cloud classification for ground-based images by using region covariance descriptors (RCovDs) for extracting pixel-level features from the image passed, and the Riemannian bag-of-feature (BoF) method is used for extracting image-level features, and at last multiclass SVM is used for classification. The method is tested on two datasets, Zenithal and SWIMCAT, and observed to produce good accuracy on a small dataset.

Manzo and Pellino [18] proposed a framework for cloud image recognition by combining multiple CNN models through a transfer learning approach and works on the voting system, adapted to manage the classification phase where the probability of prediction from each model is considered and combined to produce the best possible result.

Boonyuen et al. [19] worked on satellite images to predict three days leveling of daily Rainfall for which the Inception -V3 model is used from scratch. It has been concluded from the results that satellite images in batch gives better accuracy than single satellite image and predicted four classes based on the Rainfall as 'not rain', 'light rain', 'moderate rain', and 'heavy rain'.

3. Existing Knowledge

Clouds are formed by evaporation when warm water expands and rises and gets condensed when coming in contact with the cool air; this process is called condensation—the water vapor accumulated together to form water droplets or ice crystals. Clouds formed are of different colors, sizes, and shapes. Once the accumulation of droplets forms the clouds, they may grow or shrink. If the droplets shrink due to warm air, they will again get converted to water vapor, but if they keep on growing, the cloud will become heavy and eventually will fall from the sky as Precipitation in the form of rain or snow.

3.1 Cloud information

Among the created clouds, some clouds will get evaporated while some clouds will produce Precipitation. That's why it cannot be predicted that a cloud will produce Precipitation or not by just observing it. Hence, knowledge related to its features like color, texture, shape, size, and altitude is observed and learned to predict whether the cloud will produce Rain or not and to what extent.

Rain clouds are mainly classified as Cirrus, Stratus, and Cumulus [20].

Cirrus: These clouds are white and look like tufts of hair, long, thin, and wispy found at high altitudes in the atmosphere higher than 20,000 feet (i.e., above 6000 meters. They indicate fair weather and are usually made of ice crystals.

Stratus: These clouds covered the whole sky and formed due to the cooling of large air mass at the same time and are layered over the sky horizontally. They usually occur at a low level up to a distance of 6,500 feet from the earth. Stratus clouds produce chilly weather and produce drizzle or light snow and mostly seen around the coast and mountains.

Cumulus: These clouds are lumpy, puffy, and sometimes look like flying cotton balls and are found very close to the ground at just 1000 meters (3300 feet) in height. These clouds often have flat bases and grow vertical in heap or pile, which means 'cumulo' in Latin. Mainly cumulus cloud indicates fair weather, but if it grows vertical gets converted to towering cumulus and can produce Rain.

3.2 Classification of clouds based on height

Clouds can be classified based on their altitude, shape, and Precipitation [21].

High-Level Clouds: High-level clouds mainly appear at a height above 20,00feet (6000 meters) and are referred to by the prefix "Cirro". Cirrocumulus, Cirrus, and Cirrostratus are examples of high-level clouds, and images of some high-level clouds are shown in Figure 1.

Figure 1. High-level clouds

Mid-level clouds: These clouds mainly appear between the distance 6,500 – 20,000 feet (2,000 – 6,000 meters) from the ground and mainly consist of water droplets, but cold temperatures may contain some ice crystals. They always have the prefix 'alto' means mid-level; examples are altostratus and altocumulus. Nimbostratus is often called a low-level cloud because of the lowering base due to continuous Precipitation. Nimbostratus clouds formed because of the thickening of altostratus clouds and are dark grey and can produce continuous Precipitation that can last for many hours. Some mid-level cloud images are shown in Figure 2.

Figure 2. Mid-level clouds

Low-level clouds: These clouds appear very close to ground below 6,500 feet (2,000 meters) and have prefixes as "nimbo" or suffix as "nimbus" to indicate Low-level clouds. These clouds mostly contain water droplets and are full of moisture, and produce Rain and snowfall in cold weather. "Nimbus' is a Latin word meaning 'rain' so mostly rain clouds are nimbostratus or cumulonimbus. Some images of Low-level clouds are shown in Figure 3.

Figure 3. Low-level clouds

Not all clouds produce Rain. Mainly ten clouds are responsible for Rain, namely Altostratus, Cirrostratus, Nimbostratus, Stratus, Cumulus, Altocumulus, Stratocumulus, Cumulonimbus, Cirrocumulus, and cirrus. Therefore, the Rainfall can be predicted based on images of these rain clouds [22]. If the input image from the sky matched any of the ten clouds, we could estimate the amount of Rainfall based on the type of cloud. The description of all types of clouds responsible for Rainfall is shown in Table 1.

Table 1. Description of clouds

Type of Cloud

The appearance of the cloud

Level of the cloud


Cirrus (Ci)

White hair is like thin clouds and is wispy with a silky sheen appearance.


Composed of tiny ice crystals.

Cirrostratus (Cs)

High, milky white-like appearance. These are transparent and cover the almost whole sky like a sheet or blanket and produce a halo effect (ring, i.e., a circle of light around the sun or moon).


Mainly ice crystals.

Cirrocumulus (Cc)

Small white patches spread like grains arranged in rows.


Mostly made up of ice crystals

Altocumulus (Ac)

Bumpy rounded masses, cotton ball appearance, white and layered as shading.


Mostly made up of water droplets and at low temperatures may contain ice crystals.

Altostratus (As)

Generally covers the whole sky and looks like a transparent sheet of grey color.


Made up of liquid droplets or ice crystals.

Nimbostratus (Ns)

They are thick dark in color like wet blanket caused by continuous Precipitation

Mid-level/Low-level (Multi-level)

It contains Rain or snowflakes, or ice crystals

Stratocumulus (Sc)

They are textured and puffy like stretched out cotton


Composed of liquid droplets. If the weather becomes highly, ice crystals may form

Stratus (St)

Have a mist or fog-like appearance with long horizontal layers.


Made up of liquid droplets

Cumulus (Cu)

Cauliflower-like appearance with bulging upper parts.


Mainly composed of liquid water

Cumulonimbus (Cb)

Large, puffy clouds with solid vertical development


Mainly water droplets and ice crystals at the top

4. Data Collection

Dataset Cirrus Cumulus Stratus Nimbus (CCSN) is used for the experiment [9], which consists of cloud images for 11 categories (including contrails) consisting of a total of 3543 images. As per the requirement, contrails images (200 in number) have been removed from the original dataset, and the final dataset consists of 2343 ground-based cloud images of 10 categories are used. The description of images in the CCSN dataset is given in Table 2. All images in the CCSN dataset are in JPEG format with a resolution of 256 × 256.

Table 2. CCSN dataset containing ten categories of clouds

Cloud name

No. of images in CCSN dataset

Altocumulus (Ac)


AltoStratus (As)


Cirrus (Ci)


Cirrostratus (Cs)


Cirrocumulus (Cc)


Cumulus (Cu)


Cumulonimbus (Cb)


Nimbostratus (Ns)


Stratocumulus (Sc)


Stratus (St)


4.1 Clouds and associated precipitation

Rainfall can be estimated if we know the amount of precipitation associated with the type of cloud. So, according to the precipitation, clouds with no precipitation will not produce any rain. Still, the weather may be cold, some clouds may produce drizzle or light rain or rain for long hours, and some clouds produce intense and Heavy Rain, which may be short with lightning and strong winds [23]. The description of clouds and their associated precipitation is presented in Table 3.

Table 3. Classification of clouds based on associated precipitation


Associated Precipitation








No Precipitation but occasionally sprinkles or showers


Produces light showers or sprinkles


Bring a light or moderate Rainfall of long duration




Light Rain or drizzle


Showers or snow


Heavy Rain with lightning, hail, or snow

4.2 Categorization of clouds based on precipitation

According to the Precipitation and the amount of Rainfall associated with each cloud type is shown in Table 3, all cloud images are classified into three classes or groups.

No Rain to Very Low Rain: Cirrus (Ci), Cirrostratus (Cs), Cirrocumulus (Cc), Altocumulus (Ac).

Low to Medium Rain: Altostratus (As), Stratocumulus (Sc), Stratus (St), Nimbostratus (Ns).

Medium to Heavy Rain: Cumulonimbus (Cb), Cumulus (Cu).

4.3 Division of dataset

The dataset is categorized into three classes and assigned labels where each label represents the approximate amount of Rainfall. The total dataset is divided in the ratio 80:10:10 as train set, test set, and validation set, respectively. The number of images in each set is shown in Table 4.

Table 4. Division of images in the dataset

Labels of Classes




No_Rain to very_low_Rain




Low_Rain to Medium Rain




Medium_rain to High_Rain








5. Model and Methods

Assuming there are clouds in the sky, we now need to identify them. Humans will look for the color and thickness of the cloud. He will observe, sunshine is coming through it or not, whether the cloud is stationary or moving, and the intensity of the wind, temperature, and weather, and plan his activities. Nowadays, weather is very unpredictable and influences many things like agriculture where if the field is already irrigated and unexpected rain fall, then the crop will be damaged. Sometimes it may have the scenario of rain, but the clouds may fly away.

Similarly, crop harvesting also depends on the Rain. Nowadays, lots of outdoor events are planned, so we must get prior information about the climate. Weather forecasting is doing a great job, but the climate is different from location to location. Even sometimes we found that there is Rain in one area. After a kilometer is nearby, there is no rain, so some models or devices must be there to predict the weather at any time based on the current situation and at a particular location to make quick decisions.

The idea presented here is to build a model that can take the sky's image and predict the amount of Rainfall, whether low, medium, or high, at the current location based on the type of clouds present in the sky at that time. The experiments are carried out on the same dataset CCSN with CNN and transfer learning with the most popular pretrained models trained on the ImageNet dataset known as VGG16, Inception-V3, and Xception.

5.1 Models used

Basic CNN:

Convolution Neural networks work like human vision and are thus very effective in the image classification. The basic structure of CNN consists of the input layer, hidden layers, and output layer. The hidden layers of CNN are usually consisted of Pooling layers, Activation Functions, and Fully connected layers also called dense layers [24]. The basic architecture of the CNN model is shown in Figure 4.

Convolutional layers are the main layer that applies a convolution operation using various filters to the input and extracts various features. The information is passed forward to the next layer.

Activation Function ReLu is simple, fast, and stabilizes the input if it is negative to zero and outputs the input directly if it is positive.

Pooling layers are used in between convolution layers and contribute by reducing the size of images by keeping essential features and removing the unnecessary area, thus reducing the computational cost. Max-pooling and average-pooling are the most popular pooling strategies used in CNN.

In Fully connected layers, every neuron in one layer gets connected to every neuron in the next layer. These are also called dense layers, where the actual learning takes place by adjusting the weights controlled by the optimizer used.

The SoftMax activation function is used for multiclass classification and then the output layer predicts the input image based on previous layers trained on the training dataset.

CNN model designed for this experiment consist of 12 convolution layers followed by Relu as an activation function, 6 max pooling layers with the pool size of 2x2 and 3 Batch Normalization layers followed by dropouts with 0.2. Two dense layers are used with 2048 and 1024 units after flattening the feature map. To work with multi label classes SoftMax Activation function is used. Adam optimizer is used with learning rate of 0.0001 and categorical cross-entropy is used as loss function.

Figure 4. Basic CNN structure

Figure 5. Concept of transfer learning

Transfer learning:

Transfer learning is the technique of transferring the knowledge learned by one model trained on an extensive dataset for one task as shown in Figure 5 and is reused to solve another task on the new dataset and gives the starting point for the new model instead of starting from scratch and is very helpful especially if the dataset is minimal. A model implemented from scratch will not train appropriately on a small dataset and may produce poor results due to insufficient data and may require a long time to get trained. In such case, the pre-trained models having deep and broader architecture can be used to gain knowledge acquired by them to train a model on a new dataset for the new task with minimal data [25] and becomes capable of producing good results in significantly less time compared to the model developed from scratch.

In CNN, the first few layers (bottom layers) learn only generalized features like edges, shapes, noise, etc., and the more complex patterns are learned by top layers like color, texture, special features like eyes, nose, smile, etc. Using transfer learning, the lower layers can be reused to learn general features. Removing the last layers (top layers) and adding new layers (new classifier) at the top can train the model by learning special features related to the new dataset shown in Figure 6.

Figure 6. Working model of transfer learning


A considerable dataset consists of images of almost 22,000 different categories of objects for computer vision research. It is a database that has more than 1.2 million images and is designed to train a model to correctly predict an input image among 1000 diverse object categories [26]. ImageNet dataset consists of different objects like cats, dogs, vehicles, household things, etc., which we encounter in our day-to-day life. Neural networks are designed in such a way that in bottom layers, very general features like edges are detected than in the next higher level, some more basic features like shape, color is detected, and like this, top layers are meant to detect specific and complex features of the images [27]. The main idea of transfer learning is to reuse the task performed by bottom layers as detection of generalized features will be the same for every image, so instead of making our modern learn from scratch, the Basic learning of bottom layers can be utilized in an efficient way for the new image dataset [28]. The only modification is needed at the top layers, which must be replaced by some new layers and the new classifier to learn specialized features from the images in the new dataset. To implement transfer learning for new images, well-known dataset models like VGG16, VGG19, ResNet, Inception V3, and Xception are available publicly which are already trained on the ImageNet Dataset and are having very powerful architecture and knows how to classify images more than 1000 categories and this existing knowledge is used to quickly train a new classifier to identify specific class in entirely new and even small dataset.

Description of some well-known pre-trained models on the ImageNet dataset

VGG-16: VGG -16 is a deep learning CNN model with 16 layers in the architecture and is 1st Runner up in ILSVRC 2014 proposed by Oxford University. It consists of 13 convolution layers and three max-pooling layers [29]. Convolution layers use multiple small filters of 3×3 and show improvement over AlexNet by replacing their large convolution filters (size seven × 7 and 11 × 11).

Inception V3: 1st Runner up in ILSVRC 2015 requires very little computational power compared to the VGG16 model. This model combines various researchers' ideas and contains very few parameters, around 7 million compared to VGG16, making it computationally fast. It consists of various building blocks like convolution layers, max-pooling layers, average pooling layers, batch normalization, and dropouts, along with Fully connected layers and SoftMax as a classifier [30]. The performance metrics of some well-known pretrained models on the ImageNet dataset are presented in Table 5.

XCeption model: This model is an improvement to an inception model and stands for an extreme version of Inception due to its depth-wise separable convolutions. Batch Normalization is used after every convolution and separable convolutions. Also, the order of pointwise convolution and separable convolution order is reversed; instead, the depth-wise separable convolution follows pointwise convolution. In the XCeption model, data traverse through three main steps, i.e., Entry flow, Middle flow, repeated eight times, and exit flow [31] The basic structure of the XCeption model is as shown in Figure 7.

Table 5. Performance of models on ImageNet dataset


Top-1 Accuracy

Top-5 Accuracy






528 MB





92 MB





88 MB


5.2 Methods used

Transfer learning and fine-tuning: The common workflow for transfer learning is as follows:

  1. Initially, take layers from the previously trained model on a large dataset.
  2. Then freeze all the layers so that no information can be destroyed during the training rounds and is called feature Extraction, where the weights of frozen layers remain unchanged.
  3. The next task is to add new dense layers on top of the frozen layer to modify old features into predictions on the new dataset.
  4. And then train layers to learn features from a new dataset.
  5. This step is significant and optional, also called fine-tuning, in which the unfreezing of the entire model or a part of it is done, and re-training of the model is done on a new dataset with a very low learning rate. It is very useful and gives potential improvements by adapting the pretrained data on the new dataset.

Figure 7. XCeption architecture

Figure 8. Working model of proposed work

Fine-tuning: The concept of Fine-Tuning is used to train the model with the VGG16, Inception-V3, and XCeption models. First, the features are extracted from the pretrained model by freezing its layers, and then the model is fine-tuned, i.e., implemented the learned knowledge on the new dataset by unfreezing the model. Thus, it helps to achieve good performance with less training time. The weights of unfrozen layers and newly added fully connected layers are trained and updated [32]. The working of the proposed model is described in Figure 8.

Initially, weights of the pretrained model are downloaded, then all the layers are kept frozen except the top layer used for classification in the new dataset, and the model gets trained. For fine-tuning the model, all the layers are unfrozen, and again the model gets trained on a new dataset with a very low learning rate. The model, once trained, got ready for the prediction with good accuracy. The same flow is used for all three pre-trained models VGG16, Inception-V3, and XCeption.

5.3 Some important hyper-parameters experimented

Data augmentation: Data Augmentation is very helpful when the dataset is minimal. Models get trained better on a large number of images, but for some areas like cloud images, it is tough to find a dataset with a large number of images. Image augmentation forms new images from existing ones to make the classifier learn new features. A large dataset is required to train the network with a good learning experience to improve the neural network's performance. Training dataset can be expanded virtually using Image augmentation techniques for the better performance of the network classifier, especially in fewer images in the dataset. It artificially creates images by applying various combinations of multiple processing techniques like zooming, left rotation, proper rotation, vertical flip or horizontal flip, adjusting brightness, and many more [33].

Some of the Data Augmentation techniques are:







Optimizer: Optimizers improve the performance of the model by adjusting the weights by minimizing the loss function. The loss function calculates loss and then adjusts the weights of a neural network to match the predicted value and known value. In this experiment, Adam optimizer is used which stands for Adaptive Momentum Estimation, and is very popular that combines the advantage of AdaGrad and RMSprop and is an improvement of stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments. categorical cross-entropy is used as loss function and Adam optimizer with a learning rate of 0.001.

Learning rate: learning rate is a significant factor for any model to work as per the requirement, and choosing the correct learning rate is always a difficult task and needs lots of trial and error rounds. Learning rates chosen smaller than optimal values will take a much longer time to reach an ideal state, whereas the learning rate may not allow the algorithm to converge soon. According to Keras's document, the recommended learning rates are 0.01,0.001,1,0.1, which is appropriate for the CNN model training from scratch. However, in transfer learning, the dataset is minimal, as in this case. Pre-trained CNN fine-tunes on a small dataset only part of the model needs to update, which may require a considerable number of epochs for training, so it is always recommended to use minimal learning rates (i.e., nearly 0.00001or even less) during fine-tuning a model.

Batch size: Hyperparameters play a crucial role in training a CNN. Their values need to be chosen very carefully to obtain the appropriate and expected results. Some parameters like Batch size needs to be taken care of before starting training as it affects the resource requirements of the training process, speed, and several iterations(epochs), which alternately affects the accuracy of the network or the time that was taken till convergence [34]. Batch size decides the number of images from the training set to participate during the gradient estimation process.

Number of epochs: Depend upon the validation error, the epochs must be changed or adjusted to decrease the loss and ultimately increase the model's accuracy. CNN model from scratch requires a large number of epochs for training, while model-based using pretrained models for transfer learning needs a minimal amount of epochs, even less than 100, for the model to get trained on the new dataset [35].

Dropout: Dropout is also a fundamental concept and the most convenient way of reducing the problem of overfitting and sometimes overperformed other regularization methods [36]. Overfitting arises when training accuracy is considerable as compared to validation accuracy that means the model is learning too much during training but is not able to perform very well during the validation phase, so to control or reduce this problem, randomly, some neurons are dropped from the layers (hidden or dense) by disconnecting them temporarily from the network during training.

5.4 Platform used

Running these codes was taking hours to train the model on CPU, then the code has been run in Google colab pro, a platform provided by Google for research purposes. Google colab supports GPU-enabled hardware (Python 3 Google Compute Engine backend (GPU)), which makes training very fast and can be completed in very little time compared to local CPU and gives a massive boost to training.

6. Results and Analysis

This section displays the results obtained from different models CNN, transfer learning using VGG 16, Inception-V3, and Xception.

6.1 CNN results

Figure 9. Training and validation loss and accuracy with CNN

A typical CNN model with Convolution layers, MaxPooling layers, Batch Normalization is used to train the model on the dataset from scratch. Various data Augmentation techniques were used as the dataset is small to get good accuracy and overcome overfitting. Model is said to be overfitting when training accuracy increases, but testing accuracy remains the same. The results were very average with Normal CNN as shown in Figure 9 around 59% test accuracy and 57% validation accuracy, so then moved towards transfer learning for better performance.

6.2 Results of transfer learning using VGG16, inception-V3, and XCeption model

After working on regular CNN, the results were not satisfactory, so the model is then trained based on transfer learning. During the experiment, three models are chosen for implementation VGG16, Inception-V3, and XCeption, and the result obtained are summarized in Table 6.

Results obtained in terms of loss and accuracy without fine-tuning and with fine-tuning from the models VGG 16, Inception -V3, and XCeption are shown in Figures 10-12, respectively as (a) without Fine-tuning and (b) with fine-tuning.

  • Results obtained from VGG-16 (Figures 10, 11 and Table 6)
  • Results obtained from XCeption (Figure 12)

It has been found that the Xception model gives the best accuracy in less time compared to VGG16 and Inception-V3 Model. The predictions are done on the Xception model.

The result is shown below in Table 7, where the images are randomly selected from each class's validation set and test set. Table 8 shows the accuracy in predicting the images downloaded from the internet outside the validation set and test set.

Figure 13 shows some images of clouds passed as input and the predicted output. A total of 12 images are used, four from each class and passed as input. Results show out of 12, 11 images were predicted correctly using Xception Model trained.

Table 6. Results obtained with VGG16, inception-V3, and Xception model


Time Taken without FT (Epochs=50)

Training Accuracy without FT

Validation Accuracy without FT

Time Taken with FT (Epochs=35)

Training Accuracy with FT

Validation Accuracy with FT


42m 45 s



20m 21s



Inception V3

43m 37s



20m 32s




36m 44 s



15m 25s



Table 7. Prediction results on images from test and validation set


Total Images (test Set + Validation Set)

Predicted Correct

Predicted Wrong


No_Rain to very_low_Rain





Low to Medium Rain





Medium to High Rain





Figure 10. Results of VGG16

Figure 11. Results of inception-V3

Figure 12. Results of Xception

Table 8. Prediction results on images outside test and validation set


Images outside test and validation set

Predicted Correct

Predicted Wrong


No_Rain to very_low_Rain





Low to Medium Rain





Medium to High Rain





Figure 13. Predicted output with random images

7. Conclusion and Future Scope

Outstanding accuracy is achieved using transfer learning with the Xception model to predict Rainfall based on cloud images taken from the ground, giving output whether Rain or little Rain or medium Rain or high Rain. The experiment was first carried out with CNN. It took lots of epochs to run and needs to be worked on various hyperparameters like learning rate, optimizer, batch size, dropout, etc. It requires lots of trial and error to get good accuracy. But even after so many trials, the accuracy got was average for the model. To improve the performance of the model transfer learning technique is implemented. As per the results, Transfer learning is very fast and efficient and requires less time to achieve reasonable accuracy. XCeption model achieved good accuracy as compared to VGG16 and Inception-V3 models. Training accuracy can be increased by increasing the training cycles(epochs) and can reach above 90% but will give rise to an overfitting problem. In the future, accuracy can be improved by enriching the database with more images. The accuracy got is around 80% for training and 79% for validation which is very good and can be used in the real world. So this model will be very much helpful for people to predict the status of Rainfall at any location by giving input as current sky image at any time, which will make people, farmers take essential decisions based on the clouds present in the sky. The model can be made more robust by combining various atmospheric parameters like temperature, humidity, dew point, and wind speed which are the essential meteorological parameters to be monitored continuously to predict short-term Rainfall. In future Rainfall, prediction can be more accurate by combining both the approaches using cloud images and the various atmospheric parameters responsible for the Rain.


[1] Tsukahara, J., Fujimoto, Y., Fudeyasu, H. (2019). Rainfall forecasting by using residual network with cloud image and humidity. 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), Helsinki, Finland, pp. 331-336.

[2] Polisetty, K., Paidipati, K.K., Bodapati, J.D. (2019). Modelling of monthly rainfall patterns in the north-west India using SVM. Ingénierie des Systèmes d'Information, 24(4): 391-395.

[3] Ayalew, D., Tesfaye, K., Mamo, G., Yitaferu, B., Bayu, W. (2012). Variability of rainfall and its current trend in Amhara region, Ethiopia. African Journal of Agricultural Research, 7(10): 1475-1486.

[4] Shou, Y., Li, S., Shou, S., Zhao, Z. (2006). Application of a cloud-texture analysis scheme to the cloud cluster structure recognition and rainfall estimation in a mesoscale rainstorm process. Advances in Atmospheric Sciences, 23(5): 767-774.

[5] Ye, L., Cao, Z., Xiao, Y. (2017). DeepCloud: Ground-based cloud image categorization using deep convolutional features. IEEE Transactions on Geoscience and Remote Sensing, 55(10): 5729-5740.

[6] Ezzat, D., Hassanien, A.E., Taha, M.H.N., Bhattacharyya, S., Vaclav, S. (2020). Transfer learning with a fine-tuned CNN model for classifying augmented natural images. In International Conference on Innovative Computing and Communications, pp. 843-856.

[7] Gogoi, M., Devi, G. (2015). Cloud Image Analysis for Rainfall Prediction: A Survey. Advanced Research in Electrical and Electronic Engineering, 2(13): 13-17. 

[8] Tuominen, P., Tuononen, M. (2017). Cloud detection and movement estimation based on sky camera images using neural networks and the Lucas-Kanade method. In AIP Conference Proceedings, 1850(1): 140020.

[9] Dev, S., Lee, Y.H., Winkler, S. (2016). Color-based segmentation of sky/cloud images from ground-based cameras. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(1): 231-242.

[10] Zhang, J., Liu, P., Zhang, F., Song, Q. (2018). CloudNet: Ground-based cloud classification with deep convolutional neural network. Geophysical Research Letters, 45(16): 8665-8672.

[11] Lagunas, M., Garces, E. (2018). Transfer learning for illustration classification. arXiv preprint arXiv:1806.02682.

[12] Lai, C., Liu, T., Mei, R., Wang, H., Hu, S. (2019). The cloud images classification based on convolutional neural network. 2019 International Conference on Meteorology Observations (ICMO), Chengdu, China, pp. 1-4.

[13] García Fernández, A. (2019). Cloud Classification Using Convolutional Neural Networks (Bachelor's thesis).

[14] Kaviarasu, K., Sujith, P., Ayaappan, G. (2010). Prediction of Rainfall using image processing. 2010 IEEE International Conference on Computational Intelligence and Computing Research (Vol. 20).

[15] Salot, N., Swaminarayan, P.R. (2015). Classification of cloud types for rainfall forecasting. International Journal of Advanced Networking and Applications, 7(1): 2626. 

[16] Shi, C., Wang, C., Wang, Y., Xiao, B. (2017). Deep convolutional activations-based features for ground-based cloud classification. IEEE Geoscience and Remote Sensing Letters, 14(6): 816-820.

[17] Tang, Y., Yang, P., Zhou, Z., Pan, D., Chen, J., Zhao, X. (2021). Improving cloud type classification of ground-based images using region covariance descriptors. Atmospheric Measurement Techniques, 14(1): 737-747.

[18] Manzo, M., Pellino, S. (2021). Voting in Transfer Learning System for Ground-Based Cloud Classification. arXiv preprint arXiv:2103.04667.

[19] Boonyuen, K., Kaewprapha, P., Weesakul, U., Srivihok, P. (2019). Convolutional neural network inception-v3: A machine learning approach for leveling short-range rainfall forecast model from satellite image. International Conference on Swarm Intelligence, pp. 105-115.

[20] Houze Jr, R.A. (2014). Types of clouds in earth's atmosphere. In International Geophysics, 104: 3-23.

[21] Panchgani, A., Doshi, H., Limbasiya, N. (2014). Prediction of rainfall using image processing. International Journal of Advanced Engineering and Research Development, 1(12): 61-67.

[22] Liu, S., Li, M., Zhang, Z., Cao, X., Durrani, T.S. (2020). Ground-based cloud classification using task‐based graph convolutional network. Geophysical Research Letters, 47(5): e2020GL087338.

[23] Sinabutar, J.J., Sasmito, B., Sukmono, A. (2020). Studi cloud masking menggunakan band quality assessment, function of mask dan multi-temporal cloud masking pada citra landsat 8. Jurnal Geodesi Undip, 9(3): 51-60.

[24] Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F. E. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, 234: 11-26.

[25] Shu, M. (2019). Deep learning for image classification on very small datasets using transfer learning.

[26] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F. (2009). ImageNet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, pp. 20-21.

[27] Simonyan, K., Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

[28] Liu, S., Deng, W. (2015). Very deep convolutional neural network based image classification using small training sample size. 2015 3rd IAPR Asian conference on pattern recognition (ACPR), Kuala Lumpur, Malaysia, pp. 730-734).

[29] Chen-McCaig, Z., Hoseinnezhad, R., Bab-Hadiashar, A. (2017). Convolutional neural networks for texture recognition using transfer learning. 2017 International Conference on Control, Automation and Information Sciences (ICCAIS), Chiang Mai, Thailand, pp. 187-192.

[30] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 2818-2826.

[31] Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 1251-1258.

[32] Wolf, T., Sanh, V., Chaumond, J., Delangue, C. (2019). Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149.

[33] Han, D., Liu, Q., Fan, W. (2018). A new image classification method using CNN transfer learning and web data augmentation. Expert Systems with Applications, 95: 43-56.

[34] Kandel, I., Castelli, M. (2020). The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express, 6(4): 312-315.

[35] Palakodati, S.S.S., Chirra, V.R.R., Yakobu, D., Bulla, S. (2020). Fresh and rotten fruits classification using CNN and transfer learning. Rev. d'Intelligence Artif., 34(5): 617-622.

[36] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1): 1929-1958.