Cast Shadow Angle Detection in Morphological Aerial Images Using Faster R-CNN

Cast Shadow Angle Detection in Morphological Aerial Images Using Faster R-CNN

Sana Pavan Kumar Reddy Jonnadula Harikiran 

School of Computer Science and Engineering, VIT-AP University, Inavolu, Amaravati 522237, Andhra Pradesh, India

Corresponding Author Email: 
pavansana8@gmail.com
Page: 
1313-1321
|
DOI: 
https://doi.org/10.18280/ts.390424
Received: 
2 May 2022
|
Revised: 
16 July 2022
|
Accepted: 
23 July 2022
|
Available online: 
31 August 2022
| Citation

© 2022 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

With the tremendous advancements in digital image processing technology over the last few years, it is now possible to resolve many challenging issues. In light of this, this study proposes that digital image processing can be used to detect shadows in photographs. Since unmanned aerial vehicles and satellite devices have become more common image generating devices. The significant issue in the generated images is of its shadow. Shadows are inevitable in remote sensing photographs, particularly in metropolitan environments, due to the block of high-rise objects and the influence of the sun's altitude. This results in missing information in the shadow zone. The state-of-the-art shadow detection algorithms require manual alignment and predefined specific parameters. Most of those existing algorithms fail to deliver precise results in a variety of lighting and ecological conditions. To overcome these limitations, we propose a framework Multi Layered Linked approach with Tagged Feature Model for Shadow Angle Detection (MLTFM-SAD). The aim of the proposed model is to detect the shadows from aerial photographs and angle of those shadows. The proposed framework is a three-step approach. Initially, the image segmentation is applied on the input images. Second, hybrid Gaussian mixing mode and Otsu's approach is applied on the segmented shadow mask map and corresponding pixel set is generated. As a result, an initial shadow mask map is refined using object spectral attributes and spatial correlations between objects. Finally, the angle at which the shadow appears in the given image is recognised and analysed. The proposed method's performance is compared to that of all current approaches and the results revealed that the proposed model performance levels are superior.

Keywords: 

shadow detection, shadow angle prediction, image processing, segmentation, multi-layer approach, tagged features

1. Introduction

It is not uncommon for photos and videos to feature shadows. When straight light from an illumination is blocked by an opaque object, shadows are formed. In the past, shadows were thought of being noises that might be used to detect and track objects. Image aspects can be broken down into four main categories: Shape, look, movement, and depth. Shape features are used in many detection methods because they are invariant to changes in perspective [1]. There are extremely few shape features to identify or model. As a result, shape-based detection algorithms can be quite effective [2]. However, due of the clutter in the background, it is impossible to extract precise form details. It has been found that image appearance features can be used in sliding window detection systems [3]. Contrast information is computed and described using a variety of descriptors [4]. In unimaginative identification models, histograms of gradients orientations (HOGs) have performed well. Detecting severely obscured items using appearance-based approaches is tough.

The usage of stationary cameras is common in visual surveillance applications. Understanding a scene begins with the removal of the background [5]. The findings of background removal can be used to perform object detection and tracking [6]. Motion blobs can be useful in a cluttered setting because they can be compared to a previously learned background. Background subtraction is also performed in video sequences in this study. Subtraction results can be inaccurate; therefore, backdrop modelling and subtraction can be useful, but dull pixels recognition remains a challenge [7]. It's possible to merge numerous items in the background subtraction results, and it's possible to separate one object into several blobs [8]. Due to motion ambiguity, erroneous background model update introduces inaccurate models [9]. When the backdrop and foreground appear to be of the same colour, the situation gets more challenging.

There are two steps involved in processing shadows. Reflection and glare detection. Approaches based on models and those based on shadow features [10]. Preliminary information about the scene and moving targets is used to build shadow models. These approaches use shadow features, such as grey scale [11], brightness, saturation, and texture to identify shadow areas. The shadows appear in the photograph as a low grey value. Invariant colour spaces, such as HSV, HCV, YIQ, C1 C2 C3 and so on, can also be used to detect shadows. Another method is to use an adaptive Debye radiator model [12]. Shadow correction techniques include multisource data fusion, filtering, and radiometric enhancement. In order to recreate the observed shadow areas radiometrically, three distinct techniques are applied. Gamma correction, linear correlation, and histogram matching are all examples of these methods. Detection of shadows at the pixel level is the primary goal of most shadow detection algorithms. It is possible to detect shadows using object-oriented detection. Image objects rather than individual pixels are the focus of object-based approaches [13].

Physically-based and image-neighbourhood-based lighting invariants have been used to identify shadows in single image [14]. These invariants require high-quality images with a large range, high intensity resolution, and where the lens radiometry and colour transformations are precisely measured and accounted for in order to be reliably calculated. The invariants of these images can be badly damaged by even the tiniest of flaws [15]. The noise and distortions that result from automated gain control and colour balancing in consumer-grade photos from places like Flickr and Google are too much of a distraction for these filters to work well in these situations. There is a pressing need for a shadows detector that can function on low-quality consumer photos [16].

Occlusion identification is still challenging, however. As an example, there are two groups of five persons in Figure 1. Some opacity may be seen in the two girls to their left in the group. It's difficult to see the last three people in the group to the right [17]. The quantity and location of people can't be deduced from their outward appearances alone. Shadow data is not a random blip [18]. In visual activities, such as monitoring and detection, it is useful. The shadow detection process is shown in Figure 1.

Figure 1. Shadow detection process

Physical shadow models were used in the early stages of shadow removal techniques. A popular technique is to use an image generation model to represent the picture in terms of material qualities and a light source-occluder system that casts shadows to formulate the problem of shadow removal. Because the source-occluder system's parameters are estimated, and then the shadow effects on the image are reversed, the shadow-free image may be achieved. When dealing with the perplexing issue of shadows [19], one option is to create images that are shadow-free, which involves processing photos in such a way that the shadows are erased but the other important information in the photo is kept. Using a newly discovered method for removing from an image the colour and intensity of the predominant illumination, this strategy is based on a scientifically sound foundation [20]. A 1-dimensional invariant picture that is only dependent on reflectance is used to determine a single scalar function of an RGB image that is invariant to changes in light colour and intensity. For the invariant image, there are no shadows because they are only seen because of a change in the colour and intensity of the light. Importantly, unlike prior invariant calculations, the scalar function is not affected by features such as occlusion edges, which can alter invariants calculated over a larger area of an image [21].

While most existing methods of shadow detection rely on manually extracting unique features, Fast R-CNN provides a robust extracted features for automatically shadow modelling with many distinguishing features. There is no requirement for background subtraction with R-CNN, as it achieves precise object detection from of the raw input frame and so eliminates numerous background subtraction concerns like as noise. On top of the aforesaid issues, this research suggests the use of a better method that utilises a more efficient region-based convolutional neural network. It replaces the prior region proposal network (RPN), selective search, and other candidate region methods with the guided anchor approach. Due to object occlusion, this context characteristic is also utilised in order to access the network module. Multi-feature fusion using the skip pooling approach [17] is utilised to tackle the problem of tiny object scale, so that Faster R-CNN can perform better in complicated scenarios. Object detection in general situations can be improved by using a faster R-CNN, but the recognised objects in these scenes often have issues like occlusion, distortion, and enormous scale. A Faster R-detection CNN's performance is hampered by the presence of these issues in certain situations.

Recently, object sensors have been widely used during artificial intelligence, face recognition, autonomous driving, and other areas. Traditional object detection methods as well as deep learning-based object detection algorithms already exist. In the past, object detection algorithms relied mostly on sliding window frames or feature points. Sliding window selection for region selection may produce good results, however the lack of pertinence leads to high time complexity and window redundancy. Manual feature selection methods, on the other hand, are sometimes insufficiently robust. The advancement of deep learning technology has led to a shift in object detection techniques from human feature selection to neural network-based methods. There are two primary types of object identification algorithms based on deep neural networks: R-CNN, which is a two-stage object detection approach that uses classification model and convolutional neural networks (CNN), and a one-stage technique that transforms object detection into a regression issue.

Rather than using SVM's singular value decomposition, Fast R-CNN employs softmax, which saves all of the features in the video memory and so consumes less storage space while also increasing detection speed. Selective search and similar methods to choose region proposals yield a high number of invalid regions, which cannot be solved by R-CNN or Fast R-CNN. Because of this, computer resources are being squandered and wasted. It is possible for RPN to make full use of feature maps and learn how to generate region recommendations on its own using neural networks. For speedier detection, RPN replaces time-consuming feature extractor and comparable algorithms. To get better accuracy than R-CNN and CNN, an image based multi layered linked approach with tagged feature model for shadow angle detection is proposed for accurate shadow detection from images.

When a light source is obscured, a shadow is cast as a result. The cast shadow and the self-shadow are the two sorts of shadows [22]. Here, we concentrate on the shadows cast by photos from remote sensing. Figure 2 depicts the object-oriented detection of shadows as a block diagram. An image object is created by dividing it into smaller parts. Segmentation is used to extract each individual image object. Each object is applied to the actual image's histogram to see if it is influenced by the shadows [23]. If the false reflections are deleted from the identified shadows, the shadow detection procedure will be accomplished. Before applying radiometric correction [24], the shadow boundary's inner and outer lines are extracted for shadow angle detection [25]. Then, points derived from the inner and outer characteristic lines are corrected using relative radiometric correction. It is shown in Figure 2 how to identify shadows from high-resolution images [26].

Figure 2. Block diagram of shadow detection

As one of the main technologies in shadow angle monitoring, this study examines in depth the actual detection status of shadow detection in current slope monitoring [27]. Slope monitoring is hampered by the fact that the current relies mostly on classical detecting technologies, resulting in a low level of accuracy and numerous inaccuracies [28]. Using digital image processing to improve the current slope shadowing detection methods, this work aims to increase slope monitoring capabilities as a whole by advancing the field's understanding of digital image detection [29]. This study provides a particular optimization and enhancement plan, including the construction of an image enhancement algorithm based on saturated, and the calculation of the primary detection indications [30]. The ability of slope monitors to detect shadows has been further enhanced and a technological breakthrough has been achieved to the improvement plan outlined in this study. Traditional slope monitoring suffers from a number of technological flaws, including a lack of shadow detection, a high error value, and low accuracy.

2. Literature Survey

Depending on the setting, learning samples acquired from a single photograph may have very poor generalisation potential when compared to photographs taken in other locations or at different times [1]. A significant amount of effort is necessary to select training photos from which we want to identify shadows in photographs [2]. Model-based techniques, which leverage an existing 3D picture or DSM, use ray tracking or x and y coordinates [3] to produce shadows using camera data such as sun position [6]. The z-buffer approach uses two depth maps, one for the sun and one for the camera, to determine if a DSM pixel can be seen. As a result of these disadvantages, this technique is not without flaws [4]. Ray-tracing can be used to render 3D models in linear or voxel format [5]. Ray tracing, on the other hand, is a time-consuming procedure. Large-scale 3D ray-tracing requires data structures for 3D models, however this is rarely investigated in remote sensing [6].

Fang et al. [1] suggested a morphological erosion filter that uses the interior of enormous shadows or semi-transparent areas in rebuilt images as training examples. The use of an SVM classifier improves shadow detection. Mohajerani and Saeedi [3] used skeletons of non-shadow patches in their image matting to detect shadows caused by mislabeling caused by moving cars, trees, and rivers. The model does not assess the robustness of the approaches or the number of mislabels. Both studies failed to take into account the fact that models and images are frequently captured on different days or even at different times. Many examples of mislabeled structures and trees can be found in structures and trees that have changed over the period of more than a year.

The utilisation of motion information can help with object detection. Kim, J. and Kim, W. [4] used motion information to normalise motion information in video frames in order to recognise pedestrians. Other investigations looked at video sequences of pedestrians. Traveling experience was used in a few other studies prior to Zhang et al. [6]. Bo et al. [7] identified pedestrians using motion and appearance patterns. Haar-like features are employed to model them in both motion and appearance. Alvarado-Robles et al. [8] suggested a detection method based on long-term periodic frame rate. They examined vast video sequences in an attempt to find periodicity. Their method can process image sequences with incredibly low resolutions. Hu et al. [9] extended the pedestrian detecting system. They analysed a modest number of frames in batch processing. The model takes into account the elimination of entire shadows rather than shadow sections, which affects its accuracy.

In order to detect shadows in photographs successfully, supervised learning should be able to learn the attributes of shadows in more complex settings. In aerial VHR and natural image computer vision research, supervised learning has been extensively applied [10]. All four of the aforementioned characteristics are present in the shades in these images. Natural RGB images can be accurately identified using SVM and a three-degree polynomial kernel. Texton histograms are used in conjunction with SVM to categorise natural photographs, in addition to RGB features [12]. The texture information in VHR images is extracted using a wavelet transform, and then SVM is utilised to identify shadows [13]. Because textures in shadows do not provide as much information as RGB characteristics, shadow detection depends less on textures and more on RGB characteristics [14]. Using supervised processes, which require a lot of manual labour, it is difficult to generate a large number of relevant training instances [15].

To detect shadows, Tian et al. [11] employed a modest number of training data. Two GMMs were constructed from hand-drawn strokes: One for shadows and one for non-shadows. Mean-shift clustering was used to segment the entire image. Finally, the groupings are classified based on the GMMs with which they are most comparable. We can utilise clustering to reduce the number of training samples needed to capture the smallest of nuances. The settings for the cluster and GMM, on the other hand, have a substantial impact on the final result. It cannot be utilised in a variety of circumstances. An enhanced closed-form method proposed by Wang et al. [13] makes it easier to use image matting to locate shadows in ordinary photos. To transmit shadow and background attributes from user input to other unknown locations, local smoothness limitations are employed. This technique still requires human labour and a wide range of user input, but it works effectively in a variety of circumstances with few manual samples.

3. Proposed Model

Detecting shadowing in input photos is one of the challenges of employing shadow information. For example, a normalised RGB colour space and a brightness measure are used in region detection [18]. Invariant colour attribute Pixels with identical Hue and Saturated Values and lower luminance in Hue-Saturation-Value (HSV) colour space are categorised as cast shadows [19]. The shadows cast by an object in surveillance settings can be used to identify an object. Because the direct led light is blocked, shadow areas tend to be darker. Many shadow detection methods assume that the column under cast shadows preserves the original vector instructions, given a colour vector without cast shadows [20]. Because of the blue colour of the natural light, this isn't true in the outdoors. Different colour channels have different values muted [21].

Foreground and shadow objects are included in the background subtraction findings. Images are processed using Morphology, a wide range of image processing processes that focus on the shape of an image. Applying a structural element to an image, morphological procedures provide the same-sized output. This includes procedures like erosion and dilatation, opening and closing as well as outline and skeletonization. It is possible to apply mathematical morphology to solve image processing problems such as edge recognition, picture segmentation, noise reduction, feature extraction, and more. To differentiate shadows from background blobs, we perform a morphology close filter on background subtraction findings to fill the gaps [22]. HSV is a colour system that explicitly divides chromaticity and brightness channels [23]. Because it has reduced brightness and similar hue values to the model in the background model, a background subtraction result pixel is regarded a probable shadow pixel. We then calculate the number of pixels that can be categorised as shadows after the classification process is complete [31]. The shadow pixels are counted as the exact shadow outline need to be recognized and future actions may be to remove the shadow in a image. The proposed model identifies the objects and shadows and the image to shadow angle is recognized. For edge detection, we employ the Canny method. Comparing the hue with luminosity values reveals the margins of shadow boundaries. If a shadow is cast on a textured surface, several edges, including texture edges will be seen. These pixel’s gradient orientations resemble the backdrop model. To perform all the three-stage process, this research work proposes a Multi Layered Linked approach with Tagged Feature Model for Shadow Angle Detection (MLTFM-SAD).

Figure 3. Convolution network structure

For each residual unit in the proposed shadow detection model, the primary convolution path has two layers of 1*1 and one layer of 3*3 convolution. Convolution layers increase the asymmetric interpretation function is utilized, and regulating the number of output channels by 1*1 convolution that can lower the computing effort of the network in extracting shadow features. If 3*3 input of the main and bypass connectors is down sampled by the convolutional procedure, the two are not the same. In existing convolutional neural networks, a pooling layer was necessary for down sampling, but in proposed model, a step size of 2 is used instead, resulting in a more compact and regular network. The network structure is shown in Figure 3.

A single-scale output from a Faster R-CNN extracting features network, which really only utilizes the last feature layer as output after feeding the shadow image into feature extraction network, is further improved. Due to this process, the detection rate of small target items and noise in edges of shadows will be significantly lowered, as convolutional layers down sample images, making the pixels covered by the target objects smaller and smaller. Denoising an image is a process of removing noise out of a noisy image in order to bring back the original image. It is difficult to discern high frequency noise, edges, and textures in the denoising process, which means that the denoised photos may lose certain information. The filtering model will identify the dissimilar valued pixels with the over range values that will be normalized by removing the noise data.

Multiple nonlinear processing units exist for deep neural networks, and the output of the lower layers feeds the higher layers. Lower layer output is fed into higher layers, and the most effective features are learned and presented from the numerous data inputs in accordance with reality. Upper-layer maps are the fundamental properties of each convolutional layer; it should be emphasized that the characteristics are convolved by an associated learning convolutional kernel, and the corresponding target maps are created by activating the relevant shadow functions following the process.

The process of shadow detection and angle identification is discussed in the algorithm clearly.

Algorithm MLTFM-SAD

{

Input: Image Dataset {ID[N]}

Output: Shadow Detection and Angle Prediction

Step-1: Load the image from the dataset and then store it in a buffer for analysis and shadow identification. The image loading from the dataset is performed as

$Img(i)=\left( \sum\limits_{i=1\,\,}^{N\,\,}{getimage(ID(i))+getsize(ID(i))} \right)$

The images in the dataset can be thousands and thousands, however every image will be loaded and each image size is also considered.

Step-2: The image that is loaded will be segmented into multiple parts so that each part from the image is considered and analysed for border detection for accurate object and shadow detection. The image segmentation is performed as

$\begin{align}& Seg(Img(p,q),L)=\frac{1}{\left( 2p+1 \right)*\left( 2q+1 \right)} \\& +\frac{\sqrt{\sum\limits_{p=1}^{M}{\sum\limits_{q=1}^{M}{maxsize(Img(i))+getintensity(Img(i))}}}}{sizeof(ID)} \\\end{align}$

The principles of segmentation are considered and applied. The image is initially segmented, and then the region is re-segmented initially. Second, better features are extracted during feature extraction. Shape, texture, and intensity are only a few examples. To round things out, objects have a specific shape, which can be deduced through the use of clustering.

Step-3: The edge detection of each part of an image is performed so that only object and its shadow region pixels will be extracted for further processing. The edge detection of segments is performed as

$\begin{align}& Seg(Img(i))=\sum\limits_{i=1}^{N}{maxintensity(Seg(Img(i))+} \\& \frac{1}{\sqrt{2\pi {{\delta }_{k}}}}exp\left( \frac{\left( {{p}_{i}}-{{\delta }_{i}} \right)+{{q}_{i}}^{N}}{2\pi *sizeof(seg(i))} \right) \\\end{align}$

Here, $\delta$ is the Threshold intensity value considered from the input image. p and q are the coordinate values of the image.

Step-4: The layers in the segments are considered by extracting the features from the segments. The features are used for the accurate object and its shadow region detection. The feature extraction procedure is applied as

$FeatSet(ID)=\frac{\begin{align}& \sum\limits_{i=1}^{N}{Seg(i)+max} pixel intensity(x,y) \\& +grey(Img(i)+Th \\\end{align}}{\sum\nolimits_{(x,y){?}A\backslash ID\left( x,y \right)}{+sizeof(ID)}}$

$Img(x,y)=\left\{ \begin{matrix}Img(i)+Seg{{(i)}_{N}}\,*cos\varphi (x,y),  \\Img(i)+Seg{{(i+1)}_{N}}  \\cos\varphi (x,y),  \\min(Seg{{(i,i+1)}_{N}}  \\\end{matrix} \right.\,\,\,\,\int_{{}}^{{}}{\begin{matrix}illuminated\,\,area  \\{}  \\\,penumbra\,area,\,  \\Umbra\,area,  \\\end{matrix}}$

Interpretation of detections as layers in a layered model can aid with segmentation precision as well as depth ordering. It is possible to connect two independent object segments that are separated by an occluder by estimating the appearance of the layer. Different convolution kernels can be found in extracted feature mapping sets such as map x and map y if they are obtained by summing their convolutions. It's common practise to utilise the image as input as a starting point, and then total all the pixels that make up each block.

The object border similarity levels are compared with the feature set and the accurate outline need to be considered from the layers extracted and the shadow regions are considered. The similarity levels are performed as

$\operatorname{Sim}(F e a t S e t(i))(x, y)=\sum_{p=1}^M \sum_{q=1}^N\left(\begin{array}{l}\operatorname{size}(\operatorname{seg}(i))+\max (\text { Feat } \operatorname{Set}(p+i, q+i+L) \\-a v g(p, q)+\operatorname{getrange}(\text { Feat } \operatorname{Set}(i))\end{array}\right)$

$Sim(p,q)=\left\{ \begin{matrix}1  \\0  \\\end{matrix} \right.\,\,\,\,\,\begin{matrix}if\,Sim\left( {{p}_{i}},{{q}_{i}} \right)>Th\,\,  \\Otherwise  \\\end{matrix}$

The dissimilar values are considered as noisy values and they can be removed to reduce the training time of the model. The similar values are considered for object or multiple objects and their shadows detection.

Step-5: From the multiple layers extracted, multiple objects from an image can be recognized and their shadows can be considered as

$Shad\_Set(Seg(i))=\frac{\begin{align}& \sum\limits_{i=1}{getgreylevel(FeatSet(i))+}\,\, \\& min(p(x,y),q(x,y))+\,q(x,y) \\\end{align}}{sizeof(FeatSet)}$

The initial step in achieving feature fusion was to standardise the size of all feature maps. The shape of each layer was resized using up-sampling and down-sampling. Our goal was to merge the features of lower and higher layers into the main layer. In this research, feature maps are combined by conducting an element-by-element sum with a particular weight. This technique was utilised since the target layer's feature maps were critical to the success of the feature fusion process, or else the features would be lost.

Step-6: The features that are related to multiple objects and their shadows are tagged with unique labels. This sequence helps to identify the connectivity of the objects and their shadows. The multi-layer linking is performed as

$MLink(N)=\sum\limits_{i=1}^{N}{\begin{align}& \frac{Shad\_Set(Seg(p,q))}{max(FeatSet(i))} \\& +\frac{Shadow\_Set(Seg(p+i,q+i))}{min(FeatSet(i))} \\\end{align}}$

The shadow related features are linked for identifying the angle between the object and shadow. The shadow features are linked as

$\,Sfeat(MLink(i))=\frac{\begin{align}& \delta (Shad\_Seg(i))+max(MLink(i)) \\& +maxsize(FeatSet(i)) \\\end{align}}{\sum\limits_{k=1}^{k}{min(Sim(p,q)+\theta \left( {{p}_{i}},{{q}_{i}}) \right)}}+Th$

In ConvNet training, an early stopping condition determines the number of epochs. For every epoch, the trained network is evaluated against a tiny validation set. Once the efficiency on the verification set does not improve in the consecutive stages, the training process is halted. The network that performs best on the validation set is subsequently used for testing. It is heuristically decided on the learning basics rate by selecting the highest rate that resulted in agreement of the training error in detection of shadows.

The linked features of objects and their shadows are considered and the feature subset is generated for the detection of angles between objects and shadows.

$\,Sfeat(MLink(i))=\frac{\begin{align}& \delta (Shad\_Seg(i))+max(MLink(i)) \\& +maxsize(FeatSet(i)) \\\end{align}}{\sum\limits_{k=1}^{k}{min(Sim(p,q)+\theta \left( {{p}_{i}},{{q}_{i}}) \right)}}+Th$

$ang(sfeat(p,q))=\,\,\,{{\left( \sum\limits_{i=0\,\,}^{N\,\,}{\sum\limits_{j=0}^{N}{\frac{min(sfeat({{p}_{ij}},{{q}_{ij}})}{cos\varphi (max(Shad\_Seg(i))+sizeof(seg(i+1))}+\sum\limits_{i=0\,\,}^{N\,}{\sum\limits_{j=0}^{N}{\frac{max(sfeat({{p}_{ij}},{{q}_{ij}})}{cos\varphi (min(Shad\_Seg(i))+sizeof(seg(i+1))}}}}} \right)}^{N}}$

Step-7: Display the shadow detection and angle prediction set.

}

4. Results

The goal of shadow angle detection model is to produce a shadow-free image with the original shadow image area's texture, colour, and other characteristics maintained. To remove shadow regions, existing approaches typically entail two steps: Detecting shadows and removing them. Detection of shadows or marking of shadows is the first step in building the model, and the model is rebuilt to remove shadows. Even if the shadow region is recognised, it is still a difficulty to get rid of it entirely. However, it's easy to see how shadow detection will have a significant impact on the final output. The subsequent removal technique cannot provide a high-quality, shadow-free image if the shadow detection results are subpart. The proposed shadow angle detection technique is implemented in python using GoogleColab and the image dataset considered from image shadow triplets’ dataset. The proposed Multi Layered Linked approach with Tagged Feature Model for Shadow Angle Detection (MLTFM-SAD) Model is compared with the traditional Shadow Detection and Removal using Machine Learning techniques (SDR-ML) Model.

The proposed model is compared with the traditional models in terms of Image Segmentation Time Levels, Image Edge Detection Accuracy Levels, Feature extraction Accuracy Levels, Shadow Detection Accuracy levels, Shadow Angle Identification Accuracy Levels and False Prediction Rate. The images and the considered shadows are shown in Figure 4 that are to be improved without effecting the image quality.

Figure 4. Shadows detected in image

When light from any source is partially or completely obstructed by a misty object, shadows are casted. Many applications, such as gesture detection, object recognition, image segmentation, and traffic surveillance, are hampered by shadows. An object's shape, size, and colour can be distorted by its shadows, making it harder to process. A pre-processing step is necessary to detect shadows in photos. Segmentation is accomplished using the Thresholding methodology, which is based on sub division approach using filtering techniques. Foreground, background, and to-be-determined regions are all subdivided into three groups using this iterative process. The proposed segmentation procedure when compared to traditional model exhibits better performance. The segmentation time levels of the proposed and existing models are shown in Figure 5.

Figure 5. Image segmentation time levels

Edge detection is an image analysis approach for detecting shadow boundaries within images. It works by sensing brightness discontinuities range. In domains like image processing, machine learning, and machine vision, edge detection of shadows is utilised for image segmentation and data extraction and then removal of shadows from the image. Line recognition is an image processing approach that takes a set of n edges and detects all the lines upon which the edge points are located in a shadow range and then shadow angle detection is applied. The image shadow edge detection accuracy levels of the proposed and traditional models are shown in Figure 6.

Figure 6. Image edge detection accuracy levels

By selecting features from the image, feature extraction improves the accuracy of shadow detection models. By deleting unnecessary data in the image, this stage of the general framework decreases the dimensionality of pixels. When users need to minimize the amount of resources needed for process without losing significant or relevant information of pixels, feature extraction is helpful. Feature extraction can also help in the reduction of redundant data and to eliminate the irrelevant features from the image. The feature extraction accuracy levels of the traditional and proposed models are shown in Figure 7.

Figure 7. Feature extraction accuracy levels

Figure 8. Image used for shadow detection

The Figure 8 represents the image used for shadow detection that undergoes segmentation and feature extraction for accurate shadow detection.

The image after performing image processing technique, the shadow and the object is recognised and the image is represented in Figure 9.

Figure 9. Shadow and object detection

When shadows are detected and removed, computer vision applications like picture segmentation, object recognition and tracking can be improved significantly. In computer vision applications, the detection and removal of shadows from pictures and videos can prevent unwanted consequences. In computer vision applications, a shadow has become a severe problem since it distorts the shape of the object, merges the object, and actually removes the object from the image. the shadow detection accuracy levels of the proposed and traditional models are shown in Figure 10.

Figure 10. Shadow detection accuracy levels

Shadow angle purely depends the light source omitted on the object. The shadow angle identification helps in accurate removal of shadow from the image. The shadow identification accuracy levels of the proposed and traditional models are shown in Figure 11. The proposed model shadow angle identification levels are high when compared to traditional models.

Figure 11. Shadow angle identification accuracy levels

The training and testing rates of the proposed model is shown in Figure 12. The proposed model in less time trains the model for accurate shadow detection from the image.

Figure 12. Training and testing levels

Accuracy is measured by the false positive rate, which can be applied to image processing tests in performing segmentation and removal of shadow pixels. The likelihood of removing the original pixels and useful pixels from the image is what is meant by the term false prediction rate. The false prediction rate of the proposed model is less when compared to the existing models. The false prediction rates of the proposed and traditional models are shown in Figure 13.

Figure 13. False prediction rate

The entropy minimization levels are shown in Figure 14. The entropy levels are used for the shadow and angle prediction in the input image.

The above results are represented the following GitHubLink: https://github.com/spkreddy/SHADOW-ANGLEDETECTION.

Figure 14. Entropy minimization levels

5. Conclusion

The elimination of shadows from high-resolution remote sensing photos is accomplished using a Multi Layered Linked approach with Tagged Feature Model. Segmenting images using an active contour model is a common technique. After that, spectral properties are used to narrow down the list of possible shadows and then shadow angle detection is identified. The detected shadows are cleaned up to remove any erroneous ones. False shadow detection is accomplished by comparing the grayscale averages of false shadowing in the green and blue wavebands. For detection of shadows, a multilayer model to predict shadow masks at various scales and devise a weighted weight vector loss function is used. A Euclidean loss and a colour compensation technique are introduced to handle the problem with colour and brightness inconsistency, and the network is then used to remove shadows from photos. For both shade detection and removal, we evaluate our system on standard datasets, comparing it to several state-of-the-art methods, and demonstrating its superiority over the province methods. Our method can accurately detect ground shadows, but shadows that aren't on the ground have a lot more variance in appearance, making it difficult to detect them. When utilised as a stand-alone shadow sensor, the proposed method is capable of being neatly integrated into more complex scene understanding tasks and angle detection. There is a significant relationship between the level of an object and the appearance of a shadow by detecting its angle with the original object. In future, the shadow removal techniques can be applied on the image for enhancing the image quality and to use them for data recognition from images.

  References

[1] Fang, H., Wei, Y., Luo, H., Hu, Q. (2019). Detection of building shadow in remote sensing imagery of urban areas with fine spatial resolution based on saturation and near-infrared information. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(8): 2695-2706. https://doi.org/10.1109/JSTARS.2019.2917605

[2] Wei, H., Liu, Y., Xing, G., Zhang, Y., Huang, W. (2019). Simulating shadow interactions for outdoor augmented reality with RGBD data. IEEE Access, 7: 75292-75304. https://doi.org/10.1109/ACCESS.2019.2920950

[3] Mohajerani, S., Saeedi, P. (2019). Shadow detection in single RGB images using a context preserver convolutional neural network trained by multiple adversarial examples. IEEE Transactions on Image Processing, 28(8): 4117-4129. https://doi.org/10.1109/TIP.2019.2904267

[4] Kim, J., Kim, W. (2020). Attentive feedback feature pyramid network for shadow detection. IEEE Signal Processing Letters, 27: 1964-1968. https://doi.org/10.1109/LSP.2020.3034527

[5] Liu, Z., An, D., Huang, X. (2019). Moving target shadow detection and global background reconstruction for VideoSAR based on single-frame imagery. IEEE Access, 7: 42418-42425. https://doi.org/10.1109/ACCESS.2019.2907146

[6] Zhang, H., Qu, S., Li, H., Luo, J., Xu, W. (2020). A moving shadow elimination method based on fusion of multi-feature. IEEE Access, 8: 63971-63982. https://doi.org/10.1109/ACCESS.2020.2984680

[7] Bo, P., Fenzhen, S., Yunshan, M. (2020). A cloud and cloud shadow detection method based on fuzzy c-means algorithm. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13: 1714-1727. https://doi.org/10.1109/JSTARS.2020.2987844

[8] Alvarado-Robles, G., Osornio-Rios, R.A., Solis-Munoz, F.J., Morales-Hernandez, L.A. (2021). An approach for shadow detection in aerial images based on multi-channel statistics. IEEE Access, 9: 34240-34250. https://doi.org/10.1109/ACCESS.2021.3061102

[9] Hu, X., Fu, C.W., Zhu, L., Qin, J., Heng, P.A. (2019). Direction-aware spatial context features for shadow detection and removal. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(11): 2795-2808. https://doi.org/10.1109/TPAMI.2019.2919616

[10] Hu, X., Wang, T., Fu, C.W., Jiang, Y., Wang, Q., Heng, P.A. (2021). Revisiting shadow detection: A new benchmark dataset for complex world. IEEE Transactions on Image Processing, 30: 1925-1934. https://doi.org/10.1109/TIP.2021.3049331

[11] Tian, X., Zheng, P., Huang, J. (2021). Robust privacy-preserving motion detection and object tracking in encrypted streaming video. IEEE Transactions on Information Forensics and Security, 16: 5381-5396. https://doi.org/10.1109/TIFS.2021.3128817

[12] Wang, C., Xu, H., Zhou, Z., Deng, L., Yang, M. (2020). Shadow detection and removal for illumination consistency on the road. IEEE Transactions on Intelligent Vehicles, 5(4): 534-544. https://doi.org/10.1109/TIV.2020.2987440

[13] Wang, B., Zhao, Y., Chen, C.P. (2019). Moving cast shadows segmentation using illumination invariant feature. IEEE Transactions on Multimedia, 22(9): 2221-2233. https://doi.org/10.1109/TMM.2019.2954752

[14] Hou, L., Vicente, T.F.Y., Hoai, M., Samaras, D. (2019). Large scale shadow annotation and detection using lazy annotation and stacked CNNs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4): 1337-1351. https://doi.org/10.1109/TPAMI.2019.2948011

[15] Sultana, M., Mahmood, A., Jung, S.K. (2020). Unsupervised moving object detection in complex scenes using adversarial regularizations. IEEE Transactions on Multimedia, 23: 2005-2018. https://doi.org/10.1109/TMM.2020.3006419

[16] Kim, D.S., Arsalan, M., Park, K.R. (2018). Convolutional neural network-based shadow detection in images using visible light camera sensor. Sensors, 18(4): 960. https://doi.org/10.3390/s18040960

[17] Tatar, N., Saadatseresht, M., Arefi, H., Hadavand, A. (2018). A robust object-based shadow detection method for cloud-free high resolution satellite images over urban areas and water bodies. Advances in Space Research, 61(11): 2787-2800. https://doi.org/10.1016/j.asr.2018.03.011

[18] Hu, X., Zhu, L., Fu, C.W., Qin, J., Heng, P.A. (2018). Direction-aware spatial context features for shadow detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7454-7462.

[19] Zhu, L., Deng, Z., Hu, X., Fu, C.W., Xu, X., Qin, J., Heng, P.A. (2018). Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 121-136.

[20] Le, H., Vicente, T.F.Y., Nguyen, V., Hoai, M., Samaras, D. (2018). A+ D Net: Training a shadow detector with adversarial shadow attenuation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 662-678. https://doi.org/10.1007/978-3-030-01216-8_41

[21] Li, Z., Yang, J., Liu, Z., Yang, X., Jeon, G., Wu, W. (2019). Feedback network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3867-3876.

[22] Li, Q., Li, Z., Lu, L., Jeon, G., Liu, K., Yang, X. (2019). Gated multiple feedback network for image super-resolution. arXiv preprint arXiv:1907.04253. https://doi.org/10.48550/arXiv.1907.04253

[23] Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y. (2018). Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472-2481. https://doi.org/10.1109/CVPR.2018.00262

[24] Zhao, T., Wu, X. (2019). Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3085-3094. 10.48550/arXiv.1903.00179

[25] Hu, J., Shen, L., Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141. https://doi.org/10.1109/TPAMI.2019.2913372

[26] Gao, H., Tao, X., Shen, X., Jia, J. (2019). Dynamic scene deblurring with parameter selective sharing and nested skip connections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3848-3856.

[27] Leibe, B., Leonardis, A., Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(3): 259-289. https://doi.org/10.1007/s11263-007-0095-3

[28] Zhao, T., Nevatia, R., Wu, B. (2008). Segmentation and tracking of multiple humans in crowded environments. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(7): 1198-1211. https://doi.org/10.1109/TPAMI.2007.70770

[29] Toyama, K., Blake, A. (2001). Probabilistic tracking in a metric space. Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, pp. 50-57. https://doi.org/10.1109/ICCV.2001.937599

[30] Guo, R., Dai, Q., Hoiem, D. (2011). Single-image shadow detection and removal using paired regions. CVPR 2011, pp. 2033-2040. https://doi.org/10.1109/CVPR.2011.5995725

[31] Guan, Y.R., Aamir, M., Hu, Z.H., Dayo, Z.A., Rahman, Z., Abro, W.A., Soothar, P. (2021). An object detection framework based on deep features and high-quality object locations. Traitement du Signal, 38(3): 719-730. https://doi.org/10.18280/ts.380319