A Study and Analysis on Pedestrian Detection and Tracking Through Rear-View Images

A Study and Analysis on Pedestrian Detection and Tracking Through Rear-View Images

Damineni Sree Lakshmi* | Adusumilli Divya | Emandi Sreedevi | Ravikiran Kolagani | Prasanthi Gottumukkala | Akhilnath Muddana

Dept. of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Kanuru, Vijayawada 520007, A.P., India

Dept. of C.S.E, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur 522502, A.P., India

Dept. of Information Technology, Gokaraju Rangaraju Institute of Engineering and Technology, Hyderabad 500090, Telangana, India

Network Engineer IV, Charter Communications Inc., Loan Tree, Colorado 80124, USA

Corresponding Author Email: 
15 August 2022
1 October 2022
13 October 2022
Available online: 
28 February 2023
| Citation

© 2023 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).



As indicated by the Transportation Research and Industry Prevention Programme (TRIPP)’s Road Safety in India Report-2020, 33% of the accidents victims (deaths) are pedestrians. Heavy vehicles as well as cars are not able track pedestrian’s movements on time. Most of the Children met with the accidents due to vehicle reversing. This problem motivates to track pedestrian through rear-view in heavy vehicles as well as for cars. Certain machine learning and deep learning approaches will best adapt to coping with the particular problems of rear-view pedestrian detection. In this work a literature survey of pedestrian detection and tracking research methodology and their constraints are discussed briefly. Most of the camera applications mainly concentrate on picture visibility and tracking. If the pedestrian detection application makes as inbuilt technique, then automatically so many accidents especially of children can be avoided. This pedestrian application mainly used to track the pedestrian movements while he or she is moving on heavy traffic roads and highways or while taking vehicle reverse by using cameras which were fixed on vehicles and make alerts. Such that camera can get more extracted features and helps the future applications. In this research paper a brief literature review is placed according to various researchers along with their techniques. And also compare the performance measures such as accuracy, sensitivity, false alarm rate and detection rate. These experimental results are out performance the methodology and differentiated with present technology.


TRIPP, rear-view, pedestrian detection and tracking, false alarm rate

1. Introduction

The chance of pedestrian demise is assessed at under 10% at influence rates of 30 km/h and more noteworthy than 80% at 50 km/h, and the relationship expansion in fatalities and expansion in influence speeds is represented by a force of four in Literature Review on Vehicle Travel Speeds and Pedestrian Injuries. The amount of weak street user (people on foot, bicyclists and mechanized bikes) passings in the urban areas range somewhere in the range of 84% and 93%, vehicle tenant fatalities somewhere in the range of 2% and 7%, and tenants of three-wheeled bike taxis (TSTs) under 5% percent, besides in Vishakhapatnam where the extent for the last option is 8%. The absolute of weak street client passings remains moderately stable across urban communities of various sizes and the extent of passerby passings has all the earmarks of being higher in urban communities with bigger populace.

The extent of passerby fatalities related with MTW influences goes from 8 to 25 percent of the aggregate. The most noteworthy extent was seen in Bhopal. The contribution of motorised two-wheelers (MTWs) as affecting vehicles in VRU fatalities might be because of the way that people on foot and bicyclists don't have satisfactory offices on the blood vessel streets of these urban areas and that they need to share the street space (the control side path) with MTW riders.

Pedestrian and bike fatalities have high rates prior toward the beginning of the day. This might be on the grounds that this class of street clients start for work sooner than those utilizing mechanized transport and vehicle velocities might be higher as of now.

The contribution of MTWs as affecting vehicles in Vulnerable Road Users (VRU) fatalities perhaps because of the way that people on foot and bicyclists don't have satisfactory offices on the blood vessel streets of these urban communities and they need to share the street space (the control side path) with MTW riders. Arrangement of discrete and satisfactory person on foot and bike paths in all urban communities is an essential for RTI control.

MTW and person on foot passings are moderately high at 20:00-23:00 when we would anticipate that traffic volumes should be low. Overviews done in Agra and Ludhiana recommend that because of lower volumes vehicle speeds can be higher around evening time, sufficient road lighting is absent, and there is exceptionally restricted checking of drivers affected by liquor. This recommends that traffic quieting strategies, better road lighting and liquor control would be important to control RTI during evening time.

In national highways, traffic security is most important to avoid the accidents. Therefore, an advanced rear camera applications are required to cross over the accidents. Despite the fact that we have embraced drives and are carrying out different traffic security improvement programs, the general circumstance as uncovered by information is not even close to acceptable. Many developed nations have seen a drop in street mishaps and setback numbers, by taking on a multi-pronged way to deal with street wellbeing that envelops expansive scope of measures, for example, traffic the board, plan and nature of street framework, more secure vehicles, policing, of mishap care, and so on. We have aggregated extensive involvement with the definition and execution of traffic security. Notwithstanding, the public authority alone can't handle traffic security issues. Dynamic contribution, all things considered, to advance strategy change and execution of traffic security measures is an unquestionable necessity. Tending to street wellbeing in an exhaustive way requires the contribution of various organizations/areas like wellbeing, transport and police. In this manner, a planned reaction to the issue is basic. Experience in developed nations demonstrates that deaths and wounds can be forestalled through governmental policy regarding minorities in society.

TRIPP quoted that “Rear end collisions (including collisions with parked vehicles) are high on all types of highways including 4-lane highways. This shows that even though more space is available on wider roads rear-end crashes do not reduce. This is probably due to poor visibility of vehicles rather than road design itself.” So that a camera dependent pedestrian and object detection based applications such as accident avoidance, parking assistance, automated steering, etc., the issue of detecting pedestrians from an on-board camera is critical. So many highly efficient architectures for the detection of pedestrians have emerged in recent years [1-3]. The reliability conditions set out [1] depend on the direction of the pedestrian and on whether the pedestrian is stationary or moving sideways. The capacity towards combines both static, temporary and camera specific signals over time, the false positive rate of the main pedestrian detection device for automobile applications [1] is approximately six magnitudes higher than the standard object recognition systems [3], while the false rejection rate is comparable. Perhaps explanation for this achievement is the invention of extremely sophisticated training models for chosen algorithms, which extract important discriminatory details and features from vast datasets. Previous experiments on the identification of pedestrians have, however, dealt by special intent cameras designed for use of interest. Either monocular [1] or stereo [2] image moderators is fixed on a rear-view mirror with high contrast and 640x480 pixel will cover a wide field of view and minimal distortions. It enables efficient classification of pedestrians in tough urban traffic and adverse weather conditions [1, 2]. In this article, a detailed discussion is made on various research works done by various authors in the field of pedestrian detection and tracking using built-in rear view camera mounted on vehicles. Further the study is also carried out on the rate of false acceptance and rejection of pedestrian identification and its affect by the distance between the vehicle and the individual. the acceleration of the automobile and the conditions of lighting. Mild contrast, 320x240 pixel resolution, narrow angle of vision, and distinguishing areas distortions differentiate the built-in camera utilized in the research. Figure 1 gives sample images of pedestrian detection.

The aim of any pedestrian detection system is to detect hundreds of various types of objects automatically, reconstruct 3D environmental maps, then explore the relationships among all eco-system components semantically. These are required to provide such a real-time overview of what's going on outside the vehicle, converted into digital images. Based on a database with more than 300 standard intervals of 30sec to 30m with short drive time the classification accuracy of single frame detection varied from 0.596 to 0.584 for in-path and out-path positions shown in Figure 1. The median false positive rate ranged from 10 to 20 false positives per 1-hour session. A sliding window time taken was approximately 0.056 seconds.

Figure 1. In-path and out-path positions on roads

For support real-time operations in object detection systems, Scene geometry from known camera calibration information is utilized to search a feature pyramid more efficiently. For better detection accuracy, a multiresolution pedestrian model is used for detecting small (pixel-sized) pedestrians as well as normally-sized ones. Using the Caltech Pedestrian Dataset, quantitatively evaluated a detection system with a series of experiments and showed real-time operation (14fps@640x480 images) while maintaining the state-of-the art detection accuracy i.e. 80% detection rate [4].

A multi-cue vision system for real-time pedestrian detection and tracking from a moving vehicle, called PROTECTOR. The detection component involved a cascade of modules, each utilizing complementary visual criteria to focus on relevant image regions. Tight integration ensured that valuable information (constraints) is passed on between successive modules. A novel mixture-of-experts architecture, involving texture-based component classifiers weighted by the outcome of shape matching, was shown to outperform the single texture classifier approach [5].

Organizing the remain paper is as follows. In part 2 Literature survey is presented, part 3 describes observation of the study, part 4 expresses future research work. Part 5 presents comparison of results, part 6 depicts the conclusion.

2. Literature Survey

This section provides a detailed discussion on various research articles proposed by many researchers in the area of Pedestrian Detection and Tracking through Rear-View Images. Here a consideration has been taken on various issues like Machine Learning algorithms adopted, measures taken and results achieved. Most of the researchers in their proposals have adopted a general framework as shown in Figure 2.

Figure 2. Block diagram of pedestrian detection

Alon et al. [1] proposed that automotive industry, performance evaluation and comparison of vision-based automotive modules is becoming increasingly important. In this model experimental outcomes shows that a pedestrian detection evaluation framework that was used to put multiple detection modules and algorithms proposed by different vendors to the test.

Bar-Hillel et al. [2], in his study an iterative feature generation and pruning process. Part-based products are calculated into a feature hierarchy utilizing providers for part detection, part refining, and part combination with a function generation procedure. This examination demonstrated the utility of new part-based feature learning methods on well-known pedestrian detection benchmarks.

Cho et al. [4], in his investigation uses geometric restrictions assessment towards effectively search feature pyramids and enhance detection accuracy through utilizing a multi resolution pedestrian recognition sensor to identify a small pixel-sized pedestrians missed through a single proposed design. A multi-resolution pedestrian method is used to analyze small (pixel-sized) and normal-sized pedestrians.

Dalal et al. [6], in this study an exploring edge and inclination-based descriptors, we show tentatively that frameworks of Histograms of Oriented Gradient (HOG) descriptors essentially beat present capabilities for human recognition. This model demonstrated that utilizing standardized histograms of angle directions to highlight SIFT descriptors in a thick covering matrix that gives excellent outcomes for individual identification.

Dean et al. [7] has proposed a LSH and convolution models. In this work recognition frameworks are compelled when needed to convolve an objective picture by a bank of channels that code for various parts of an item's appearance, for example, the presence of segment parts. This study commitment that a versatile way to deal with an object identification that replaces linear convolution by ordinal convolution through utilizing productive LSH conspires.

According to Dollar et al. [8], pedestrian identification is a big issue in computer vision, by many solutions that have the potential towards change people's lives. Further, the quantity of techniques for recognizing pedestrians in monocular based photos has progressively advanced. Direct comparisons are complex, though, owing to the usage of various data sets and widely differing assessment protocols. This research work aimed to find out how much pedestrian detection technology has progressed. Automatically sensing pedestrians after driving vehicles may have a direct economic effect as well as the ability towards minimize pedestrian accidents and deaths.

Dubout et al. [9] present a general and specific method for significantly speeding up slipping, multiscale window linear object detection systems, such as component detectors in part-based models. The idea that the Fourier transform is linear motivates the author to adopt convolutions through feature planes in Fourier space and only do one inverse Fourier transform.

The study of Enzweiler et al. [10] aims to provide a detailed overview of the current state of the art, both methodologically and experimentally. The first section of the article is a survey. The critical elements of a pedestrian detection device, as well as the underlying versions are covered. The investigation found two evaluation conditions to strike a good compromise between generality and specificity a generalized environment with no scenario or processing limits and a setting unique to an application aboard a moving vehicle in traffic.

In the work of Everingham et al. [11] the dataset and assessment process are described. The study examines the state-of-the-art of assessed classification and recognition processes, determining if these models are statistically distinct, whether the models are learning from the pictures (e.g., the target or its context), and whether the approaches find confusing or straightforward. Object type detection has come a long way in the last decade.

Felzenszwalb et al. [12], In their work proposed a framework for object recognition on PASCAL dataset. This framework can represent wildly varying object classes and achieve state-of-the-art outcomes. Although deformable component models have wide usage and their worth is shown on challenging benchmarks like the PASCAL datasets.

Fergus et al. [13] Objects are characterised as versatile component constellations. For all dimensions of the entity, including form, appearance, occlusion, and relative size, a probabilistic representation is used. An entropy-based feature detector is used. The scale-invariant entity models parameters are calculated during the learning process. The identification data produced here convincingly illustrate the constellation models and learning algorithm's strength.

Geronimo et al. [14] identified that the implementation of accurate on-board pedestrian identification systems is a big challenge. It is tough to cope with this type of devices required robustness due to pedestrians differing presence (e.g., various clothing, shifting height, aspect ratio, and complex shape) and the unorganized landscape. The shortage of public benchmarks and the difficulties of replicating many suggested techniques are two issues that arise in this research field, making it challenging to evaluate approaches. Intelligent automobiles are an essential technology for reducing the amount of pedestrian-vehicle collisions.

Levi et al. [15] introduced a new part-based entity finding method that detects hundreds of objects in real-time. Due to their capacity to reflect broad appearance differences, part-based models are currently state-of-the-art for object detection. However, such techniques are restricted to a few sections only because of their high computational demands and are too sluggish for real-time execution. This work demonstrated the AFS, a real-time system for multiple-part dependent target detection. This study present KD-Ferns, a new ANN search algorithm that is very good at finding tiny multi-dimensional datasets.

The features are invariant to picture size and rotation. Lowe et al. [16] show that the reliable matching through a broad spectrum of affine distortion, 3D perspective shift, noise inclusion, and lighting change. The features are particularly unique because a single component may be accurately compared against a vast database of parts from multiple photos with a strong likelihood. This uniqueness is accomplished by putting together a high-dimensional vector representing the image gradients within a small area of the image. The key points are invariant to picture rotation and size and resilient to affine distortion, noise addition, and illumination transition.

Mazzae and Garrott [17], the National Highway Traffic Safety Administration (NHTSA) checked on three sensor-based back object identity frameworks, one for item recognition framework with sensors and rearview camera, one rearview video (simply) lower back object area framework, and one rear-view reflect for the examination. NHTSA also assessed on four sensor-based gadgets, rearview digicam frameworks, and one lower rear-view replicate. The sensor and rearview video components of the composite designs were tested independently.

In the work of Antonio Prioletti et al. [19] owing towards the intense variability of targets, lighting environments, occlusion, and high-speed car motion, detecting pedestrians remains a difficult challenge for automotive vision systems. In the last few years, much research has gone into this challenge, and classifier-based detectors have taken a prominent position among the various approaches proposed. Although feature-based looking at works first-rate at higher frame rates, a low-stage reimplementation of the 2-stage classifier that totally misuses multicore-processor (or illustrations making ready devices) highlights ought to bring about sizeable execution profits [20].

Shashua et al. [21] have introduced object detectors efficiency and has vastly enhanced the rapid growth of deep learning network models for detection tasks. To fully and profoundly understand the object detection pipelines critical development state, the first review is based on traditional detection methods and identify the benchmark datasets in this study. The need for high accuracy real-time solutions is becoming increasingly critical to deploy on more precise applications.

Tsishkou et al. [22] has shown that one of the main achievements is the inclusion of both single frame and time measured signals and the development of additional object categories such as vehicles and stationary context structures. However, most detailed field experiments were conducted using cameras by high contrast, decent clarity, a wide field of view, and no blur. The main aim of the study was to compare pedestrian detection efficiency utilizing a conventional built-in rear-view camera for parking space detection mounted on a standard passenger car towards previous studies using special-purpose installations in terms of vehicle speed, distance from the vehicle towards pedestrian, and lighting circumstances.

Bar-Hillel et al. [23] have presented some techniques for adapting part-primarily based article reputation paintings union. This approach is based on an iterative feature creation and pruning operation. Essential section primarily based highlights are included right into a framework chain of command the usage of directors for component interpretation, component refinement, and part blend in this issue age measure. This investigation demonstrated the usefulness of a modern part-based function learning approach on well-known pedestrian recognition benchmarks. The proposed model is exceptionally adaptable, allowing for the rapid discovery of new function forms.

Silberstein et al. [24] for a particular application, compiled and annotated a large dataset of videos featuring pedestrians in various environments. This work illustrates the advantages of utilizing part-based identification for identifying individuals in multiple poses and under occlusions using this dataset. Demonstrated method for detecting pedestrians in rear-view cameras trained and evaluated on a dataset gathered for this purpose. The device is unusual in that it employs a part-based detection process, which increases its robustness when detecting pedestrians who are nearby, partly occluded, or not upright.

Molineros et al. [25, 26] have used a monocular high angle sensor, introduce a method for automatically detecting obstacles from a moving car. This framework was created to locate the barriers, especially children while backing up. The camera's perspective is changed to a simulated bird's eye view. To obtain ego-motion, they created a novel picture registration algorithm that produces a residual motion map by respect towards the field when combined by variational dense optical flow. As a result, improved detection and predictions are possible. By analyzing a residual motion diagram, this can segment obstacles using quick variational optical flow. This algorithm may be used as a child recognition focus tool, for example.

Park et al. [27], presented a method that combines coarse-scale flow and fine-scale temporal difference functions. Poor motion stabilization is accomplished utilizing the system, which factors out camera motion and coarse object motion while maintaining non-rigid movements that serve as helpful recognition cues. In video sequences, display data for pedestrian identification and human pose prediction, obtaining state-of-the-art results in both. Using weakly stabilized video clips, defined a family of temporal features. This detector can reliably collect part-centric details thanks to weak stability, eliminating many images- and object-centric motion.

In terms of cost Alon et al. [28] suggested repeatability, and the ability to evaluate various modules under the same circumstances, off-vehicle assessment using a catalog of video streams has many benefits on-vehicle evaluation. This technique provides an off-vehicle evaluation forum for camera-based pedestrian identification, which allows for the testing of industrial modules and internally built algorithms. This model demonstrated a pedestrian detection assessment tool used to test multiple detection modules and algorithms proposed by various vendors. The platform allows for the calculation of identification statistics and precision in different data partitions, resulting in a characterization of the evaluated system's capabilities.

Dollár et al. [29], the consistency of the picture functionality accessible to a vision device will also assess its reliability and robustness. Data mining usually involves working with massive amounts of raw data, necessitating efficient algorithms to explore the data room. In the same way, data mining has many meaningful functions; image processing has a large number of them. Recent advances in machine learning and a sustained development initiative in manual function design by domain specialists have rendered the problems associated with these problem areas more tractable. Feature mining is intended to reduce the time and effort required for feature design, effectively laying the groundwork for applications that outperform those with manually built features.

Felzenszwalb et al. [30] A discriminatively qualified, multiscale, a deformable component model for object detection is presented in this paper. Compared to the highest result in the 2006 PASCAL individual detection challenge, This device achieves a two-fold increase in average precision. In ten of the twenty groups, it even outperforms the strongest performances from the 2007 challenge. The machine is mainly reliant on deformable components. Although deformable component models have increased in popularity, their worth is yet to be proved on challenging benchmarks like the PASCAL challenge. Also, this system strongly depends on modern approaches for discriminative teaching. An ordinary method for getting ready SVMs with idle design was introduced. This study utilized it to make a multiscale, deformable version-based totally acknowledgment gadget. Proposed methodology has proved as a good detection algorithm by achieving exploratory outcomes.

Maji et al. [31], the kernel for a test vector and each of the help vectors must be evaluated for straightforward classification using kernelized SVMs. This model prove that this can be done even more effectively with a class of kernels. This study demonstrates that histogram intersection kernel SVMs (IKSVMs) can be built the classifier's runtime difficulty is logarithmic in the number of support vectors, rather than linear as in the traditional method. This investigation’s theoretical contribution is a method for precisely testing intersection kernel SVMs with a runtime logarithmic in the number of support vectors compared to linear in the regular implementation. Demonstrated models that an approximation of the IKSVM classifier could be developed with similar classification efficiency but a constant runtime in the number of help vectors.

Dollár et al. [32], pedestrian identification is a critical issue in computer vision, with applications in robots, security, and automobile protection. The availability of daunting public databases has fuelled much of the progress made in recent years. This study launches the Caltech Pedestrian Dataset, two orders of magnitude greater than current databases, to keep up with the accelerated pace of progress. The dataset includes annotated footage captured from a moving object and daunting videos of poor quality and often occluded people. Pedestrians at lower resolution and pedestrians under partial occlusion are two understudied situations that stick out as being especially prevalent and significant in the data collected. Note that pedestrians account for more than 80% of the data in the medium/far scales; additionally, it is critical to detect pedestrians early to provide enough notice to the pilot in automotive activities.

Vedaldi et al. [33] show such an efficient classifier cannot be checked on all picture sub-windows in a reasonable period. As results suggest a three-stage classifier that incorporates linear, quasi-linear, and non-linear kernel SVMs. This work demonstrates that growing the kernels' non-linearity strengthens their discriminative capacity at the expense of increased computational difficulty. To overcome the complexity cost, chose a three-stage cascade, which has resulted in a heavy algorithm in both training and testing. This model worked on solutions to the whole waterfall, such as utilizing the first-stage applicants as regions to scan rather than the only choices for further thought.

Bar-Hillel and Weinshall [34] present a method for studying part-based object class models that are both productive and accurate. Part position and size partnerships, as well as part presentation, are both used in the models. Raw object and context images are used to construct templates described as an unordered series of features collected using an interest point detector. A primary Bayesian network with a central secret node containing position and scale knowledge and nodes representing object components is used to model the object type generatively. This model has introduced an object class recognition system based on discriminative boosting-oriented optimization of a basic relational generative model. The approach blends generative classifiers' regular spatial component relations with discriminatory systems' efficiency and function selection consistency.

Xie et al. [35] has proposed a Support Vector Machine Recursive Feature Exclusion (SVM-RFE) is a feature filtering process that can eliminate irrelevant features but ignores redundant features. This study explains that this approach fails to delete unnecessary features and proposes a better method. With SVM-RFE, the correlation coefficient is used towards calculate the redundancy of the chosen sub-set. To pick insightful function subsets, support vector machines are commonly used. This work demonstrate in standard SVM-RFE function selection algorithm only considers the relationship between condition attributes and judgment but ignores the relationship between condition attributes.

3. Observation of the Study

This section explains the limitations of existing models. Many pedestrian detection models implemented through conventional methods.

The techniques are facing many limitations and research problems those are like image hazing, uneven detection, faulty pedestrians and unclassified results.

Due to lack of effectiveness in the experimental results existed model cannot be use in inbuilt application at Rear view cameras.

Most of the works are carried out on benchmark image datasets and very limited research is available on real time video tracking. Video tracking is a significant problems and difficulties presented in the construction of applications. The above problems exist at three distinct stages of object identification, i.e., video collection, human recognition and monitoring. The video acquisition problems include changes in lighting, sudden motion; dynamic context, object estimation etc., object or pedestrian identification & monitoring problems include diverse etc.

Therefore, for future applications an effective pedestrian image detection models are required with efficient techniques.

Some of the examples for image datasets are shown below in Figure 3.

Figure 3. Part-based feature synthesis for human detection [2]

Left: Feature types currently supported in our generation process. An arrow between A and B stands for ’A can be generated from B’. Center: Examples of features from our learned classifiers. a, b) Localized features. The rectangle denotes the fragment. The circle marks the 1-std of its location Gaussian. c) Detection example for a spatial “AND” feature composed of fragments a, b, d) Two fragments composing together a semantic “OR” feature. e) A subpart feature. The blue rectangle is the emphasized fragment quarter. Right: Typical images from the Children dataset.

Figure 4. Pedestrian detection and tracking from a moving vehicle [5]

Driver inattention can instantly lead to dangerous situations with pedestrians as shown in Figure 4.

Figure 5. Stereo verification

Examples of removal of background corrupted detections (accepted solutions shown white, rejected black) are shown in Figure 5.



Figure 6. Part-based pedestrian detection system [19]

In Figure 6, (a): The base image is from the DaimlerDB data set [36, 37]. The red dashed line is the Haar bounding box and the blue continuous line is the HOG bounding box; (b): Detection stage output. Several false positives are contained, but these will be removed in the verification stage.

Figure 7. Automotive accidents involve cars backing up [25]

In Figure 7, (a): An image taken from a camera rigidly mounted on a vehicle; (b): Result of virtually rotating the camera view to bird-eye view.

4. Future Research Work

In this section future objectives are discussed for designing and verification of pedestrian detection.

·A real and accurate pedestrian detection for rear view modeling with Vector Feature Synthesis Technique to classify the object.

·A deep learning based pedestrian detection model to classify the unusual events at real time applications.

·A machine learning and segment related pedestrian detection model for dynamic detection and hybrid classification.

5. Comparison of Results

A wide range of cameras based on the rear end of the foot as well as the front end detection have been designed for this study. The suggested architecture is a stable and reliable classification for any object detection through the camera. The model deployed will automatically detect the base body, foot, upper body and head data bases that are quickly identified and provide reliable results. The proposed models will have true positive rates, precision, sensitivity and recall as different classification metrics. Table 1 tabulates the results of few of the researchers work.

Table 1. Earlier models comparison

S. No.


RFO [6]

Residual flow [7]

Fused DNN [14]

DT [15]


Prediction probability



[3.12e-4 1.2987]

[4.12e-4 1.1987]

[3.11e-4 1.1987]



















Figure 8. Comparison of results

Table 1 and Figure 8 provide a straightforward explanation of pedestrian detection models. It is clear that the existed techniques have achieved less detection rates and can be improved further through successive thresholds and advanced methodologies or learning models.

6. Conclusions

In this research work a brief review is performed to analyse the implemented models and their limitations. According to this review all implemented models are facing mis accurate pedestrian detections with low accuracy rate. So many conventional methods cannot be compatible for real time applications. The pedestrian detection rate, elapsed time and performance evolutions are getting very low scores. Moreover, the measures like accuracy, sensitivity, Recall, F1 score, throughput and detection rate are need to be improved for real time applications. Therefore, according to this study, few limitations are identified and crossover by defined objectives with accurate manner. Finally conclude that these types of pedestrian applications can helps in heavy vehicles, cars, jeeps, busses and real time CCTV observation to avoid accidents.


[1] Alon, Y., Bar-Hillel, A. (2012). Off-vehicle evaluation of camera-based pedestrian detection. In 2012 IEEE Intelligent Vehicles Symposium, Madrid, Spain, pp. 352-358. https://doi.org/10.1109/IVS.2012.6232160

[2] Bar-Hillel, A., Levi, D., Krupka, E., Goldberg, C. (2010). Part-based feature synthesis for human detection. In European conference on computer vision, pp. 127-142. https://doi.org/10.1007/978-3-642-15561-1_10

[3] Broggi, A., Bertozzi, M., Fascioli, A., Sechi, M. (2000). Shape-based pedestrian detection. In Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No. 00TH8511), Dearborn, MI, USA, pp. 215-220. https://doi.org/10.1109/IVS.2000.898344

[4] Cho, H., Rybski, P.E., Bar-Hillel, A., Zhang, W. (2012). Real-time pedestrian detection with deformable part models. In 2012 IEEE Intelligent Vehicles Symposium, Madrid, Spain, pp. 1035-1042. https://doi.org/10.1109/IVS.2012.6232264

[5] Gavrila, D.M., Munder, S. (2007). Multi-cue pedestrian detection and tracking from a moving vehicle. International Journal of Computer Vision, 73(1): 41-59. https://doi.org/10.1007/s11263-006-9038-7

[6] Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), San Diego, CA, USA, pp. 886-893. https://doi.org/10.1109/CVPR.2005.177

[7] Dean, T., Ruzon, M.A., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J. (2013). Fast, accurate detection of 100,000 object classes on a single machine. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1814-1821. 

[8] Dollar, P., Wojek, C., Schiele, B., Perona, P. (2011). Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4): 743-761. https://doi.org/10.1109/TPAMI.2011.155

[9] Dubout, C., Fleuret, F. (2012). Exact acceleration of linear object detectors. In European Conference on Computer Vision, pp. 301-311. https://doi.org/10.1007/978-3-642-33712-3_22

[10] Enzweiler, M., Gavrila, D.M. (2008). Monocular pedestrian detection: Survey and experiments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12): 2179-2195. https://doi.org/10.1109/TPAMI.2008.260

[11] Everingham, M., Winn, J. (2011). The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern Analysis, Statistical Modelling and Computational Learning, Tech. Rep, 8(5). http://www.pascalnetwork.org/challenges/VOC/voc2009/workshop/index.html.

[12] Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine intelligence, 32(9): 1627-1645. https://doi.org/10.1109/TPAMI.2009.167

[13] Fergus, R., Perona, P., Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. II-II. https://doi.org/10.1109/CVPR.2003.1211479

[14] Geronimo, D., Lopez, A.M., Sappa, A.D., Graf, T. (2009). Survey of pedestrian detection for advanced driver assistance systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7): 1239-1258. https://doi.org/10.1109/TPAMI.2009.122

[15] Levi, D., Silberstein, S., Bar-Hillel, A. (2013). Fast multiple-part based object detection using kd-ferns. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 947-954.

[16] Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2): 91-110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

[17] Mazzae, E.N., Garrott, W.R. (2006). Experimental evaluation of the performance of available backover prevention technologies (No. DOT HS 810 634).

[18] Papageorgiou, C., Poggio, T. (2000). A trainable system for object detection. International Journal of Computer Vision, 38(1): 15-33. https://doi.org/10.1023/A:1008162616689

[19] Prioletti, A., Møgelmose, A., Grisleri, P., Trivedi, M.M., Broggi, A., Moeslund, T.B. (2013). Part-based pedestrian detection and feature-based tracking for driver assistance: real-time, robust algorithms, and evaluation. IEEE Transactions on Intelligent Transportation Systems, 14(3): 1346-1359. https://doi.org/10.1109/TITS.2013.2262045

[20] Lasmika, A., Kumaresan, M. (2022). A smart car parking system based on IoT with Gray Wolf Optimization-probability correlated neural network recognition methods. Ingénierie des Systèmes d’Information, 27(5): 807-814. https://doi.org/10.18280/isi.270514

[21] Shashua, A., Gdalyahu, Y., Hayun, G. (2004). Pedestrian detection for driving assistance systems: Single-frame classification and system level performance. In IEEE Intelligent Vehicles Symposium, 2004, pp. 1-6. https://doi.org/10.1109/IVS.2004.1336346

[22] Tsishkou, D., Bougnoux, S. (2007). Experimental evaluation of multi-cue monocular pedestrian detection system using built-in rear view camera. In 2007 7th International Conference on ITS Telecommunications, pp. 1-6. https://doi.org/10.1109/ITST.2007.4295885

[23] Bar-Hillel, A., Levi, D., Krupka, E., Goldberg, C. (2010). Part-based feature synthesis for human detection. In European conference on computer vision, pp. 127-142. https://doi.org/10.1007/978-3-642-15561-1_10

[24] Silberstein, S., Levi, D., Kogan, V., Gazit, R. (2014). Vision-based pedestrian detection for rear-view cameras. In 2014 IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, USA, pp. 853-860. https://doi.org/10.1109/IVS.2014.6856399

[25] Molineros, J., Cheng, S.Y., Owechko, Y., Levi, D., Zhang, W. (2012). Monocular rear-view obstacle detection using residual flow. In European Conference on Computer Vision, pp. 504-514. 10.1007/978-3-642-33868-7_50

[26] Cheng, S.Y., Molineros, J., Owechko, Y., Levi, D., Zhang, W. (2012). Parts-based object recognition seeded by frequency-tuned saliency for child detection in active safety. In 2012 15th International IEEE Conference on Intelligent Transportation Systems, pp. 1155-1160. https://doi.org/10.1109/ITSC.2012.6338883

[27] Park, D., Zitnick, C.L., Ramanan, D., Dollár, P. (2013). Exploring weak stabilization for motion feature extraction. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2882-2889.

[28] Alon, Y., Bar-Hillel, A. (2012). Off-vehicle evaluation of camera-based pedestrian detection. In 2012 IEEE Intelligent Vehicles Symposium, pp. 352-358. https://doi.org/10.1109/IVS.2012.6232160

[29] Dollár, P., Tu, Z., Tao, H., Belongie, S. (2007). Feature mining for image classification. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8.

[30] Felzenszwalb, P., McAllester, D., Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In 2008 IEEE conference on computer vision and pattern recognition, pp. 1-8.

[31] Maji, S., Berg, A.C., Malik, J. (2008). Classification using intersection kernel support vector machines is efficient. In 2008 IEEE conference on computer vision and pattern recognition, pp. 1-8.

[32] Dollár, P., Wojek, C., Schiele, B., Perona, P. (2009). Pedestrian detection: A benchmark. In 2009 IEEE conference on computer vision and pattern recognition, pp. 304-311.

[33] Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A. (2009). Multiple kernels for object detection. In 2009 IEEE 12th international conference on computer vision, pp. 606-613.

[34] Bar-Hillel, A., Weinshall, D. (2008). Efficient learning of relational object class models. International Journal of Computer Vision, 77(1): 175-198. https://doi.org/10.1007/s11263-007-0091-7

[35] Xie, Z.X., Hu, Q.H., Yu, D.R. (2006). Improved feature selection algorithm based on SVM and correlation. In International symposium on neural networks, pp. 1373-1380.

[36] Li Y. (2018). Design and implementation of intelligent travel recommendation system based on internet of things, Ingénierie des Systèmes d’Information, 23(5): 159-173. https://doi.org/10.3166/ISI.23.5.159-173

[37] Franke, U., Gavrila, D., Gorzig, S., Lindner, F., Puetzold, F., Wohler, C. (1998). Autonomous driving goes downtown. IEEE Intelligent Systems and Their Applications, 13(6): 40-48. https://doi.org/10.1109/5254.736001