Motion-Frames Based Video Watermarking Scheme for Copyright Protection Using Guided Filtering in Wavelet Domain

Motion-Frames Based Video Watermarking Scheme for Copyright Protection Using Guided Filtering in Wavelet Domain

Purnima* Rakesh Ahuja Nidhi Gautam

Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura 140401, Punjab, India

University Institute of Applied Management Sciences, Department of Computer Science, Panjab University, Chandigarh 160014, India

Corresponding Author Email:
5 December 2022
20 January 2023
5 February 2023
Available online: 
28 February 2023
| Citation

© 2023 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (



Due to the dawn of elevated bandwidth of the internet, there exists an explosion of multimedia data exchange that makes our everyday life extremely easier. With this ease, several security issues also arise named as ownership identification, copyright protection, and illegal access to digital content. This research article proposes a prevailing technique under the scope of video watermarking based on the wavelet transform domain method with the inclusion of an explicit frame filter known as a guided filter for protecting the copyrights. The guided filter is the product of a linear model and considers the resultant image to generate the ultimate filtered output. This filter is significantly accepted for its edge-preservation and detail enhancement characteristics. Thus, the inclusion of a guided filter with the watermarking technique can efficiently confiscate the video frame noise and expresses the descriptive facets of the video object resulting in an enhanced version of the watermarked video frame. Furthermore, the said filter is considered the fastest edge-preserving filter as it exhibits a fast and fairly accurate time algorithm, in spite of the other supportive attributes like the intensity and size of the kernels. Therefore, the outcome of the investigation shows that the video watermarking approach involving wavelet transform embedded with guided filter on motion-frames of the video object is proficient and effective in a range of video watermarking application areas, including copyright protection, ownership identification, video authentication, etc.


video watermarking, copyright protection, motion-frames, edge-preserving filter, bilateral filter, edge-smoothening

1. Introduction

The colossal utilization of multimedia data due to the availability of the raised bandwidth of Internet transforms one’s life tremendously trouble-free with the accessibility of numerous attributes. Few of these characteristics engross information exchange despite the ecological location of the users with upholding the eminence and protection, teaching in virtual mode, use of diverse social media platforms. However, in unison, it affects the owner of the multimedia object in an off-putting way by involving a gigantic fiscal loss and not granting them requisite recognition and identification. Furthermore, copying, unauthorized handling, alteration and illicit circulation of the multimedia content is unlawful and taken as an offense [1]. Out of all the forms of multimedia data, videos are considered as the highly attackable where illegal distribution is at the peak and unauthorized people are making money leading to the massive loss to the entertainment industry. Thus, to safeguard the rights of the originator, there is a demand of defensible technique like digital watermarking to identify the illegitimate use and distribution of the video objects [2]. Digital watermarking is a significant technique that inserts some secret information inside the video; it could be a logo, an image or a pattern known as a watermark that can be retrieved afterwards to shield the copyright and the possession recognition against illegal sharing and manipulation of the video content.

There are numerous watermarking techniques available on video object based on scene-change detection, key frames, identical frames and motionless frames. In scene-change based watermarking schemes, the watermark insertion is being done on motionless frames of the video object that sometimes result in disintegration or removal of independent watermark [3]. Moreover, in frequent scene-change detection and the scene that involves various dissimilar short sequences, it is not advisable to implement video watermarking scheme as the watermark is not proved to be strong and safe. The remedy to this problem is to have an efficient video watermarking scheme that involves the use of motion frames with an exaggerated security and robustness by including encryption.

Considering these issues, the proposed technique is given that adds some secret information for authentication as well as to safeguard the content from illegal use. The prime challenge in inserting the secret content in the video are the extraction and detection of motion frames and to preserve the perceptual quality of the object. After the evaluation of motion regions from the video object, the watermark is inserted in them instead of all the video frames. This has been done to protect the secret data otherwise, it is very easy to detect and discard it leaving the object unsafe. The vital concern here is to minimize the quality degradation as the techniques used affects the quality of the video content. There are many watermarking techniques available. The scheme presented in this study involves wavelet transform along with guided filter and scrambling of watermark to enhance the security. This combination is applied to the watermark.

The rationality of the given approach is examined after applying diverse categories of attacks to watermarked video. The main benefaction of this approach is encapsulated as follows:

(1) A new technique is proposed in wavelet domain based on guided filter on the selected frames of the video object.

(2) This step involves the extraction of motion frames from the video for the implementation of the technique.

(3) The performance evaluators and quality parameters used in this study are Peak Signal to Noise Ratio (PSNR), Normalized Correlation (NC), and Mean Square Error (MSE).

(4) Numerous signal processing, geometric, video-specific attacks are executed on the video content.

The innovative approach depicted in this paper proclaims an ingenious design created with the help of two-dimensional wavelet transform technique combined with guided filtering, that not only preserves the perceptual quality but concurrently maintains the robustness.

This paper is arranged in six different sections that includes introduction, Section 2 shows the survey existing work that has been done in the respective area, Section 3 gives the brief information about the video preprocessing method, Section 4describes the proposed approach in detail, Section 5 exhibits the results of the study and their comparison with the latest research implemented in the field and Section 6 presents the conclusion of the research work done.

2. Related Work

In the review of literature, various watermarking approaches based on video content are executed for copyright protection. A lot of work has been done and number of novel schemes has been proposed by the researchers pursing their work in the field of digital video watermarking in non-compressed domain covering the spatial domain and the frequency domain as shown in Figure 1.

Figure 1. Classification of video watermarking techniques

The spatial domain techniques involve the modification in the pixel value to shield the multimedia object. While designing an algorithm using these schemes, one should have some knowledge about the three major conflicting facets of the process covers the several features while embedding the additional so that the resultant watermarked video has a balance between these parameters. In order to enhance the strength [4-10] blind video watermarking scheme where the watermark is embedded at different position including luminance component of each frame, on every frame and the frames selected by the scene change based on approach which can effectively resist transcoding attacks [4], frame averaging attacks [6], bit errors of watermark by cyclic error correction codes and frame dropping attacks [10]. Due to easy implementation, the algorithms in spatial domain are widely used but the sturdiness is not up to the mark and the scope is constrained with respect to the application.

The other category is frequency domain techniques where the content in several frequency components are altered and watermark insertion will be accomplished by choosing the appropriate component. Some of the well-known transform based methods are cosine transform, wavelet transform and Fourier transform. These methods are considered to perform better and give accurate results over spatial domain methods against various categories of attacks on the watermarked video object. DCT is one of the widely used frequency domain technique that involves energy of the video frames to be condensed in low-frequency sub-band as human perception and sturdiness are better in these sub-bands as compared to high-frequency sub-bands. Liu et al. [11], Thanh et al. [12], Nguyen and Nguyen [13] proposed a blind robust video watermarking algorithm in DCT domain based on Arnold Transform and code division multiple access (CDMA) for the application of outlawed copying, the watermark embedding positions in these methods are high frequency coefficients of R,G,B components and low frequency DCT coefficients of Y component.

Due to the multiresolution attribute and wider distribution of frequency in wavelet transform method, various techniques based on DWT are highly acceptable [14, 15]. To accomplish the understanding of watermark invisibility and its resilience against various attacks, the insertion of watermark was done on the second-level of medium and high frequency band using the restricted information and quantization method where additionally error-correction codes are used to enhance the performance of the algorithm. Gupta et al. [16] presents a scheme that includes resizing of frames using DWT based on security method. This technique is given to prevent collusion attacks that uses group search optimization (GSO) algorithm to optimize the positions selected to insert the watermark. Agarwal and Husain [17] proposes a new approach based on Digital Video Watermarking that checks and implements the important outlines and attributes of the video object. The method is executed based upon the circle symmetry algorithm that uses two-dimensional Lifting Wavelet Transform and Speeded-up robust features. These aids in getting some constant attribute points on the estimation of 2D-LWT using the luminance part of each frame of the video object. To find the specific location to insert the watermark, the attribute points with the highest intensity on the circumference of the half portion of a quadrant of a circle are used. Various subjective and objective parameters are evaluated by checking this technique against various categories of temporal, compression and addition of noise attack. The final outcome shows a visibly better performance of this technique as compared to others as it gives a good value of the metrics while maintaining the quality of the video object.

Singular Vector Decomposition (SVD) personifies the internal attributes of image in preference to the visual attributes. This feature of SVD makes it popular in the area of video watermarking and [18-20] the information is inserted in the singular value matrices that results in better robustness and quality of the video object. Himeur and Boukabou [19] adopted a fast gradient magnitude similarity deviation (GMSD) algorithm in order to identify the shot boundaries of the video sequence, and then representative key-frames can be extracted. Sharma et al. [20] suggests a Hybrid and protected video watermarking technique. It uses a safe Graph based transform Hyperchaotic Encryption and SVD methods to address the issues related to ownership and copyright protection.

To take the benefits from all the techniques involved in this field for enhancement of the performance and get better results, the combination of methods popularly called as hybrid techniques are used [21-25] where DWT and SVD are merged to insert the secret image into selected key-frames after the encryption is applied to the watermark with the help of chaotic encryption. Adul and Mwangi [26] shows a blind approach based on SVD/DWT technique, considered as hybrid method of video watermarking. Here, the G components of the identified frames are chosen to execute the DWT to get the descriptive coefficients procured diagonally to apply SVD and put in the watermark image into singular value matrices. Panda and Garg [27] compares four different insertion methods for watermark embedding process for better sturdiness results. The imperative concerns in the respective area are watermark dependency, faster insertion technique, selection of appropriate host and compressed format. Additionally, encryption can be added in the scheme for supplementary security and safety of the algorithm [3].

3. Preliminaries

This segment shows the basic preliminaries used in the proposed approach covering the discrete wavelet transform method, extraction of motion frames and brief knowledge about various filtering techniques available and are helpful in preserving and smoothening of edges of video objects.

3.1 Scrambling of watermark

Typically, the intention of transforming the embedding object into scrambled form is the way of encrypting so that the plain watermark image becomes worthless. In order to enhance the safety and secrecy, an encryption tool [4] ciphered the binary watermark image for which the human naked eyes can never discriminate the exact figure of the plain text form of the watermark. The attacker can never recover the watermark from the watermarked video without having sufficient knowledge of scrambling algorithm and the key even if it has been extracted. So, double security will be obtained after re-shuffling the image for the multimedia contents. The watermark image is resized and partitioned into the same number of elements as the elements present in array Key. The key for encrypting the watermark information is as follows:

Key=[16, 01, 14, 03, 12, 05, 10, 08, 15, 07, 06, 11, 04, 13, 02, 09]

The encryption process rearranged 16th part of the original watermark image by 1st element, 1st component stored into 2nd element, 14th segment into the 3rd element of the array Key and so on. In this way, all 16 parts of the image has been reshuffled according to the key and due to this, the shape of the original watermark becomes meaningless. Scrambled image is formed by concatenating all the sub-images in the sequence as shown by the elements in the array Key. The obtained image is transposed and executed 16 times to retrieve the encrypted one. An example of watermark image of the cameraman and the corresponding scrambled image are represented in Figure 2. The scrambled watermark image is further divided into number of sub-images as given:

No partitioned images $=\frac{\text { No.ofmotionframes }}{\text { No. ofKeyelements } / 4}$          (1)

For instance, if the No. of motioned frames are 14 and No. of key elements are 16 then the number of partitioned images of watermark object is defined by the above method is 4. The scrambled image is partitioned into four parts to be embed into different motion frames as per the Eq. (1) and shown in the following Figure 3.


Figure 2. Watermark image at different stages

Figure 3. Histograms corresponding to motion frames

3.2 Extraction of motion frames

The researchers faced diverse issues while working on motionless regions or scene-change detection in the field of video watermarking. Hence, the presented scheme executed a new approach related to watermarking of a video object with the utilization of motion-frames. The red component histogram method is employed to retrieve the frame known as motion frame from a video object. The major focus of this technique is to set threshold in order to differentiate the motionless and motion frames. The approach results in detection as well as the selection of 14 frames from the original video data. The threshold was set to 5000 in this case all the histograms corresponding to the selected frames and the motion frames are shown in the Figure 2 and 3 given below.

Step 1: Read two next consecutive frames from the original video. If second frame is the last frame then terminates the process else red channel are used to estimate the histogram for both the video frames and store into two separate matrices.

Step 2: Find the difference between the histograms and store the result into the newly constructed matrix.

Step 3: Calculate the sum of elements in the newly constructed matrix and also test if the totality is larger than the given threshold then the current frame is considered as motion frame and stores this frame number into an array.

Step 4: Repeat Step 1 to Step 3 after picking the next adjacent frames from original video. The approach results in detection as well as the selection of 14 frames from the original video data. The threshold was set to 5000 in this case and all the histograms corresponding to the selected frames are shown in Figure 3.

3.3 Wavelet transform based watermarking

This is one of the most popular and efficient method of transform-domain that is commonly employed for the insertion of the secret data in a video object with the involvement of simple filter. Wavelet approach is generally disintegrates a frame into several components that are used to express the details of the object such as lower resolution approximation constituent represented by LL, horizontal and vertical approximation part shown as HL and LH and diagonal approximation section given as HH. If this method is implemented on two scale decomposition, the execution is subsequently be repeated to compute multiple wavelet levels.

(a) One level decomposition

(b) Two level decomposition

Figure 4. DWT decomposition levels

In contrast to the other available techniques in this domain, the wavelet method takes care of the entire major aspects of Human Visual System [26] precisely. Additionally, it also has the capability to inspect the signals related to frequency as well as time domain. These are few advantages of wavelet transform method that makes it extremely acceptable in watermark insertion at different decomposition levels as shown in the Figure 4. The computations of the parameters used to judge the various aspects related to watermarking can be maintained using this prevalent method based on wavelet transform decompositions [25]. While employing this technique both lower and higher resolution components are used to put in the watermark to accomplish the properties of these decomposed constituents after implementing 1D and 2D wavelet approach at lower and higher sub bands, represented by LL and HHHH respectively.

3.4 Guided filter

3.4.1 Definition

Guided filter is considered as the most uncomplicated, well-liked and efficient filter used for preserving and smoothing the edges of the video object. The time complexity of this filter is not dependent on the size and computed as O (N) where ‘N’ is the number of picture cells. It also generates satisfying visual results especially on the edges of the object and successfully minimizes the artifacts occurred due to the gradient-reversal problem. All these important features of the filter made it include in the toolbox of Research programming software MATLAB and is extensively accepted in this field. One of the major reasons of the recognition of this filter is its efficiency as it involves the accelerated approach only for combined up sampling that subsamples the input frame and the guidance frame to estimate the local coefficients as shown in Figure 5.

These linear coefficients are then joined and accepted by the actual guidance frame to generate the results. This faster approach minimizes the time complexity for the sub parts exhibiting the ratio‘s’ to O (N/s2) and is supported for mega-pixel frames where filter size is relatively similar to video frame size. This method is highly accepted as it involves almost negative evident degradation that makes it grow with increased usage over the years in the field of image processing with enhanced performance.

Figure 5. Guided filter

3.4.2 Mathematical foundation

The general mathematics involved in the implementation of guided filter is represented by the following equations 2-4. Here, the guided image is represented by G and x is the linear transform of G in the window W that is concentrated in the centre at pixel f. The linear model used to represent the guided filter is shown here with the help of the equations given below:

qi=afGi+bf, i∈W         (2)

where, af, bf shows the constant linear coefficients in window W. In this computation a window W is used of radius r. The model here makes sure that q has an edge only if it exists in I due to the standard explanation represented by Δq=aΔI. The foundation discussed in this action is capable of enhancing an object at resolution level, enabling matting effects as well as dehazing.

In order to determine the coefficients represented by af and bf, the constraints from the filtering input p is required. The overall output q is being modelled by subtracting the input p from the undesirable disturbance/ noise n:

qi=pi-ni               (3)

The difference between p and q should be minimised to retrieve an effective solution. Furthermore, the cost function in the window W is also reduced as shown below:

E(af,bf)=i∈W∑((afIi+bf-pi)2+ὲaf2)                  (4)

where, ὲ is a regularization variable used to balance the higher value of af.

3.4.3 Pseudo code of guided filter

This section presents the pseudocode [2] related to the implementation of guided filter where G(r) denotes the mean filter of guidance image with radius ‘r’ and I(r) denote the mean filter of filtering input image with window radius ‘r’. The other variables used in the code indicate their meaning itself like correlation, covariance, variance. The specified code explains the steps involved in the execution of guided filter. With input image p, guided image I, radius r and a sigma value.

  1. meanI=G(r)meanp=I(r)
  2. correlationI=fmean(I) * G(r); correlationp=fmean(I) * I(r); 
  3. varianceI=correlationI- meanI*meanI; covarianceIp=correlationIp- meanI*meanp
  4. m=covarianceIp/(varianceI+ sigma); n=meanp–a meanI
  5. meanm=fmean(a,r)meann=fmean(b,r)
  6. output=meanm* I+meann

3.4.4 Enriching a video object using guided filters

One of the major characteristics of Guided Filter is Image/Video enhancement [28]. The following steps are to be executed to improve the quality of video. The aim of utilizing DoG for hiding the details of the object is that it can produce the description while constricting different noises. DoG is created by the combination of two gaussian filters with diverse variances. Therefore, the difference between the results of these filters gives the integrals of the data. The scale value after observing is adjusted within the limit of 2 to 8. The clients can set the value on their own in the applications created by them. In the Eq. (5), the input image and output image is represented by B and A, scale factor is shown by the variables:

A=GF(B+s.DoG.B)          (5)

This methodology uses AIA images and videos, the part of SDO initiated in 2010 to execute under ultraviolet light. The results are computed by incessantly observing the solar chromospheres and corona in ultraviolet channel which gives greater resolution as well as lively range images of the said components. They possess vital information related to the structure that is momentously supportive in divulging astronomical events. The presented algorithmic steps help in enhancing the specifications related to the structure of the object so that its visual analytics could be accessed and executed with the interest of upcoming researchers. The guided filter could constrict some disturbances like gaussian noise during the preservation and smoothening of edges. The two different gaussian filters can flatten the noise and the final description of the object. Therefore, the shown algorithm exhibits fine aspects of repressing the disturbance, preservation of edges and enriching the details of the object.

4. Proposed Methodology

The proposed approach is implemented on the original video ‘engine’ after extracting the motion frames, then the watermark embedding is being executed finalizing the appropriate location using wavelet transform method. The method suggests the use of guided filter as shown in Figure 6 before retrieving the watermark which results in smooth and preserved edges of the resultant video.

Figure 6. Block diagram

The anticipated algorithm suggested a novel method for securing the intellectual property rights of object. Complete watermarked object file is constructed by embedding watermark in motion frames and then replaced these watermarked motioned frames with same serial arrangement of actual video frames in order to give rise to watermarked object file.

4.1 Watermark insertion algorithm

Step 1: Extract each RGB frame from the original video data. Extract two consecutive frames at an instance. Check the first frame for motion frame. If it is not a ‘Motion Frame’ then add it into the watermarked video file without changing otherwise initiate the insertion technique after the conversion of motion frames into different color components- luminance and chrominance represented below by Y and Cb, Cr with the help of Eq. (6) [24]:


Cb=0596∗R−0275∗G−0321∗B       (6)


Step 2: A two dimensional Haar wavelet transform (DWT) [117] is applied twice time on luminance component (Y), resulting in coarse-scale and fine-scale DWT coefficients from LL sub-frequency and LH, HL and HH sub-frequency respectively. Extract the highest energy sub-bands (HH) used to insert the watermark [29]. Lower energy coefficients (LL) are not used as these coefficients (LL) reduces the superiority of watermarked video.

Step 3: The DCT image is generated on extracted HH sub-band followed by DCT operation. The purpose of using the transformation is to provide more robustness as it can resist against compression attacks.

Step 4: DCT image (Say ‘A’ having size M ×N) as input is passed through two-level discrete wavelet transform (DWT) transformation.

Step 5: Resize the watermark (Say W) image to the currently generated DCT image (A) and then watermark again two-level DWT is applied to it.

Step 6: To regenerate the luminance component (Y), apply inverse DWT two times followed by inverse DCT.

Step 7: Reconstructed watermarked Y component is combined with the original chrominance components (Cb and Cr) to regenerate the Y Cb Cr frame followed by converting such frame into RGB watermarked frame.

4.2 Watermark extraction process

Step 1: Extricate the consecutive RGB frames from the selected video object without the presence of watermark, acting as the input.

Step 2: The initial frame is checked for motion frame, if it is the one then the retrieval process is executed with the help of conversion into luminance and chrominance channels, else the motion frame extraction continues.

Step 3: A two-dimensional wavelet transform (DWT) is applied twice time on luminance component, resulting in coarse-scale and fine-scale DWT coefficients from LL subchannel and LH, HL and HH sub-channel respectively. Extract the energy coefficients of HH-Sub bands followed by DCT operation to get the DCT image.

Step 4: DCT image (Say A having size M×N) as input is passed through discrete wavelet transform (DWT) transformation two times.

Step 5: Apply guided filtering on the watermark image.

Step 6: The reconstruction of watermark object is done.

Step 7: Repeat Step-1 to Step-6 in order to extract all the watermark images from all motion frames.

Step 8: To obtain the final averaged watermark image, sum up entire extracted images followed by divide operation by exact total of Motion Frames.

5. Discussion

The experiments were executed on color static video object ‘engine’ consists of 210 video frames. The frame size of the selected video object is 364 x 264 as shown in Figure 7(a). An image of ‘cameraman’ is used as watermark image and the size of watermark image is 256 x 198 as shown in Figure 7(b) as the secret information selected for insertion in the video object. The Stirmark benchmark was used as a tool to compute the sturdiness of the inserted watermark and the performance metric termed as peak-signal-noise-ratio was considered to for checking the eminence of the watermarked video frame.

Figure 7. Video frames

It should be taken into consideration that the greater value of PSNR points to the better standard of the retrieved watermark. Prior to process of evaluating the performance with the help of parameters, a threshold value for validating the computations should be set. As the PSNR is the performance evaluator of perceptibility, similarly the NC value represents the sturdiness of the watermark embedded in the object. The same is also depicted in the Table 1. It demonstrates the comparison between the watermarking techniques implemented using diverse methods and objects and the performance of these techniques are evaluated by explaining and calculating PSNR and NC. Evidently, the outcome of the techniques clearly shows that our method is superior to other methods with respect to several aspects of the video object. Eventually, the robustness is computed against different categories of attacks from Stirmark bench [14] and the comparison is also shown with the previous work done in the respective area. For the record, it is stated that the resultant frames after insertion of watermark from the comparison methods were managed to get the PSNR value 35 dB approximately.

Table 1. Evaluation of robustness

Name of attack

Type of attack


No attack

Without attack


Signal processing

Salt n Pepper noise



Speckle noise



Gaussian noise


Geometric attacks









Frame attacks

Frame averaging



Frame dropping



Frame swapping


5.1 Perceptibility

The performance evaluators PSNR and NCC are used to compute perceptibility and sturdiness and the same can be checked by executing several tests for presented process of watermarking. Here, the major goal is the retrieval of watermark from the inserted video object with the application of diverse categories of attacks.

Here, PSNR is the performance evaluator used to check the perceptibility of the video object and is calculated as:

PSNR=20 log10(maxi/√MSE)         (7)

where, MSE=maxi=max (WF (i,j), i and j ranges from 1 to M and 1 to N respectively and MSE is the mean square error between the original frame and the watermarked frame. The unit of PSNR is in dB. Here, the corresponding watermarked frame and the extracted watermark from the Engine video are shown in Figure 8 (a) and 8 (b) respectively.

Figure 8. Watermark image

5.2 Robustness evaluation

The robustness is computed with the comparison of retrieved watermark and the actual watermark to determine the affinity between them that is evaluated by:

NC=∑i∑jWijW’ij            (8)

 √ ∑i∑j (Wij) 2√ ∑i∑j (W’ij) 2

In the Eq. (8), i and j represents the pixel component of the row and column of actual and processed watermark respectively. The suggested technique shows the inflated value of peak-to-signal-ratio of 50.88 dB as the PSNR value above 30 dB is acceptable and the normalized correlation coefficient (NCC) of sturdiness came out to be 0.98 without any attack. Several experiments have been conducted to measure the sturdiness of the proposed theme. To accomplish this determination, several security checks have been applied to the watermarked video. The measurement of some of the security features are based on the analysis of properties of video. The most remarkable property of video is its temporal characteristic that is the arrangement of fix images also known as video frames. If few of the adjacent frames swapped by a malicious user then, although, it will not affect the perceptual quality of watermarked video but certainly corrupt the additional information into it. The second distinguish feature of video is to have huge redundancies exist between or among the frames. Due to processing at frame level like frame dropping, frame averaging, frame insertion, frame deletions and frame swapping becomes the primary source of attacks. The watermark embedded into the video is disrupted by replacing the frames with the original one or replacing with by averaging of neighboring frames without while compromising the quality of watermarked video. Some of the attacks on video watermarked signal are inherited from image watermarking. These types of attacks are geometric attacks further categorize as rotation, scaling and cropping of watermarked object. Other category includes inserting various noise attacks as Salt and Pepper noise, Gaussian noise, Speckle noise, and Poisson noise. Although marginally rotating each watermarked frame conserve the excellence of watermark embedded video yet disturb the extracted quality of watermark. In view of all these attacks, the given approach proves the strength of watermark system by executing the attacks explained above.

5.2.1 Exploitation geometric occurrences

Malicious user alters the coordinates of the embedded watermark by introducing rotation, cropping and scaling attacks in order to corrupt the embedded watermark. Therefore, video watermarking schemes always tested that the watermark must be extracted successfully and hence this matter can never be ignored. The proposed scheme rotates each watermarked frame anticlockwise from 0.1 to 0.6 degrees. It is examined that the process of watermarking resist upto 0.3 degrees of rotation of the entire watermarked video frames by extracting 70% of the watermark. The perceptibility of watermarked video is sustained up to 0.5 degrees by obtaining the PSNR more than 30 dB.

Figure 9a. Perceptibility against rotation attack

Figure 9b. Robustness against rotation attack

To maintain the stability between the quality and toughness, the scheme is ideally survived for these two essential parameters up to 0.3 degrees as shown in the Figure 9 (a) and 9 (b) respectively.

Other part of experiment is cropping the number of columns from 60 to 160 by substituting zeros in these columns from watermarked video frames in such a manner that the width of the frames must not be distorted. It is because of maintaining the perceptibility of watermarked video and to easily calculate the PSNR. Satisfactory quality measures are obtained for the up to 65 columns watermarked frames are cropped as shown in the Figure 10a.

Figure 10a. Effect of perceptibility after cropping the frames columns

As the robustness is concerned the watermark is auspiciously retrieved (70%) when cropping is around 130 columns as shown in the Figure 10b.

Figure 10b. Effect of robustness after cropping the frames columns

5.2.2 Addition of noise and filtering attack

The embedded watermark may also damage by adding noise into the watermarked video. Hence, the sturdiness must also be evaluated. The available categories of noise are Speckle, Poisson, Gaussian and Salt and Pepper which can be simulated to distort and degrade the quality and sturdiness of watermarked object. The insertion of little amount of noise of any type may affect while extracting 0 5 watermarks. The intensities for Speckle, Gaussian and Salt and Pepper are varied from 0.002 to 0.006 at the interval of 0.1. It is observed that the watermarked system can withstand up to the intensity of 0.003 for speckle noise and 0.005 for salt and pepper noise attacks but the effects are not adequate for the Gaussian noise attack due to the reason that the distortion level is too high to maintain the robustness. The effect of robustness and perceptibility after inserting noises is also explored in Figures 11a and 11b respectively.

Figure 11a. Effects of different kinds of noises on the robustness

Figure 11b. Effect of perceptibility after inserting various noises

5.2.3 Frame dropping attack

The malicious technocrat measured the characteristic of redundant nature of video by selecting few frames from watermarked object and drop in two following fashions. The initial plot suggested that designated frames of the video object were removed forever. On the other hand, few of the chosen parts were substituted with the matching actual frame. Yet in either case, a smart assailant preserves the perceptibility but aim to distort the watermark. To fulfill this purpose, 10 to 50 frames are dropped from all three watermarked video. During the dropping process, some motion frames are also dropped like 32nd and 16th directly affects the NC value and therefore the strength of the algorithm dropped sharply. It is evident from Figure – that the presented approach conveys the fine robustness against frame dropping attacks with acceptable perceptibility as shown in Figure 12a in watermarked video.

Figure 12a. Effects of frame dropping and insertion attack on the robustness

5.2.4 Frame insertion attack

This type of attack is unintentional in nature and very common in video watermarking system. In most of the cases, it becomes necessary to insert commercial breaks at several places during the playing of video. There are two methods for inserting the frames from another video into the watermarked video. In the first case, the commercial video clip is inserted but the number of frames occupied by the additional video clip is replaced by the actual frame of the video object in order to unchanged the extent of object and in other case the selected video clip is inserted into the watermarked video without concern about the increases of length of video itself. In both the cases, a lots of watermarked video data is lost. Yet, an assailant maintains the standard of the object so that it should not be deteriorated substantially. This attack is performed on inserting 10 to 50 frames from the video in the watermarked video. It is confirmed that the proposed scheme achieved a good robustness as shown in Figure 12a and acceptable perceptibility against frame inserting and frame replacement attack in Figure 12b.

Figure 12b. Effects of frame insert/deletion based attack on the Perceptibility

5.2.5 Frame swapping attack

An active part of watermarked video can also be destroyed by frame swapping attack. It takes place by defining FW=FW−1 and FW−1=FW where FW denotes the current watermarked frame and FW−1 is the just previous watermarked frame. Again, to accomplish this purpose, 10% to 60% frames are swapped too for all three watermarked video. 20% swapping mean every fifth watermarked frame is swapped from watermarked video.

5.2.6 Frame averaging outbreaks

Frame averaging is another critical category of outbreak in video watermarked objects. It is manifested that the structure of watermark may completely be destroyed by taking the average of various frames. It is shown in the equation below:

FW = (FW −1+FW +FW +1)/3           (9)

where, FW−1 is just the previous frame, FW is the current frame and FW+1 is the next frame to achieve this purpose, 10% to 60% frames are averaged by considering all three watermarked video. 20% averaging mean every fifth watermarked frame is averaged from watermarked video. The performance of robustness and perceptibility are best illustrated in Figure 13a respectively.

The sturdiness of the presented approach is completely opposite to the outcomes of previously stated methods given by the other researchers. The scheme proposed by Chetan and Raghavendra [30] is based on DWT technique Nasreen et al. [31] described the method for embedding four non-identical fragments of a watermark object into 4 disjoint PCA blocks generated from two sub-bands of DWT technique. The Table 2 shows that the sturdiness of the offered process is higher than the algorithms against several signal processing attacks mentioned by previous researchers.

Figure 13a. Effects of frame replacement, swapping and averaging attack on the robustness

Figure 13b. Effects of frame replacement, swapping and averaging attack on the perceptibility

Table 2. Comparison of signal-processing attacks

Name of attack Reference

Salt n Pepper

Gaussian Noise

Speckle Noise

Poisson Noise














































Table 3. Comparison of geometric attacks

Name of attack Reference



Lossy Compression





































To validate the obtained results for the video ‘engine.avi’ are compared with the results of recently published state-of-art schemes [20, 31-36] that shows resilience under different attacks. Table 3 shows the results of comparison for several geometric attacks. It is because of the fact that such types of outbreaks terminate the stationary shares of movie. In view of this, the suggested algorithm inserted the watermark only in the movable part of video. Therefore, the key consequences of the proposed scheme are to obtain the sturdiness against dissimilar varieties of lossy compressions.

Finally, the performance is evaluated against various video-specific attacks of the proposed video watermarking scheme to the results of related video watermarking schemes given in [17, 30-32], Table 4 displays these results of comparison under frame averaging and frame dropping attacks. It is clear that the proposed scheme achieves a better robustness.

Table 4. Comparison of video-specific attacks


Ref. No.












































Also, at some places the values in the Tables 2-4 are not given and marked dashed, this is because the referred studies have not tested their approach for those attacks.

6. Conclusions

The presented approach has been tested by evaluating sturdiness of the video object against diverse categories of attacks consisting of various noise attacks, geometric attacks and attacks related to compression. The method involves two-level approach to handle the video object. First level involves the detection and extraction of motion frames and then the watermarking process has been executed with the use of wavelet transform method embedded with an explicit frame filter named as guided filter on those selected frames. The frames of the video object are afterwards passed to the guided filter which results in the preservation and smoothening of the frame after extracting the watermark from it. From the given method, it is easily understandable that the guided filter enhances the standard of the watermarked object. The said approach has been applied on selected motion-object and the aspect of the resultant embedded video is quite superior. Therefore, there exists very minor difference between the actual video object and the embedded video. It is considered that the proposed algorithm should be implemented for the application of private video watermarking process due to the non-availability of actual object and watermark image during retrieval of the watermark. Furthermore, it is suggested to calculate the strength of the watermarked object for complex attacks including ambiguity, collusion attacks and designing a blind watermarking scheme without the availability of video object and watermark image.


This work is supported by Chitkara University, Rajpura, Punjab and Sri Guru Gobind Singh College, Chandigarh. The authors are thankful to the anonymous researchers for their contribution to the subjective assessment of the method. The authors are also thankful to the reviewers for their constructive suggestions and comments which have improved the quality of the paper considerably.



Original Watermark


Extracted Watermark



Pixel element of ith row


Pixel element of jth row


[1] Doerr, G., Dugelay, J.L. (2003). A guide tour of video watermarking. Signal Processing: Image Communication, 18(4): 263-282.

[2] Ahuja, R., Bedi, S.S. (2016). Compressed domain based review on digital video watermarking techniques. Information Technology of Linear Networks and Systems, Wadsworth, Belmont, 123-135.

[3] Ahuja, R. (2019). Design of digital video watermarking technique based on motion frames. Journal of Computational and Theoretical Nanoscience, 16(10): 4328-4338.

[4] Bayoudh, I., Ben Jabra, S., Zagrouba, E. (2018). Online multi-sprites based video watermarking robust to collusion and transcoding attacks for emerging applications. Multimedia Tools and Applications, 77: 14361-14379.

[5] Masoumi, M., Rezaei, M., Hamza, A.B. (2015). A blind spatio-temporal data hiding for video ownership verification in frequency domain. AEU-International Journal of Electronics and Communications, 69(12): 1868-1879.

[6] Tokar, T., Kanocz, T., Levicky, D. (2009). Digital watermarking of uncompressed video in spatial domain. In 2009 19th International Conference Radioelektronika, pp. 319-322.

[7] Preda, R.O., Vizireanu, N. (2011). New robust watermarking scheme for video copyright protection in the spatial domain. UPB Sci Bull, 73(1): 93-104.

[8] Venugopala, P.S., Sarojadevi, H., Chiplunkar, N.N., Bhat, V. (2014). Video watermarking by adjusting the pixel values and using scene change detection. In 2014 Fifth International Conference on Signal and Image Processing, pp. 259-264.

[9] Bahrami, Z., Akhlaghian Tab, F. (2018). A new robust video watermarking algorithm based on SURF features and block classification. Multimedia Tools and Applications, 77: 327-345.

[10] Li, X., Wang, X., Yang, W., Wang, X. (2016). A robust video watermarking scheme to scalable recompression and transcoding. In 2016 6th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, pp. 257-260.

[11] Liu, G.Q., Zheng, X.S., Zhao, Y.L., Li, N. (2010). A robust digital video watermark algorithm based on DCT domain. In Proceedings of the International Conference on Computer Application and System Modeling, Taiyuan, China, pp. V2-202.

[12] Thanh, T.M., Hiep, P.T., Tam, T.M., Tanaka, K. (2014). Robust semi-blind video watermarking based on frame-patch matching. AEU-International Journal of Electronics and Communications, 68(10): 1007-1015.

[13] Nguyen, T.T., Nguyen, D.D. (2015). A robust blind video watermarking in DCT domain using even-odd quantization technique. In 2015 International Conference on Advanced Technologies for Communications (ATC), Ho Chi Minh City, Vietnam, pp. 439-444.

[14] Preda, R.O., Vizireanu, D.N. (2010). A robust digital watermarking scheme for video copyright protection in the wavelet domain. Measurement, 43(10): 1720-1726.

[15] Preda, R.O., Vizireanu, D.N. (2011). Robust wavelet-based video watermarking scheme for copyright protection using the human visual system. Journal of Electronic Imaging, 20(1): 013022-013022.

[16] Gupta, G., Gupta, V.K., Chandra, M. (2018). An efficient video watermarking based security model. Microsystem Technologies, 24: 2539-2548.

[17] Agarwal, H., Husain, F. (2021). Development of payload capacity enhanced robust video watermarking scheme based on symmetry of circle using lifting wavelet transform and SURF. Journal of Information Security and Applications, 59: 102846.

[18] Kerbiche, A., Jabra, S.B., Zagrouba, E., Charvillat, V. (2017). Robust video watermarking approach based on crowdsourcing and hybrid insertion. In 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1-8.

[19] Himeur, Y., Boukabou, A. (2018). A robust and secure key-frames based video watermarking system using chaotic encryption. Multimedia Tools and Applications, 77: 8603-8627.

[20] Sharma, C., Amandeep, B., Sobti, R., Lohani, T.K., Shabaz, M. (2021). A secured frame selection based video watermarking technique to address quality loss of data: combining graph based transform, singular valued decomposition, and hyperchaotic encryption. Security and Communication Networks, 2021: 1-19.

[21] Kerbiche, A., Jabra, S.B., Zagrouba, E., Charvillat, V. (2018). A robust video watermarking based on feature regions and crowdsourcing. Multimedia Tools and Applications, 77: 26769-26791.

[22] Gaj, S., Rathore, A.K., Sur, A., Bora, P.K. (2017). A robust watermarking scheme against frame blending and projection attacks. Multimedia Tools and Applications, 76: 20755-20779.

[23] Joshi, A.M., Gupta, S., Girdhar, M., Agarwal, P., Sarker, R. (2017). Combined DWT–DCT-based video watermarking algorithm using Arnold transform technique. In Proceedings of the International Conference on Data Engineering and Communication Technology: ICDECT 2016, pp. 455-463.

[24] Kunhu, A., Nisi, K., Sabnam, S., Majida, A., Saeed, A. M. (2016). Index mapping based hybrid DWT-DCT watermarking technique for copyright protection of videos files. In 2016 Online International Conference on Green Engineering and Technologies (IC-GET), pp. 1-6.

[25] Jiang, D., Kim, J. (2015). A spread spectrum zero video watermarking scheme based on dual transform domains and log-polar transformation. International Journal of Multimedia and Ubiquitous Engineering, 10(4): 367-378.

[26] Adul, V., Mwangi, E. (2017). A robust video watermarking approach based on a hybrid SVD/DWT technique. In 2017 IEEE AFRICON, pp. 309-313.

[27] Panda, J., Garg, P. (2016). An efficient video watermarking approach using scene change detection. In 2016 1st India International Conference on Information Processing (IICIP), pp. 1-5.

[28] He, K., Sun, J., Tang, X. (2012). Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6): 1397-1409.

[29] Rathore, A.K., Kumar, P. (2016). Block based video watermarking scheme using wavelet transform and principle component analysis. International Research Journal of Engineering and Technology (IRJET), 3(6): 1140-1144.

[30] Chetan, K.R., Raghavendra, K. (2010). DWT based blind digital video watermarking scheme for video authentication. International Journal of Computer Applications, 4(10): 19-26.

[31] Yassin, N.I., Salem, N.M., El Adawy, M.I. (2012). Block based video watermarking scheme using wavelet transform and principle component analysis. International Journal of Computer Science Issues (IJCSI), 9(1): 296.

[32] Wang, Y.G., Lu, Z.M., Fan, L., Zheng, Y. (2009). Robust dual watermarking algorithm for AVS video. Signal Processing: Image Communication, 24(4): 333-344.

[33] Nouioua, I., Amardjia, N., Belilita, S. (2018). A novel blind and robust video watermarking technique in fast motion frames based on SVD and MR-SVD. Security and Communication Networks, 2018: 1-17.

[34] Singh, R., Mittal, H., Pal, R. (2022). Optimal keyframe selection-based lossless video-watermarking technique using IGSA in LWT domain for copyright protection. Complex & Intelligent Systems, 8(2): 1047-1070.

[35] Rupa, C., MidhunChakkarvarthy, D., Patan, R., Prakash, A.B., Pradeep, G.G. (2022). Knowledge engineering–based DApp using blockchain technology for protract medical certificates privacy. IET Communications, 16(15): 1853-1864.

[36] Kishore, D.R., Syeda, N., Suneetha, D., Kumari, C.S., Ghantasala, G.P. (2021). Multi scale image fusion through Laplacian Pyramid and deep learning on thermal images. Annals of the Romanian Society for Cell Biology, 3728-3734.