A Coverless Information Hiding Algorithm Based on Grayscale Gradient Co-occurrence Matrix

ABSTRACT In this paper, a coverless information hiding algorithm is introduced. In which, the grayscale gradient co-occurrence matrix is used to encode images and the mapping relationship between the images and the random numbers is used to express the payload information. There are three steps for this algorithm. Firstly, the grayscale gradient co-occurrence matrix of cover image is calculated, in accordance with which a descriptor is introduced. Secondly, the descriptor is quantized into a binary sequence to construct a mapping relationship between the cover image and the binary random numbers. Finally, the binary secret information sequence is divided into many segments, and the correct images corresponding to those segments are selected from the image database according to the mapping relationship. Moreover, the secret information is encrypted by Turbo encoder to improve the security. The experimental results show that the proposed algorithm has a good tolerance towards JPEG compression attack and low-pass filter attack. This promising algorithm which can be applied into remote sensing satellites leads an applied value in covert communication with high-security.


INTRODUCTION
Nowadays, the satellite technology succeeds in longrange communication with the large cover area. Strengthening the security of satellite communication system is of great significance to the confidential communication at national defense and military levels. At present, the secure transmission of important data on the satellites mainly relies on encryption technology, which directly converses the original signal into meaningless cipher text through encryption keys and encryption functions. However, this technology results in the existence of the covert communication, which may attract the attention of the adversaries. If the covert communication is imperceptible to be detected, the security of the technology will be greatly improved. Due to the increasing demand on the communication security, information hiding technology has been shaping up. Different from the cryptography technology, information hiding technology utilizes the multimedia information as hosts to implement the concealment of digital signals.
In this paper, we introduce a coverless information hiding algorithm based on coding the feature descriptor of remote sensing images. First, the grayscale-gradient cooccurrence matrix (GGCM) of an image is calculated by using a modified extraction algorithm. Then the GGCM is processed and characterized to establish an index relationship between images and random binary sequences without modification on the original image. Last, a mapping library of many-to-one correspondences between carrier images and secret messages is built to ensure the image-based coverless information hiding. By the way, the secret message is encoded to improve the security by using Turbo encoder.
The rest of this paper is organized as follows. Section 2 describes the related work. Section 3 presents the proposed method in details. Section 4 shows the results of the experiments and analysis. The last section makes a conclusion of the full text.

RELATED WORK
Cryptography conceals the content of the messages secretly while steganography aims at hiding the existence of messages. According to the ways of hiding, the existing image steganography methods can be classified into two categories: spatial domain based and transform domain based. One of the most common methods of spatial domain is to replace the least significant bits (LSB) with secret data [1]. Reference [2] proposed a data hiding method based on pixel value difference (PVD), least significant bit (LSB) substitution, PVD shift, and modification of prediction error (MPE). Steganography in the transform domain is used to modify the statistical feature of the cover image for achieving data hiding [3]. Many transform domain methods are proposed to hide data into DCT (Discrete Cosine Transform) domain [4], DFT (Discrete Fourier Domain) domain [5], and DWT (Discrete Wavelet Domain) domain [6]. Reference [7] proposed a novel multi-watermarking scheme in DWT domain based on hybrid multi-bit multiplicative rules controlled by secret keys. Reference [8] proposed a method which combines the signal processing, cryptography and steganography to enhance the security of secret information. Reference [9] proposed a novel method which joints traditional data hiding and compression scheme to enhance the information hiding. In addition, some researchers applied the steganography technology in field of satellite communication, embedding low-speed data into the images transferred by the high-speed transmission system [10,11].
Steganography, in general, is the process of embedding the secret information into the cover media, including images, texts, audio, video and so on, which can be described as follows [12]: In this way, an image is chosen to be a cover image, and the secret information is embedded into this image by a specific embedding algorithm. This inevitably changes the original image itself and leaves some traces of modifications. In addition, related anti-steganalysis algorithms based on statistical analysis can detect the existence of secret information. Reference [13] proposed a steganalysis method aimed at HUGO steganography, which can detect the stego-images reliably, and extract the embedded message as well. Reference [14] proposed a general steganalysis feature selection method based on decision rough set α-positive region reduction, improving the efficiency of steganalysis algorithm. Overall, steganalysis technique is a great threat for these traditional steganography methods.
To avoid modifying the original image, zero-steganography proposed by [12] establish a relationship between the cover image and the secret data by using the logistic map to generate a chaotic matrix, and the stegokeys among the cover image, chaotic matrix and secret data. Reference [15] proposed a zero-watermarking like steganography with an information extraction file generated, which is indispensable for the receiver to recover the secret message. Reference [16] proposed a zero-steganography algorithm based on chaotic sequence and DCT transform, using the DC components of the cover image DCT transform coefficients to represent the secret message, with a related document.
A mathematical model for a basic zero-steganography, which is presented in the followed equation, is used for constructing a functional relationship F between the cover image and secret data to obtain the corresponding stego-key.
F(cover image , secret data ) = stego key (2) Zero steganography indeed has a good imperceptibility; however, it produces some auxiliary data, such as stego-keys, the extraction files or other related documents, which must be transmitted through another channel. Moreover, there are some security risks that attacks or cryptanalysis on the auxiliary data caused by the adversaries cannot be avoided. In terms of satellite data communication, additional secure channels are occupied to transmit the auxiliary data, resulting in a waste of resources.
Rather than the traditional information hiding method that embeds the secret data into a specified image, the coverless information hiding method without other carriers was proposed by Zhou et al. [17]. In this method, a cover image can be driven by the secret information and yield an encryption vector, so that the secret message is transmitted without any modification. Besides, the process of the coverless information hiding method does not generate any auxiliary information, which saves the signal channel resources.
From the above analysis, there are three advantages of the coverless information hiding method [18]: no embedding, no additional message, and anti-steganalysis. The main idea of the coverless information hiding method is to construct the mapping relationship between the information contained in the image itself and the secret information. The mapping relationship M between the cover image and secret data is shown as follows: Reference [18] established the correspondence between the Chinese vocabulary and cover image by using the coverless information hiding method. Reference [19] proposed the IEBVER (image-entropy-based visual expression of random) algorithm, which uses the entropy of each image block and quantifies the eigenvalues of the entropy matrix to express the random numbers. Reference [20] utilizes the MSIM (molecular structure images of material) to achieve coverless information hiding. Reference [21] proposed a coverless image steganography based on DCT and LDA (Latent Dirichlet Allocation). Reference [22] introduced a dynamic content selection framework (DCSF) for coverless information hiding. Reference [23] proposed a coverless steganography method by leveraging a generative model. Besides, the progress in the technology of data sharing and cloud computing [24][25][26] has indeed facilitated the coverless information hiding.

PROPOSED METHOD
The frame diagram of the proposed method is illustrated in Figure 1. A complete image database is constructed through this process, in which the grayscale gradient cooccurrence matrix (GGCM) of an image is employed to carry out the image feature coding. At the sending end, the secret information is processed through Turbo coding, increasing the security of the covert information.
For obtaining the correct images corresponding to the binary sequences, the sender searches the image from the database and then sends those images out to the receiver through the open channel. Similarly, at the receiving end, the information expressed by the cover image is extracted and then decoded by using Turbo to recover the secret information.

Establishment of Image Database Using GGCM
An image consists of many features like brightness, colour, entropy and so on. A binary sequence can be expressed through coding and quantizing those features of the image. The GGCM, namely the grayscale gradient co-occurrence matrix, has been proved to be a powerful approach for image texture analysis, describing the distribution of each point in an image at grayscale and gradient scale, and showing the spatial relationship between each point and its neighboring points. Given h(i, j) is an element of GGCM, the value of which is defined as the total number of pixels, in the condition of the gray value i and the gradient value j in the normalized gray image F(i, j) and the normalized gradient image G(i, j).
In this paper, we calculate the GGCM of image F(i, j) as follows: (1). Extract the grayscale matrix F(x, y) of the image.
(2). Extract the gradient matrix G(x, y) of the image by using the improved Sobel operator [27]. In this way, the outermost edge of the gray matrix, which carries on less information, is deliberately omitted because the number of dimension in the gradient matrix is one less than that in the grayscale matrix. Thus one can calculate the GGCM though this. For giving an explicit illustration of the 3-D plot, the gradient level (X-axis) is set to 64 and the gray level (Y-axis) is set to 256. For the standard test image, the three-dimensional plot of the GGCM is shown in Figure 3.
To describe the matrix GGCM effectively, a "feature descriptor" is introduced to represent H1 as a binary sequence. First, the 3-D plot is projected to the Y-axis, as H1 shown in Figure 4 (a). Second, the matrix H1 on [t 1 , t 2 ] is extracted to obtain H2, where t 1 = 2, t 2 = 63; in this way, the projection lines corresponding to the highest gradient value, which is 64 and the lowest gradient value, which is 1, are excluded, as shown in Figure 4 (b). Third, the average value of each abscissa in the projection drawing is calculated to obtain matrix H3, as shown in Figure 4 (c). In the interval [t 3 , t 4 ], where t 3 = 1, t 4 = 255, the matrix H3 is intercepted and divided into 17 segments, then the average of all elements in each segment is obtained to get the vector H4, which is the feature descriptor of low distortion.
To generate the binary sequence {a1,a2, . . . an}, the quantification criteria are as follows: According to this equation, a 16-bit-length binary sequence is denoted by an image. To make full use of a mass of remote sensing images from the satellite, each image is coded into a binary sequence and then stored in a subpool which represents a certain single binary sequence, building a mapping library of many-to-one correspondences between carrier images and secret messages. A complete image database is constructed when all binary sequence is presented by images.

Design for QPP Interleaver in Turbo Code
The Turbo code, which can achieve near-Shannon-limit performance through iterative decoding process, is an impressive forward error correction technique. It is widely used in mobile, deep space and satellite communications. The Turbo encoder is formed by two RSC (recursive systematic convolutional) encoders in parallel concatenation by a random QPP (quadratic permutation polynomial) interleaver, along with the puncturing mechanism. A typical Turbo encoder block diagram is shown as follows Figure 5:   symbols, and which converts a series of consecutive burst errors into discrete random errors in the message. Hence, to solve the coefficients of the polynomial is the crux of the matter. Some values of the coefficients for QPP can be calculated through computer-aided search by using the theory [28]. Table 1 gives the appropriate values of the coefficients and the length k of the interleaver.
The operating parameters in this paper are as follows: the interleaving mode for QPP is Pseudo-random; the generator matrix is [7,5] for RSC1 and [15,17] for RSC2; the puncture matrix is P = 1 0 0 1 ; the data rate is 0.5; the decoding mode is Log-MAP; the number for iteration is 5.

Process of Sending
Step 1: The secret information is converted into a secret binary sequence M in the first place. The length of the secret binary sequence is L(L ≤ 2 16 ). Then M is divided into segments of 8-bit-length. Namely M = {M 1 , M 2 , · · · M i , · · · M n }. If M i is less than 8 bits, it will be appended zeros at the end of this segment to reach 8 bits.
Step 2: Each segment is coded by Turbo encoder. As a result, the length of each segment becomes 16 bit because of the data rate.
Step 3: Search the correct image randomly from the subpool corresponding to the 16-bit-length segment according to the mapping relationship.
Step 4: Continue step 3 until all of the segments are expressed.
Step 5: Select the right image which is equal to the binary form of L to represent the length of M. Then, add it to the end of those images.
Step 6: Transmit all images consecutively to the receiving end.

Process of Receiving
At the receiving ending, all images are received one by one. First, the 16-bit-length binary sequence is extracted from each image by using the algorithm mentioned in Section 3.1. Second, the binary sequence of the last image is converted into a decimal number L, which is the length of the binary secret message. Third, each 16-bitlength sequence is decoded into the 8-bit-length binary sequence. Then all 8-bit-length binary sequences are jointed. At last, zeroes should be deleted if there are any of them added to the end of the original sequence, and the number of redundant zeroes is L%8.

Robustness to Common Attack
Robustness is a rational measurement of the steganography algorithm, representing the ability of resistance to the attacks of an adversary. In the transmission process, the failure of the steganography algorithm is caused by common attacks which are JPEG compression, noise attack, low-pass filter attack and common image processing operations, etc. We will verify the robustness of the algorithm through four experiments in the next chapter.
Bit Error Rate (BER) is introduced to measure the robustness of the algorithm in the communication process. If the binary vector of the original image is P = {p 1 , p 2 · · · p n }, and Q = {q 1 , q 2 · · · q n } represents the binary vector of the image after being attacked, then the BER is defined as follows: The BER value is the average of the results for 1000 images.

JPEG Compress
The compression standard for continues-tone still images (ISO 10918-1), known as JPEG, is the most popular and basic compression format. The JPEG is one of the lossy methods for compression, which allows the loss of information because of the insensitivity at a certain frequency of human vision and is performed on digital images before the transmission between mobile devices   [24]. As such, the cover image is likely to be damaged through the transmission if attacked by this kind of compression. In this test, the robustness of the proposed algorithm against JPEG compression attack is measured by bit error rate. The quality factor (Q) of JPEG compression varies from 1 to 100 with an increment of 5 whereas Q = 1 represents the highest compression rate and Q = 100 represents the lowest compression rate. As shown in Figure 6, the abscissa axis represents the compression quality while the ordinate axis represents the bit error rate. Table 2 provides the comparison among the CBD (chaos based DCT steganography) algorithm [29], CBZS (chaos based zero-steganography) algorithm [12], CSD (chaotic sequences and image DCT algorithm) algorithm [16], CIHRIH (coverless information hiding based on robust image hashing) [30] and the proposed algorithm. The result shows the excellent performance of the proposed algorithm with the smallest BER at the same compression quality value.

Noise Attack
Salt & pepper noise, also known as double pulse noise, is caused by the strength of the signal pulse. The position of the noise is random, but the depth of it is fixed. This noise includes salt noise, which belongs to high grayscale noise, and pepper noise, which belongs to low grayscale noise. Generally, two types of noise appear at the same time. The proposed algorithm is analysed for salt & pepper noise,  whose density ranges from 0 to 0.1 with an increment of 0.01. Figure 7 shows the error rate curve of the algorithm with the existence of salt & pepper noise, whose density is represented by the X-axis, and Y-axis represents the BER. Table 3 provides the comparison of the BER among the CIHWE (coverless information hiding without embedding) algorithm [17], CIHRIH (coverless information hiding based on robust image hashing) algorithm [30] and the proposed algorithm when attacked by salt & pepper noise. The result shows the excellent performance of the proposed algorithm with the smallest BER at the same compression quality value.
Additive white Gaussian noise (AWGN) damages the signal through the linear additional white noise, which is a basic interference model. The amplitude of AWGN obeys the Gaussian distribution while the power spectral density obeys the uniform distribution. Mean μ and variance σ 2 are two parameters of AWGN. In our test, mean μ is 0 and variance σ 2 varies from 0 to 0.1 with a step with 0.01. Figure 8 shows the BER curve of the proposed algorithm with the existence of AWGN attacking.

Low-pass Filter
The low-pass filter removes the high-frequency components in the image. In other words, the filter can blur and smooth the image, and weaken the visible changes at the edges of the object. Some conventional steganography methods embed secret information into the highfrequency region because of the insensitivity of human vision, but those methods cannot resist the low-pass filter attack. In out tests, the cover image is attacked by the mean filter and Gaussian low-pass filter.
The mean filter has the size of 3 × 3, 5 × 5, 7 × 7 and 9 × 9, where 3 × 3 is the lowest filtering and 9 × 9 is the highest. The result is shown as Figure 9, in which the X-axis represents the size of the filter, and the Yaxis represents BER. Table 4 shows different BER of the algorithm CBZS (chaos based zero steganography) [12], SCZS (satellite communication zero steganography algorithm) [31] and CSD (chaotic sequences and image DCT algorithm) [16] when attacked by the mean filter.  Figure 10: Gaussian low-pass filter attack There are two parameters of the Gaussian low-pass filter, namely size and Gaussian variance. In our test, the size is the default and the Gaussian variance varies from 0 to 1 with the stepping of 0.1. As shown in Figure 10, the proposed method has robust ability to attacks from Gaussian low-pass filter. Table 5 provides the comparison among the CBD (chaos based DCT steganography) algorithm [29], the CBZS (satellite communication zero steganography algorithm) algorithm [31] and the CSD (chaotic sequences and image DCT algorithm) algorithm [16].

Common Image Processing Operation
It is indispensable to carry out robustness analysis when image are attacked by some common image processing operations, which can make negative effects to the result of information hiding. Some kinds of common image processing attacks, which are applied to the cover images, are described below.   In this test, we make comparisons between the RCIS (robust coverless image steganography based on DCT and LDA classification) [21] algorithm and the proposed algorithm. Tables 6 and 7 show the BER of the two algorithms under the attacks of rotation and scaling processing respectively.

Security Analysis
One of the most considerable points for any steganography algorithms is imperceptibility, which can be measured by PSNR (Peak signal-to-noise ratio). For the proposed algorithm, the stego image and cover image are exactly the same; therefore, the stego image has the highest PSNR because of no modifications in the cover image. As a result, existing steganalysis techniques [13,14] fail to detect the secret information. Through the procedure of the covert communication, there is no auxiliary information generated, which reduces the skepticism and suspicion of a latent adversary.
Security is compromised when the covert communication is monitored or broken. For an attacker who wants to acquire the secret information, he or she has to detect the existence of the covert communication and read the hidden massage. According to the analysis above, it is impossible for others to detect the use of steganography and damage a stego-key because there is no additional information. Therefore, the proposed algorithm is of great imperceptibility and security. In addition, this algorithm can be applied to CG (computer graphics) and PG (photographic) [25].

Capacity Analysis
For n-bit-length binary encoding of all images in the database, there must be 2 n images at least corresponding with the binary sequences. If the length of encoding is too long, the image library that satisfies the conditions will be hard to be established; on the contrary, if the length is too short, the image library will be a small one, which is too simple to ensure the secure transmission of covert communication.
It is obvious that the capacity of coverless information hiding is less than tradition image steganography, which is a harsh problem to be solved. Table 8 provides the comparison among Zhou's method [17], Yuan's method [32], Zheng's method [30] and the proposed method in this paper.

CONCLUSION
The proposed method in this paper aims at satellite communication of high-security level, improving the quality of confidential data transmission. Compared with conventional steganography and zero-steganography, the proposed algorithm makes no changes to the cover image and generates no auxiliary data in the process of information hiding. This algorithm, which has been tested in MATLAB 2016b, shows high performance of resisting common image attacks. Therefore, the proposed algorithm is of good robustness and excellent anti-steganalysis. In terms of close and tight monitoring on communication links, this method leads to a strong applied value in the area of covert satellite communication.