Evolutionary algorithms deceive humans and machines at image classiﬁcation: An extended proof of concept on two scenarios

. The range of applications of Neural Networks encompasses image classiﬁcation. However, Neural Networks are vulnerable to attacks, and may misclassify adversarial images, leading to potentially disastrous consequences. Pursuing some of our previous work, we provide an extended proof of concept of a black-box, targeted, non-parametric attack using evolutionary algorithms to fool both Neural Networks and humans at the task of image classiﬁcation. Our feasibility study is performed on VGG-16 trained on CIFAR-10. For any category c A of CIFAR-10, one chooses an image A classiﬁed by VGG-16 as belonging to c A . From there, two scenarios are addressed. In the ﬁrst scenario, a target category c t (cid:54) = c A is ﬁxed a priori. We construct an evolutionary algorithm that evolves A to a modiﬁed image that VGG-16 classiﬁes as belonging to c t . In the second scenario, we construct another evolutionary algorithm that evolves A to a modiﬁed image that VGG-16 is unable to classify. In both scenarios, the obtained adversarial images remain so close to the original one that a human would likely classify them as still belonging to c A .


Introduction
Fast and accurate image classification has become an important topic in a series of industrial (automation, robots, self-driving cars, etc.), security and military (monitoring, face recognition, etc.) domains, just to mention a few [12]. Neural Network (NN) approaches to image classification have outperformed traditional image processing techniques, becoming today's most effective tool for these tasks [29]. Concretely, a NN is trained on a large set of examples. During this process, the NN is given both the input image and the output category it is expected to associate with the input image. Based on this accumulated "knowledge", the NN can then classify new images into categories with high confidence.
However, the process leading to the classification by a NN still seems to differ significantly from the way humans perform the same task. It is therefore tempting to exploit this difference and cause the NN to make classification errors. Indeed, trying to fool NNs has become an intensive research topic [19,18,11], since the vulnerabilities of NNs used for critical applications can have disastrous consequences, and lead to harmful decisions. It is hence important to better understand these weaknesses.
The present work is a contribution to this understanding. It provides an effective method to construct adversarial images without prior knowledge of the features of the NN. The proof of concept of this method is achieved by a test against a concrete NN. More concretely, our work further substantiates our research program [3] to construct Evolutionary Algorithms (EA) which create images to deceive machines and humans at image classification. The present article, extending significantly [4], describes the construction of two such EAs (that actually differ only by their fitness function, and are variants of those used in [2]) in the following context. Trained on an image dataset, a given NN sorts images into categories. For any category c A , one chooses an image A classified by the NN as belonging to c A . From there, two scenarios (see [3] for more) are considered. In the first "target" scenario, a target category c t = c A is fixed a priori. We construct the evolutionary algorithm EA target d , that evolves A to a modified image that the NN classifies as belonging to c t . In the second "flat" scenario, we construct the evolutionary algorithm EA flat d that evolves A to a modified image that the NN is unable to classify with certainty to any of the categories. In both scenarios, the EAs are used with two different similarity measures d (L 2 and SSIM), that assess differently the proximity between the created adversarial images and the image A. In each case, a human would likely classify the obtained adversarial images as still belonging to c A .
The strategy being set, experiments are performed with the neural network VGG-16 [25] trained on the image dataset CIFAR-10 [14] to sort images into = 10 categories. The outcome is that the algorithms EA target d and EA flat d do create adversarial images that, VGG-16 misclassifies in the former case, and that VGG-16 cannot classify with certainty in any category in the latter case. The evolved images remain similar to the original ones. Our results give substantial reasons to believe that humans would not do the same misclassification as VGG-16, and would hence label the modified images as belonging to the original category, both for the "target" scenario and for the "flat" scenario.
The remainder of this article is organised as follows. Section 2 summarizes the necessary background about NNs and EAs, and positions our evolutionary algorithm approach as a a black-box, targeted, non-parametric attack. Section 3 details the two scenarios considered, and the strategy to address them. In particular, it details the generic fitness functions used in EA target 2 Neural Networks, Evolutionary Algorithms and typologies of attacks

Neural Networks and Evolutionary Algorithms in a nutshell
We give here the very minimum background about Neural Networks (NNs) and Evolutionary Algorithms (EAs) needed to position our contribution (section 3) in the context of some related work (2.2).
A Neural Network can be considered here as a black box (see [10] for a detailed description of NNs in the context of Deep Learning), which takes an image p as input and produces the corresponding output, consisting of a vector o p of length . Each position 1 ≤ i ≤ in the output vector represents a category c i of objects defined in a dataset. Since the vector components are probabilities, one A NN is first trained on images of the dataset. Once trained, the NN produces a label for an image p by extracting the category for which the probability is highest, e.g.
Evolutionary Algorithms search for solutions to optimisation problems by imitating natural evolution (see [26, sec. 3] for a quick introduction. For a general introduction to evolutionary algorithms, one can consult [13] and [35]). Whereas NNs are trained on a set of examples, EAs are heuristic and search for solutions in the entire image space. Each generation of the algorithm presents a new, generally improved population g i+1 , compared to the previous one g i . An EA mainly consists of a loop in which 1) an initial population of individuals is created or chosen; 2) each member of the population is evaluated through a fitness function; 3) a new population is created from the existing one, taking into account each individual's fitness.

Context and typologies of attacks
Consider a sample x, which is classified correctly by M , a machine learning model (MLM ), as M (x) = c true (for instance the classification of an image by a NN). To fool this MLM, one can create an adversarial sample x (this concept was introduced by Szegedy et al. [28]) as the result of a tiny (and unnoticed, ideally) perturbation of the original sample x, such that M misclassifies it, i.e. M (x ) = c true . Let us classify these attacks here according to two criteria: whitebox vs black-box attacks and targeted vs non-targeted attacks.
White-box and black-box attacks depend on the amount of information about the MLM available to the adversaries. In white-box attacks, the adversaries have full access to the MLM. For instance, if the MLM is a NN, they know the training algorithm, the training set and the parameters of the NN. The adversaries analyse the model constraints and identify the most vulnerable feature space, altering the input accordingly [21,28]. In black-box attacks, the adversaries have no knowledge about the MLM (the NN in our context) and they instead analyse the vulnerabilities by using information about the past input/output pairs. More specifically, the adversaries attack a model by inputting a series of adversarial samples and observing corresponding outputs.
Previous attempts [23,7] to deceive NNs dealt primarily with untargeted attacks. In these, the task is to take some input x, which is correctly labeled as c true by a NN, and to modify it towards an x with distance d(x, x ) < e, which is labeled as a different class than c true . For example, a small change imperceptible to humans misleads the network into classifying an image of a tabby cat as guacamole [24]. A targeted attack, while usually harder to perform, is able to produce adversarial samples for a specific target class. Given classes c a and c b , with an input x classified as belonging to c a , a targeted attack produces a similar x , which the NN classifies as belonging to c b .

Evolutionary Algorithms leading to black-box, targeted, non parametric attacks
Within the category of black-box attacks, which are attempted in this paper, there are multiple methods to fool a NN. Most of these approaches, however, are still reliant on the model and its parameters [30,22]. Bypassing them might be achieved merely by training a NN to recognise the product of these methods. By contrast, EAs are extremely flexible and adaptive. Our contribution positions EAs as an alternative approach that has the advantage of being non-parametric (see [34], [16] for other non-parametric models for adversarial attacks). Our nonparametric model presents no bias, unless of course such a bias is programmed in the EA.
For EAs, among the most important aspects are the definition of their target and surrounding landscape, both done through the fitness function. The fitness function also reveals the main difference between our paper and the previous work on adversarial attacks using evolutionary algorithms. While Su et al. [26] use an EA to produce misclassifications by only changing one pixel per input image, our method does not limit the number of pixels which can be modified. Limiting the number of modified pixels leads to higher perturbation values for a few pixels, which become very different from their surroundings. However, our focus is on making the modifications less visible by humans, and hence the goal is to limit the overall value change, rather than the pixel count.
The following section describes our black-box, targeted, non parametric attack. More precisely, it provides the construction of our evolutionary algorithms, that create adversarial images which aim to fool NNs, and human beings.

Scenarios, Strategy and Implementation
Our algorithm is an EA that aims to deceive both NNs and human beings. Clearly, the design of the EA depends on the scenario governing this dual deception. The two scenarios addressed here (see [3] for others) start the same way. One is initially given an image, the "ancestor" A, labelled by the NN as belonging to c A .
The first scenario is the "target" scenario. A target category c t = c A is chosen. The task of the EA is to evolve A to a new image D (a "descendant") that the NN classifies into c t , but in such a way that the evolved adversarial image D remains very similar to the ancestor A. With perturbations kept as least visible as possible, a human being should still consider D as obviously belonging to category c A .
In the second "flat" scenario, the task of the EA is to evolve A into a descendant D, that the NN is unable to classify with certainty to any specific category, in the sense that the NN ranges D to all categories with the same plausibility modulo a tiny and controlled margin. The same constraint of similarity between D and A as in the first scenario remains: a human being should still consider D as belonging to c A .  [2]. While their fitness functions differ, and hence so does the evaluation step, the population initialization and the evolution steps are similar for EA target d and for EA flat d .
Population initialisation. A population size being fixed, the initial population is set to a number of copies of the ancestor A equal to the chosen population size.
The Evaluation step consists in running the fitness function on all population individuals to measure how well each of them is approaching the goal of the evolution. Even if the fitness functions of EA target d and of EA flat d differ, they are similar conceptually, and the evolution aims at maximising their values. In both cases, as shown in sections 3.3 and 3.4, the fitness function is the sum of two components. One component of the equation defining the fitness function deals with deceiving machines (which differs according to the "target" or "flat" scenario), the other with deceiving humans. This latter aspect is addressed in section 3.2.
Evolution encompasses multiple steps: -Segregation. After evaluation, the scores are used to segregate the population into three classes: • the elite consists of the top ten individuals, which pass unchanged to the next generation • the "didn't make it" consists of the lower scored half of the population, which is discarded. It is replaced by the same number of mutated individuals from the elite and middle class • the middle class contains the remaining individuals -Mutations. Two types of mutation are considered, namely small and large scale ones: • For pixel mutations, a power law is used to randomly select the number of pixels to be mutated. By following a power law, this number is often small, encouraging exploitation. However, the occurrence of larger values also takes place, encouraging exploration and offering ergodicity properties. Once this number is selected, the pixels are randomly chosen and modified by a random -1 or 1, in order to maintain a high similarity between the images. • For circle intensifying mutations, the intensifying factor is chosen with a normal law centred on 1 with a standard deviation decreasing from 0.6 to 0.1 as the generation number increases. The radius and location of the circle are chosen uniformly random.
Individuals in the elite are not mutated. The members replacing the "didn't make it" group are all mutated, while half of the middle class members are mutated.
-Cross-overs occur after the mutation step. Two children are created simply by swapping a randomly selected rectangular area between two parents. The number of parents and the individuals are selected randomly. After the crossover, the parents are discarded and replaced by the children. This step is applied to all but the elite class.
As in [2], an equivalence is made between the population of the EA and a batch of the NN so that the NN can process the EA population in parallel, as a single batch, using a GPGPU.

Image similarity
The difference between two images (of the same size) i and i can be evaluated in many ways. But only some of them give a hint at the similarity between two images, as a human being would perceive it. We explore here two of them, namely d = L 2 and d = SSIM , that assess proximity in a different way. The former belongs to the family of L k norms acting on vector spaces. In the present context, the L k norms address performed modifications pixel for pixel. Based on a series of experiments, we found that there is no convincing advantage to use larger k's than k = 2. On the other hand, it is useful to consider an alternative measure, like SSIM, that assesses modifications performed on more structural components of a picture, rather than the pixel for pixel approach.
-The L 2 -distance, which calculates the difference between the initial and modified pixel values: where p j is the pixel of the image in j th position, and 0 ≤ ı[p j ] ≤ 255 is the corresponding pixel value of the image i. A minimisation of the L 2 -norm in the fitness function would lead to a minimisation of the overall value change in the images' pixels. -The structural similarity (SSIM [33]) method attempts to quantify the perceived change in the structural information of the image, rather than simply the perceived change. The Structural Similarity Index compares pairs of sliding windows (sub-samples of the images) W x and W y of size N × N : . ( The quantities µ x and µ y are the mean pixel intensities of W x and W y , σ 2 x and σ 2 y the variance of intensities of W x and W y , and σ xy their covariance. The purpose of c 1 and c 2 is to ensure that the denominator remains far enough from 0 when both µ 2 x and µ 2 y and/or both σ 2 x and σ 2 y are small. The N W window pairs to consider equals the number of pixels (times the number of colour channels if appropriate) of the picture cropped by a frame that prevents the windows from "getting out" of the picture. With pictures of size h × w × c and windows of size N × N , one gets: The SSIM value for two images i and i is the mean average of the values obtained for the N W window pairs (i k , i k ): Unlike the L 2 -norm, the SSIM value ranges from −1 to 1, where 1 indicates perfect similarity.
Whether for d = L 2 or d = SSIM , the EA might consider preferable to slightly modify many pixels, as opposed to changing fewer pixels but in a more apparent way. This contrasts with the approach of Su et al. [26], where only one pixel is modified, possibly being assigned a very different colour that can make it stand out.

The fitness function of EA target d
The fitness function performing the evaluation in the "target" scenario combines the two following factors. On the one hand, the evolution is directed towards a larger classification of the images as belonging to c t . On the other hand, similarity between the evolved and ancestor images is highly encouraged. Our fitness function therefore depends on the type of similarity measure that is used.
If using the L 2 -norm, it can be written as where ind designates a given individual (an image), g i the i th generation, and o ind [c t ] designates the value assigned by the NN to ind in the target category c t . The quantities A L2 (g i ), B L2 (g i ) > 0 are coefficients to weight the members and balance them. They vary with the generations dealt with by the EA, since we chose to assign different priorities to the different generations. The first generations were thus assigned the task of evolving the image to the target category, while the later generations focused on increasing the similarity to the ancestor, while remaining in c t .
In the case of structural similarity, a higher value translates into a higher fitness of the individual. Therefore, mutatis mutandis, the difference in the fitness function becomes a sum: 3.4 The fitness function of EA flat d In the "flat" scenario, the fitness function also combines two factors. On the one hand, the evolution is directed towards a "flat" classification of the image in all categories c 1 , · · · , c . The measure D flat of the "flatness" of a classification is defined by the equation (7), where flat is a vector of values, all set to flat[k] = 1 .
A larger value of D flat (ind) means that ind is further away from the desired "flatness". Similarity between the evolved and ancestor images is highly encouraged. For d = L 2 , our fitness function is given by: and for d = SSIM by: 4 Dataset, Neural Network Architecture and Parameters of the two EAs

VGG-16 trained on CIFAR-10
The feasibility study regarding the two scenarios is tested against one concrete example: CIFAR-10 and VGG-16. The dataset CIFAR-10 [14] contains 50, 000 training images and 10, 000 test images of size 32x32x3. CIFAR-10 sorts = 10 categories (see Table 1) into two groups: 6 categories (c 3 , c 4 , c 5 , c 6 , c 7 and c 8 ) form the group of animals, and 4 categories (c 1 , c 2 , c 9 and c 10 ) the group of objects. Table 1. CIFAR-10.-For 1 ≤ i ≤ 10, the 2 nd row specifies the category ci of CIFAR-10. The 3 rd row specifies the numbering of the image belonging to ci, taken from the test set of CIFAR-10, and used as ancestor in our experiments. These images are pictured on the diagonal in Figures 10, 11, or on the first row of Figure 12 in the Appendix. VGG-16 [25] is a convolutional neural network (CNN) that passes input images through 16 layers to produce a classification output. As shown in Figure 1, the model consists of 5 groups of convolution layers and 1 group of fully-connected layers. Each convolution filter has a kernel size of 3 × 3 and a stride of 1. Meanwhile, pooling is applied on regions of size 2 × 2, with no overlap. Since VGG-16 was initially designed for the ImageNet dataset [6], a series of adjustments were necessary for its use with the CIFAR-10 dataset [9]. The CNN used here was therefore an adapted VGG-16 architecture obtained through the steps described in [17]. Specifically, VGG-16's input size was adjusted to 32 × 32 × 3, Batch Normalization layers were added before every nonlinearity and the first two fully-connected layers were reduced in size from 4096 to 100. Moreover, dropout was added to all 6 groups of the network with the following rates: 0.3 for the first 3 convolution groups, 0.4 for the fourth group, 0.5 for the fifth group and 0.5 for the fully-connected layers.
We made use of this adjusted VGG-16 pre-trained on CIFAR-10 with a validation accuracy of 93.56% [9]. The same pre-trained model was used throughout the evolutionary algorithms EA target d and EA flat d .

EA Parameters
The implementation of EA target d and of EA flat d was done from scratch, using Python 3.7 with the NumPy [20] library. Keras [5] was used to load and run the VGG-16 [25] model, and Scikit-image [32] to compute SSIM (using Scikitimage's default options). We ran our experiments on nodes with Nvidia Tesla V100 GPGPUs of the Iris HPC cluster at the University of Luxembourg [31].
Both EAs run with a population of size 160, the ancestor images being chosen from the CIFAR-10 [14] test set. For any source category out of the 10 categories of CIFAR-10, a random image was selected from the 1000 test images belonging to that category. This image was then set as the ancestor for both EAs. The specific ancestor images used in our experiments are referred to in Table 1's last row, and pictured in the appendix section A.
In the "flat" scenario, A flat d (g i ) and B flat d (g i ) take constant values, independent of g i . In the "target" scenario, the values of A target d (g i ) and of B target d (g i ) vary, depending on the generation, although they do so in a different way.
"Target" scenario: the target category was selected from all labels, excluding the source category. This led to 90 (source, target) couples of categories al- together (10 source categories and 9 different target categories for each source category), and therefore to 90 adversarial images. For any g i , we set B target d (g i ) = 10 − log 10 (d(ind,A)) for d = L 2 or d = SSIM (in this latter case, one assumes that SSIM (ind, A) > 0, meaning that ind and A are close enough). The value of The goal for the target class probability was set to 0.95, meaning the algorithm would stop when the fittest evolved image reached this value. To illustrate, Figure 2 shows an original image in the category "dog" and evolved images classified by VGG-16 as the target category "horse" with probabilities 0.5, 0.9, and 0.95. They were created by EA target  Figure 3 shows the graph of the source "dog" and target "horse" class probabilities obtained along the evolution referred to in Figure 2. The paths of the two probabilities appear to be the inverse of each other, with their sum remaining almost constant at a value of about 1.0 throughout the evolution process. This suggests that the increase of the target class probability and the decrease of the source class probability take place at the same pace.   Table 2.

Results
Depending on the chosen source and target categories, reaching the probability of 0.95 required between 5 and 124 generations, for the 90 altogether evolutions, as specified in Appendix section (in Figure 10 with L 2 and Figure 11 with SSIM). The average computation time per generation is 0.03 ± 0.01s for L 2 and 0.15 ± 0.02s for SSIM. Organizing the 10 categories of the CIFAR-10 dataset into the group of animal categories and the group of object categories, the animal →animal and object→object evolutions took fewer generations than the animal →object and object→animal ones. The required number of generations varied significantly not only with the category of an ancestor, but also with the particular image extracted from a given category. Table 3 presents the mean number of generations for the 4 combination types for the 90 evolutions of Figure 10 and of Figure 11 in Appendix.  When one compares the adversarial images created by EA target L2 and by EA target SSIM , neither appears in general to be clearly better than the other (see Figures 2 and  4 for example, as well as Figures 10 and 11 in Appendix) for the human eye. Both EAs were able to produce misclassification with high confidence, while keeping the adversarial image very close to the original. In some image areas, the similarity to the original is higher with L 2 , while in other image areas it is higher with SSIM . One should note, however, that the final aspect of the evolved image does not only depend on the used similarity measure. The randomness inherent to the evolution process may indeed contribute to the observed differences.
To summarize, the algorithm EA target d (for d = L 2 and d = SSIM ) achieved the objective set in the "target" scenario. Adversarial descendant images were constructed, which VGG-16 labeled in the target category with a probability exceeding 0.95, while simultaneously remaining highly similar to their ancestors for the human eye. However, they are not entirely indistinguishable, the modified image being usually noisier than the unchanged one. Section 5.2 addresses this issue specifically.

Visualising the performed modifications: SSIM vs L 2
In order to visualise the modifications that were performed, the difference between the descendant and ancestor images was computed. An example is given in Figure 5, where this difference is displayed, both spatially as an image and as a plot of the difference between original and evolved pixels for the descendant.
Experiments show that the difference consists of noise distributed almost evenly across the image, with no particular area of focus. This noise is naturally not random, but fine-tuned by our EA (it has already been proven than random noise is not sufficient to deceive a NN, since they are only vulnerable to targeted noise [8]). The majority of pixels were modified by an absolute value lower than 10. Histograms of the pixel modification values were computed for both the L 2 -norm and SSIM (see Figure 6 for one example. Note however that the figures were averaged on six runs to reduce the possible impact of random fluctuations) for all 90 source-target combinations (Note that the Kullback-Leibler values given in Table 6 are performed only on one and not on six runs. Nevertheless, they remain small enough to indicate that the patterns are similar). The most prominent value is 0 in both histograms, corresponding to unchanged pixels.
Normalising the two histograms into probability densities, the Kullback-Leibler divergence [15] KL between them indicates the proximity of the pattern of the two sampled distributions. Since KL(p a ||p b ) ≥ 0 is not a symmetric function of the probability distributions p a and p b , one needs to compute two values. It turns out that in the case of Figure 6, the Kullback-Leibler divergence values between the probability distribution associated with the L 2 and SSIM histograms is 0.015, while the vice-versa value is 0.016, hence providing evidence that the  Figure 4. The image on the left gives an idea of the spatial distribution of the changes, each pixel being a combination of three channels, the range of each being given by the scales in the middle. The plot on the right ignores the 2D structure of an image to show more clearly by how much each of the 1024 pixels is changed on the three different channels. Despite the goal oriented nature of the evolution, these changes look like noise, almost evenly distributed across the image. L2 and SSIM exhibit slightly different patterns, with noise peaking at different pixels. For this particular combination of ancestor and descendant images, an increase in the contribution of the blue channel is observed when replacing L2 with SSIM.
patterns are indeed very similar.
However, the SSIM results are more symmetrical than the L 2 -norm ones (see Figure 6 for an example of this phenomenon). Although a human being is unlikely to perceive a difference between the descendant images obtained either with L 2 or with SSIM , the histograms (see again Figure 6 for an example) indicate that EA target L2 tends to leave more pixels unchanged than EA target SSIM .
6 Running EA flat d : Examples, Results and Discussion of Table 4 for L 2 and of Table 5 for SSIM. These label values are 0.100 with extremal variations of +0.007 and −0.017 with L 2 , and +0.023 and −0.015 for SSIM. Note that these extremal values occur for both d's for the same categories c 4 and c 5 . The image pixel modifications necessary to reach these label values are visualised in Figure 8 for both L 2 and SSIM. The probability densities of L 2 and SSIM are highly similar, leading to Kullback-Leibler divergences of 0.004. Figure 9 shows the evolution of the class probabilities outputted by VGG−16 during the "flattening" of the dog ancestor image with L 2 and SSIM. One sees that the label value of a first category, namely the category c 4 , takes off (around 20 generations) while the label value of c A = c 6 decreases, so that the sum of both label values is around 1, while the label values of the other categories remain insignificant. Note that in this process, the label value of c 4 exceeds largely that of c 6 . Then the label value of a second category, namely c 8 , takes off (around 80 generations), while the label values of c 6 and c 4 decrease (with a similar phenomenon as before, namely the label value of the newcomer c 8 exceeds the label values of c 6 and of c 4 ).

Results
Depending on the chosen source category, reaching almost flatness required between 142 and 552 generations for the 10 altogether evolutions, as specified in the Appendix section ( Table 9). The average time required per generation was 0.04±0.01s with L 2 and 0.17±0.02s with SSIM. For the "flattening" process, the "horse" category took the fewest number of generations, and the "deer" category Table 4. "Flat" scenario: Label values predicted by VGG−16 for the 10 different "flattened" images, using L2. For any row 1 ≤ i ≤ 10 one considers the adversarial descendant image created by EA flat L 2 and pictured on the i th position on the 2 nd row of Figure 12. For 1 ≤ j ≤ 10, the value given on the j th column is the label value for the category cj output by VGG-16 for this adversarial image. The column D flat gives the value of the function D flat for the descendant "flat" image obtained by EA flat L 2 . The columns ∆ + and ∆ − indicate the maximal deviation exceeding 0.100 from above or from below in the row.  Table 5. "Flat" scenario: Label values predicted by VGG−16 for the 10 different "flattened" images, using SSIM. For any row 1 ≤ i ≤ 10 one considers the adversarial descendant image created by EA flat SSIM and pictured on the i th position on the 3 rd row of Figure 12. For 1 ≤ j ≤ 10, the value given on the j th column is the label value for the category cj output by VGG-16 for this adversarial image. The column D flat gives the value of the function D flat for the descendant "flat" image obtained by EA flat SSIM . The columns ∆ + and ∆ − indicate the maximal deviation exceeding 0.100 from above or from below in the row.  the largest number of generations, at least with the ancestor pictures taken in these categories.
When one compares the adversarial images created with EA flat L2 and those created with EA flat SSIM , neither appears better than the other (see Figure 7 for an example of the flattening of an ancestor in the "dog" category, and Figure 12   and −0.022 for SSIM). On the other hand, when one takes into account the starting points 10 −6 in most cases, of the label values of the categories distinct from the ancestor category, it is fair to consider that reaching label values so close to 0.1 modulo ∆ + and ∆ − indeed makes our point. We nevertheless come back to this aspect in the conclusion part.

Visualising the performed modifications
Like for the target scenario, we studied the way noise is distributed. Histograms of the pictures' modification values exhibit a bell shape, for both the L 2 norm and SSIM (see Figure 8 for one example, again with numbers averaged on six runs to reduce the potential impact of random fluctuations). The Kullback-Leibler divergence values computed provide again evidence that the patterns are indeed very similar. Note that the Kullback-Leibler values given in Table 8 are performed for all 10 possibilities of the "flat" scenario, but only on one and not on six runs. Therefore the values are larger than they would be on an average of six runs. Nevertheless, they remain small enough to lead to the same conclusion, namely that the patterns are similar.
Although the evolutions of the class probabilities corresponding to the 10 "flattened" images have different patterns, it is a general rule that during the initial generations only a few classes dominate, interchanging their order. More precisely, similar to what happens in Figure 9 for the "flat" scenario with the dog ancestor pictured in Figure 7, where the successive label values "taking off" are first those of animal categories, the first label values taking off are those of the categories which, excluding the ancestor class, rank highest in the classification of their corresponding ancestor image, thus having a higher starting point in both the L 2 and SSIM evolutions. They typically belong to the same animal or object category as the ancestor class.

Conclusions and Future Work
Pursuing the research program announced in [3], this work substantially complements [4] by demonstrating the validity of an approach using evolutionary algorithms to produce adversarial samples that deceive neural networks performing image recognition, and that are likely to deceive humans as well. Our two evolutionary algorithms, EA target d and EA flat d , that differ by their fitness functions, successfully fool, for two "target" and "flat" scenarios, the neural network VGG-16 [25], trained on the dataset CIFAR-10 [14] to label images in 10 categories. The similarity between the adversarial images and the original ones, aiming at ensuring that humans would still classify the modified image as belonging to the original category, is measured by two "distances" d, namely d = L 2 and d = SSIM. These distances differ conceptually, since they assess different quality features of pairs of images. An outcome of our experiments (thanks to the computation of the Kullback-Leibler divergence values, the number of generations required, etc.) is that none seems qualitatively better than the other. Furthermore, experiments performed with L 2 tend to be 4 to 5 times faster than SSIM. Therefore, as a consequence, the choice of L 2 rather than SSIM seems a reasonable trade-off. The study shows that taking advantage of SSIM requires at least to introduce mutations that would not impact the values of L 2 and those of SSIM the same way.
While we consider that our point is fully made by EA target d in the context of the "target" scenario, at least for VGG-16 and images from CIFAR-10, we intend to perform a more in-depth study of the "flat" scenario, let alone because this latter scenario seems harder to fulfill than the former one. We would like to further explore the balance between the three following components: the size of the ∆ + and ∆ − values given in Tables 4 and 5, the proximity between the evolved pictures and the ancestor pictures, and the size of the pictures. This study, that may lead to different penalty values A flat d (g i ) and B flat d (g i ), will not be limited to pictures of CIFAR-10 size, but will consider larger pictures as well. Said otherwise, such a study may provide an indication about the maximum amplitudes of the ∆ + and ∆ − values around 1 that one can not diminish without compromising the proximity of the descendant pictures with the ancestor pictures. It would be interesting to get a heuristic bound on this amplitude, not only with respect to 1 , but also in terms of the size n × n of the pictures considered, with other (n, ) values than the (32, 10) case of VGG-16 trained on CIFAR-10 considered in this paper.
For both scenarios, our current confidence that the similarity aspect between the original and adversarial images is indeed satisfied is limited in the following sense. While all three authors of this paper classified the modified images as belonging to their original categories, three people represent a small sample. Therefore, part of an on-going work is to conduct a statistically significant study to check whether our evolutionary algorithms do what we expect from them, even if the three of us believe that it does.
Although our algorithms successfully managed to deceive the neural network, while producing adversarial images similar to the original, a closer comparison of the original and modified images reveals the noisiness of the evolved images. This noisiness is noticeable here as CIFAR-10 images have a low resolution. We observed in this paper, that the noise, although appearing to be random and evenly distributed across the image, is targeted. This leads to three additional research directions, that we plan to explore.
The first direction aims at explaining why these small perturbations of the input are able to produce misclassifications with high confidence. In particular, results for the "target" scenario do not indicate that EA target d creates a shape pattern to fool VGG-16, but rather acts on texture. Note that this hypothesis is supported by the phenomenon observed on the evolved images created by EA flat d in the "flat" scenario as well (with the order of the successive "rising categories" being close to the ancestor category from a texture point of view).
The second direction aims at producing adversarial images, that are almost entirely indistinguishable from the original for a human eye. Towards this goal, several steps could be taken, such as optimizing the EAs with different mutations or using other datasets, notably with larger images. As a complement to this, one of the main advantages of our evolutionary algorithms is the ability to treat the neural network as a black box, with no required knowledge of its architecture or parameters. We hence intend to extend the present study to more CNNs (Inception [27], etc.) trained on larger datasets (CIFAR-100 [14], Ima-geNet [6], etc.) with images of larger resolution.
The third direction aims at confronting the resistance to filters (e.g. denoising filters like the median filter) of the adversarial images constructed by our EAs compared to adversarial images obtained by other methods, for instance those of [26].
Once these studies performed, we will be armed to conduct a thorough comparison between our evolutionary algorithm approach and other adversarial image generation approaches listed for instance in the survey [1].

A Appendix
A.1 "Target" scenario Table 6. "Target" scenario.-For i = j, the element at the intersection of the i th row and j th column is 10 3 KL pL 2 (ci → cj)||pSSIM (ci → cj) , 10 3 KL pSSIM (ci → cj)||pL 2 (ci → cj) , where KL pL 2 (ci → cj)||pSSIM (ci → cj) is the Kullback-Leibler divergence computed between the L2 and the SSIM probability densities of the normalisation of the histograms representing the changes in pixel intensities through the ci → cj evolution of the ancestor Ai on i th diagonal position in Figure 10 (and Figure  11). Mutatis mutandis KL pSSIM (ci → cj)||pL 2 (ci → cj) .  Table 7. "Target" scenario.-The pair of integers at the intersection of the i th row and j th column (for i = j) represents the number of generations necessary to create the adversarial image with in the evolution ci → cj, as specified in Figure 10 with L2 (left-hand side of the pair) and in Figure 11 with SSIM (right-hand side of the pair). A.2 "Flat" scenario Table 8. "Flat" scenario.-For 1 ≤ i ≤ 10, the element in i th position in the 2 nd row is 10 3 KL pL 2 (ci)||pSSIM (ci) , 10 3 KL pSSIM (ci)||pL 2 (ci) , where KL pL 2 (ci)||pSSIM (ci) is the Kullback-Leibler divergence computed between the L2 and the SSIM probability densities of the normalisation of the histograms representing the changes in pixel intensities through the ci → "flat" evolution of the ancestor Ai on i th position on the first row in Figure 12. Mutatis mutandis KL pSSIM (ci)||pL 2 (ci) .  Table 9. "Flat" scenario.-The pair of integers on the 2 nd row represents the number of generations necessary to create the adversarial image in the evolution ci → cj, as specified in Figure 12 with L2 (left-hand side) and SSIM (right-hand side).  to Ai, that VGG-16 classifies as belonging to cj. Fig. 11. "Target" scenario, case SSIM .-Pictures on the diagonal are the ancestors Ai belonging to the category cA i = ci, for 1 ≤ i ≤ 10. On each row 1 ≤ i ≤ 10, the picture on the j th column, with j = i, is the descendant picture Dij, obtained by applying EA target SSIM to Ai, that VGG-16 classifies as belonging to cj.  Figures 10 and 11). For 1 ≤ i ≤ 10, the picture in i th position on the 2 nd row is the adversarial descendant picture obtained by applying EA flat L2 to Ai, and that VGG-16 is unable to classify with certainty. Mutatis mutandis 3 rd row with EA flat SSIM .