Facelikeness matters: A parametric multipart object set to understand the role of spatial configuration in visual recognition

ABSTRACT There is a view that faces and objects are processed by different brain mechanisms. Different factors may modulate the extent to which face mechanisms are used for objects. To distinguish these factors, we present a new parametric multipart three-dimensional object set that provides researchers with a rich degree of control of important features for visual recognition such as individual parts and the spatial configuration of those parts. All other properties being equal, we demonstrate that perceived facelikeness in terms of spatial configuration facilitated performance at matching individual exemplars of the new object set across viewpoint changes (Experiment 1). Importantly, facelikeness did not affect perceptual discriminability (Experiment 2) or similarity (Experiment 3). Our findings suggest that perceptual resemblance to faces based on spatial configuration of parts is important for visual recognition even after equating physical and perceptual similarity. Furthermore, the large parametrically controlled object set and the standardized procedures to generate additional exemplars will provide the research community with invaluable tools to further understand visual recognition and visual learning.

There is a long standing view that the adult human brain processes faces with neural mechanisms that are different from those which process other objects (see reviews and arguments in Biederman & Kalocsai, 1997;Ellis & Young, 1989;Farah, Wilson, Drain, & Tanaka, 1998;Kanwisher, 2000;McKone & Robbins, 2011;Nachson, 1995). This face-specific view is rooted in neuropsychology (i.e., the study of prosopagnosia following brain damage; Bodamer, 1947) and corroborated by behavioural and neural evidence (e.g., Allison, Puce, Spencer, & McCarthy, 1999;Bentin, Allison, Puce, Perez, & McCarthy, 1996;Busigny, Graf, Mayer, & Rossion, 2010a;Jeffreys, 1996;Sergent, Otha, & Mac-Donald, 1992;Tanaka & Farah, 1993;Yin, 1969). However, some researchers propose that faces are processed by different mechanisms predominantly because faces form a category whose members are more physically similar to each other compared to most other categories, and faces need to be discriminated at a finer-grain level than objects in most other categories. According to this similarity-based view, other objects that have these constraints will recruit the same mechanisms used to process faces (Faust, 1955; see Damasio, Damasio, & Van Hoesen, 1982;Gauthier, Behrmann, & Tarr, 1999a, for more recent evidence and a clear articulation of this view). This view can be traced to the early neuropsychological observation of brain-damaged individuals with prosopagnosia who had difficulties differentiating various exemplars of chairs for instance (Faust, 1955;see Bornstein, 1963;Clarke, Lindemann, Maeder, Borruat, & Assal, 1997;Cole & Perez Cruet, 1964;De Renzi, Faglioni, & Spinnler, 1968). In line with this neuropsychological work, neuroimaging studies have shown increased activation in face-selective brain areas when observers discriminate visually similar objects ("subordinate level of categorization"; Gauthier, Anderson, Tarr, Skudlarski, & Gore, 1997).
The similarity-based hypothesis has been criticized because previous studies have not systematically equated physical and perceptual similarity between faces and other objects (Busigny et al., 2010a). Moreover, a possible explanation to account for some of the "facelike" results observed, particularly with novel stimuli such as the widely used "Greebles" , comes from a suggestion by Biederman and Kalocsai (1997). These authors suggested that the physical resemblance between nonface objects and faces may play a role in visual recognition and visual learning. For example, human adults can spontaneously see faces in everyday objects predominantly by virtue of features arranged into a spatial configuration that can be mistaken for facial parts such as eyes, nose and mouth; as in face pareidolia (e.g., a cloud appearing like a face; Hadjikhani, Kveraga, Naik, & Ahlfors, 2009;Meng, Cherian, Singal, & Sinha, 2012). Thus, adults may use prior representations of familiar stimuli (e.g., faces) to help them process novel stimuli that physically resemble familiar stimuli in some way (e.g., parts arranged in a facial configuration). As Biederman and Kalocsai phrased it to describe the Greebles: … a set of stimuli composed of three rounded parts-a base, body, and head-one on top of the other, with protrusions that are readily labelled penis, nose, and ears. Unfortunately, these rounded, bilaterally symmetrical creatures closely resembled humanoid characters … This characteristic of the stimuli is termed unfortunate because even if face-or body-like results were obtained from the training, it would be unclear whether the stimuli engaged face or body processing because of their physical resemblance to people. (Biederman & Kalocsai, 1997, p. 1205italics added) In line with this suggestion, there is evidence that the physical resemblance of objects to faces can lead to facelike behavioural and neural responses for those nonface objects (e.g., Brants, Wagemans, & Op de Beeck, 2011;Caharel et al., 2013;Churches, Nicholls, Thiessen, Kohler, & Keage, 2014;Davidenko, Remus, & Grill-Spector, 2012;Gauthier, Behrmann, & Tarr, 2004;Hadjikhani et al., 2009;Liu et al., 2014;Meng et al., 2012;Rossion, Dricot, Goebel, & Busigny, 2011;Shafto, Pyles, Jew, & Tarr, 2015;Yue, Pourladian, Tootell, & Ungerleider, 2014; for evidence of this in infants, see Cassia, Turati, & Simion, 2004;Morton & Johnson, 1991). To date, however, no study has systematically manipulated physical resemblance to faces while equating physical and perceptual similarity as well as prior experience with the tested objects, therefore leaving open several interpretations of the effects. The reason for this gap in knowledge is the lack of an appropriate object set to manipulate these factors.
Novel objects can provide such a set. Researchers often use novel objects because they afford a tight control of stimulus properties and they minimize any influence of individuals' prior experience. Both factors are important in shaping the representations that support visual recognition in adulthood. For example, with novel objects, researchers can systematically manipulate the desired features (e.g., parts, spatial relationship between parts, etc.) while keeping others constant, and measure behavioural and neural responses to these manipulations. They can also systematically manipulate how physically similar one object is to other objects. By physical similarity we mean similarity based on physical measurements of the stimuli (e.g., comparing pixel values of images of objects). Several studies have shown that physical similarity can moderate performance on perceptual discrimination at different levels of abstraction (e.g., discriminating exemplars of a category to discriminating different superordinate-level categories; Cutzu & Edelman, 1996Nederhouser, Yue, Mangini, & Biederman, 2007;Op de Beeck, Baker, DiCarlo, & Kanwisher, 2006;Schultz, Chuang, & Vuong, 2008;Vuong, Friedman, & Read, 2012;Yue, Biederman, Mangini, von der Malsburg, & Amir, 2012) and that physical similarity can moderate neural responses (e.g., Brants et al., 2011;Davidenko et al., 2012;Schultz et al., 2008). These behavioural and neural studies further show that the perceptual similarity between two objects is related to their physical similarity.
Our purpose in this paper is twofold. First and primarily, we present a new parametric multipart threedimensional (3D) object set that provides researchers with a rich degree of control over a wide variety of features such as individual parts, global structure, spatial configuration of parts and texture patterns, all of which have been shown to be important for visual recognition. Second, we demonstrate with our novel objects that perceived facelikeness-particularly in terms of the spatial configuration of parts-can influence visual recognition even after equating physical and perceptual similarity.
The Greeble set, in particular, has been widely used to test the face-specific and similarity-based hypotheses, and to investigate the acquisition of perceptual expertise (Gauthier et al., 1999a(Gauthier et al., , 1999b. Greebles have physical aspects similar to faces: a dominant vertical axis, the same number of similar parts, a similar spatial configuration of those parts and so on. In most studies, Greebles are symmetric (for non-symmetric Greebles, see Rossion et al., 2004). All Greebles have one of five possible body shapes (i.e., reflecting the different families). However, critically, each Greeble has a unique set of individual parts (i.e., different than all other Greebles). Consequently, individual Greebles may be learned and recognized on the basis of a single diagnostic part, or the independent processing of each of these parts. Moreover, there is no systematic way to manipulate the physical similarity between Greebles. Thus, despite their wide use, it is not clear whether the behavioural and neural findings with Greebles were dependent on their perceived facelikeness (Brants et al., 2011) or other factors, such as having discriminative individual parts or varying levels of physical similarity. This may also have led to reported differences in neural findings using similar paradigms (e.g., compare Brants et al., 2011;Gauthier et al., 1999b). Our procedure to create novel objects addresses many of the limitations of Greebles and other novel object sets used in the past.
We first present a standardized procedure to create a high-dimensional feature space that can be used to parametrically generate novel objects. This parameterization allows us to generate an arbitrarily large set of multipart objects in which the physical similarity between objects can be equated. Importantly, we use parameters that define the shape of the body and parts so that we can independently and flexibly control the spatial configuration of the parts and the surface properties of the objects. Figure 1 presents an array of objects generated with this procedure to illustrate the range of objects that can be generated. As with the Greebles, we initially aimed at creating novel objects that capture many physical aspects of faces- Figure 1. Exemplars of the novel 3D parametric objects. These exemplars are shown in their nonfacelike orientation. We encourage the readers to turn the page upside down to see the same objects in their facelike orientation. particularly a systematic spatial configuration of parts, but which, nevertheless, would not look like faces. Quite serendipitously, we noticed that our objects appeared facelike in one of the picture-plane orientations. To illustrate, the objects in Figure 1 are presented in their nonfacelike orientation. We encourage readers to turn this figure upside down so that the perceived facelikeness of the objects becomes immediately apparent. This perceived facelikeness seems to be predominantly driven by the spatial configuration of parts which "resembles" the spatial configuration of an upright face. However, they are not perceived as facelike in the nonfacelike orientation, unlike inverted pictures of face photographs for instance. Thus, in our critical Experiment 1, we were able to use this inversion manipulation to test whether performance on a demanding perceptual matching task differed for perceptually facelike compared to nonfacelike objects, in which the two object types differed only in terms of their picture-plane orientation. Although the stimuli are physically identical in the two orientations, their perceived similarity may still differ because participants perceive them to be facelike in only one of the orientations. We therefore next report two control experiments to measure the perceptual similarity for these objects in both orientations. In Experiment 2, we use a discrimination task to implicitly measure perceptual similarity; in Experiment 3, we use an explicit similarity rating task. To help us better understand the physical properties that may drive perceptual similarity, we follow previous work and correlate perceptual similarity with different measures of physical similarity (e.g., Yue et al., 2012). We hypothesized that facelikeness will facilitate performance on the perceptual matching task (Experiment 1) but that the perceptual similarity of the objects at the two orientations will not differ (Experiments 2 and 3).

Overview
To generate the objects illustrated in Figure 1, we created a high-dimensional feature space in which the dimensions of the space are defined by a large number of 3D shape parameters (e.g., Biederman, 1987;Blanz & Vetter, 1999;Giese & Poggio, 2000;Op de Beeck et al., 2006;Schultz et al., 2008;Vuong et al., 2012;Wong et al., 2009). Objects can then be represented as points in this high-dimensional parameter space; that is, they vary in the values for each parameter. By sampling this space, researchers can generate an object set of arbitrary size and physical similarity. There is, however, a large possibility for sampling this space. To constrain the sampling procedure, a small number of prototypes within the space are first generated; each prototype is defined by a vector of parameter values. New objects can then be generated by taking a weighted average of the prototypes' parameter values (i.e., morphing; Giese & Poggio, 2000).

Initial object specification
For our object set, we constrained the objects to be mono-oriented, to be roughly symmetric about their vertical midline, to have the same number of parts and to have the same spatial configuration of these parts. These constraints meant that these objects share general physical characteristics with faces. As shown in Figure 2A, each object comprised a large central body with five smaller parts attached. The volume of the parts was approximately 79-95% smaller than that of the body. Three parts (1-3) were attached below each other in the top half (Part 3 extended below the midline); these three parts defined the "frontal" (0°) viewpoint. Parts 4 and 5 were attached to the bottom half next to each other at the left and right edges, respectively.

Parameter space definition
The body and parts were defined by 19 parameters, such as the two-dimensional (2D) shape of the crosssection (from circle to square), amount of bending, and amount of tapering, thereby creating a 19-dimensional parameter space. Table 1 lists the 19 parameters and their (arbitrary) value range. Figure 2B shows a three-parameter subspace to illustrate how varying the parameter values affects the 3D shape of the body. Additional manipulations were made to these objects so that they would further share physical properties with faces. To emulate the fact that metric spatial relationship between facial parts differ between human individuals (Sheehan & Nachman, 2014) and that faces have small and variable bilateral asymmetries along the vertical midline, the position of the parts (1-3 individually and 4 and 5 together) can be slightly jittered randomly in the x-and y-dimensions along the surface of the body (-0.4 to 0.4 arbitrary units) and fractal noise (roughness: 0.7; iterations = 6) can be added to the body and parts to introduce asymmetry. The edges of the objects were smoothed using the relax modifier (value 0.5; iterations = 1) to create curvilinear contours (Shafto et al., 2015;Yue et al., 2014).

Prototype generation
We next generated prototypes within our 19-dimensional parameter space. For this purpose, we first normalized all parameters to the range 0-1. The average object was at the centre of this space, with 0.5 (normalized) as a value for all 19 parameters. We then created 12 prototypes that met the following constraints: (1) The parameters for the body, Part 1, and across Parts 2-4/5 were equidistant to the average object; and (2) the shape of the body and the parts was unique for each prototype.

Parameter space morphing
We paired the 12 prototypes in all 66 possible combinations and morphed the body and each part from 0% (i.e., Prototype A) to 100% (i.e., Prototype B) in 5% increments. That is, we took a weighted average of corresponding parameter values between the two prototypes (Giese & Poggio, 2000). For example, given two prototypes, P A and P B , a morph, M, can be created by the weighted average: The weight, c, which ranges from 0 to 1, represents the relative contribution of each prototype to the final morphed object. In this case, if c is close to 1, the morphed object will appear physically more like P A than P B . Figure 3 presents an example morph continuum between two prototypes.

Texture mapping and image rendering
To add surface properties such as colour and texture patterns, we used the texture images from Vuong et al. (2005). The textures were randomly selected and randomly phase scrambled for each object ote: The units are arbitrary unless specified otherwise. For the body and parts, the order of the modifiers will affect the final 3D of that body/part. The modifiers are applied in the order from top to bottom. See also Figure 2.
before they were texture-mapped onto the body and the parts so that each object had a unique texture pattern. The body and Part 2 had a light texture pattern, whereas the remaining parts had a dark texture pattern (see Figures 1-3). The objects were rendered from 24 viewpoints (in increments of 15°) against a uniform black background as 500 pixels × 500 pixels images. Figure 3 also shows an object from five different viewpoints.

3D Models and scripts
The objects were constructed using 3D Studio Max (Autodesk Entertainment Creation Suite, Ultimate 2013; San Rafael, CA, USA). We used custom scripts in MATLAB (MathWorks, Natick, MA, USA) and the native scripting language in 3D Studio Max to facilitate stimulus generation. All 3D Studio Max models, rendered images, and scripts to automate some of the steps are freely available from the first author (QCV) upon request (they are also available at http:// reshare.ukdataservice.ac.uk/852397/).

Experiment 1
In the main experiment (Experiment 1), we used the 4alternative-forced-choice (4AFC) delayed perceptual matching task from Laguesse, Dormal, Biervoye, Kuefner, and Rossion (2012) to test whether perceived facelikeness influenced performance on a perceptual matching task. The task was made demanding by having participants match a target object to four possible probe objects at the same picture-plane orientation as the target but from different viewpoints. It is important to stress that the target and probe objects on a given trial were always presented at the same orientation so that, a priori, there was no reason to expect performance differences between the facelike and nonfacelike orientation. However, if facelike objects recruit to some extent face mechanisms (for instance), we predicted that performance would be better for facelike objects compared to nonfacelike objects.

Participants
Forty-six naïve participants from the University of Louvain and Newcastle University participated in Experiment 1. Their age ranged between 21 and 38 years. All participants in this and subsequent experiments provided informed consent and were naïve to the stimuli and purpose of the study. The ethics were approved by both the University of Louvain and Newcastle University local ethics committees.

Stimuli
For this experiment, we used the 25% and 75% morphs between prototypes to create four sets of 33 stimuli each. We randomly split the 25% morphs to create Sets 1 and 2 and the 75% morphs to create Sets 3 and 4. We randomly selected 26 stimuli in Figure 3. Exemplars can be systematically created by morphing between two prototype objects (top). The morph percentage is applied to all 19 shape parameters. Each exemplar can be rendered from different viewpoints (bottom).
each set to be used on experimental trials, and the remaining stimuli to be used on practice trials. Each participant was randomly tested with one of these four sets.

Design and procedure
In the 4AFC delayed matching task, participants were shown a target stimulus followed by four probe stimuli. Their task was to select which probe stimulus matched the target stimulus. On each trial, the target and probe stimuli were always shown from the same stimulus orientation (facelike or nonfacelike) and three possible viewpoints (−30°["left facing"], 0°[ "front facing"], and +30°["right facing"]). The four probe stimuli were always shown from the same viewpoint on a given trial. To avoid image matching and to increase the difficulty of the task, the target and probe stimuli were always shown from different viewpoints (e.g., target stimulus at 0°and all probe stimuli at +30°, or target stimulus at +30°and all probe stimuli at −30°). Thus, there were six possible non-matching view combinations. There was a total of 260 trials (26 stimuli × 2 stimulus orientations × 5 repetitions).
Although there were six view combinations, we randomly selected five of the six possible view combinations for each stimulus × orientation condition and for each participant to reduce the overall duration of the experiment. Figure 4 illustrates the trial sequence. Each trial began with a white fixation cross at the centre of the screen for 500 ms followed by a blank screen for another 500 ms. A target stimulus was then presented for 500 ms at the centre of the screen. After a 500 ms blank screen, the target stimulus was followed by four probe stimuli arranged in each quadrant of a 2 × 2 matrix. The participants' task was to decide which of the four probe stimuli matched the previously seen target stimulus (ignoring viewpoint changes). The probe matrix remained on the screen until participants responded. Participants were instructed to respond as quickly and as accurately as possible. Responses were made on a standard computer keyboard (v = bottom left, n = bottom right, f = top left, j = top right). All participants used their left index/middle finger to make f/ v responses and their right index/middle finger to make j/n responses. There were 20 practice trials with feedback (i.e., a 1500 Hz tone occurred when an incorrect response was made) to familiarize participants with the stimuli, procedure, and task. The practice trials were followed by the experimental trials in which no feedback was provided. Participants could take a short break after every 32 trials. The experiment took approximately 25-30 min to complete.
All experiments were conducted in a quiet, dimly lit room. They were programmed using MATLAB with the Psychtoolbox extension (Brainard, 1997;Kleiner, Brainard, & Pelli, 2007;Pelli, 1997). The stimuli were presented on a 19-in flat-panel monitor with a 1280 pixels × 1024 pixels resolution. The participants sat approximately 50 cm from the screen. At this distance, the objects subtended maximally 10.3°× 10.3°of visual angle.

Results and discussion
We analysed proportion correct (chance = .25), correct median response times (RT) and efficiency scores (correct median RT/proportion correct; Townsend & Ashby, 1983). Table 2 presents the mean and standard error of the means (SEMs) for proportion correct and correct RT as a function of stimulus orientation. Figure 5 presents the efficiency scores for the facelike and nonfacelike orientations. Strikingly, participants were significantly faster and more efficient for facelike compared to nonfacelike orientation (RT: t(45) = 2.59, p = .01; efficiency score: t(45) = 2.88, p = .01). There was no significant difference in proportion correct between the two stimulus orientations (t(45) = 1.31, p = .20). Thus, this is the first study to demonstrate that perceived facelikeness facilitated the efficiency with which naïve participants matched target and probe stimuli across changes in viewpoints.

Experiment 2
Experiment 1 demonstrated that perceived facelikeness facilitated the efficiency with which participants matched novel objects. The facelike and nonfacelike objects differed only by a 180°rotation in the pictureplane but were otherwise physically identical. Moreover, participants could not perform this task solely by low-level image matching because we changed the viewpoints between the target and probe objects. However, it may still be possible that the picture-plane rotation lead to differences in perceptual similarity between objects at each stimulus orientation. Therefore, in Experiments 2 and 3, we measured the relationship between perceptual and physical similarity for these objects at these two orientations.

Participants
Twelve volunteers from Newcastle University participated in Experiment 2. Their age ranged between 20 and 38 years.

Design and procedure
The participants performed a same-different discrimination task. There were four blocked conditions: two stimulus orientations (facelike, nonfacelike) and two noise levels (noise, no-noise). In the no-noise condition, we did not jitter the position of the parts and we did not add fractal noise to make the body or parts asymmetric. We included the noise manipulation in this experiment to test whether small asymmetries (as seen with faces) would influence perceptual discriminability. The four blocks were run in a Latin square design across participants to counterbalance the order of the conditions. For this experiment, we arbitrarily selected six prototype pairs (of the 66 possible). For each pair, there were seven morph difference levels between the two prototypes (0% [same], 10%, 20%, 30%, 40%, 50% and 60%). There were six repetitions of the 6 × 7 conditions for a total of 252 trials per block. All the trials within a block were run in a random order for each participant.
For each morph difference level, we first randomly selected a morph stimulus from a prototype pair. For the second stimulus, we selected the morph stimulus in that prototype pair that provided the corresponding morph difference level. For example, suppose we randomly select a stimulus that was a 35% morph between prototypes A and B. In this case, for a 10% morph difference level, we would then select a 25% morph between A and B (or equally possible, a 45% morph between A and B). For a 0% morph difference level (i.e., same object), we would select the same 35% morph between A and B. Across participants, we sampled the entire morph continuum (from 0% to 100% in 5% increments) for each prototype pair and each morph difference level.
At the beginning of each trial, a white fixation cross was presented at the centre of the screen for 500 ms. The two stimuli were then presented sequentially for 300 ms each separated by a 1000 ms black screen. They were both shown from the 0°("front facing")  viewpoint. To prevent image matching, each stimulus was spatially shifted randomly along the x-and y-axis by up to ±50 pixels relative to the centre of the screen and each stimulus was randomly scaled by up to ±10% on each trial. The participants were instructed to respond as accurately as possible by pressing the "same" or "different" key following the presentation of the second stimulus. They were instructed to ignore any changes to position and size between the two stimuli. The response mapping was counterbalanced across participants. There was a self-timed break after every 42 trials. Prior to each block, the participants completed 20 practice trials to familiarize themselves with the procedure and stimuli. Feedback was provided on practice trials. Each block took approximately 20 min to complete.
Results and discussion Figure 6 shows the sensitivity (d' or dprime) as a function of the morph difference level for the four different conditions, averaged across participants. False alarms were defined as responding "different" when the morph difference level was 0% and hits were defined as responding "different" for all other morph difference levels. The false-alarm and hit rates were corrected by replacing rate = 0.0 with 0.01 (1/2N, where N = 36) and replacing rate = 1.0 with 0.99 (1-1/2N). For each condition, d' was computed as the difference between the z-transform of the hit rate and z-transform of the false-alarm rate. The same false-alarm rate for a given condition was used for all morph difference levels within that condition. Table 3 shows the proportion different responses as a function of morph difference level for the four different conditions, averaged across participants. The sensitivity data were submitted to a repeated measures analysis of variance (ANOVA) with stimulus orientation (facelike, nonfacelike), noise level (noise, no-noise) and morph difference level (10% to 60% in 10% step) as within-subjects factors. There was only a main effect of morph difference level, F(6,66) = 147.0, p < .001, h 2 p = .93. Moreover, there was a significant linear trend in the sensitivity data, F(1,11) = 182.7, p < .001, h 2 p = .94. No other main effects or interactions were significant. As evident in Figure 6, participants were sensitive to subtle changes in 3D shape (i.e., 10% morph differences). However, stimulus orientation or jittering the position of the parts did not influence their responses.

Participants
Fifteen new volunteers from Newcastle University participated in Experiment 3. Their age ranged between 21 and 59 years.

Design and procedure
The participants in Experiment 3 performed a pairwise similarity rating task. Picture-plane orientation (facelike, nonfacelike) was blocked with block order counterbalanced across participants. Twelve morph stimuli were randomly selected from the 198 possible Figure 6. The sensitivity data (dprime) from Experiment 2 averaged across participants as a function of morph difference level. stimuli. There were four repetitions of the 66 possible pairs of these 12 stimuli for a total of 264 trials per block. All the trials were run in a random order for each participant. At the beginning of each trial, a white fixation cross was presented at the centre of the screen for 500 ms. After the fixation cross disappeared, two stimuli appeared simultaneously side by side. One stimulus was shifted 450 pixels to the left of fixation, and the other stimulus was shifted by the same amount to the right. The participants were instructed to rate how similar the two stimuli appeared to them using a rating scale of 1 (very similar) to 7 (very different). The stimuli remained on the screen until they responded. Prior to each experimental block, the participants completed six practice trials using a different set of stimuli to familiarize themselves with the procedure and rating scale. Each block (including practice trials) took approximately 20 min to complete.

Results and discussion
Comparison of similarity ratings between facelike and nonfacelike objects For each participant, we averaged the similarity ratings for each of the 66 stimulus pairs, separately for the facelike and nonfacelike orientations. We then averaged the mean rating for each stimulus pair across participants. For visualization purposes, the resulting group means were then normalized to the range 0 and 1 (1 = most similar). Figure 7 shows the normalized pairwise mean similarity ratings for the two stimulus orientations.
The average pairwise similarity ratings did not differ between the two stimulus orientations (facelike: M = 3.9, SE = 0.2, range: 3.1 to 5.0; nonfacelike: M = 4.0, SE = 0.2, range: 3.2 to 5.0; t(14) = .95, p = .36). To compare the pattern of participants' similarity ratings between the facelike and nonfacelike orientations, we therefore computed the Pearson correlation between participant's average similarity for each corresponding stimulus pair in the facelike and nonfacelike orientations. The Pearson correlation averaged across all participants was r = 0.89 (SE = 0.01). This was significantly greater than r = 0, t(14) = 73.6; p < .001. Thus, stimulus pairs in the facelike orientation that were rated as highly similar tended to be also rated as highly similar in the nonfacelike orientation.

Comparison of perceptual similarity with physicalsimilarity measures
We also compared participants' similarity ratings to two different measures of physical similarity. The first physical-similarity measure is the Euclidean distance in parameter space between each stimulus pair. To compute this measure of 3D shape similarity, we took the vector of parameter values for Object A ([p A1 , p A2 , p A3 , … , p A19 ]) and Object B ([p B1 , p B2 , p B3 , … , p B19 ]) and computed the Euclidean distance as: Participants do not have direct access to an object's 3D shape parameters but only to the resulting rendered image of that object (see Figure 2). Therefore, Figure 7. The similarity judgment results for the facelike and nonfacelike orientation from Experiment 3. The pairwise similarity matrix represents all possible pairing of the 12 objects used in Experiment 3. Each row and corresponding column represents a single object. The 12 objects are arbitrarily ordered along the rows and columns. The colour scale in this and subsequent figures represent the similarity rating (scaled to between 0 and 1[highly similar]) averaged across participants.
we also computed a physical-similarity measure based on 2D images. A common image-based measure of physical similarity is to compare the response of Gabor filters to pairs of images (e.g., Lades et al., 1993;Yue et al., 2012). Figure 8 illustrates some of the steps in this computation. For ease of computing this measure of 2D image similarity, we converted all images to 256 level greyscale images. Second, we segmented the object from the background and created binary masks. Third, we placed Gabor jets on a uniform grid covering the entire image ( Figure 8A). Each jet consists of four spatial scales (8, 16, 32, and 64 pixels) and six spatial orientations (0°, 30°, 60°, 90°, 120°, and 150°). Figure 8B illustrates the Gabor filters comprising a jet, arranged in a 4 scales × 6 orientations layout. Fourth, to get the responses of the filters to an image, we convolved each image with each filter. Fifth, we extracted the responses only from Gabor jets which fall in the union of the two binary masks ( Figure 8A). The extracted jets form a high dimensional feature vector (number of elements = 4 scales × 6 orientations × number of included jets), one vector for each image. Lastly, the similarity between two feature vectors, J A and J B , is computed as: If two vectors are identical, the angle between them is 0°and so the cosine of the angle will be 1 (Gabor similarity = 0). If the two vectors are maximally different, the vectors will be orthogonal to each other (i.e., the angle between them would be 90°) and so the cosine of the angle will be 0 (Gabor similarity = 1). The correlation between the 2D and 3D measures of physical similarity for the 66 unique stimulus pairs was r = 0.54. To compare perceptual similarity and these two physical-similarity measures, we first computed the correlation between similarity ratings and each measure separately for each participant and each stimulus orientation. There were thus four correlation coefficients for each participant (2 stimulus orientations × 2 physical-similarity measures). For each condition, we averaged these correlations across participants and tested whether they differed significantly from r = 0. Table 4 presents the correlation between similarity ratings and the 2D and 3D similarity measures, averaged across participants for the facelike and nonfacelike orientations. Table 5 shows the pairwise t-tests between all conditions in Table 4. Figure  9 shows one representative participants' similarity matrices for the facelike and nonfacelike orientation, along with the similarity matrices derived from the two visual-similarity measures. All the correlations between perceptual similarity and the two physicalsimilarity measures were significantly different from zero. Furthermore, there were no significant differences in the pairwise correlations across the conditions. Thus, participants' judgements of perceptual similarity Figure 8. Illustration of computing the Gabor similarity measure. The silhouettes of two objects are superimposed (A). The yellow region represents their overlap whereas the red and green regions represent non-overlapping regions for each object. The Gabor jets are placed on a uniform grid (circles). Only jets that fall within a coloured region are used to compute similarity. Each Gabor jet is represented by four spatial scales and six spatial orientations (B). were linearly related to both the 3D shape parameters and 2D images but, importantly, these judgements were not influenced by stimulus orientation.

Post-experiment facelikeness rating
We presented six participants from Experiment 2 and seven participants from Experiment 3 with two objects at the end of their experimental task. One object was presented in the nonfacelike orientation and the other object was presented in the facelike orientation. For each object, participants were instructed to judge how facelike the stimulus appeared (with no further instruction) on a scale of 1 (very facelike) to 7 (not facelike at all). The specific objects were randomly selected from those they had seen during the main experiment and each object was randomly assigned to one of the orientations. The presentation order was counterbalanced across participants. Participants rated the facelike orientation (M = 2.5, SE = .9) as significantly more facelike than the nonfacelike orientation (M = 6.1, SE = .9), t(12) = 13.6, p < .001.  Figure 9. Representative similarity matrices from a single participant. The participant's pairwise similarity matrix for facelike and nonfacelike orientations are shown (top). The correlation between the participant's similarity judgments and the 2D and 3D similarity measures are also presented (r 2D = correlation between perceptual similarity and 2D image similarity; r 3D = correlation between perceptual similarity and 3D shape similarity). The pairwise similarity matrix based on the 2D and 3D similarity measures are also shown (bottom).

General discussion
We introduce a new parametric multipart object set which allows researchers to test different features for visual recognition and visual learning. Here we used our object set to test the role of spatial configuration of parts in visual recognition. Critically we demonstrated that perceived facelikeness facilitated performance on a perceptual matching task (Experiment 1). That is, we demonstrated that physical and perceptual resemblance to upright faces (particularly the configuration of parts), when other factors such as physical and perceptual similarity are equated (Experiments 2 and 3), can play an important role in visual recognition.
The findings of Experiment 1 are consistent with the suggestion that physical and perceptual resemblance to faces is an important factor both in visual recognition and visual learning (Biederman & Kalocsai, 1997). They are also consistent with previous studies which found that facelikeness for a range of nonface objects can lead to behavioural and neural responses that are similar to responses to faces (Brants et al., 2011;Caharel et al., 2013;Churches et al., 2014;Davidenko et al., 2012;Gauthier et al., 2004;Hadjikhani et al., 2009;Liu et al., 2014;Meng et al., 2012;Rossion et al., 2011;Shafto et al., 2015). However, our study is the first to use novel objects and equate perceptual similarity and low-level features that may drive physical similarity (e.g., for novel objects: curvilinear edges; Shafto et al., 2015; see also Davidenko et al., 2012;Yue et al., 2014). In particular, we showed that stimulus orientation did not influence performance in a discrimination task (Experiment 2) and in an explicit similarity-rating task (Experiment 3).
We note that there are alternative interpretations of the same objects in the two orientations other than their perceived facelikeness. For instance, objects in the nonfacelike orientation can appear creature-like with a "face" restricted to the upper half of the object. This interpretation may affect how observers parse the parts and derive their spatial configuration for the same objects at each orientation. The two orientations may also lead to the objects having different expressiveness (e.g., some objects in the facelike orientation appear to be "smiling" while the same objects in the nonfacelike orientation appear to be "frowning"). However, there remains an element of facelikeness in both of these alternative interpretations of the two orientations. It is also possible that objects in the facelike orientation are perceived to be less physically stable (i.e., can fall over) than objects in the nonfacelike orientation. This perceived instability has been shown to automatically attract observers' attention (Firestone & Scholl, 2016).
Compared to most nonface objects, faces form a visually homogenous category in that individuals share a high degree of physical similarity between individual faces both in terms of the facial parts but particularly in terms of the spatial configuration of those parts. Not surprisingly, several researchers have proposed the idea that physical similarity is a critical difference between faces and nonface objects and can be a strong factor that drives the difference in the way that both stimulus categories are represented (Brants et al., 2011;Cutzu & Edelman, 1996Damasio et al., 1982;Faust, 1955;Gauthier et al., 1999b;Op de Beeck et al., 2006;Vuong et al., 2012). However, neuropsychological studies have shown that physical similarity alone is not the only stimulus property that can lead to these differences in representation. For instance, brain damaged patients with prosopagnosia can discriminate visually similar items such as cars, but not faces (Busigny et al., 2010a;Busigny et al., 2014;Busigny, Joubert, Felician, Ceccaldi, & Rossion, 2010b). Moreover, when physical similarity is parametrically manipulated within categories (faces or cars), such patients show deviations from normal range performance only for faces (Busigny et al., 2010a(Busigny et al., , 2010b(Busigny et al., , 2014. Our object set currently has 12 prototypes and 198 objects derived from these prototypes rendered from 24 viewpoints. This is a particularly large set compared to novel object sets used in previous studies (e.g., 40 non-parametrized individual Greebles;  and subsequent studies). Moreover, one of the aims of this study was to present a procedure to allow researchers to create new object sets according to their research questions. We have therefore set up the 3D models so that researchers can easily generate these objects in a flexible yet systematic way. For example, they can create more prototypes, change the range of parameter values, add additional shape parameters, or add more parts. Researchers can also change the spatial configuration of the parts, for example, by systematically shifting the spatial position of the parts along the surface of the body (see Experiment 2). Lastly, researchers can manipulate the texture that is mapped onto the body and parts, and they can create random individual variations between objects by adding 3D shape noise to the body and parts (see Experiment 2). Their selection of objects can further be guided by calculating 3D shape and 2D image similarity measures between object pairs during stimulus generation.
Our findings suggest that physical and perceptual resemblance to faces based on the spatial configuration of parts is an important factor in visual recognition (Biederman & Kalocsai, 1997). Furthermore, the rich degree of flexibility in creating an arbitrarily large object set afforded by the procedure presented here will allow researchers to address issues such as these and further understand the representations that support visual recognition and visual learning.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the Economic and Social Research Council [grant number ES/J009075/1] to QCV and BR, and by the Belgian Science Policy Office [grant number IAP P7/33] to BR and AL.