Do capuchin monkeys (Sapajus apella) use exploration to form intuitions about physical properties?

ABSTRACT Humans’ flexible innovation relies on our capacity to accurately predict objects’ behaviour. These predictions may originate from a “physics-engine” in the brain which simulates our environment. To explore the evolutionary origins of intuitive physics, we investigate whether capuchin monkeys’ object exploration supports learning. Two capuchin groups experienced exploration sessions involving multiple copies of two objects, one object was easily opened (functional), the other was not (non-functional). We used two within-subject conditions (enrichment-then-test, and test-only) with two object sets per group. Monkeys then underwent individual test sessions where the objects contained rewards, and they choose one to attempt to open. The monkeys spontaneously explored, performing actions which yielded functional information. At test, both groups chose functional objects above chance. While high performance of the test-only group precluded us from establishing learning during exploration, this study reveals the promise of harnessing primates’ natural exploratory tendencies to understand how they see the world.

It has been suggested that one way in which humans make accurate predictions about how objects will behave is by having a "physics engine" in the brain (Battaglia et al., 2013;Fischer et al., 2016). There is evidence that humans make predictions about how objects will behave in a way that is remarkably similar to the kind of physics engines employed in computer games (Fischer et al., 2016;Ullman et al., 2017). From an evolutionary perspective, an ability to simulate the environment, to make predictions and design interventions, would seem to be of particular adaptive value to organisms like humans, which exploit the environment in innovative and flexible ways. There is evidence to suggest that our primate relatives are also capable of making inferences about object-object interactions, for example, to locate food rewards based on indirect information such as an inclined board or a noisy shaken cup (Völter & Call, 2017), or to anticipate how one object will affect another to avoid a trap or choose an effective tool (e.g., Jordan et al., 2020). However, in the context of a task where good performance is rewarded with food, it can be difficult to rule out other explanations for successful performance, such as learning which perceptual features are associated with reward. To understand the evolutionary origins of intuitive physics, it is therefore desirable to study how non-human primates learn about the physical environment outside of the context of locating or releasing food rewards. One possibility is to examine their natural tendency to explore objects.
From the child development literature, there is evidence to suggest that some of our basic intuitions about the physics of the world are present from birth (Spelke & Kinzler, 2007). However, our intuitions about how objects in the environment will behave are being constantly fine-tuned by the information gathered via our life experiences. Due to our seemingly innate desire to explore, from the moment we are born human curiosity drives us towards new information in the environment. It has been proposed that children are able to generate their own learning opportunities (Weisberg et al., 2016) and almost from birth, human infants naturally show systematic attention towards the most learnable parts of an environment. In response to novel environments and objects, an infant's motor activity will increase and become more variable (de Almeida Soares et al., 2013;Molina & Jouen, 2004;Ruff, 1984). When shown events in which objects act in unexpected ways, infants will look longer than at events without irregular or unexpected movements (Spelke, 1991;Wang et al., 2016) and in some cases even tailor their object manipulation to investigate the strange behaviour (Stahl & Feigenson, 2015). By three years of age, humans appear to seek out explanations; leading some researchers to describe children as "mini scientists" (Gopnik et al., 1999). There is now a large body of evidence to show that when provided opportunities to freely explore, pre-schoolers will spontaneously choose to explore confounded or belief-violating objects (Bonawitz et al., 2012;Cook et al., 2011;Gopnik et al., 2001;Schulz & Bonawitz, 2007;Sim & Xu, 2017;Stahl & Feigenson, 2015van Schijndel et al., 2015) and that the new information is incorporated into their knowledge of the world (Bonawitz et al., 2012;McCormack et al., 2016;Sim & Xu, 2017).
Yet humans are not the only animals living in complex environments, nor are we the only species with a strong tendency to explore. Most animals exhibit some form of exploration, and although in many instances non-human animals' exploration behaviour can be explained by either a foraging strategy or some other search for valuable resources, in some cases their exploration appears to be intrinsically motivated. Within captive environments, studies have shown that the complexity of an enclosure has a more significant effect on welfare than its size (Kerl & Rothe, 1996) and that primates prefer to interact with novel and complex objects (Boinski et al., 1999;Brunon et al., 2014;Snowdon & Savage, 1989;Woolverton et al., 1989). Ballesta et al. (2014) showed that independently of any food or social motivation, macaques treat novel objects as valuable resources. Further work has shown that both barbary macaques and capuchin monkeys will spend more time exploring responsive vs unresponsive objects (Polizzi di Sorrentino et al., 2014;Vick et al., 2000). In chimpanzees, after spending time interacting with sticks, all the individuals tested were better able to use similar sticks to solve a tool use problem (Birch, 1945). It seems that, just like the findings with human infants, primates preferentially interact with objects which provide the greatest potential for learning. Perhaps, as has been suggested to be the case for children, their intrinsically motivated exploration is driven by a desire to gather information about the world.
Despite the evidence that in humans, intrinsically motivated exploration plays a key role in how an individual builds up their intuitive understanding of the world, animal exploration has rarely been tested for learning. Many animal cognition paradigms revolve around food rewards in order to promote participation. In addition, exploration under controlled conditions can be difficult to elicit. There is some work to suggest that non-human animals are not intrinsically motivated to explore objects and their properties and will only do so in response to either perceptual novelty or an extrinsic motivator (capuchin monkeys: Edwards et al., 2014;chimpanzees: Povinelli & Dunphy-Lelii, 2001). However, in a recent study with macaques they have been shown to sacrifice rewards in order to obtain counterfactual information (Wang & Hayden, 2019); suggesting they were motivated to gather information or explanations. Moreover, recent studies with apes and capuchin monkeys, have shown that these non-human primates are not only intrinsically motivated to explore objects and discover their properties, but that they are then able to use the information discovered during problem solving (Ebel & Call, 2018;Polizzi di Sorrentino et al., 2014;Taffoni et al., 2017). When given the opportunity to explore an unbaited collapsible platform box, individuals from all four species of great ape explored it and discovered the mechanism. Consequently, when a food reward was placed into the box, those who had had the opportunity for exploration retrieved the reward faster (Ebel & Call, 2018). Similarly, when capuchin monkeys were presented with a mechatronic board which they could freely explore, they were motivated to spontaneously engage with the board and discovered that manipulation of specific modules of the box led to a reward chamber opening (despite no rewards being present inside the chamber). Once rewards were introduced to the board in the test phase, the monkeys were then able to retrieve the rewards faster than a control group (Taffoni et al., 2017). Similarly, in an experiment with capuchin monkeys, Polizzi di Sorrentino et al. (2014) showed that after experiencing action-outcome contingencies in an unrewarded setting, capuchin monkeys were able to act on these contingencies to retrieve rewards during a test phase. These studies provide support for the idea that reinforcement is not always necessary for exploration and learning, and that primates can use exploration to update their knowledge. However, in these studies it is difficult to know what has been learned by the participants, as the dependent variable is often speed of action production (for an exception, see Polizzi di Sorrentino et al. (2014) who also recorded the number of rewards retrieved). One study attempted to address this by including both an exploration and problemsolving component involving discrimination (Lambert et al., 2017). Kea and New Caledonian crows were given objects to explore before experiencing a problem-solving task where some of the objects could be used as tools. Following this experience, the birds were presented with two tests; one where they could use the objects explored to solve a tool-use task, and a second where they had to solve the tool-use task using novel objects. The authors found that the birds performed better on the test when it involved the objects they had explored compared to novel objects. These results suggest that Kea and New Caledonian crows are able to learn about object properties during their exploration and then apply their knowledge to problem solving. In this experiment, we aimed to use a similar method with capuchin monkeys to examine whether exploration could lead to an ability to discriminate between objects on the basis of their physical properties.
The experiment consisted of two within-subject conditions (enrichment-then-test, and test-only). During the enrichment stage, we presented two groups of capuchin monkeys with unrewarded objects which differed in their physical properties and analysed their exploration behaviour. Following exploration, we then presented the monkeys with a test in which rewards were added to the objects, and we investigated whether or not monkeys that had had the opportunity to explore the objects (those in the enrichment-then-test condition) would be more successful in selecting the object from which a reward could easily be extracted than monkeys to whom the objects were novel (those in the test-only condition). Capuchin monkeys are renowned for their tool use in the wild and exhibit object manipulations as complex as that of the great apes (Torigoe, 1985;Truppa et al., 2019). This combination of high rates of object manipulation alongside their ability to exploit the functional properties of the objects involved in their tool use (e.g., the weight of stones to crack nuts, Visalberghi et al., 2009;Visalberghi & Neel, 2003), makes capuchins a good candidate species to look for evidence of learning via intrinsically motivated exploration. If, like humans, the capuchin monkeys are capable of some level of intuitive physics, then it is likely that this intuition is built up via similar mechanisms of using exploration to gather information which leads to updated intuitions and predictions. Therefore, we predict that the capuchin monkeys in this study will be motivated to explore the objects and that their exploration behaviour will enable them to discover and learn the objects' properties required to retrieve a reward.

Subjects and housing
All the monkeys participating in the study were housed at the University of St Andrews' "Living Links to Human Evolution" research centre located within the Royal Zoological Society of Scotland's Edinburgh Zoo. At the centre, the monkeys live in two mixed-species communities made up of common squirrel monkeys (Saimiri sciureus) and brown tufted capuchin monkeys (Sapajus apella) Both groups' enclosures consist of an indoor capuchin area (7 m by 4.5 m by 6 m high), to which both species have access; an indoor squirrel monkey enclosure (5.5 m by 4.5 m by 6 m high), to which only the squirrel monkeys have access; and a large shared outdoor area (approximately 900 m2), consisting of natural vegetation and climbing structures. Situated between the indoor areas is a research room where, at specified research times, the monkeys have access to their testing cubicles which they enter voluntarily and are able to leave at any time.
Enrichment sessions took place within the monkey's outdoor area once a week between 11.15 am and 12.45 pm on either a Monday, Tuesday, or Wednesday. Test sessions took place in their testing cubicles up to five days a week, twice a day between 11.15 am-12.45 pm and 2.15 pm-4 pm Monday to Friday. Participants came from both of the groups at the research centre: the East group and the West group. The groups live in adjacent enclosures that are mirror images of each other, under identical housing conditions and in similarly sized social groups. All participation in experiments was voluntary and all food rewards provided (peanuts, raisins, and sunflower seeds) were supplemental to the monkeys' daily diet.
We tested both groups of capuchin monkeys in March 2019, with the entirety of both groups participating in enrichment sessions in their outdoor enclosures. Participation in the test sessions was voluntary and so we tested only those monkeys who chose to enter the testing room and participate. The sample size per condition and object-type was as follows: enrichment-then-test: bottle-bunting: n = 10, waterbottles: n = 14, plastic-tubs: n = 10, cardboard-boxes: n = 14; test-only: bottle-bunting: n = 11, waterbottles: n =12, plastic-tubs: n = 14, cardboard-boxes: n = 11. Fifteen monkeys from each group (N = 30; mean age: 10.8 years, range: 5-23 years; 12 females, 18 males) participated individually in at least one test session in the testing cubicles.

Apparatus
We designed 4 paired sets of objects ( Figure 1): "bottle-bunting", "water-bottles", "plastic-tubs", and "carboard-boxes". The objects were filled with nonfood material (sawdust, straw and blue stones). Each pair contained one set of "functional" objects which could have material extracted easily from them, and one set of "non-functional" objects from which the material could not be extracted easily.
For the bottle-bunting and water-bottle sets, the functional objects had holes in either the top or sides of the bottles so that spinning the objects enabled the monkeys to retrieve their contents. The non-functional objects had no holes, so the filling was difficult to access. The bottle-bunting set consisted of 1 L plastic bottles filled with sawdust and strung up with 4 bottles of the same type per string and two strings of each type provided in the enrichment session (a total of 16 objects: 8 functional and 8 non-functional). The functional and non-functional bottles differed in their texture, shape and markings ( Figure 1). The water-bottle set consisted of 5 L water containers filled with straw and skewered onto a bamboo cane with 2 containers of the same type per cane, and 2 canes of each type provided in the enrichment session (a total of 8 objects: 4 functional and 4 non-functional). The functional and non-functional bottles differed in their transparency and flexibility, the colour of the lids, and the shape of the bottles (Figure 1).
For the plastic-tub and cardboard-box sets, the functional objects were built to ensure they were easy to rip open so that pulling open the objects enabled the monkeys to retrieve their contents. The non-functional objects were made of stronger materials, so the filling was difficult to access. The plastic-tub set consisted of clear plastic tubs with a volume of 500 ml, filled with sawdust and attached to a bamboo cane 6 of the same type per cane, with 2 canes of each type provided in the enrichment session (a total of 24 objects: 12 functional and 12 non-functional). The functional and non-functional tubs differed in their shape and markings (Figure 1.). The cardboard-box set consisted of cardboard containers with a diameter of 10 cm, filled with straw and attached to a bamboo cane 6 of the same type per cane, with 2 canes of each type provided in the enrichment session (a total of 24 objects: 12 functional and 12 non-functional). The functional and non-functional boxes differed in their shape and colour (Figure 1.) Although the number of individual objects provided varied between the object sets, the objects were always presented to the monkeys on 4 canes (2 functional and 2 non-functional). The size of the objects determined how many copies were present on the cane as we wanted to ensure that all monkeys had access to the objects either due to there being multiple copies of the smaller objects, or due to objects being large enough for multiple monkeys to interact with each object at one time.
During enrichment phases, the objects were presented in the outdoor enclosure with the addition of blue stones inside them in order to attract the attention of the monkeys. During test phases, the objects were presented in the research rooms, on the experimenter's side of the research cubicles, with the addition of a raisin and flaked maize inside them to act as a reward for the monkeys.

Experimental design & procedure
The experiment consisted of two conditions (enrichment-then-test, and test-only) with each of the monkey groups receiving each condition twice with different object sets. For the east group, the enrichment-then-test conditions involved the bottlebunting and plastic-tub object pairs and the testonly conditions involved the water-bottles and cardboard-boxes object pairs. The object pairs were reversed for the west group.
The enrichment-then-test condition involved an initial morning enrichment session (11.30 am-12.30pm), followed by test sessions at 2.15 pm-3.45 pm the same day and 11.15 am-12.45 pm the following morning. The test-only condition involved just one test session without the monkeys having any prior experience of the objects. This allowed us to compare learning after experience with the objects, to a baseline without this prior experience. This was important because the monkeys may have had biases for one object of the pair or may have been able to learn over the 10 trials of experience which one was rewarded. Using multiple object pairs meant that we could make this comparison without confounding the question with cohort differences.
Enrichment sessions consisted of one of the object sets being set up in the monkeys' outdoor enclosure at 4 pre-determined locations. Locations were chosen based on their visibility from the observation balcony whilst making sure locations were spread equally between areas where low-and high-ranking individuals were often found; see supplementary material S1 for the location of each enrichment site in the enclosures as well as the location of the cameras based on a figure from Leonardi et al., (2010). The cameras were all started before the outdoor enclosure was entered in order to capture the monkeys' initial reactions to the objects. The monkeys' behaviour was recorded for a maximum of 1 h, however, if a period of 10 min passed with no individuals interacting with the objects, the session ended early. The enrichment objects were removed at the end of every session.
Test sessions were identical for the enrichment-thentest and test-only conditions. At the beginning of a session, the monkey voluntarily entered the testing cubicles and was isolated by the experimenter in a cubicle with two holes in the window located equal distance from the centre, at hip height for a sitting monkey. A table was positioned in front of the window and the experimenter held up one functional and one non-functional object (of the pair). The monkey's name was called and the objects were then shaken 3 times to show the monkey that they both contained food. The objects were then placed on the table simultaneously and in one smooth movement so that they were now within reach of the monkey, one in front of each of the window's holes. The experimenter said the command "choosing" and the monkey reached out for one of the objects. The verbal command was used as in a previous task the monkeys had been trained that this command signalled a demonstration phase was over and they were now able to interact with objects on the table. In most cases the chosen object was then moved to the next window (with a bigger hole) to enable the monkey to interact with the object to try and retrieve the rewards (in some cases the monkeys were so enthusiastic that they refused to let go of the objects for them to be placed at the larger hole window and so continued to manipulate them through the smaller hole). If the functional object was chosen the monkeys were able to retrieve the rewards, however, if the non-functional object was chosen the monkeys were given up to 30 s to interact and experience that they could not retrieve the reward before moving onto the next trial. Although a small number of individual monkeys had managed to persevere and open these non-functional objects during the enrichment sessions, the setup of the testing cubicles prevented them from getting the correct grip during testing and all monkeys gave up trying to retrieve rewards from these objects within the 30 s interaction time. Subjects were given one session of 10 trials for each of the object pairs.

Coding
All enrichment sessions were videotaped and coded by a single experimenter using the same coding scheme (see supplementary material S2 for the coding scheme and video examples of each of the behaviours). We coded 11 different behaviours which fell into 4 broader categories of behaviour. (1) Sensory exploration. This included visual inspection, licking and sniffing the objects. (2) Object manipulation. This included banging, spinning, biting, and pulling apart the objects. (3) Filling manipulation. This included removing or attempting to remove the filling material from the objects. (4) Other. This included transporting the objects and any other behaviour the monkeys directed towards the objects. Of the object manipulation behaviours, the critical manipulation for gaining access to the objects filling differed between objects. For the bottle-bunting and water-bottles the monkeys were required to spin the bottles to retrieve the filling, whereas for the carboard-boxes and plastic-tubs the monkeys needed to pull open the objects to access the contents. A second coder scored 25% of all trials from the recorded video material to establish inter-observer reliability. Fleiss' kappa was calculated, and according to Landis and Koch (1977), inter-observer reliability was "almost perfect" (correct choice: K = 0.97, p < 0.001).
All test sessions were videotaped and coded by a single experimenter. Trials were coded as correct if the monkey reached out towards the functional object, and incorrect if they reached out towards the non-functional object. As with enrichment sessions, a second coder scored 25% of all trials from the recorded video material to establish inter-observer reliability. Fleiss' kappa was calculated, and according to Landis and Koch (1977), inter-observer reliability was "almost perfect" (correct choice: K = 0.97, p < 0.001).

Analysis
The analysis was carried out in RStudio (R Studio Team, 2019) using R (version 3.6.2; R Core Team, 2019). To analyse the monkeys' exploration, we looked at the average length of time spent interacting with the objects, and what proportion of the time spent manipulating the monkeys spent doing each of the manipulation actions.
To assess whether the monkeys had learnt anything during the enrichment sessions we took the scores from the test sessions and ran one-sample ttests to compare their performance to chance. Following this, we investigated the effects of condition (enrichment-then-test and test-only), object-type and trial number on the performance of the monkeys using generalized linear mixed models (GLMMs). All GLMMs had binomial error structure and logit link function with choosing the functional object as the dependent variable (DV) and included monkey ID as a random effect. In all cases, trial number was z-transformed to a mean of zero and a standard deviation of 1. GLMMs 1-3 used the data from all four object types and included the test predictor variables condition (test-only, or enrichment-then-test), object type (bottle-bunting, water-bottles, plastic-tubs or cardboard-boxes), and trial number, as well as the random slopes of object type within monkey ID. We first attempted to fit the maximal model with random slopes of trial number and condition, however, after encountering convergence issues these were removed. GLMM 1 included a three-way interaction between the predictors, however, this interaction was not significant and so we fitted GLMM 2 which included only the 2-way interactions between the predictors. Again these 2-way interactions were not significant and so we fitted GLMM 3 which included only the interaction between object type and trial number with condition included as a main effect. GLMMs 4-7 acted as post hoc tests to investigate the effect of condition within each objecttype. We subset the data into the four object types and fitted a model with condition and trial number as the test predictors, as well as monkey ID as a random effect. All the models were fitted using the R function lmer of the package lme4 (Bates et al., 2015) and we tested the effect of the fixed effects using likelihood ratio tests comparing the full model with reduced models lacking the respective fixed effect. We assessed model stability by comparing the estimates of the model that was based on the complete data set with estimates obtained from models with each subject excluded one at a time. All models were stable for all fixed effects and the variance inflation factors confirmed there was no problems of collinearity (for all models VIF = 1 for all fixed effects).

Exploration
In all enrichment sessions, the monkeys were motivated to interact with the objects with 16/17 monkeys interacting with the bottle-bunting, 17/18 monkeys interacting with the water-bottles, 14/17 monkeys interacting with the plastic-tubs, and all 18 monkeys interacting with the cardboard-boxes. In all cases, the monkeys first performed some form of sensory exploration (e.g., sniffing, licking, or visually inspecting the objects) before some individuals then going on to perform object manipulations and interacting with the contents of the objects. However, in all enrichment sessions the average amount of time individuals spent interacting with the objects was low as only a small number of individuals spent a large amount of time with the objects (interaction times: bottle-bunting: mean 242 s, range 2-1575 s; water-bottles: mean 190 s, range 2-551 s; plastic-tubs: mean 66 s, range 6-247 s; cardboard-boxes: mean 344 s, range 33-1697 s).
Whilst interacting with the objects the monkeys performed several different manipulations; biting, banging, spinning and pulling-apart the object. However, although biting the object occurred at high rates in all four enrichment sessions, the occurrence of the other object manipulations varied between the enrichment sessions ( Figure 2). In both the bottle-bunting and water-bottle sessions, biting was accompanied by spinning and pulling. Whereas, in the plastic-tubs and cardboard-boxes sessions, biting was accompanied by banging and pulling.
The affordances of the objects differed between the enrichment sessions, and for each object pair there was a "critical action" which the monkeys needed to perform to release the contents of the functional object. For the bottle-bunting and waterbottles, the critical action was to spin the objects, whilst for the plastic-tubs and the cardboard-boxes, the critical action was to pull the objects apart. Of the monkeys who manipulated the enrichment objects, spinning was seen in 67% (n = 8) of monkeys in the bottle-bunting sessions and 53% (n = 8) of monkeys in the water-bottle sessions and pulling was seen in 90% (n = 9) of monkeys in the plastic-tubs sessions and 67% (n = 12) of monkeys in the cardboard-boxes sessions.
GLMM1 contained the 3-way interaction between condition (test-only, or enrichment-then-test), object type (bottle-bunting, water-bottles, plastic-tubs or cardboard-boxes), and trial number, as well as the random effect of ID and the random slope of object type within ID. The model fitted the data significantly better than a null model containing only the random effects (LRT: χ2 = 56.99, df = 15, p < 0.001). However, the three-way interaction was not significant (LRT: χ2 = 0.88, df = 3, p = 0.83).
For GLMM2, we removed the three-way interaction and ran a reduced model containing only the twoway interactions between the main effects, along with the random effect of ID and the random slope of object type within ID. The model fitted the data significantly better than a null model containing only the random effects (LRT: χ2 = 56.11, df = 12, p < 0.001). and revealed a significant interaction between object type and trial number (LRT: χ2 = 11.28, df = 3, p = 0.01; see supplementary data S4 for the GLMM2 output). The interaction between condition and object type was not significant (LRT: χ2 = 4,24, df = 3, p = 0.24), nor was the interaction between condition and trial number (LRT: χ2 = 3.33, df = 1, p = 0.07).
To check for any main effect of condition, we ran a third GLMM (GLMM3) which included condition as a main effect as well as the object type by trial number interaction and the random effect of ID and the random slope of object type within ID. The model was significant when compared to a null model containing only the random effects (LRT: χ2 = 48.88, df = 8, p < 0.001) and revealed a significant interaction between object type and trial number (LRT: χ2 = 9.72, df = 3, p = 0.02; see supplementary data S5 for the GLMM3 output), but no significant effect of condition (LRT: χ2 = 0.91, df = 1, p = 0.34).
To investigate the interaction between object type and trial number we subset the data into the separate object types and ran four further GLMMs (one for each object type). All four models contained the predictor variables condition and trial number with monkey ID as a random effect.
Bottle-bunting & plastic tubs GLMM 4 used the data from the bottle-bunting test sessions and GLMM 5 used the data from the plastic-tubs test sessions. Neither model fitted the data significantly better than their respective null models containing only the random effects (LRTs: Bottle-bunting: χ2 = 1.61, df = 2, p = 0.45; Figure 3(a); Plastic-tubs: χ2 = 3.61, df = 2, p = 0.16; Figure 3(c)). For both bottle-bunting and plastic-tubs neither test condition nor trial number had a significant effect on performance.

Discussion
The monkeys showed spontaneous exploration of all the enrichment objects provided and chose the functional objects above chance in the test sessions in both the enrichment-then-test and test-only conditions. This finding suggests that they are intrinsically motivated to explore objects and provides empirical support in line with the findings of the enrichment literature (e.g., Clark & Smith, 2013;Dubois et al., 2005;Vick et al., 2000). The monkeys did not behave in a way which suggested they were following an innate exploration routine as they tailored their exploratory manipulations to the affordances of the different objects (spinning the bottle-bunting and water-bottles but pulling the plastic-tubs and cardboard-boxes). However, when looking at their performance in the test sessions, performance following an opportunity to explore was not different from the baseline sessions. In the cardboard-boxes and water-bottles sessions the monkeys appeared to be learning during the test in both the post-enrichment and baseline sessions suggesting that the monkeys may not have learnt about the object properties during their exploration. Whilst in the bottle-bunting and plastic-tubs sessions, high performance in the baseline sessions may have prevented us from detecting any benefit from the enrichment session, possibly due to the monkeys having had prior experience with similar objects. Future research should consider that when dealing with visible differences, prior learning is likely to have occurred, and attempt to use objects which are harder to disambiguate and/or hidden properties that the participants are less familiar with in order keep the baseline low.
It is common knowledge that many animals, especially primates, will explore and manipulate Figure 3. Average score at test for each enrichment object, split into the different enrichment conditions. For each enrichment object type, the graph shows the average score across trials split into the two conditions (enrichment-then-test, and test-only). Error bars show the mean ±sd, with the dashed red line indicating chance.
objects. From just a few months old, baboons, capuchin monkeys and chimpanzees have been documented to show object manipulation (Hayashi, 2007;Westergaard, 1993;Westergaard et al., 1999) with chimpanzees and bonobos performing a wide array of different manipulation actions (Takeshita & Walraven, 1996). Capuchin monkeys have been classified as showing object manipulations as complex as those exhibited by the apes (Torigoe, 1985;Truppa et al., 2019), which includes combing objects with their environment or other objects and performing object manipulations which involve almost all body parts. In research with human children, object play has been used as a window on children's understanding of the world and the processes by which they build up their world knowledge (e.g., Butler & Markman, 2012;Cook et al., 2011). Despite this, researchers interested in uncovering how nonhuman primates learn about their physical world have given little attention to their natural exploratory tendencies. Most research into how nonhuman primates see the physical world is done via presenting individuals with problem-solving tasks that mirror a foraging context in which they work for rewards. These methods have yielded many fruitful results in helping us understand how primates use their physical knowledge in problem solving (e.g., Jordan et al., 2020;Völter & Call, 2017). Often in these studies, the subjects' performance is good from the first trial onwards and shows no improvement over trials (e.g., Völter & Call, 2014), suggesting that the tasks tap into intuitive physics rather than being learned over trials in the reinforced setting. However, these tasks fail to uncover how primates acquired this information. In a natural setting nonhuman primates could build their physical knowledge from their own un-reinforced interactions with their environment without concepts being scaffolded with increasingly difficult trials, or associative cues. In this study we provided the primates with objects in a more ecologically valid setting and showed that by harnessing their natural exploratory tendencies it is possible to begin to investigate how they learn about the physical environment outside of the context of locating or releasing food rewards.
Examining this natural behaviour and its consequences for learning can yield important insights into non-human primates' cognition as it removes the need to reward good performance with food and rules out any explanation of learning based on associative learning. Within the research into nonhumans' spatial cognition, an animal's natural exploratory tendencies have long been exploited, for example, Tolman & Honzik's, 1930 latent learning study showed that rats could learn the layout of a maze without any food reinforcement. Latent learning is the acquisition of information which is learned without the presence of a reward (Shaw & Waters, 1950) and can be seen as incidental acquisition of information and doesn't require an individual to be paying attention to the information they are trying to learn (Jiang & Leung, 2005). In their classic study, Tolman and Honzik (1930), placed rats in an unbaited maze for ten days and left them to explore. On the eleventh day, a food reward was placed at the end of the maze, and after just one day the rats could navigate the maze to get to the reward as quickly as rats who had been trained with a reward for the whole 10 days. This study showed that learning was possible without reinforcement and since the 1930s, many variations have been carried out to show that during unrewarded exploration of a space, both humans and other animals are able to learn its layout. This method has only sparsely been applied to learning non-spatial information, however, there is evidence to suggests that children can learn the properties of objects during exploration (Stevenson, 1954). When children had to explore objects whilst finding a key, they were able to latently learn about these objects and find them faster than objects they had not explored whilst finding the key. Similarly, capuchin monkeys have been shown to learn action-outcome contingencies during unrewarded interactions with an object (Polizzi di Sorrentino et al., 2014). Capuchins were presented with a mechatronic board which they could explore unbaited before the board was then baited and the capuchins could work to release rewards. The monkeys were split into two groups with one group receiving a board where their actions had an effect on components of the board (e.g., pressing a button caused a box to open) whist for the other group the movement of the board's components was not contingent with the monkeys' actions. The group who experienced contingent action-outcome relationships retrieved more rewards and were faster at retrieving rewards during the baited stage than the group who had not experienced contingency between their actions and the movement of the board. This innovative study suggests that even under unrewarded conditions, experiencing action-outcome contingencies can facilitate learning in capuchin monkeys. The current study builds on this work by using objects that could be presented to capuchin monkeys within their enclosure, allowing us to get a deeper insight into their natural exploration behaviour and their intrinsic motivation to learn during exploration. In addition, the test phase in Polizzi di Sorrentino et al. (2014) contrasts monkeys who had experienced that their actions caused an effect on the board to monkeys who had learn that their actions did not cause an effect on the board. This means that rather than having an inexperienced group they had a group who had experienced that the movements of the board was random, which could explain why during the test stage these monkeys were slower at retrieving the rewards. By comparison, in the current study we compared the performance of experienced vs inexperienced monkeys, allowing us to investigate any pre-existing biases the monkeys had. Both Polizzi di Sorrentino et al. (2014) and the current study show that combining ethological approaches to investigating behaviour with carefully controlled cognitive paradigms has the potential to uncover insights into the ways in which non-human primates learn the physics of the world around them.
In conclusion, this study reveals the promise of harnessing primates' natural exploratory tendencies as a tool for understanding how they see the world. When presented with unrewarded objects that had differing functional properties, the capuchin monkeys seemed to be intrinsically motivated to explore and performed actions that would yield functional information. However, due to their high performance in the discrimination test with and without the experience of the exploration phase, we were unable to conclusively determine whether they indeed gathered functional information through exploration. Although it remains an open question as to whether capuchin monkeys can learn about objects in their environments via unrewarded exploration, this study highlights that by combining ecology and cognition we can gather rich data that will provide insight into how knowledge is built in a real-world context.