Help or punishment: acute stress moderates basal testosterone's association with prosocial behavior

Abstract The gonadal hormone testosterone is well-recognized to facilitate various behaviors for obtaining social status. A good reputation (i.e. competitive, generous, and trustworthy) is of crucial importance for acquiring high social status. It is unclear which type of reputation is preferred by individuals under the influence of testosterone. Given that the recent dual-hormone hypothesis emphasizes the modulating effect of stress (cortisol) on the influence of testosterone, it would be intriguing to test the role of stress-induced cortisol in testosterone-related reputation seeking. To test this hypothesis, we induced acute stress in 93 participants with cold pressor test (CPT) paradigm (vs. control condition), and then they were instructed to play a third-party intervention game, in which they made decisions as an uninvolved, outside the third party to punish a violator, help a victim, or do nothing. Salivary samples were obtained to assess participants’ testosterone and cortisol levels. We split the testosterone concentration by median to low endogenous testosterone (LT) and high endogenous testosterone (HT). We found that HT individuals’ prosocial preferences did not affect by acute stress. They were more likely to choose punishment than helping under both stress and control conditions. In contrast, individuals with low testosterone were more inclined to help than punish under control conditions. Interestingly, acute stress brought behavior patterns of LT individuals closer to those of HT individuals, that is, they reduced their helping behavior and increased the intensity of punishments. In this preliminary study on the preference inducement of testosterone for different types of prosocial behaviors, we discuss the physiological mechanism of the relationship between testosterone and reputation and the implications of these results for the dual-hormone hypothesis. HIGHLIGHTS Low testosterone (LT) individuals were more inclined to help than punish. High testosterone (HT) individuals were more inclined to punish than help. The HT individuals' preferences for prosocial types were not affected by acute stress. Acute stress brought the behavior patterns of LT individuals closer to those of HT individuals.


Introduction
Testosterone, a steroid hormone released by the hypothalamic-pituitary-gonadal (HPG) axis, is known to play a critical role in the development and maintenance of physical masculinization (Mazur & Booth, 1998). Studies from animals and humans have shown that testosterone is associated with antisocial, egoistic, or aggressive behavior in social contexts (Bj€ orkqvist et al., 1994;Carr� e et al., 2011;Mazur & Booth, 1998). But recently, several studies link testosterone with prosocial behavior, such as increasing fair bargaining behavior, honesty, altruistic punishment, social cooperation, and generosity (Burnham, 2007;Dreher et al., 2016;Eisenegger et al., 2010;Mehta & Josephs, 2010;Van Honk et al., 2012). Studies have shown that prosocial behavior with increased testosterone can help individuals establish and maintain a higher social status (Dreher et al., 2016;Eisenegger et al., 2010). However, it's not clear which kind of prosocial behavior does testosterone makes individuals more likely to prefer to obtain social status?
When faced with a situation that violates social rules (e.g. unfair events or crimes), a person could punish violators of social norms or help the victim mitigate the damage, with an expense of self-interest . Studies have found that both costly help and punishment can lead to some good reputations (Ohtsubo et al., 2018), but there are still some reputational differences between the two behaviors. Helping is mainly driven by empathy (Leliveld et al., 2012) and compassion (McCall et al., 2014), and it has a more informative signal of trustworthiness and warmth (Li et al., 2018;Przepiorka & Liebe, 2016;Raihani & Bshary, 2015b). While the reputation of punishment may be more complicated, some studies believe that punishment has a signal of competitiveness and power (Raihani & Bshary, 2015a), others believe that punishers are more trustworthy, group-focused, and worthy of respect than non-punishers (Barclay, 2006). However, up to now, most empirical studies have demonstrated that punishment can give people an impression of emotional instability and irritability, which induce fear in bystanders (dos Santos et al., 2013;Jordan et al., 2016).
Through a third-party intervention task in which a participant made options freely as an uninvolved, outside the third party: punishing the violator, helping the victim, or doing nothing, we can infer the motivation behind the two intervention methods and test the impact of testosterone on the preference. Evidence in support of fair bargaining provides insight that testosterone increased sensitivity for aversive events, particularly those that challenge the high social status of an individual (Eisenegger et al., 2011), and point to circumstances in which high testosterone may lead to punishment (Eisenegger et al., 2010). On the contrary, the study found that when the targets of punishments were poor proposers, participants showed greater empathy for poor norm violators in highly unfair trials (Ouyang et al., 2021). Similarly, although warmly helping victims can also contribute to social status, it required more victim-focused empathy (David et al., 2017) which has been shown to be relatively deficient in individuals with high testosterone (Van Honk et al., 2013). Therefore, we put forward our first hypothesis (H1): HT individuals may be more inclined to punish than help while LT individuals may be more inclined to help than punish. In particular, the reason behind HT individuals' punishment bias may be (1) to pursuit a reputation of competitiveness (status-seeking), or (2) just lack of empathy. The reason behind LT individuals' helping bias may be (1) to pursuit a reputation of trustworthiness (status-seeking).
Besides, there is now increasing evidence to show that testosterone's influence on status-relevant behavior (i.e. status, dominance, risk-taking, aggression, and psychopathy) is modulated by cortisol, a hormone released of the hypothalamic-pituitary-adrenal (HPA) axis, in response to physical and psychological stress (Dekkers et al., 2019;Dickerson & Kemeny, 2004;Mehta & Prasad, 2015). According to the dualhormone hypothesis, higher testosterone leads to more status-seeking behaviors when cortisol levels are low, while when cortisol levels are high, testosterone's impacts on status-seeking behaviors were inhibited (Mehta & Josephs, 2010;Mehta & Prasad, 2015). Some studies provide initial support for the hypothesis that high cortisol blocks the testosterone/behavior relationship (Edwards & Casto, 2013;Ponzi et al., 2016;Sherman et al., 2016). Given that cortisol fluctuates in real stress situations (Dickerson & Kemeny, 2004), two recent studies found that the cortisol fluctuation induced by psychological stress manipulated by the Trier Social Stress Task (TSST) suppressed the association between basal testosterone and fair bargaining behavior in the ultimatum game (UG) and dictator game (DG) (Prasad et al., 2017;. However, it is unclear whether the social nature of the stressor used in such studies (i.e. being evaluated while speaking publicly) was responsible for the subsequent social behaviors (FeldmanHall et al., 2015). The public speech itself is a form of improving social status (Eisenegger et al., 2011), which may have a priming or interfering effect on subsequent social behavior. Here, using a systemic and physiological stressor (cold pressor test, CPT), we directly manipulated acute stress and measured stress-induced cortisol levels. We aimed to probe whether acute stress has domain-specific effects on the testosterone-prosocial preference relationship. We put forward our second hypothesis (H2): (1) if the results show that under the condition of stress compared with control, the punishment bias of HT individuals is inhibited (increase help and decrease punish), it indicates that our results support the dual hormone hypothesis and the punishment bias of HT individuals is driven by competitive reputation. (2) if the results show that the punishment bias of HT individuals is not affected by stress, it indicates that our results don't support the dual hormone hypothesis and the punishment bias of HT individuals is driven by lower empathy.

Participants
Ninety-three healthy males were recruited and randomly assigned to the stress condition (CPT) or the control condition. Three subjects were excluded due to the suspicion about the authenticity of their partners in the third-party intervention task (TPI), and five subjects were removed from the analysis due to a failed saliva collection. The final sample included 85 subjects (control: n ¼ 42, mean age ¼ 22.95, SD ¼ 2.19; stress: n ¼ 43, mean age ¼ 22.02, SD ¼ 2.23). To control for the potential influence of non-experimental factors on the reactivity of the HPA axis to stress, screening criteria were enforced as follows: (1) no alcohol and nicotine abuse; (2) no chronic diseases or mental disorders; (3) no medication use within 2 weeks; (4) no current periodontitis; (5) no major examination within 2 weeks; (6) no circadian disruption (i.e. adequate sleep and no chronic overnight work). All subjects were right-handed with normal or corrected-tonormal vision. They were required to have adequate sleep and were forbidden high-intensity exercise the day before the experiment.
This study was approved by the Institutional Review Board of the State Key Laboratory of Cognitive Neuroscience and Learning at Beijing Normal University. All participants provided written informed consent.

Experimental procedure
To control for circadian rhythms, experiments were carried out between 1:30 pm and 6:00 pm (Dickerson & Kemeny, 2004). Participants were asked not to eat, drink, or work out at least 2 h before the experiment. Upon arrival at the lab, participants were taken to the testing room, where they were explained the synopsis and process of the experiment and completed questionnaires for 20 min. Then, the heart rate (HR1), the saliva sample (S1), and the Positive and Negative Affect Scale [PANAS (PA1 and NA1)] were administered. Then they were randomly assigned either to the CPT or to the control condition. Heart rate was recorded throughout the experiment. Immediately after the CPT, a second saliva sample (S2) and PANAS (PA2 and NA2) were collected. Then, the subjects completed the Third-Party Intervention task (TPI). The third and fourth saliva samples (S3, S4) and PANAS (PA3 and NA3, PA4, and NA4) were measured 20 and 30 min after the CPT, respectively. The entire study took 1 h. See Figure 1 for the sequence of the experimental protocol.

Stress induction
After completing the baseline saliva sample, participants were randomly assigned to either a stress or control condition. Stress induction involved a cold pressor task (CPT) wherein subjects submerged their left hand to the wrist in 0-4 � C ice water for 3 consecutive minutes (Riccio et al., 1992). If a participant failed to complete the CPT, they were excused from the study. The control participants submerged their left hand in warm water (35-37 � C) for 3 consecutive minutes. The CPT is extensively documented to reliably induce activation of HPA-axis as evidenced by elevation of endocrinal (i.e. cortisol) responses, and it has been used to elicit a stress response (Lovallo, 1975;McRae et al., 2006). Essentially, the CPT excludes lasting psychological effects typically associated with other types of laboratory stressors (McRae et al., 2006), letting us isolate an increased neurohormonal stress response exclusive of ancillary effects that could bias social behaviors.

Third-party intervention task
The third-party intervention task (TPI) was adapted from the classic third-party punishment game (TPP) (Fehr & Fischbacher, 2004), in which unaffected observers punish selfishness to promote fairness with self-costs, It has been recognized as a form of prosocial behavior. In our previous research, a potential punisher has equal opportunity to help, so we can test the choice preferences in one task. Our results show that the subjects' choice under this task is highly correlated with a scenario task with more ecological validity (Zhen et al., 2021). Participants were seated in separate cubicles and informed that they would stay anonymous during and after the experiment. They believed that they were randomly assigned as player C (the third party) through a large drawing procedure. All participants (player C) received an endowment of 50 money units (MUs, 10 MUs ¼ 1 Chinese yuan) per round and were told to observe a set of allocations of 100 MUs between several pairs of player A (the proposer) and player B (the recipient) (i.e. the dictator game), which would take place in other rooms simultaneously. Actually, they played with a computer. The proposer (player A) received an endowment of 100 MUs per round and could decide how to distribute these between him-/herself and the recipient (player B) with 10 MUs as the minimum unit (i.e. 0, 10, 20, 30, 40, 50), and the recipient had to accept the allocation. The participants were told that the number of trials was determined by how many pairs of player A and player B there were, though in reality this was preprogrammed to be three trials in total, and the offers were specifically selected by us (i.e. 90/10, 70/30, and 50/50 for the three respective trials). The participants were asked to choose from three options: transferring MUs to deduct from player A's MUs, transferring MUs to add to player B's MUs, or keeping their own allotted MUs for themselves. If the subject chose to deduct or add, then this entailed deciding how many MUs of their own 50 MUs to transfer, with 1 MU as the minimum unit.
There are several details we need to highlight: (1) When player C chose to deduct from A's MUs or add to B's MUs, the cost ratio was 1:3, as in previous studies (Hu et al., 2015); that is, every 1 MU player C transferred deducted or added 3 MUs from player A or to player B, respectively. (2) In the instructions, we used "players A, B, and C" instead of "dictator," "recipient," and "observer," and the words "deduct" and "add" were used to replace the terms "punish" and "help." (3) Players A and B were not real. We preprogrammed the allocation chosen by player A. (4) The three trials with different offers were randomly presented to the participants.

Physiological and endocrinal measures
Heart rate was continuously recorded with a Polar WearLink þ heart rate monitor (POLAR RCX3) to assess the effects of the high-stress induction vs. the low-stress induction. Heart rate was monitored for 3 min as the baseline and was recorded throughout the subsequent tasks.
Saliva samples were collected to assess the effects of the high-stress induction vs. the low-stress induction on cortisol concentrations. Samples were collected by Salivette sampling devices (Sarstedt, Rommelsdorf, Germany) using absorbent swabs placed under the tongue for 2 min. Saliva samples were stored at À 20 � C until analysis, and samples from participants who reported any sickness (i.e. periodontitis, fever, or endocrine diseases), related medication regimen (especially hormone medicines) within the last two weeks were not analyzed further. The samples were thawed and centrifuged at 3500 rpm for 5 minute. The concentrations of salivary cortisol and testosterone were analyzed by electrochemiluminescence immunoassay (Cobas e 601, Roche Diagnostics, Numbrecht, Germany) with a sensitivity of 0.500 nmol/L (lower limit) and a standard range in the assay of 0.5-1750 nmol/L for cortisol and testosterone. The intra-and inter-assay coefficient variations (CV) for cortisol and testosterone were below 10%.

Psychological and personality measures
The Positive and Negative Affect Scale (PANAS) (Watson et al., 1988) was used to measure the subjective affective states of participants at each designated instant. The scale has a total of 20 items describing different feelings and emotions, including 10 items for positive affect (e.g. "interested," "excited") and 10 items for negative affect (e.g. "nervous," "scared"). The participants were asked to score each item on a 5-point scale based on their instant affective state, from 1 (very slightly or not at all) to 5 (extremely). The average scores of positive affect (PA) and negative affect (NA) were calculated. Besides, studies have shown that some personality traits (such as impulsiveness) and empathy (Hu et al., 2015;Wood et al., 2013) also have a very important impact on prosocial behavior, therefore, some personality factors were also considered in this study. All participants completed the following Chinese versions of personality inventories: the State-Trait Anxiety Inventory (STAI) (Shek, 1993;Spielberger et al., 1970); Barratt's Impulsiveness Scale (BIS) (Li et al., 2011;Patton et al., 1995); and the Interpersonal Reactivity Index (IRI) (Davis, 1983;Siu & Shek, 2005).

Data management and analysis
To examine whether stress was induced successfully, a mixed two-way analysis of variance (ANOVA) was conducted on salivary cortisol, heart rate, and subjective affective state, with a group (stress group, control group) as the between-subject variable and acquisition time as the within-subject variable.
For the third-party intervention task, we mainly want to investigate how acute stress and basal testosterone jointly affect the choice of third-party altruistic behavior (the number of people who made each choice and the MU amount to be taken out). We implemented median splits, according to whether they were high or low in basal testosterone for stress and control group separately (Brannon et al., 2019;De Berker et al., 2016;Van Honk et al., 2012. This resulted in four groupings: high-testosterone stress group (S-HT), lowtestosterone stress group (S-LT), high-testosterone control group (C-HT), and low-testosterone control group (C-LT). The testosterone concentrations (nmol/L, mean ± SD) of the four subgroups were S-HT: 10.4 ± 3.61, S-LT: 7.60 ± 4.38, C-HT: 11.6 ± 4.28, C-LT, 7.03 ± 2.32 (Table S1). The number of subjects who chose to punish offenders, to help victims, or to serve themselves was calculated by the chi-square test. Repeated-measurement ANOVA with a group (stress/control; high testosterone/low testosterone) and decision (punish A, help B) as factors was used to analyze the average number of MUs transferred to deduct from player A or add to player B separately under fair and unfair conditions (when A:B ¼ 90:10 and 70:30). The Greenhouse-Geisser correction was used when the requirement of sphericity in the ANOVA for repeated measures was violated. We report the partial g 2 (F test), Cohen's d (t-test), and Cramer's V (v 2 test) as the effect size, and power (1 À b) were included where appropriate.
In general, these results indicate that our stress manipulation successfully induced elevation of cortisol levels, heart rate score, and negative emotion in the expected directions, consistent with prior research (Kudielka et al., 2007). We tested whether the change in heart rate and negative affects varies as a function of cortisol responses. By linear regression between the increase of cortisol (cortisol after CPT minus baseline) and the increase of heart rate (heart rate during CPT minus baseline), we found that with the increase of cortisol, the increase of heart rate (b ¼ 0.202, p ¼ 0.064, 95% CI: À 0.01-0.34) increased synchronously. In addition, we also reported the results after grouping according to the median testosterone. Similarly, we only found the main effect of stress and did not find a significant difference between high and low testosterone separately in the stress and control group (see Figure S1 in Supplementary Information).

Preliminary analyses
After confirming that our administration of stress was successful, we conducted another analysis to verify that there were no differences in the decision-making-related personality factors between the four groups (p s < .05, Table S1 in Supplementary Information). First, we carried out an analysis to determine whether the unfair conditions were attributable to a difference in their prosocial behavior. We averaged the punishment and helping choice rate to create an overall index of the percentage of prosocial behavior and compared the prosocial and selfish choices as a function of the condition (fair or unfair). As expected, we found that as the degree of unfairness increased, the proportion of prosocial behavior increased, and the proportion of selfish behavior decreased correspondingly (v 2 ¼ 70.24, p < .001, Cramer's V ¼ 0.525). We found no preference differences between the two

Basal testosterone, stress, and third-party intervention
Next, we tested the hypothesis that basal testosterone's role in unfair offers' prosocial preference would depend on stress. According to the v 2 test, we found a significant interaction of stress � testosterone � decision (v 2 ¼ 11.83, p ¼ .008, Cramer's V ¼ 0.321) for the number of subjects who choose to punish violators to help the victims in unfair situations (Figure 3(a), Table S2 in Supplementary Information). In the control condition, we found a reduced helping preference and an increased punishment preference in the high-testosterone group, while the low-testosterone group was more willing to help and give a lesser punishment (v 2 ¼ 7.34, p ¼ .007, Cramer's V ¼ 0.359). Additionally, we found that acute stress reduced the helping preference of low-testosterone individuals (v 2 ¼ 3.47, p ¼ .063, Cramer's V ¼ 0.242), but hightestosterone individuals maintained a high preference for punishment (v 2 ¼ .09, p ¼ .77, Cramer's V ¼ 0.039). Moreover, we found a significant interaction of testosterone � decision intensity (the contribution of the transferred MUs) in the control condition [F (1,40) ¼ 4.57, p < .05, partial g 2 ¼ 0.102, power ¼ 0.568]. After a simple main effect test, we found a greater punishment intensity in the high-testosterone group than the low-testosterone group [F (1,40) ¼ 8.30, p < .01, partial g 2 ¼ 0.172, power ¼ 0.822], while we did not find a lower helping intensity in the high-testosterone group relative to the low-testosterone group [F (1,40) ¼1.01, p ¼ .32, partial g 2 ¼ 0.025, power ¼ 0.173] (Figure 3(b)). We also didn't found the significant interaction of testosterone � decision intensity (the contribution of the transferred MUs) in the stress condition [F (1,41) ¼ 0.631, p ¼ .431, partial g 2 ¼ 0.015, power ¼ 0.124]. In addition, we found a significant interaction of stress � decision type on the magnitude/intensity of the MUs transferred in the low-testosterone group [F (1,41) ¼ 4.76, p < .05, partial g 2 ¼ 0.104, power ¼ 0.588]. Specifically, we found that the low-testosterone individuals increased the punishment intensity under acute stress relative to control condition [F (1,41) ¼ 5.75, p < .05, partial g 2 ¼ 0.123, power ¼ 0.669], while the low-testosterone individuals didn't increased the help intensity under acute stress relative to control condition [F (1,41) ¼1.70, p ¼ .20, partial g 2 ¼ 0.040, power-¼ 0.257] (Figure 3(b)).
Besides, we further want to explore possible mechanisms through which acute stress may alter the testosterone/behavior association. We investigated the effects of cortiso on prosocial behavior and found the results consistent with stress (see Supplementary Information).

Discussion
Based on the dual-hormone hypothesis, previous research has demonstrated the testosterone/status-seeking relationship from the influence of basal cortisol to the influence of acute stress through costly punishment in an economic bargaining game (Prasad et al., 2019(Prasad et al., , 2017. In the present study, we further expanded previous findings to a third party without a direct conflict of interest. Consistent with previous studies, the changes of the HPA axis induced by acute stress were highly correlated with the physiological markers of stress, such as increased heart rate (Isowa et al., 2006). We found, under normative control conditions, individuals with high testosterone exhibited a reduced helping rate and increased punishment rate than the low-testosterone group. Moreover, an increased punishment magnitude in the hightestosterone group compared to the low-testosterone group when facing a control condition. These effects are consistent with the idea that testosterone can enhance both reactive aggression and generosity for increased social status (Dreher et al., 2016). It is worth emphasizing that, beyond the previous findings that individuals with high testosterone are inclined to bear the cost of personal retaliation, we found that individuals were willing to enforce norm fairness at the cost of their own benefit, even though they were not victims of the norm violation.
If testosterone can promote prosocial behavior (For example, punish rule violators in our experimental scenario), how can we explain the difference in preference between high testosterone and low-testosterone individuals in helping and punishment? According to the costly signaling hypothesis (CST), third-party punishment and third-party help are both costly signals of cooperativeness and fairness (Barclay, 2006;Kurzban et al., 2007). However, recent research has indicated that compared with a punisher, a helper has a more trustworthy reputation (Jordan et al., 2016). Punishment entails a cost both for the punisher and the punished, it is expensive and inefficient, and the threat of retaliation and vengeance from the target might lead individuals to avoid punishment when other, non-confrontational options are available (Grimalda et al., 2016;Ohtsubo et al., 2018;Przepiorka & Liebe, 2016). In addition, costly helping behavior requires a certain level of empathy (Decety & Cowell, 2015;Raboteg-Saric & Hoffman, 2001); that is, a person must be able to recognize the emotional dynamics of others, such as identifying the victim's grievance and the anger of the offender after being sanctioned. A large number of studies have shown that higher testosterone is associated with poorer empathic accuracy (Bos et al., 2016;Nitschke & Bartz, 2020;Van Honk et al., 2013). Individuals with high testosterone, in addition to possibly lower emotion-recognition ability, might have attempted to create a reputation for competitiveness, brave, courageous, and a willingness to retaliate out of concern for victims' welfare (Raihani & Bshary, 2015b).
Research also provided some support for these ideas in which a leader could command a certain amount of respect via assertive or forceful behaviors to enhance social cohesion (Price et al., 2002). In addition, people may engage in prosocial behaviors without any conscious awareness of their reputations, the punishment of high-testosterone individuals may be motivated by emotions, such as moral outrage or anger and annoyance at the offender (Fehr & G€ achter, 2002;Gordon & Frank, 1990). Some research found that police recruits with relatively high testosterone showed excessive aggression with reduced anterior prefrontal cortex control over the amygdala during emotion regulation (Kaldewaij et al., 2019). Interestingly, we found that acute stress brought the prosocial behavior of LT individuals closer to that of HT individuals. However, the preference of punishment of HT individuals did not affect by stress. Our results did not support the dual hormone hypothesis (Mehta & Josephs, 2010). Why is the competitive enthusiasm of high-testosterone individuals stable under stress? Why does stress make low-testosterone individuals less calm? First, consistent with the stress-buffering effect of status, in which high-status leadership roles decrease the stress response compared with subordinate roles (Akinola & Mendes, 2014;Knight & Mehta, 2017). Recent studies have found that testosterone decreases social anxiety and may help to modulate the effects of stress in socially challenging situations (Knight & Mehta, 2017;Kutlikova et al., 2020). High testosterone may itself be a buffer of acute stress, which enables individuals to perform as steadily, as usual, that is, they have the tendency to compete and dominate when faced with a choice of different prosocial behaviors. However, under stress, individuals with low testosterone increased their competitiveness-oriented behavior and reduced the behavior that would gain a more favorable reputation. Because of the lack of a buffer effect on stress shown by those with high testosterone, individuals with low testosterone are more susceptible to stress. We try to explain the increased punishment intensity of low-testosterone individuals under stress from the "dual-process" model of stress (Sanfey & Chang, 2008;Seeley et al., 2007). Two neurocognitive systems work in our brain: System 1 runs fast with the "hot" emotional neural circuitry of the salience network, while System 2 operates slowly and employs the neural circuitry of the executive control network. In general, people make an optimal choice by balancing the two systems. Under stress, stress-related hormones and neurotransmitters strengthen salience network activity during the acute stress phase at the cost of executive control network function (Hermans et al., 2014), which leads to a shift from goal-directed behavior to a more emotion-driven response. Driven by anger, the intensity of punishment is increased among low-testosterone individuals under stress. In addition, we also found a decreased helping tendency of low-testosterone individuals under stress. Humans have a "fight or flight" prototypic response to stress, which has been represented as an essential mechanism in the survival process (Cannon, 1914). Whether human fights or flees are thought to depend on the individual's cognitive appraisal of the stressor. If one perceives that the challenge can be realistically overcome, then a fight is likely. In circumstances in which the threat is perceived to be more formidable, then the flight is more probable. In fact, in the current scenario of third-party intervention, helping victims and selfishness belong to a kind of flight behavior, neither of which involves a conflict of interest with the violator of the social rules nor incurs any retaliation or threat (Dreber et al., 2008;Rockenbach & Milinski, 2006). As a result, individuals with low testosterone see the stress as a challenge, and they seem to want to gain social status through the same behavior pattern (punishment) as those with high testosterone.
There are several limitations to the present study. First, we recruited only healthy young males in the present study, because in our pilot experiment, more than half of the women stopped the experiment prematurely because they couldn't stand the pain of ice water. There is initial evidence for sex differences have shown that men are more likely to behave aggressively and punitively yet women are more tend to behave warmly and friendly under stress (Nickels et al., 2017;Taylor et al., 2000;Youssef et al., 2018). Besides, some research related to dual-hormone found that in women, compared to the low-stress condition, the high-stress condition reduced retaliation, whereas in men was the opposite pattern (Prasad et al., 2017). However, alternative researches suggest gender differences may not be robust (Knight & Mehta, 2017;Mehta & Prasad, 2015;Prasad et al., 2019). We hope further research will focus on any gender differences to better understand the role of testosterone in social decision-making. Second, the relatively small sample size might have anyway curtailed the sensitivity of our analyses, especially when we divide the groups dichotomously based on testosterone and cortisol median splits, which may reduce effect sizes and experimental power (Lagakos, 1988), future researchers can draw large enough sample sizes to offset the power reduction. Besides, it is slightly less rigorous to compare the conclusions obtained by the classification method and linear regression method. We suggest that with the accumulation of increasing research evidence, future researchers should consider the comparability of research methods when drawing conclusions. Third, in addition to acute stress, some personality variables, genes, or other hormones may regulate the testosterone/status-seeking relationship, which was not studied here and need to be followed up in the future. For instance, testosterone caused an increase in aggressive behavior among those who scored relatively high in trait dominance or scored low in trait selfcontrol (Carr� e et al., 2017). Third, in the current study, only basal testosterone was measured, but the changes in testosterone levels under stress are unknown. Future studies can simultaneously examine the changes in the HPA axis and HPG axis and test their influence on social behavior.

Conclusions
In conclusion, our study demonstrates that individuals with low testosterone under acute stress were more aggressionoriented, that is, they reduced their helping behavior and increased their punishment intensity, while individuals with high-testosterone exhibited less helping behavior and more punishment behavior than the low-testosterone group in the control condition. This suggests that other individual variables, such as testosterone level should also be included in the study of stress effect and stress treatment in the future.

Disclosure statement
No competing financial interests between the authors.