Volume 14: pp. 43–50

The Case for a Heuristic Approach to Account for Suboptimal Choice

Thomas R. Zentall

University of Kentucky

Reading Options

Continue reading below, or:
Read/Download PDF | Add to Endnote


The commentaries appropriately mention boundary conditions for the less is more effect (Beran, this issue; Carvalho et al., this issue) and the caution that choice behavior that seems suboptimal in the laboratory may be optimal in nature (Vasconcelos et al., this issue). Pisklak et al. (this issue) object to my definition of contrast to describe the difference between probability (or magnitude) of reinforcement expected and obtained but they focus on only one kind of contrast, behavioral contrast. Carvalho et al. question how impulsivity can account for the failure to choose optimally in the ephemeral reward task. The justification comes from research on delay discounting (a measure of impulsivity) in which further delaying both the smaller sooner and the larger later reward can shift preference in the direction of optimality. The same occurs with the ephemeral reward task. With regard to the midsession reversal task, Carvalho et al. question our interpretation of the positive effect on accuracy of reducing the probability of reinforcement for correct choice of S2 (the correct stimulus during the second half of the session). They argue that according to our attentional account, reducing the probability of reinforcement for correct choice of S1 (the correct stimulus during the first half of the session) should have a similar effect. However, during that half of the session, choice of S2 would be an anticipatory error, thus not very helpful as a cue. Instead, we suggest that any manipulation that shifts attention from S2 to S1 (e.g., increasing the response requirement to S2) should improve task accuracy and it does. Finally, I suggest that evolved heuristics may account for an animal’s suboptimal choice but that an animal’s flexibility in dealing with a changing environment may be a useful ability to have and may be worth studying.

Keywords: less is better effect, ephemeral reward task, midsession reversal task, heuristics

Author Note: Thomas R. Zentall, Department of Psychology, University of Kentucky, Lexington, KY 40506, USA

Correspondence concerning this article should be addressed to Thomas R. Zentall at zentall@uky.edu.

Acknowledgments: I would like to thank the coeditor of ­Comparative Cognition and Behavior Reviews, Marcia Spetch, for suggesting that my contribution to the journal should be a target article that might be commented on by several researchers.

Science is best advanced when a forum for discussion of articles can be provided within such a format. When critiques are presented in a collegial manner, it can lead to the better definitions of the phenomena being studied, better formulations of the theories to account for them, and better specification of the conditions under which the phenomena will occur. With the present response I hope to clarify the theoretical accounts and identify some of the boundary conditions under which the several phenomena covered in the target article occur (i.e., the less is better effect, the ephemeral reward effect, and the midsession reversal effect).

The less is better effect is found when there are two items of different value (e.g., A is preferred over B, but the animal will work for either one); however, when given a choice between A alone or A in combination with B, the organism prefers A alone. Beran (2019), who has done much of the ape research on the less is better effect, notes that the effect does not occur when A and B are similar in value (Sánchez-Amaro, Peretó, & Call, 2016). More important, the effect does not occur when the ape is required to consume both A and B before the next trial will start, thus delaying the next opportunity to consume A. When a delay is inserted between trials, the apes choose the A plus B option. Although the suboptimal A alone option can be found in pigeons, if one raises their motivational level, it also shifts their preference to the A plus B option (Zentall, Laude, Case, & Daniels, 2014). Thus, as Beran notes, there are clearly boundary conditions for the phenomenon that need to be further explored.

Beran (2019) also mentions the apes’ difficulty with the reverse-reward contingency test in which they need to learn to point to the smaller amount of reward in order to obtain the larger amount. Of interest, the task is actually easier when the apes can point at the (already learned) Arabic numeral for the smaller amount of reward to get the larger amount. This result is consistent with Beran’s suggestion and findings reported in the target article that often inserting a delay between the choice response and the reinforcement can facilitate the acquisition of several tasks (e.g., Zentall, Andrews, & Case, 2017; Zentall, Case, & Berry, 2017; Zentall & Raley, 2019).

Beran (2019) also notes that potentially important factors in identifying the conditions under which animals will choose suboptimally are perceptual considerations. For example, in studying the less is better effect, apes undervalue two pieces of food that are separate compared with the same two pieces that are stacked (Beran, Evans, & Ratliff, 2009), as well as food items that are broken over those that are whole (Parrish, Evans, & Beran, 2015).

Beran (2019) summarizes nicely—but mentions it only in passing in the target article—that choice behavior reflects a kind of heuristic-based bounded rationality. This theory suggests that human and nonhuman animal choice behavior is often driven by the extent to which the environment triggers certain behavior that might have evolved to be satisfactory in nature but becomes suboptimal under certain laboratory conditions (see commentary by Vasconcelos, González, & Macías, 2019).

Pisklak, McDevitt, and Dunn (2019) note in their commentary that the hypothesis that organisms are motivated to maximize reinforcement should not be attributed to Skinner (1938); perhaps Hull (1952) would have been a better reference. In that book, Hull noted that magnitude of reinforcement is associated with magnitude of reaction potential, so a stronger response would be expected to a larger magnitude or probability of reinforcement.

More important, in their critique they objected to my use of the term contrast to describe the preference for 50% signaled reinforcement over 100% reinforcement. They indicated that my use of the term contrast is inconsistent with positive behavioral contrast as defined by Reynolds (1961) and Williams (1983); the contrast one observes when a change in the rate of reinforcement in one component of a multiple schedule produces an opposite change in the rate of response in another component. However, there are several other definitions of contrast.

One of these, incentive contrast, occurs when there is a shift in the magnitude of reinforcement, for example, from low to high, in which the magnitude of the response increases beyond that of a control group for which the higher magnitude of reinforcement was experienced from the start (Crespi, 1942; Mellgren, 1972). Similarly, Bower (1961) found that rats ran slower to a signaled low magnitude of reinforcement (when on other signaled trials, there is a higher magnitude of reinforcement) compared to control rats that ran to the low magnitude of reinforcement in the presence of both signals.

Another kind of contrast, consummatory anticipatory contrast, occurs when an animal consumes less of a weak saccharin solution if it has learned that the weak saccharin solution will be followed by a strong sucrose solution, relative to a control group for which saccharin is followed by more saccharin (Flaherty, 1982). Williams (1992) reported evidence for an operant version of anticipatory contrast.

Still another kind of contrast, within-trial contrast, occurs when there is a series of paired events, the second of which is better than the first (Clement, Feltus, Kaiser, & Zentall, 2000). This within-trial contrast is implied by the procedure in which the initial-link stimulus is associated with 50% reinforcement but the positive terminal-link stimulus is associated with 100% reinforcement. Dunn and Spetch (1990) explained this effect in terms of the reduction in the delay to reinforcement. Of course, delay reduction is not an actual reduction in delay to reinforcement, because in most of these procedures, the delay between choice and reinforcement is held constant (e.g., 10 s in most of our research). Instead, it is a reduction in the delay to reinforcement relative to what would have been expected had the signal for the absence of reinforcement appeared. But of course, the same can be said for the within-trial contrast effect. If the point of the critique is that the term contrast does not explain anything that delay reduction or a signal for good news (SiGN) already provides, the point may be well taken. However, contrast was introduced by Case and Zentall (2018) as a mechanism to complement the explanation by Smith and Zentall (2016) for suboptimal choice that choice depends solely on the probability of reinforcement associated with the positive conditioned reinforcer in the terminal link. The important point is not whether one calls the mechanism contrast, delay reduction, or SiGN but how one accounts for the difference in findings between Smith and Zentall (indifference between 50% and 100% reinforcement) and Case and Zentall (a significant preference for 50% over 100% reinforcement).

The strongest criticism of the target article comes from Carvalho, Santos, Soares, and Machado (2019). Their commentary focuses on each of the phenomena covered in the target article (i.e., the less is better effect, the ephemeral reward effect, and the midsession reversal effect), and I respond to the critique of each phenomenon in turn.

The Less Is Better Effect

Carvalho et al. (2019) note that consistent effects have not always been found. When inconsistent effects occur, they may be taken as the unreliability of the effect or they may define the boundary conditions of the phenomenon. For example, Beran et al. (2009) reported that chimpanzees prefer a single large portion of banana to a similar large portion together with a smaller portion, yet if the smaller portion is stacked on top of the larger portion, they now prefer the larger amount. Carvalho et al. suggest that the animals may be avoiding fragmented/nonuniform items, a choice that might protect the animal from previously interfered with and discarded food. That explanation may account for the human broken plate experiment (Hsee, 1998) and possibly even for the smaller piece of banana for the chimpanzee (Beran et al., 2009). But it does not explain the results with pigeons involving two different kinds of grain, both intact (Zentall et al., 2014), or the finding with dogs involving a piece of carrot and piece of cheese (Pattison & Zentall, 2014), or even the results with monkeys involving a slice of cucumber and grape (Kralik, Xu, Knight, Khan, & Levine, 2012).

Carvalho et al. (2019) would also like an explanation for the individual differences. Given the fact that pigeons that showed the less is better effect were only minimally deprived of food, some of the individual differences may result from motivational differences. That hypothesis is consistent with the data from the one dog in the Pattison and Zentall (2014) study that had been rescued as a stray. This dog preferred the cheese together with the carrot to the cheese by itself. Certainly, I would concur that further research is needed to determine the conditions under which the less is better effect can be found.

The Ephemeral Reward Task

Carvalho et al. (2019) question how impulsivity can account for the absence of an association between the first and second reinforcement by most primates, pigeons, and rats, and how the absence of impulsivity allows such associations to develop in wrasse and parrots. At a descriptive level, the animals that fail to choose optimally do not appear to base their choice on the events that follow the first reinforcement.

At a more theoretical level, however, impulsivity appears to be the mechanism responsible for suboptimal choice with the delay discounting procedure (Ainslie, 1974). The rate at which delayed rewards are discounted by humans is significantly correlated with measures of impulsivity (e.g., Blaszczynski, Steel, & McConaghy, 1997; Petry, 2001; Vitaro, Arseneault, & Tremblay, 1997). When organisms prefer small-immediate reinforcement to larger-delayed reinforcement, it can be attributed to the ratio of delay of the smaller-sooner to the larger-later reinforcement. By forcing the organism to make its choice earlier (see Rachlin & Green, 1972), it reduces the ratio of the smaller-sooner to the larger-later. Imagine that the smaller-sooner is delayed by 1 s and the larger-later is delayed by 10 s. The ratio of the delays would be 1:10. However, if the choice had to be made 10 s earlier, the ratio would have been 11:20; a smaller ratio. Assuming that a similar mechanism is involved in the ephemeral reward task, if the rapid availability of the first reinforcement is responsible for the indifference between the two alternatives, increasing the delay to the first reinforcement should make the ratio of the delay from the choice to the first and second reinforcement almost 1:1. Thus, the choice would be almost one reinforcement versus two reinforcements, both delayed by about 10 s.

The Midsession Reversal Task

We have suggested that some errors occur because the pigeons have difficulty remembering the previously chosen stimulus and, importantly, the consequence of that choice (Smith, Beckmann, & Zentall, 2017). But Carvalho et al. (2019) question that account because the pigeons rarely make errors during many of the early and late trials in the session, so memory cannot be a factor on those trials. Of course, there is no need to remember the last chosen stimulus and the outcome of that choice early and late in the session because, after many sessions of training, the appropriate early and late choices would be well established in reference memory. The only ambiguity would be in about the middle third of the session, when time to the reversal would be least accurate, and there, a reminder of the last choice made and its consequence would be most helpful. Smith et al. (2017) found that a reminder of the last choice made (using distinctive houselights) and its consequence (maintaining the feeder light following reinforcement) results in a significant reduction in errors. Such memory loss does not explain all of the anticipatory errors, however, because even with so-called reminders, pigeons continue to make both anticipatory and perseverative errors, just fewer of them.

Carvalho et al. (2019) also suggest that timing cannot be the only cue, because when the duration of the intertrial interval is halved, pigeons switch shortly after the reversal, not at the end of the session, so clearly local cues based on feedback from reinforcement and its absence do play a role. But as we have shown (Rayburn-Reeves, Molet, & Zentall, 2011), pigeons respond to the feedback from an unpredictable reversal much more slowly when it occurs early in the session than when it occurs later in the session.

Carvalho et al. (2019) question my interpretation of why reducing the probability of reinforcement for correct S2 responses (from 100% to 20%) reduces errors (especially anticipatory errors) but reducing the probability of reinforcement for correct S1 responses (from 100% to 20%) actually increases errors (especially anticipatory errors; Santos, Soares, Vasconcelos, & Machado, 2019). In my target article, I suggested that reducing the probability of reinforcement for correct S2 responses from 100% to 20% reduces the response competition between S1 and S2 and encourages the pigeons to choose based on the consequences of choice of S1 alone. Carvalho et al. note that if response competition accounts for the effect, it should be symmetrical, but it is not.

Clearly more is needed to account for the absence of symmetry. In the case of the 20% reinforcement of correct choice of S1 during the first half of the session, the problem is that the feedback from choice of S1 is ambiguous (nonreinforcement occurs most of the time prior to the reversal), so it is difficult to respond to the reversal based on those local cues. Feedback from choice of S2 is not ambiguous, but choice of S2 represents an anticipatory error. Thus, the only reliable source of local cues would be correct choice of S2 following the reversal, and for this reason, the pigeons make a large number of anticipatory errors to test whether the reversal has occurred. In the 20% reinforcement of correct S1 choices condition, the local cues for the reversal are virtually absent and only global cues remain. Thus, the major source of local cues consists of the feedback from anticipatory errors. Of course, the pigeons are also timing so they do not make anticipatory errors early in the session.

There is convergent evidence for the hypothesis that the reduction of reinforcement for correct choice of S2 reduces response competition by shifting the pigeons’ attention from reinforcement associated with choice of S2 to nonreinforcement associated with choice of S1. We have recently found that increasing the response requirement for choice of S2 (from one peck to 10 pecks) has an effect similar to the reduction of reinforcement for correct choice of S2 (Zentall, Andrews, Case, & Peng, 2019). Increasing the response requirement for all S2 choices results in a decrease in anticipatory errors without a concomitant increase in perseverative errors. In this case, the added response requirement for choosing S2 delays (rather than omits) reinforcement, but it also shifts attention from the consequence of choice of S2 to the consequence of choice of S1, especially as the reversal approaches and shortly thereafter.

According to Carvalho et al. (2019), an alternative explanation is that the difference in reinforcement probability between S1 (20%) and S2 (100%) biases the pigeons’ time-estimate of the reversal moment. But to be consistent, the S1 (100%) and S2 (20%) condition should bias the pigeons’ time estimation in the opposite direction. If that were the case, however, the decrease in anticipatory errors should be accompanied by an increase in perseverative errors, but there was no evidence of an increase in perseverative errors. According to Carvalho et al., the idea of “cue competition remains so vague that its empirical test is virtually impossible” (p. 30). To adequately test this hypothesis, one would have to propose a reasonable (discriminable) alternative. As already noted, a bias in the pigeons’ time estimation does not account for the asymmetrical effects of 20% reinforcement of correct S2 choices and 20% reinforcement of correct S1 choices. Furthermore, why the absence of reinforcement for correct choices of either kind should bias the pigeons’ time estimation is not clear.

Carvalho et al. (2019) suggest that we need to specify the conditions under which pigeons use global cues (e.g., timing) and local cues (e.g., responses and their outcomes). In fact, Rayburn-Reeves, Qadri, Brooks, Keller, and Cook (2017, Experiment 2a) attempted to distinguish between global and local cues by providing pigeons with a cue during the intertrial interval that indicated whether the trials were from the first or second half of the session. On uncued sessions, when they occasionally miscued the pigeon either early or late in the session by presenting them with the wrong cue (Rayburn-Reeves et al., 2017, Experiment 2c), the effect of the miscue had little effect. That is, early and late in the session, the global timing cue controlled choice, whereas as the reversal approached and immediately after the reversal, the intertrial interval cue exerted control over choice.

In the conclusion section of the target article, I suggested that evolved heuristics may account for some instances of suboptimal choice. Carvalho et al. (2019) noted that “the evidence for the specific heuristics is conspicuously missing” (p. 30). The hypothesis that evolved heuristics may predispose animals to make choices that in nature, on average, would be more rewarding or less dangerous is meant as a challenge to look for such mechanisms in nature. If such mechanisms could be identified, it would help identify the origins of the suboptimal behavior found in the laboratory.

Furthermore, the generalization of such a predisposition to laboratory conditions suggests the relative inflexibility of the animal’s behavior in the face of altered contingencies. Carvalho et al. (2019) feel that it may be inappropriate to hypothesize about evolved heuristics to explain suboptimal choice in laboratory experiments. They are not reluctant, however, to hypothesize that the reason chimpanzees prefer a 20 g piece of banana to a 20 g piece together with a 5 g piece may be because they have an aversion to “fragmented” or “discarded” food. In both case, evolved heuristics may provide a useful starting point to help explain certain suboptimal choice phenomena, but once one has imagined a naturally occurring event that might account for the behavior, as Carvalho et al. note, one should not be satisfied that one understands it.

Vasconcelos et al. (2019) make an even stronger case than I make in the target article for the evolution of behavior in the natural environment. They argue that the degree to which the current circumstances match the organism’s typical ecology provides the determining factor in the degree of optimality of the animal’s behavior. As Vasconcelos et al. state, “It all comes to the match between the domain of selection and the domain of testing (Houston, McNamara, & Steer, 2007; Stevens & Stephens, 2010)” (p. 39).

I have no argument with this position except to note that ontological adaptability might be considered an important attribute to possess, especially in a rapidly changing environment of the kind that many species are now encountering (e.g., with climate change and reduction in habitat). It may be species-centric to favor flexibility of behavior as a trait that easily accommodates to novel reinforcement contingencies, but it is certainly worthy of study.

Carvalho et al.’s (2019) conclusion that “we need to define [heuristics] clearly, identify the conditions that activate them, and coordinate them with currently known behavioral processes” (p. 31) is certainly correct. However, calling the mechanisms that result in suboptimal choices heuristics is not just an ad hoc term that pretends to explain the behavior. Instead, it suggests certain testable hypotheses about the conditions under which the suboptimal behavior should occur.

In general, heuristics are decision rules, triggered by environmental cues, which suggest that natural predispositions or well-learned behaviors are generally appropriate under conditions that favor a rapid response. Heuristics are responses governed by what Kahneman (2011) referred to as under the control of System 1 (less cognitive and more automatic than System 2). What is the evidence that heuristics are responsible for suboptimal choice? It is assumed that the use of heuristics results in appropriate choices under most (naturally occurring) conditions, especially in cases in which there is a cost to the acquisition of additional information. If this is correct, it should be possible to reduce suboptimal choice by imposing a delay prior to experiencing the consequences of one’s choice, thus allowing one to obtain additional information. We have found that adding a delay following choice has reduced suboptimal choice for several cases in which suboptimal choice otherwise has been found: ephemeral rewards (Zentall et al., 2017), unskilled gambling-like tasks (Zentall et al., 2017; see also McDevitt, Spetch, & Dunn, 1997), and object permanence (Zentall & Raley, 2019). The hypothesis that heuristics are responsible for suboptimal choice is not merely “ad hoc speculation” (Carvalho et al., 2019, p. 31). Instead, it provides a direction for the further research as suggested by Carvalho et al. (2019).


  1. Ainslie, G. W. (1974). Impulse control in pigeons. Journal of the Experimental Analysis of Behavior, 21, 485–489. doi:10.1901/jeab.1974.21-485

  2. Beran, M. J. (2019). All hail suboptimal choice! Now, can we “fix” it? Comparative Cognition and Behavior Reviews, 14, 19–23. doi:10.3819/CCBR.2019.140002

  3. Beran, M. J., Evans, T. A., & Ratliff, C. L. (2009). Perception of food amounts by chimpanzees (Pan troglodytes): The role magnitude, continuity, and wholeness. Journal of Experimental Psychology: Animal Behavior Processes, 21, 485–489. doi:10.1037/a0015488

  4. Blaszczynski, A., Steel, Z., & McConaghy, N. (1997). Impulsivity in pathological gambling: the antisocial impulsivist. Addiction, 92, 75–87. doi:10.1111/j.1360-0443.1997.tb03639.x

  5. Bower, G. H. (1961). A contrast effect in differential conditioning. Journal of Experimental Psychology, 62, 196–199. doi:10.1037/h0048109

  6. Carvalho, M., Santos, C., Soares, C., & Machado, A. (2019). Meliorating the suboptimal-choice argument. Comparative Cognition and Behavior Reviews, 14, 25–32. doi:10.3819/CCBR.2019.140003

  7. Case, J. P., & Zentall, T. R. (2018). Suboptimal choice in pigeons: Does the predictive value of the conditioned reinforcer alone determine choice? Behavioural Processes, 157, 320–326. doi:10.1016/j.beproc.2018.07.018

  8. Clement, T. S., Feltus, J., Kaiser, D. H., & Zentall, T. R. (2000). “Work ethic” in pigeons: Reward value is directly related to the effort or time required to obtain the reward. Psychonomic Bulletin & Review, 7, 100–106. doi:10.3758/BF03210727

  9. Crespi, L. P. (1942). Quantitative variation in incentive and performance in the white rat. American Journal of Psychology, 40, 467–517. doi:10.2307/1417120

  10. Dunn, R., & Spetch, M. (1990). Conditioned reinforcement on schedules with uncertain outcomes. Journal of the Experimental Analysis of Behavior, 53, 201–218. doi:10.1901/jeab.1990.53-201

  11. Flaherty, C. F. (1982). Incentive contrast. A review of behavioral changes following shifts in reward. Animal Learning & Behavior, 10, 409–440. doi:10.3758/BF03212282

  12. Houston, A. I., McNamara, J. M., & Steer, M. D. (2007). Do we expect natural selection to produce rational behaviour? Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 362, 1531–1543. doi:10.1098/rstb.2007.2051

  13. Hsee, C. K. (1998). Less is better: When low-value options are valued more highly than high-value options. Journal of Behavioral Decision Making, 11, 107–121. doi:10.1002/(SICI)1099-0771(199806)11

  14. Hull, C. L. (1952). A behavioral system: An introduction to behavior theory concerning the individual organism. New Haven, CT: Yale University Press.

  15. Kahneman, D. (2011). Thinking fast and slow. London: Macmillan.

  16. Kralik, J. D., Xu, E. R., Knight, E. J., Khan, S. A., & Levine,W. J. (2012). When less is more: Evolutionary origins of the affect heuristic. PLoS ONE, 7, e46240. doi:10.1371/journal.pone.0046240

  17. McDevitt, M., Spetch, M., & Dunn, R. (1997). Contiguity and conditioned reinforcement in probabilistic choice. Journal of the Experimental Analysis of Behavior, 68, 317–327. doi:10.1901/jeab.1997.68-317

  18. Mellgren, R. L. (1972). Positive and negative contrast effects using delayed reinforcement. Learning and Motivation, 3, 185–193. doi:10.1016/0023-9690(72)90038-0

  19. Parrish, A. E., Evans, T. A., & Beran, M. J. (2015). Defining value through quantity and quality—Chimpanzees (Pan troglodytes) undervalue food quantities when items are broken. Behavioural Processes, 111, 118–126. doi:10.1016/j.beproc.2014.11.004

  20. Pattison, K. F., & Zentall, T. R. (2014). Suboptimal choice by dogs: When less is better than more. Animal Cognition, 17, 1019–1022. doi:10.1007/s10071-014-0735-2

  21. Petry, N. M. (2001). Pathological gamblers, with and without substance use disorders, discount delayed rewards at high rates. Journal of Abnormal Psychology, 110, ٤٨٢–487. doi:10.1037/0021-843X.110.3.482

  22. Pisklak, J. M., McDevitt, M. A., & Dunn, R. M. (2019). Clarifying contrast, acknowledging the past, and expanding the focus. Comparative Cognition and Behavior Reviews, 14, 33–38. doi:10.3819/CCBR.2019.140004

  23. Rachlin, H., & Green, L. (1972). Commitment, choice and self-control. Journal of the Experimental Analysis of Behavior, 17, 15–22. doi:10.1901/jeab.1972.17-15

  24. Rayburn-Reeves, R. M., Molet, M., & Zentall, T. R. (2011). Simultaneous discrimination reversal learning in pigeons and humans: Anticipatory and perseverative errors. Learning & Behavior, 39, 125–137. doi:10.3758/s13420-010-0011-5

  25. Rayburn-Reeves, R. M., Qadri, M. A. J., Brooks, D. I., Keller, A. M., & Cook, R. G. (2017). Dynamic cue use by pigeons in a midsession reversal. Behavioural Processes, 137, 53–63. doi:10.1016/j.beproc.2016.09.002

  26. Reynolds, G. S. (1961). Behavioral contrast. Journal of the Experimental Analysis of Behavior, 4, 57–71. doi:10.1901/jeab.1961.4-57

  27. Sánchez-Amaro, A., Peretó, M., & Call, J. (2016). Differences in between-reinforcer value modulate the selective-value effect in great apes (Pan troglodytesP. PaniscusGorilla gorillaPongo abelii). Journal of Comparative Psychology, 130, 1–12. doi:10.1037/com0000014

  28. Santos, D. C., Soares, C., Vasconcelos, M., & Machado, A. (2019). The effect of reinforcement probability on time discrimination in the midsession reversal task. Journal of the Experimental Analysis of Behavior. Advance online publication. doi:10.1002/jeab.513

  29. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century.

  30. Smith, A. P., Beckmann, J. S., & Zentall, T. R. (2017). Gambling-like behavior in pigeons: ‘jackpot’ signals promote maladaptive risky choice. Nature: Scientific Reports, 7, 6625. doi:10.1038/s41598-017-06641-x

  31. Smith, A. P., & Zentall, T. R. (2016). Suboptimal choice in pigeons: Choice is based primarily on the value of the conditioned reinforcer rather than overall reinforcement rate. Journal of Experimental Psychology: Animal Behavior Processes, 42, 212–220. doi:10.1037/xan0000092

  32. Stevens, J. R., & Stephens, D. W. (2010). The adaptive nature of impulsivity. In G. J. Madden & W. K. Bickel (Eds.), Impulsivity: The behavioral and neurological science of discounting (pp. 361–387). Washington DC: American Psychological Association. doi:10.1037/12069-013

  33. Vasconcelos, M., González, V. V., & Macías, A. (2019). Evolved psychological mechanisms as constraints on optimization. Comparative Cognition and Behavior Reviews, 14, 39–42. doi:10.3819/CCBR.2019.140005

  34. Vitaro, F., Arseneault, L., & Tremblay, R. E. (1997). Dispositional predictors of problem gambling in male adolescents. American Journal of Psychiatry, 154, 1769–1770. doi:10.1176/ajp.154.12.1769

  35. Williams, B. A. (1983). Another look at contrast in multiple schedules. Journal of the Experimental Analysis of Behavior, 39, 345–384. doi:10.1901/jeab.1983.39-345

  36. Williams, B. A. (1992). Inverse relations between preference and contrast. Journal of the Experimental Analysis of Behavior, 58, 303–312. doi:10.1901/jeab.1992.58-303

  37. Zentall, T. R., Andrews, D. M., & Case, J. P. (2017). Prior commitment: Its effect on suboptimal choice in a gambling-like task. Behavioural Processes, 145, 1–9. doi:10.1016/j.beproc.2017.09.008

  38. Zentall, T. R., Andrews, D. M., Case, J. P., & Peng, D. (2019). Less information results in better midsession reversal accuracy by pigeons. Manuscript submitted for publication.

  39. Zentall, T. R., Case, J. P., & Berry, J. R. (2017). Early commitment facilitates optimal choice by pigeons. Psychonomic Bulletin & Review, 24, 957–963. doi:10.1016/j.beproc.2017.09.008

  40. Zentall, T. R., Laude, J. R., Case, J. P., & Daniels, C. W. (2014). Less means more for pigeons but not always. Psychonomic Bulletin & Review, 21, 1623–1628. doi:10.3758/s13423-014-0626-1

  41. Zentall, T. R., & Raley, O. L. (2019). Object permanence in the pigeon (Columba livia): Insertion of a delay prior to choice facilitates visible- and invisible-displacement accuracy. Journal of Comparative Psychology, 133, 132–139. doi:10.1037/com0000134