The Critical Human Elements in Using Artificial Intelligence in Comparative Cognition Studies
Abstract
The past 5 years have seen a revolution in the use of artificial intelligence (AI) across a wide range of disciplines, from the sciences to education to the performing arts (Haenlein & Kaplan, 2019). Its ability to change how society collects, distills, and interprets information offers immense promise for potential breakthroughs while requiring reflection on the ethics and limitations of its capacity. For nonspecialists, AI is the ability of computer programs to emulate human decision making and perform tasks in everyday environments. Within the scope of AI falls machine learning (ML), which refers to the specific technologies or algorithms that enable computer systems to identify patterns, make decisions, and improve accuracy through increased experience and interaction with the data of interest (Kok et al., 2009). In this article, we provide examples of how AI is transforming the field of animal cognition and behavior, especially within the discipline of animal communication. We discuss several of AI’s contributions to deciphering how animals exchange and interpret information and consider what we risk losing along this path. We hope to begin an evolving conversation about the use of AI in studies of animal behavior and reflect on whether AI can truly enhance meaningful outcomes in our pursuit of understanding nature.
Keywords: animal behavior, communication, observations, algorithms, validation
One need not think too hard before realizing the potential contribution of artificial intelligence (AI) to the field of comparative cognition. Researchers within this discipline are motivated primarily by the desire to understand how animals think and behave within their environment and the evolutionary consequences of different approaches to decision making. To do so, we must identify individuals, observe and record their behavior, categorize several behavioral metrics of interest that can be easily defined and interpreted, and test hypotheses. A compelling benefit of AI is its inherent objectivity, enabling researchers to overcome our human subjective biases and detect patterns that are difficult to perceive and categorize. Although the field has developed ways to deal with this (observer reliability tests), learning algorithms can be used to cluster behaviors (especially those that may be more nuanced) without human input. AI approaches have enabled researchers to capture rare but important behaviors and increase the number of observations within their study. For example, deep neural networks can provide frame-by-frame tracking of animals from video data. Barrett et al. (2020) used machine learning to characterize the manual dexterity of mice during food handling. They found that mice exhibit multiple distinct microstructural features that can be generalized across food types. Such automated methods may improve the type of data that can be collected in the field or the laboratory, as well as analyze vast quantities of information that would take researchers months or even years to sift through by hand (e.g., Hoffman et al., 2023).
Additionally, AI has expanded how experiments with animals can be conducted. Virtual reality programs, including FreemoVR, enable controlled interactions between animals, including mice, fish, and simulated environments (Stowers et al., 2017). For example, accounts of real-time head position and eye tracking in zebrafish collected through deep neural networks have been used to improve the accuracy of experiments using virtual reality to expose fish to dynamic environments and social interactions with virtual fish (Forkosh, 2021). Similarly, researchers have developed a tiny dancing honeybee robot named RoboBee that was programmed to effectively communicate with other bees inside the hive (Landgraf et al., 2018). Machine learning has also been used to improve the lives of animals living in captivity. Research that uses machine learning to track the behavioral and physiological state of animals is capable of detecting changes in body temperature, sound production, and even deviations in the physical appearance of animals, which may indicate the presence of parasites or disease (Neethirajan, 2020; Patel et al., 2022; Cuan et al., 2020).
Perhaps the topic that has captured the most attention in our field is the use of AI to decode nonhuman communications systems. This rekindled interest in “speaking with animals” follows groundbreaking success in using machine learning and large language models to evaluate and expand our understanding of human language (for a review, see Hadi et al., 2023). Linguists, biologists, and cognitive scientists have deep-rooted motivation to understand whether the capacity for language is truly unique to humans or is shared by different species. Although our desire to understand and communicate with animals dates back to the earliest evidence of human culture (Fögen, 2014), many researchers have proposed that we can now harness the power of AI and large language models to accomplish interspecies communication. This idea, referred to as the “Doctor Dolittle Challenge” (aptly named by Yovel and Rechavi, 2023) has spurred multiple AI-based multidisciplinary initiatives that aim to break the communicative barriers between humans and other species. This effort is driven by engineers and programmers as much as biologists and behavioral scientists and includes groups such as Earth Species Project (ESP); Communication and Coordination Across Scales (CCAS); Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR); Interspecies Internet; and Project CETI (Cetacean Translation Initiative). Each collective varies with respect to what they aim to achieve (from the extreme undertaking of understanding the language of whales, dolphins, and porpoises to simply using AI as an overall tool to better understand information transfer and decision making among individuals), but they all propose that machine learning will revolutionize the field of animal communication.
Numerous challenges are inherent in unraveling what animals are saying to each other. Yovel and Rechavi (2023) outlined several criteria that must be fulfilled to communicate with animals, including developing machines that communicate with animals using their own signals in a variety of contexts and proving that individual animals respond to the machines as if they were responding to a conspecific. One fundamental component of this effort is deciphering where communicative units (song, chirps, or buzzes) begin and end. Many of the aforementioned initiatives attempt to tackle this issue. For example, Project CETI’s primary mission is to achieve advancements and breakthroughs in interspecies communication, with a focus on sperm whales. They provide a scientific road map toward understanding sperm whale communication by processing massive data sets, detecting basic communication units, searching for language-like higher-level structures, and validating models through interactive playback experiments (Andreas et al., 2022). Given the state of the field of marine bioacoustics and the scarcity of information available on the functionality of most cetacean communication signals, this initiative is of great interest to those who study signal production and reception among nonhuman animals. Projects like these have captured the attention of the public and motivated investors who hope to be a part of the effort that accomplishes interspecies communication and dismantles the idea that language systems are uniquely human. Ironically, millions of dollars are pouring into such initiatives at a time when attaining funding for long-term foundational studies of animal behavior is increasingly challenging.
The theoretical road maps laid out by these groups include exciting new tools to detect and classify animal vocalizations, especially within large and arduous data sets. When our research group first began to explore these efforts in earnest, we were sifting through tens of thousands of overlapping common dolphin whistles recorded near the Channel Islands and trying our best to imagine some reasonable path forward. Very little is known about the behavior of this species, and we were desperate to learn more about what these signals meant, who they were intended for, and how they influenced group-level behavior (Casey et al, in press). Members of our group even met with a representative from ESP to ask about potential new information we might garner from these recordings. It became clear that while AI-based tools could assist with parsing, measuring, and even classifying discrete signals, there were also broad limitations of automated approaches. Without more knowledge of group composition and the behavioral context of individuals, deciphering the meaning of a given whistle would be impossible. As lifelong students of marine mammal behavior, we can appreciate the enormous challenges of observing and interpreting the behavior of animals that spend most of their lives beneath the water’s surface. The media surrounding AI-centric approaches to decoding animal communication systems often oversimplifies the connection between the machine-learning-generated groupings of potentially important signals and the behavioral experiments needed to test their significance. The ability to conduct the latter still depends heavily on our understanding of a given animal’s biology and the behavioral context in which information exchange occurs. Our capacity to decipher what common dolphins were saying to each other still began and ended with foundational experiences with these animals in nature.
To date, some of the most significant discoveries in the field of animal communication have come from researchers having immersed themselves in the lives of animals to understand their behavior. Only after years of careful observation have we isolated specific signals and experimentally tested whether animals respond differently and appropriately to these stimuli in different contexts. Through a series of elegant experiments that spanned several decades, Karl von Frisch and his collaborators showed that bees communicate the distance and direction of food sources via their figure-eight waggle dances (von Frisch & Lindauer, 1956). Fifteen years of observational study led to the discovery that Japanese tits emit a unique alarm call that elicits an antipredator response specific to the presence of snakes (Suzuki, 2018). Long-term studies of bottlenose dolphins have shown that each animal produces its own individually distinctive signature whistle that is learned early in life and helps individuals to recognize and maintain contact with conspecifics (Caldwell & Caldwell, 1965; Caldwell et al., 1990). These efforts were successful because they included carefully curated data collection gained over a long period, and their scientific relevance was augmented by the depth of multimodal knowledge gleaned throughout the course of each study. Moving forward, how will we teach emerging students in our field? Will they be compelled to go outside and spend countless hours tracking and observing songbirds, or will they sit behind a computer and use big data to deduce vocal patterns and infer their meaning? We hope it is a combination of both. Although our field has always tried to remove our human-centric lens when interpreting the way in which animals perceive the world around them, ironically the only way we may ever accomplish this is to continue to immerse ourselves in their environment and, to the best of our abilities, try to imagine and empathize with what it may be like to live in their world. This perspective has been validated by Hoeschele et al. (2023), who advocate for the incorporation of the objective human-centric approach to understanding animal cognition. Paired with rigorous experimentation, our human experiences and patterns can inform and even improve our ability to interpret animal behavior (e.g., Mann et al., 2021).
AI is a dynamic, multidisciplinary tool that is certain to enhance the field of comparative cognition. However, like any effective analytical tool, it is only as good as the data that it is fed and the concepts that it is testing. In the field’s attempt to produce a “human-free” interpretation of behavior using AI, we risk losing the essential human element of thoughtful and meticulous observation that leads to creative data collection and rigorous hypothesis development—both of which are essential inputs for any analytical testing. How do we reconcile the capacity of powerful new AI-based tools with the human experience that researchers gain through long-term field studies? The power of AI lies in its ability to illuminate the dark corners of our data that we may not see. Its results are maximized through their interpretation, which comes from wisdom gleaned through those who dedicate their lives to studying animals in nature.
References
Andreas, J., Beguš, G., Bronstein, M. M., Diamant, R., Delaney, D., Gero, S., Goldwasser, S., Gruber, D. F., de Haas, S., Malkin, P., & Pavlov, N. (2022). Toward understanding the communication in sperm whales. Iscience, 25(6), Article 104393. https://doi.org/10.1016/j.isci.2022.104393
Barrett, J. M., Raineri Tapies, M. G., & Shepherd, G. M. (2020). Manual dexterity of mice during food-handling involves the thumb and a set of fast basic movements. PLOS ONE, 15(1), Article e0226774. https://doi.org/10.1371/journal.pone.0226774
Caldwell, M. C., & Caldwell, D. K. (1965). Individualized whistle contours in bottle-nosed dolphins (Tursiops truncatus). Nature, 207(4995), 434–435. https://doi.org/10.1038/207434a0
Caldwell, M. C., Caldwell, D. K., & Tyack, P. L. (1990). Review of the signature-whistle hypothesis for the Atlantic bottlenose dolphin. In S. Leatherwood & R. R. Reeves (Eds.), The bottlenose dolphin (pp. 199–234). Academic Press. https://doi.org/10.1016/B978-0-12-440280-5.50014-7
Casey, C., Fregosi, S., Oswald, J., Janik, V., Visser, F., & Southall, B. (in press). Common dolphin whistle response to experimental mid-frequency sonar. PLOS ONE.
Cuan, K., Zhang, T., Huang, J., Fang, C., & Guan, Y. (2020). Detection of avian influenza-infected chickens based on a chicken sound convolutional neural network. Computers and Electronics in Agriculture, 178, Article 105688. https://doi.org/10.1016/j.compag.2020.105688
Fögen, T. (2014). Animal communication. In G. L. Campbell (Ed.), The Oxford handbook of animals in classical thought and life (pp. 216–232). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199589425.013.013
Forkosh, O. (2021). Animal behavior and animal personality from a non-human perspective: Getting help from the machine. Patterns, 2(3), Article 100194. https://doi.org/10.1016/j.patter.2020.100194
Hadi, M. U., Al-Tashi, Q., Qureshi, R., Shah, A., Muneer, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Wu, J., & Mirjalili, S. (2023). A survey on large language models: Applications, challenges, limitations, and practical usage. TechRxiv, July 10, 2023. https://doi.org/10.36227/techrxiv.23589741.v1
Haenlein, M., & Kaplan, A. (2019). A brief history of artificial intelligence: On the past, present, and future of artificial intelligence. California Management Review, 61(4), 5–14. https://doi.org/10.1177/0008125619864925
Hoeschele, M., Wagner, B., & Mann, D. C. (2023). Lessons learned in animal acoustic cognition through comparisons with humans. Animal Cognition, 26(1), 97–116. https://doi.org/10.1007/s10071-022-01735-0
Hoffman, B., Cusimano, M., Baglione, V., Canestrari, D., Chevallier, D., DeSantis, D. L., Jeantet, L., Ladds, M. A., Maekawa, T., Mata-Silva, V., & Moreno-González, V. (2023). A benchmark for computational analysis of animal behavior, using animal-borne tags. ArXiv, abs/2305.10740.
Kok, J. N., Boers, E. J., Kosters, W. A., Van der Putten, P., & Poel, M. (2009). Artificial intelligence: Definition, trends, techniques, and cases. Artificial Intelligence, 1, 270–299.
Landgraf, T., Bierbach, D., Kirbach, A., Cusing, R., Oertel, M., Lehmann, K., Greggers, U., Menzel, R., & Rojas, R. (2018). Dancing honey bee robot elicits dance-following and recruits foragers. ArXiv, abs/1803.07126.
Mann, D. C., Fitch, W. T., Tu, H.-W., & Hoeschele, M. (2021). Universal principles underlying segmental structures in parrot song and human speech. Scientific Reports, 11, Article 776. https://doi.org/10.1038/s41598-020-80340-y
Neethirajan, S. (2020). The role of sensors, big data and machine learning in modern animal farming. Sensing and Bio-Sensing Research, 29, Article 100367. https://doi.org/10.1016/j.sbsr.2020.100367
Patel, H., Samad, A., Hamza, M., Muazzam, A., & Harahap, M. K. (2022). Role of artificial intelligence in livestock and poultry farming. Sinkron: Jurnal Dan Penelitian Teknik Informatika, 7(4), 2425–2429. https://doi.org/10.33395/sinkron.v7i4.11837
Stowers, J., Hofbauer, M., Bastien, R., Griessner, J., Higgins, P., Farooqui, S., Fischer, R. M., Nowikovsky, K., Haubensak, W., Couzin, I. D., Tessmar-Raible, K., & Straw, A. D. (2017). Virtual reality for freely moving animals. Nature Methods, 14, 995–1002. https://doi.org/10.1038/nmeth.4399
Suzuki, T. N. (2018). Alarm calls evoke a visual search image of a predator in birds. Proceedings of the National Academy of Sciences, 115(7), 1541–1545. https://doi.org/10.1073/pnas.1718884115
von Frisch, K., & Lindauer, M. (1956). The “language” and orientation of the honey bee. Annual Review of Entomology, 1(1), 45–58. https://doi.org/10.1146/annurev.en.01.010156.000401
Yovel, Y., & Rechavi, O. (2023). AI and the Doctor Dolittle challenge. Current Biology, 33(15), R783–R787. https://doi.org/10.1016/j.cub.2023.06.063