By Debbie Sniderman
Behavioral ecologists study ecological and evolutionary aspects of animal behavior and their adaptation to surrounding environments. Traditionally, they had to visit the field, take notes, and make observations when gathering behavioral data. This was a sparse form of observation wrought with limitations. Today, an abundance of data about wild populations is available in scales that are orders of magnitude richer than before, thanks to photos, videos, sensors, and new collection technologies such as GPS, high-definition cameras, unmanned aerial vehicles (UAVs), genotyping, and crowdsourcing. Computational investigation of this data is fundamentally changing the way biologists study nature through analysis, hypothesis formation, and visualization of complex data sets.
The computational work of Tanya Berger-Wolf (University of Illinois at Chicago) has facilitated the scientific process of understanding animal sociality at individual, group, and interaction levels in the context of animals’ own environments.
“Even the questions being asked are changing because of the data,” Berger-Wolf says. “Visualizations need to change to provide answers that aren’t only in terms of text, but also appeal to the visual ways through which humans process information in an immersive dynamic environment where an analysis can be overlaid into a virtual world, such as a construction of an African Savannah with individual animals moving around it.”
Berger-Wolf is pioneering the analysis of high-resolution data for behavioral scientists, who have struggled with its use over the last decade. Computational techniques are exploratory and offer predictive models and tools to explain why animals are social and how they move. Such techniques also offer insights about how to find and identify key individuals in a group, how the group makes decisions, and whom it decides to follow. Computational pattern recognition techniques identify specific animals with unique markings, helping answer questions about population dynamics and ranges.
Berger-Wolf uses data from a wide variety of sources to provide scientific insight into collective behavior of animals such as zebras, baboons, and humans. Analyzing animal trajectories and movement patterns allows her group to identify long-term affiliates, group leaders, and individuals that initiate changes, i.e. from browsing to moving in coordinated groups.
Behavioral ecologists use video cameras to track Quick Response (QR) or color-coded insects and small animals such as ants and frogs with fluorescent numbers. Larger animals wear GPS collars with solar batteries, which identify location and inferred proximity within a network. Computer vision provides even more detailed information about the direction in which animals are looking. In the future, tiny radio antennas may help observe bird and insect migration.1
Animal signatures such as stripes, spots, notches, or wrinkles are unique and easily identifiable from any angle through image data. Berger-Wolf and her colleagues have developed an algorithm (HotSpotter) for automatically recognizing individual animals from images using these visual signatures, and built an Image-Based Ecological Information System (IBEIS) that allows tracking of individuals and populations using this data. The identification algorithm uses algebraic scale-invariant feature transform (SIFT) features to find key pixels in images that are invariant to photo angles and scales. It then matches those pixels in different images to determine whether it is the same individual.
Margaret Crofoot (University of California, Davis), a collaborator of Berger-Wolf, tracked entire baboon troops for 30 days using GPS collars at one-second intervals. International Space Station (ISS) receiver transmitters then collected location data, animal position, animals in close proximity, and orientations, creating new GPS data that didn’t previously exist for baboons or other animal populations.
Using data from the 30-day observations, Berger-Wolf and her colleagues are creating a dictionary translating GPS and accelerometer data to labeled behaviors such as hanging out, coordinated pauses, coordinated progression, startlement, transition, and unknown. Using new active learning techniques involves exploring the space, optimizing, model-fitting, and eventually inferring and predicting behaviors.
These approaches reinforce the importance of social networks in predicting behavior. Using only a baboon’s past history to predict future locations doesn’t provide accurate results and implies that social interactions don’t matter. The best predictors are simple nearest-neighbor spatial affiliates at short time scales, or animals most frequently near an individual at long timescales (see Figure 1). The number of neighbors needed to make a prediction is between four and six for both timescales, possibly defining Dunbar’s number for baboons, which is the suggested cognitive limit to the number of individuals with whom one can maintain stable relationships. Coordinated movement of a community is also significant. In higher-order animals, evidence shows that shared decision-making takes place while animals shift from uncoordinated to coordinated movements.
Berger-Wolf’s method offers much insight into animal community dynamics, indicating which members change affiliations, where they go, when they go, and for how long. As groups split and merge, the model accounts for these real biological events in the form of costs to the community.
Time stamps for networks that change over time must be chosen carefully. When sampling senses data too frequently, such as every second, the data becomes noisy and reveals networks that are too sparse or don’t change enough. Aggregating over periods of time that are too long loses important information about the order and causation of interactions. ‘Just right’ time slices correspond to the temporal scale of the network. Sampling time is a critical part of creating and inferring networks from animal data.
Communities are clusters or subgroups of individuals with relatively strong, direct, frequent ties. The definition of dynamic communities, identities or cohesive groupings that persist over time but with changing members in their clusters, is a little harder to pin down. While individuals are mostly seen with their own community, members of dynamic communities interact more frequently among themselves than with individuals outside the community.
Changing membership in a dynamic community comes with costs: \(\alpha\), \(\beta1\), and \(\beta2\). The cost \(\alpha\), for switching communities, occurs because individuals are reluctant to shift affiliations. Switching increases stress hormones, decreases access to resources and the ability to socially share those resources, and drops status. \(\beta1\) and \(\beta2\) represent loyalty and loss of social opportunity respectively, \(\beta1\) for visiting other communities where more harassment can occur and \(\beta2\) for being absent from one’s own community.
Animal observation data of physical positions, switches, and visits can yield graphs, the levels of which are assigned to the various costs based on the number of individuals switching, visiting, or absent. A graph coloring problem is an approximable way to model communities, where the colors of individual vertices denote affiliation and the colors of group vertices indicate community structure. The algorithm finds the most parsimonious dynamic communities, minimizing the overall cost across all individuals. The resulting problem is the following: for a given cost setting (\(\alpha\), \(\beta1\), \(\beta2\)), find vertex coloring that minimizes total cost. This becomes a graph coloring problem, which can be approximated close to the optimal solution.
Berger-Wolf uses traditional graph theory algorithms to solve these social network problems. “We put standard algorithmic techniques together in a way that’s not standard, since the graphs to analyze are non-standard,” she says. Fast flow-based constant factor approximations and cost optimal coloring are also proven maximum likelihood solutions for a dynamic community model.
The graph coloring problem, produced with nine months of aggregated data from observing an entire population of zebras once or twice a day, created a network with definite communities minimizing the overall cost. Principal component analysis (PCA) of the dynamic zebra communities shows four clusters marked by \(L\), \(N\), \(M\), and \(B\), indicating that lactating females, non-lactating females, stallion males, and bachelors with diverse resource needs and ways of trading resources all hold different parts in the community structure. This conclusion is biologically meaningful.
Intelligent data collection from GPS collars, drones, accidental photography, cell phone tourist photos, data sensors, and behavior comparison (with genetics and genomics) is creating large data sets that humans can no longer process or find patterns in. Computational techniques, such as Berger-Wolf’s work, are already helping analyze and visualize this data to help biologists answer complex questions.
This article is based on an invited lecture by Tanya Berger-Wolf at the SIAM Annual Meeting, which was held in Boston this July.
 Farine, D.R., Strandburg-Peshkin, A., Berger-Wolf, T.Y., Ziebart, B., Brugere, I., Li, J., & Crofoot, M.C. (2016). Both Nearest Neighbours and Long-term Affiliates Predict Individual Locations During Collective Movement in Wild Baboons. Sci. Rep., 6, 27704.
 Tantipathananandh, C., & Berger-Wolf, T.Y. (2009). Constant-Factor Approximation Algorithms for Identifying Dynamic Communities. Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 827-836). New York, NY: Association for Computing Machinery.
 Crall, J.P., Stewart, C.V., Berger-Wolf, T.Y., Rubenstein, D.I., & Sundaresan, S.R. (2013). Hotspotter – Patterned species instance recognition. Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV) (pp. 230-237). Washington, DC: IEEE Computer Society.
 Kempe, D., Tantipathananandh, C., & Berger-Wolf, T.Y. (2007). A Framework For Community Identification in Dynamic Social Networks. Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 717-726). New York, NY: Association for Computing Machinery.
 Rubenstein, D.I., Sundaresan, S.R., Fischhoff, I.R., Tantipathananandh, C., & Berger-Wolf, T. Y. (2015). Similar but Different: Dynamic Social Network Analysis Highlights Fundamental Differences between the Fission-Fusion Societies of Two Equid Species, the Onager and Grevy’s Zebra. PLOS One, 10(10), e0138645.
 Strandburg-Peshkin, A., Farine, D.R., Couzin, I.D., & Crofoot, M.C. (2015). Shared decision-making drives collective movement in wild baboons. Science, 348(6241), 1358-1361.
Debbie Sniderman is an applied physics engineer, materials scientist, and CEO of VI Ventures LLC, an engineering consulting company. She can be reached at email@example.com.