| August 03, 2022

What Makes Intelligent Visual Analytics Tools Truly Intelligent?

Humans are innately curious creatures. We constantly seek connection, certainty, and clarification in the world around us — when looking at signs while driving, for instance, or attempting to understand pointed gestures when asking for directions. And beyond pattern recognition, we seek to find meaning in our surroundings.

Understanding the various relationships and patterns in the world is second nature to many people, but describing them in a formalized way is rather difficult and abstract. This is where semantics—the study of the way in which individuals draw meaning in communication—comes into play [1]. Semantics allows us to explore how words and/or signs combine to convey a concept, why icons denote specific ideas, and which gestures effectively support a message.

As machine learning and artificial intelligence (AI) algorithms become more sophisticated, researchers must develop visual analysis tools that genuinely help users understand and reason with data. Analytical thinking is a structured approach wherein one answers questions and makes decisions based on facts and data. So how do we use data to assist with analytical thinking? This process involves a number of steps—including cleaning, shaping, transforming, and visualizing the data at hand—to ultimately create a story for analysis and communication.

Visual analytics tools need to solve these pain points while empowering humans to do what they do best: use their visual systems and analytical reasoning to make sense of the world. Specifically, I find that tools must possess five main characteristics in order to be intelligent:

Support the cycle of data analysis
Answer a question by composing a meaningful picture
Understand the user’s intent and the context of the task
Provide the best visualization to help with the task
Remain easy to use.

Let’s begin with an example of how the application of semantics in computer techniques can help create meaningful data encodings in charts.

Statistical Graphics Are Inherently Abstract

Charts abstract information, making it easier for viewers to identify patterns, conduct comparisons, and extrapolate. Icon encodings are graphical elements that often visually represent the semantic meanings of marks for categorical data. Assigning meaningful icons to display elements can help users perceive and interpret the visualizations with greater ease. Such encodings effectively enable visual analysis because the preattentive visual system processes them rapidly and efficiently [5]. The human visual system then spatially categorizes these icons to generate a meaningful understanding of the visualization.

Carl Sagan’s 1977 book, The Dragons of Eden, displays a chart that tracks the brain-to-body mass ratios for various animals (see Figure 1a) [2]. To make sense of this visualization, your eyes must follow the dots and labels to recognize a pattern. But by associating each animal with a semantically meaningful icon, the chart suddenly becomes much easier to understand and follow (see Figure 1b).

Figure 1. Sample charts that display the brain-to-body mass ratios for various animals on a log-log scale. 1a. Carl Sagan’s scatterplot includes dots with labels for each animal. 1b. Icons that depict each animal effectively indicate the semantics of marks in the chart. Figure courtesy of [5].

Next, let’s discuss the various ways in which tools can help people ask questions about data in simple language form.

Supporting an Analytical Conversation

Pragmatics—a concept that originated within the linguistics community—explores how one can interpret a user’s intent in a particular context, such as where something was stated, who said it, and what had been previously asserted. This concept smoothly carries over into analytical conversation, during which an individual converses with data via an interactive chart or dashboard as the medium. The user’s current and past interactions and the context of the data support a pragmatic approach for handling their intent.

A “smart” system can attempt to affect a match between the concepts in a user’s utterances and the concepts that the system knows. While follow-up repair communication can help resolve any ambiguities that a natural language interface may encounter, such systems are often constrained by the domain of their knowledge base or the context in which the interaction occurs. In addition, analytical concepts may not map directly from utterances to the underlying information.

Eviza is a research prototype that supports vague queries in the context of a visualization [3]. Figure 2a illustrates a map of earthquake data in the U.S. with marks that indicate locations that Eviza selected in response to the query, “Find large earthquakes near California.” Eviza finds two ambiguities in this query: “large” and “near,” which are fuzzy terms for size and distance. The system semantically associates the size descriptor “large” to the attribute “magnitude” with values of five or more and decides that “near” is a 100-mile radius around the border of California.

One could then present a follow-up question: “How about near Texas.” Though this phrase does not explicitly ask for “large earthquakes,” the system can infer the intent from the previous context (see Figure 2b).

Figure 2. Eviza’s response to vague queries. 2a. Eviza’s interface maps large earthquakes near California. 2b. Eviza maps large earthquakes near Texas. Figure courtesy of [3].

Helping Users Formulate Analytical Inquiry

When users find themselves clarifying their queries again and again, their questions are often too broad, too narrow, or simply not formed in a way that the system can comprehend. They thus require guidance to understand whether they are finding new insights with the visualization with which they are interacting; a lack of guidance can interfere with an accurate sense of progress toward the analytical goal. Autocompletion in these natural language systems tends to be rather basic and focus on syntactic completion of the users’ queries without any suggestions or helpful previews of the data. How can intelligent tools help users ask questions about the data that may be interesting?

Sneak Pique is an interactive visual autocompletion tool that addresses this problem [4]. It aims to help anyone—regardless of skillset—interact with data using natural language. Sneak Pique brings the fluidity of in-situ suggestions to the analytical expressions that are typical of visual analysis tasks. Figure 3 offers several examples of autocompletion suggestions that Sneak Pique generates as a user explores a dataset of COVID-19 cases around the world. The user types the query “show me cases in” and is subsequently prompted with map and calendar autocompletion widgets that respectively provide previews of the geospatial and temporal data frequencies, since “in” can reflect queries that focus on both place and time. If the user then clicks on China on the map, the widget proceeds to identify a range of cases by prompting the user to consider time with the word “between.” Sneak Pique displays a pair of date and numerical range widgets with corresponding histograms of data frequencies to help users select a valid range based on the underlying data.

Figure 3. Sneak Pique displays map and calendar autocompletion widgets. Figure courtesy of [4].

Moving Beyond the Desktop

Canadian communication theorist Marshall McLuhan coined the phrase “The medium is the message” to indicate the way in which the medium shapes the message that is being communicated. People no longer consume and analyze data via traditional business intelligence desktop tools, as media like Slack and chatbots have significantly changed these tools’ support of users. The resulting multimodal conversations create a proliferation of new potential entry points, platforms, and styles of interaction.

Figure 4. A snippet of conversation with an analytical chatbot in Slack. Courtesy of [6].

One emerging interaction modality is the analytical chatbot [6], a software application that engages in a back-and-forth natural language dialog with the user about data. Like other types of chatbots, analytical chatbots are meant to simulate a human’s actions as a conversational partner and must therefore employ natural language as both an input and output mechanism.

Figure 4 illustrates a user’s interaction with an analytical chatbot in Slack. A conversation evolves about Titanic-related data in which the chatbot provides a bar chart with a text description as well as a filter for repair and refinement.

Summary

In this article, I provided examples of scenarios wherein visual analysis tools are intelligent enough to help humans do what they do best. I aimed to provide a sense of the ways in which machine intelligence can augment human intelligence during visual analysis. There are numerous data enthusiasts in the world, many of whom may not possess the deep statistical skills to navigate the data and tools that we provide. Nevertheless, they are all excellent analytical thinkers. Here is an opportunity for us as a community to determine how to best support people through the thoughtful intelligence and cooperation that allows them to excel. With better and more powerful AI algorithms comes the added responsibility of understanding biases that might emerge during the communication of insights. I encourage readers to brainstorm new techniques that could help ease the friction of mundane tasks and ensure that the process of seeing and understanding data can be both useful and delightful.

Vidya Setlur presented this research during a keynote lecture at the Women in Data Science (WiDS) Worldwide Conference 2022, which took place earlier this year.

References
[1] Cann, R., Kempson, R., & Gregoromichelaki, E. (2009). Semantics: An introduction to meaning in language. New York, NY: Cambridge University Press.
[2] Sagan, C. (1977). The Dragons of Eden: Speculations on the evolution of human intelligence. New York, NY: Ballantine Books.
[3] Setlur, V., Battersby, S.E., Tory, M., Gossweiler, R., & Chang, A.X. (2016). Eviza: A natural language interface for visual analysis. In Proceedings of the 29th annual symposium on user interface software and technology (UIST’16) (pp. 365-377). Tokyo, Japan: Association for Computing Machinery.
[4] Setlur, V., Hoque, E., Kim, D.H., & Chang, A.X. (2020). Sneak Pique: Exploring autocompletion as a data discovery scaffold for supporting visual analysis. In Proceedings of the 33rd annual ACM symposium on user interface software and technology (UIST’20) (pp. 966-978). Association for Computing Machinery.
[5] Setlur, V., & Mackinlay, J. (2014). Automatic generation of semantic icon encodings for visualizations. In Proceedings of the SIGCHI conference on human factors in computing systems (CHI’14) (pp. 541-550). Toronto, Canada: Association for Computing Machinery.
[6] Setlur, V., & Tory, M. (2022). How do you converse with an analytical chatbot? Revisiting Gricean maxims for designing analytical conversational behavior. In Proceedings of the 2022 SIGCHI conference on human factors in computing systems (CHI’22) (pp. 1-17). New Orleans, LA: Association for Computing Machinery.

Vidya Setlur is the director of Tableau Research. She leads an interdisciplinary team of research scientists in areas such as data visualization, multimodal interaction, statistics, applied machine learning, and natural language processing (NLP). Setlur earned her Ph.D. in computer graphics from Northwestern University in 2005. Her personal research interests lie at the intersection of NLP and computer graphics.