Abstract
Diversity metrics are used to compare sites and to track trends within sites. Many metrics have been proposed, from simple species counts to complex indices like the Floristic Quality Index (FQI). Using a large dataset (n = 452 quadrats) from high-quality remnant prairie, degraded remnant prairie, and restoration sites, we examined correlations among ten diversity metrics and explored which metrics provide useful information about trends and differences among site types. We also created a conservatism list for 492 plant species of the Texas Blackland Prairie. Shannon diversity, Simpson diversity, and FQI were strongly correlated with species richness (r2 > 0.53). Average conservatism varied greatly at low levels of richness and less at high levels of richness. This metric, which is usually considered to be independent of sample area, should be used with caution in sites with low richness. When assessing change in high-quality remnant prairie plots, native species richness increased the most. Average conservatism changed little, because an increase in specialist species (coefficient of conservatism ≥ 7) was matched by an increase in generalist forbs (coefficients 4–6). FQI increased, driven by changes in species richness rather than conservatism. When comparing site types, only FQI differed among all three site types; average conservatism and the number of specialist species also differed between remnant and restoration sites. We recommend using the number of specialist species and native species richness to examine trends within a site. For differences among sites, we recommend the number of specialist species, average conservatism, or FQI.
- abundance-weighted Floristic Quality Index
- coefficient of conservatism
- Floristic Quality Index
- modified Floristic Quality Index
Restoration Recap
The best plant diversity metric depends on the purpose of the analysis.
To examine trends within a site, we recommend using native species richness and the number of specialist species (species characteristic of high-quality, unaltered habitats). Species richness is very sensitive to how much area is sampled, so we recommend using permanent plots for trend analyses. In areas without conservatism lists, specialists can be identified using floras and expert knowledge.
To compare sites of different quality, we recommend using the number of specialist species, along with average conservatism and Floristic Quality Index if conservatism lists are available.
Tallgrass prairie was once the dominant vegetation in the Great Plains. Because most prairie has been converted to agriculture (McIndoe et al. 2008), restoration of former agricultural lands is needed to increase the total area of prairie and to reconnect remnant sites (Rowe et al. 2013). Restoration projects often use remnant prairies as reference sites, aiming to restore similar plant communities (Polley et al. 2005, Hansen and Gibson 2014). Various metrics of diversity have been used to compare restored and reference sites, but it is unclear which metrics provide the most useful information.
The number of species in an area, or species richness, is a basic metric of plant communities. Richness is usually highest at intermediate stages of disturbance or succession (Fleishman et al. 2006). Species richness numbers are simple to collect, but depend on the sample area (Arrhenius 1921). Richness can be separated into native and non-native species (e.g., Taft et al. 2006); counts of specialized species can also be used (e.g., Brudvig et al. 2007). However, richness does not take plant abundance into account.
Two metrics that include abundance are the Shannon index (or Shannon-Wiener Index; Shannon and Weaver 1949, Spellerberg and Fedor 2003) and Simpson index (or Gini index, Simpson 1949). Evenness, which is derived from the Shannon index, uses both abundance and species richness (Pielou 1966). The Shannon index is sensitive to sample size if there are many rare species; the Simpson index depends much less on sample size (Buckland et al. 2011). For this reason, the two indices can show opposite trends for samples with similar species richness but varying evenness (Nagendra 2002).
The original formulas for the Shannon and Simpson indices measure uncertainty rather than diversity, but both can be converted into true diversity metrics (Jost 2006). Differences in diversity as measured by species richness, Shannon diversity, and Simpson diversity demonstrate the increasing influence of dominance (Jost 2006). However, these metrics ignore species identity and treat all species as equivalent: if undesirable disturbances add ruderal plants to a site, overall diversity still increases.
The Floristic Quality Index (FQI, also known as Floristic Quality Assessment Index or FQAI) was designed to address this shortcoming (Swink and Wilhelm 1994). Species are assigned a coefficient of conservatism (conservatism hereafter), ranging from 0–10, with higher numbers assigned to “conservative” species characteristic of remnant or unaltered sites. FQI for a sample is calculated as average conservatism adjusted for species richness; average conservatism can also be used as a stand-alone metric (Taft et al. 1997, Rooney and Rogers 2002). Both metrics are widely adopted and have been used to assess the value of potential conservation areas, measure effects of land management, and evaluate the success of restoration projects (Jog et al. 2006, Taft et al. 2006, McIndoe et al. 2008, Smart et al. 2011, Hansen and Gibson 2014).
Coefficients of conservatism are subjectively assigned and opinions about the appropriate value for a species can differ among regional experts (Landi and Chiarucci 2010, Bried et al. 2012), so assigning valid coefficients requires consulting a large group of experts (Matthews et al. 2015). At least in tallgrass prairie species, conservatism is related to life history traits such as growth rates and mycorrhizal responsiveness, indicating that the values are ecologically useful (Bauer et al. 2017).
The relationship of FQI and average conservatism to site quality appears to vary with vegetation type and site. Both average conservatism and FQI show predictable increases with old-field succession (Spyreas et al. 2012). In wetlands, average conservatism may be a better measure of disturbance (Bell et al. 2017, Kutcher and Forrester 2018). In tallgrass prairies, FQI was more sensitive than average conservatism to differences among restored prairies (Hansen and Gibson 2014) and among remnant prairies (Bowles and Jones 2006).
Whether and how to include non-native plants in conservatism calculations is unsettled. Non-native species have been excluded, assigned negative coefficients of conservatism, or given values of 0 (Allain et al. 2004, Smart et al. 2011, Spyreas et al. 2012). Including non-native species may help distinguish among sites of different quality (Kutcher and Forrester 2018) and some invasive species may be indicators of moderately undisturbed sites (Matthews et al. 2015). Miller and Wardrop (2006) adjusted the FQI formula to reflect the percentage of the maximum attainable score for each site (richness-corrected FQI). Their formula uses only native species to calculate average conservatism and adjusts that average by the proportion of species that are native, which accounts for the influence of non-native plants.
Conservatism and FQI formulas can incorporate plant abundance. Average conservatism can be weighted by frequency, proportion of cover, or total plant cover (Cohen et al. 2004, Bourdaghs et al. 2006, Cretini et al. 2012). Similarly, in the FQI formula, conservatism can be weighted by abundance (Rocchio 2007, DeBerry et al. 2015). In these formulas, more abundant species are given greater weight in determining site quality.
We used monitoring data to investigate which diversity metrics (listed in Table 1) provide the most useful information about prairie diversity. Our analysis includes data from a high-quality remnant prairie, degraded remnant prairies, and restoration sites. We demonstrate how diversity metrics can track long-term change in a single site and compare diversity among sites of different quality. Because our data are from a single preserve complex, our analyses should be considered a case study.
Methods
Site Description
Clymer Meadow Prairie Preserve (Hunt County, Texas, 33°18′50″ N, 96°14′32″ W, Figure 1) is in the Blackland Prairie, the southern extension of the tallgrass prairie (Diggs et al. 1999). Clymer Meadow is 456 ha with approximately 283 ha of remnant tallgrass prairie. The remainder of the preserve, which includes multiple disjunct tracts, contains former crop fields or pastures undergoing restoration. Average annual precipitation is 1028 mm; average maximum temperatures are 12.5°C in January and 34.9°C in July (Greenville weather station, 1897–2014).
For most of the 19th century, the high-quality remnant prairie was hayed once or twice during the growingseason and, prior to the 1950s, burned occasionally in October. Since 1986, the site has been protected by The Nature Conservancy. In 1986, this remnant was dominated by warm-season perennial grasses, including Schizachyrium scoparium (little bluestem), Sorghastrum nutans (yellow Indiangrass), and Panicum virgatum (switchgrass; J. Eidson, personal observation). During the study period, the remnant was managed with haying, prescribed fire in multiple seasons, and periodic grazing with cattle and bison.
The study also included degraded remnant prairies and restoration sites. The degraded remnant prairies have had less active management than the high-quality remnant and are partially encroached by native woody plants. The restoration sites were previously farm fields or pastures. Since their acquisition by the Conservancy, they have been grazed intermittently, burned multiple times, and treated with herbicides for invasive plants; some have also been seeded with native prairie species.
Vegetation sampling
We used two datasets to include a range of vegetation quality. The first dataset (“remnant”) resulted from long-term monitoring data of high-quality prairie. For this monitoring, we placed ten 60 × 60-m plots in the unplowed portion of Clymer Meadow Preserve (Figure 1). Plots were sampled in June of 1996, 1999, 2002, 2006, and 2013–2018. Not all plots were sampled in each year; individual plots were sampled 1–6 times.
During sampling, a 60-m long base transect, which formed one edge of the plot, was measured using a tape. Five 60-m transects were laid out perpendicular to this base transect (except for one plot that had six transects in 1996 and 1999). In the first year, we randomly determined transect locations within each 10-m segment of the base transect, starting 10 m from the base transect starting point. In subsequent years, we placed transects at 10-m intervals, starting at 10 m. Along each transect, we sampled six 0.25-m2 quadrats. In early years of sampling, we randomly determined the location of the quadrats within each 10-m segment of the transect; in later years, we placed quadrats every 10 m, starting at the 10-m point along the transect. We recorded all plant species rooted in the quadrat (Diggs et al. 1999). Before 2016, we estimated cover using Daubenmire cover classes (< 1%, 1–4%, 5–24%, 25–49%, 50–74%, 75–94%, and 95–100%, Daubenmire 1959). Starting in 2016, we estimated cover to the closest 1%.
Data from two plots, one on an east-facing slope (E1) and one on a summit (S1), were used for the long-term change analysis (Figure 1). Prescribed burns were conducted in the slope unit in March 1995, March 1997, March 2001, March 2005, July 2008, and March 2013. The summit unit was burned in March 1995, December 2000, March 2005, July 2008, and March 2013. Sites were hayed in July 1997 (summit) and July 1998 (both). The sites were grazed by bison in spring 2002 (both), summer 2004 (both), and summer 2006 (slope); both sites were grazed with cattle in the winters of 2012–2014.
The second dataset used information collected in restoration sites and degraded remnant prairies. The restoration sites are in the early stages of being restored through woody plant removal, prescribed fire, and limited seeding. Degraded remnant prairies have been encroached by woody plants after long periods without management. We established 17 plots (3 plots in degraded remnants, 14 in restoration sites), spaced regularly throughout each site and separated by at least 100 m (Figure 1). The sample size for degraded remnants is small because such prairies are uncommon in the preserve. In each plot, we sampled five 0.25-m2 quadrats along a 50-m long transect, starting 5 m from the starting point and spaced 10-m apart. We recorded all plant species rooted in the quadrat and estimated cover to the closest 1%. We sampled plots in June 2017 and 2018.
Statistical analyses
We assigned coefficients of conservatism to species encountered in our sampling based on their fidelity to remnant prairies in the northern Blackland Prairie of Texas (Supplemental Material, Table S1). Plants that could only be identified to the genus level were assigned a coefficient if most species in the genus had similar values. These genera were included in all diversity calculations as species, since each quadrat usually had only one unidentifiable species per genus. Several genera (Aristida [threeawn], Dalea [prairie clover], Coreopsis [tickseed], Erigeron [fleabane], Eupatorium [thoroughwort], Liatris [blazing star], Monarda [beebalm], and Symphyotrichum [aster]) included species with a larger range of values and could not be assigned a genus-level coefficient. Quadrats containing these genera were excluded from further calculations.
We used regression to compare diversity metrics to each other (R 3.5, R Core Team, Vienna, Austria). We included 2016–2018 data from all site types (452 quadrats) and used quadrat as the sample unit. For each quadrat, we calculated species richness, Shannon diversity, evenness, Simpson diversity, average conservatism, cover-weighted conservatism, and the number of specialist species (conservatism ≥ 7, Table 1). We calculated three versions of the Floristic Quality Index (FQI) for each quadrat: FQI, abundance-weighted FQI, and richness-corrected FQI (Table 1). Metrics were calculated with all species and with only native species, since the inclusion of non-native species is debated (Matthews et al. 2015). For regressions involving evenness, quadrats with only one species (n = 6 in native species only dataset) were excluded because evenness cannot be calculated for sample units with one species. For the native species analyses, we excluded one quadrat with only non-native species. We compared linear, quadratic, and cubic regressions for each pair of metrics and selected the regression with the lowest AICc (corrected Akaike Information Criterion) using AICcmodavg package 2.1-1 (Mazerolle, 2017).
We evaluated change over time for the two most frequently sampled remnant prairie plots, an east-facing slope plot (E1) and a summit plot (S1, Figure 1). The slope plot was sampled six times between 1996 and 2018; the summit plot was sampled five times. In years where cover was recorded in classes (1996–2015), we used the cover class midpoint as the cover value (e.g., 1–4% = 2.5%). For data collected after 2015, we used the recorded cover; exploratory analyses indicated that assigning 2016–2018 data into cover classes would not change our results. The two plots were analyzed separately, since each has a unique management history (described earlier). We examined changes in species richness, evenness, Simpson diversity, average conservatism, number of specialist species, and FQI using linear regression. All metrics except species richness and Simpson diversity included non-native plants.
Finally, we compared high-quality remnant, degraded remnant, and restoration sites using the following diversity metrics: species richness, Simpson diversity, evenness, average conservatism, number of specialist species, and FQI. All metrics except species richness and Simpson diversity included non-native plants. We used 2017 data from the degraded remnant and restoration sites and included the most recent sample for each of the 10 remnant plots (sampled between 2016 and 2018). Because the number of quadrats varied among plots, we averaged quadrat-level metrics for each plot and used plot as the analysis unit. Data were analyzed with linear models, specifying site type as the independent variable. We used multiple comparisons with a Tukey adjustment to compare site types (multcomp package, Hothorn et al. 2008). The number of plots in degraded remnants was much lower than in the other two categories, so comparisons with this group should be interpreted cautiously.
Throughout the rest of the paper, diversity metrics calculated with all species are noted with a superscript all, while metrics calculated with only native species are noted with a superscript native. Metrics with no superscript refer to both calculations.
Results
Selecting informative diversity metrics
Species richness, Shannon diversity, and Simpson diversity were all moderately to strongly correlated (Table 2). Average conservatism was not strongly correlated with species richness, but the variation in conservatism was greater at lower richness (Figure 2); evenness and richness-corrected FQI exhibited a similar pattern (figures available at charlottereemts. shinyapps.io/diversityapp). Evenness was correlated with Simpson diversity and with Shannon diversity when including non-native species. All the conservatism and FQI metrics were moderately to strongly correlated. The number of specialist species (coefficient of conservatism ≥ 7) in a quadrat was also correlated with most of the conservatism and FQI metrics (Figure 2), except for cover-weighted conservatism.
Applying and Interpreting Diversity Metrics: Change over Time
Native species richness increased in both plots (slope: 7.6 ± 0.3 to 11.8 ± 0.4 species per quadrat; summit: 6.8 ± 0.4 to 11.4 ± 0.7 species, mean ± standard error, Figure 3). Simpson diversitynative also increased (slope: 3.9 ± 0.2 to 5.2 ± 0.3; summit: 3.4 ± 0.3 to 5.1 ± 0.3). Average conservatismall remained stable in the slope plot (4.2 ± 0.1 to 4.4 ± 0.1) and decreased slightly in the summit plot (4.3 ± 0.1 to 3.7 ± 0.2). The number of specialist species increased in the slope plot (1.6 ± 0.2 to 2.3 ± 0.3) and remained stable in the summit plot (1.6 ± 0.2 to 1.9 ± 0.2). FQIall increased in both plots with a larger increase in the slope plot (slope: 11.6 ± 0.3 to 15.0 ± 0.4; summit: 11.2 ± 0.5 to 12.7 ± 0.7). Evennessall remained stable in the slope plot (0.74 ± 0.02 to 0.77 ± 0.02) and increased in the summit plot (0.68 ± 0.03 to 0.78 ± 0.01).
Applying and Interpreting Diversity Metrics: Comparing Sites
Native species richness was higher in the high-quality remnant prairie (11.3 ± 0.6 species per quadrat, mean ± standard error) compared to degraded remnant prairies (7.4 ± 0.9 species, Tukey’s range test; t = 3.98, p = 0.04) and restoration sites (6.5 ± 0.7 species, Tukey’s range test, t = 4.85, p < 0.01). Richness was similar in degraded remnants and restoration sites (Tukey’s range test, t = 0.87, p = 0.83; Figure 4, Table 3).
Simpson diversitynative followed a similar pattern. Diversity was higher in the high-quality remnant (5.1 ± 0.2) compared to the degraded remnants (3.6 ± 0.2, Tukey’s range test; t = 1.46, p = 0.01) and restoration sites (3.6 ± 0.3, Tukey’s range test; t = 1.79, p < 0.01). Simpson diversitynative was similar in degraded remnants and restoration sites (Tukey’s range test; t = 0.32, p = 0.73; Figure 4, Table 3).
Average conservatismall did not differ in the two remnant types (high-quality remnant: 4.1 ± 0.1, degraded remnant: 3.2 ± 0.6, Tukey’s range test; t = 0.96, p = 0.21). Conservatism was lower in the restoration sites (1.4 ± 0.3) compared to high-quality remnants (Tukey’s range test; t = 2.78, p < 0.01) and degraded remnants (Tukey’s range test; t = 1.82, p = 0.01; Figure 4, Table 3).
The number of specialist species was higher in high-quality remnant prairie (2.1 ± 0.3 species per quadrat) compared to restoration sites (restoration: 0.2 ± 0.1 species, Tukey’s range test; t = 1.87, p < 0.01). Degraded remnants (1.1 ± 0.4 species) contained slightly fewer specialists compared to the high-quality remnants (Tukey’s range test; t = 0.94, p = 0.06) and slightly more specialists compared to restoration sites (Tukey’s range test; t = 0.94, p = 0.05; Figure 4, Table 3).
FQIall differed among all three site types. FQI was higher in high-quality remnants (14.0 ± 0.6) compared to degraded remnants (8.7 ± 1.1, Tukey’s range test; t = 5.26, p = 0.01) and restoration sites (4.1 ± 0.8, Tukey’s range test; t = 9.84, p < 0.01). FQI was also higher in degraded remnants compared to restoration sites (Tukey’s range test; t = 4.58, p = 0.02; Figure 4, Table 3).
Evennessall in high-quality prairies (0.76 ± 0.01) was marginally higher compared to restoration sites (0.67 ± 0.03, Tukey’s range test; t = 2.42, p = 0.06), but did not differ from degraded remnants (0.73 ± 0.03, Tukey’s range test, t = 0.52, p = 0.86). Evennessall also did not differ in degraded remnants and restoration sites (Tukey’s range test; t = 1.03, p = 0.56; Figure 4, Table 3).
Discussion
Selecting Informative Diversity Metrics
We used regression to calculate correlations among diversity metrics and identify metrics that provide unique information, using data from 452 quadrats. All metrics were positively correlated, but the degree of correlation varied greatly (Table 2, Figure 2; additional figures: charlottereemts.shinyapps.io/diversityapp).
The strong correlations among species richness, Shannon diversity, and Simpson diversity are not surprising, since Shannon and Simpson diversity are designed to increase with higher richness. Of these three metrics, species richness is the most intuitive measure of the plant community, although the difference between richness and Simpson diversity shows the influence of dominance on the plant community (Jost 2006).
Variation in average conservatism was greatest when species richness was low (Figure 2). As richness increases, quadrat-level conservatism will converge on the average conservatism of the overall plant community because the influence of a single species becomes more diluted. This statistical phenomenon is known as regression to the mean (Stigler 1997). Spyreas (2016) noted that the variation in average conservatism was about ten times greater in plots of 0.01 m2 (average richness of 5) compared to plots of 10 m2 (average richness of 17). Similarly, variation in conservatism decreased with increasing species richness in tallgrass prairies and woodland ground flora (Bowles and Jones 2006, Maginel et al. 2016). Interestingly, the relationship between species richness and average conservatism is sometimes negative (Bowles and Jones 2006, Manning et al. 2017). Negative correlations may be caused by the greater variability in average conservatism at lower richness: if low richness plots are biased towards high conservatism (e.g., in sites with many conservative species), the addition of even slightly less conservative species in high richness plots will decrease average conservatism. Many authors consider average conservatism to be independent of sample size (Bourdaghs et al. 2006, Kutcher and Forrester 2018); these results suggest that conservatism is instead sensitive to species richness. Average conservatism should be used cautiously in areas with low richness and should be calculated only with adequate plot sizes (> 1 m2) in diverse sites (Spyreas 2016).
In our data, Floristic Quality Index (FQI) was more strongly correlated with average conservatism than with species richness. In many other studies, species richness is a stronger driver of FQI than average conservatism (Miller and Wardrop 2006, Spyreas 2016, Manning et al. 2017), but the relationship between FQI and its components may be complicated. Maginel et al. (2016) found that conservatism had greater influence on FQI values at high richness or when comparing sites with a narrow range of richness; richness in our quadrat-level data varied from 1–20. These results suggest that using species richness and average conservatism may provide more useful information than FQI itself, because the relationship of FQI with its components depends on the characteristics of the dataset.
The number of specialist species (species with conservatism ≥ 7) was correlated with average conservatism and all variations of FQI. However, average conservatism and FQI increase very little as the number of specialists rises (Figure 2), suggesting that those metrics are relatively insensitive to greater numbers of specialists. Prairie management often aims to increase the number and abundance of specialist species. However, management practices like prescribed fire and grazing that benefit specialists by maintaining open grassland can also increase generalist species (Brudvig et al. 2007), leading to no change in average conservatism. For this reason, we recommend tracking the presence and abundance of highly conservative species or using the number of specialist species separately from overall species diversity. A further advantage of focusing on specialist species is that managers do not need coefficients of conservatism for all species in their region. Such lists are lacking for many regions (Freyman et al. 2016). Instead, lists of species characteristic of high-quality native habitats can be assembled from regional floras and local knowledge.
Evenness was highly correlated with the Shannon and Simpson diversity indices. These correlations are expected: the Shannon index is a component of evenness, and the Simpson index is very similar to the Shannon index. Given that the Shannon and Simpson diversity indices are also highly correlated with species richness, evenness may be the most informative metric of the three. However, many vegetation types are naturally uneven (i.e., with a few dominant and many less abundant species), so managers must decide whether evenness is a relevant metric for their objectives.
Cover-weighted conservatism was poorly correlated with diversity metrics that do not incorporate conservatism and highly correlated with average conservatism and variations of FQI. Interpretation of this metric is complicated because a ruderal species with high cover can have greater influence than a highly conservative species with low cover. Furthermore, cover-weighted conservatism can be strongly correlated with cover (Cretini et al. 2012). In our dataset, the correlation between cover-weighted conservatismall and coverall was very low (r2 = 0.02). In Louisiana marshes, only sites dominated by non-native or ruderal species (12 out of 405) had lower cover-weighted conservatism than expected, given their cover (Cretini et al. 2012). Cover-weighted conservatism may useful in limited situations, but tracking cover of problem species individually, especially if they are management targets, would be more productive.
Abundance-weighted FQI is simply cover-weighted conservatism adjusted for species richness. Like cover-weighted conservatism, this metric can be difficult to interpret. Correlation between abundance-weighted FQI and human disturbance in Colorado varied from poor to moderate in different vegetation types (r2 = 0–0.44, Rocchio 2007). Rocchio (2007) concluded that the extra effort to collect cover did not justify the minor improvements in discrimination between pristine and disturbed sites. Furthermore, abundance-weighted FQI was moderately or strongly correlated with all other versions of average conservatism and FQI, suggesting that it does not provide any additional information.
Richness-corrected FQI was very strongly correlated with average conservatismall (r2 = 0.98), but poorly correlated with species richnessall (r2 = 0.17). Richness-corrected FQI was designed to reduce the influence of species richness on FQI by adjusting average conservatism by the proportion of non-native species in a sample (Miller and Wardrop 2006). However, the metric does not take dominance of those non-native species into account. For this reason, richness-corrected FQI in sites with only few non-native species will not differ much from average conservatism, even if those non-native species dominate the plant community. In our restoration sites, the average adjustment factor was 0.87 ± 0.01: average conservatism was decreased by an average of only 13%, even where non-native species were the dominant cover. Unless a large proportion of the species in a site are not native, richness-corrected FQI and average conservatism provide redundant information.
In summary, many diversity metrics are highly correlated and provide similar information to less complex metrics. Species richness is the most basic and intuitive metric of diversity; separating native and non-native species is often helpful. The difference between richness and Simpson diversity shows in the influence of dominance; evenness captures similar information but may not be relevant in all situations. Average conservatism is highly variable when richness is low; the number of specialist species provides equally useful information and does not require a complete list of coefficients of conservatism. The relationship between FQI and its components is complicated and sometimes contradictory. Derivatives of FQI that include cover can be difficult to interpret and provide information similar to FQI.
Applying and Interpreting Diversity Metrics: Change Over Time
We used linear regression to investigate which diversity metrics changed over 22 years of active management in our high-quality remnant prairie, focusing on two plots (30 quadrats each) that were sampled 5–6 times. Most diversity metrics were stable or increasing.
Native richness (species per quadrat) increased dramatically between the first and last samples. Simpson diversity of native species increased in both plots, but rates of increase were lower than for species richness (Figure 3). In Wisconsin, quadrat-level species richness in a remnant prairie almost doubled in 30 years after management changed from annual haying to mostly annual prescribed fire (Rooney and Leach 2010). In Illinois, Simpson diversity also increased in remnant prairies over 25 years (Bowles and Jones 2013). In contrast, species richness in Oklahoma prairies managed with bison and fire did not change over 11 years (Spyreas 2016). These results show that species richness can show dramatic changes over time in some sites, but also that Simpson diversity provides the same information about vegetation trends as species richness.
Even though species richness increased at Clymer Meadow, average conservatism (all species) remained relatively stable in one plot and declined slightly in the other (Figure 3). Prairies in Illinois showed a similar pattern: plot-level species richness increased significantly over 30 years, while average conservatism remained stable (Bowles and Jones 2006). As discussed earlier, plot-level conservatism approaches the overall community average as species richness increases; variability in conservatism also decreases with increasing richness. Management of the Clymer Meadow remnant has changed from annual haying to a more diverse regime. This varied disturbance regime has increased the richness of generalist forbs (coefficients of 4–6), even as the number of highly conservative species increased slightly. Other scenarios could create a similar result. For example, management that simultaneously reduced the number of disturbance-dependent (low coefficient) and highly conservative species would also not change average conservatism over time. For this reason, average conservatism does not provide useful information about long-term changes in a plant community. Instead, comparisons of the distributions of coefficients (e.g., by comparing histograms) and examinations of specialist species (discussed below) will be more useful.
The number of prairie specialists (conservatism ≥ 7) increased slightly in one plot and remained stable in the other (Figure 3). Similarly, specialist species in a Wisconsin remnant prairie increased with a change in management, even as less conservative species also become more abundant (Rooney and Leach 2010). Such increases are not reflected in average conservatism or FQI, because the influence of specialist species on average conservatism in quadrats with high richness is diluted. For this reason, we recommend examining the number of specialist species or even focusing on individual species of interest.
FQI (all species) increased in both plots (Figure 3). Given that average conservatism was stable or decreasing, this increase in FQI is driven by the large increases in species richness. Bowles and Jones (2006) found a similar pattern in Illinois prairies: changes in FQI mirrored changes in species richness, not average conservatism. These similarities suggest that FQI and species richness are redundant when examining trends in high-quality sites.
Evenness (all species) was relatively stable in both plots (Figure 3). Evenness in another Blackland Prairie remnant increased by 0.13 over three years with no change in management, while evenness in a remnant previously treated with herbicide increased by 0.18 after herbicide application stopped (Hickman and Derner 2007). Because native tallgrass prairie is very diverse, evenness is usually high and is unlikely to change dramatically.
In summary, for our remnant prairie data, species richness and the number of specialist species appear to be the most useful metrics of change in diversity. Trends in Simpson diversity and FQI are very similar to species richness. Average conservatism obscures trends in groups of interest, such as prairie specialists, and similar patterns of conservatism can be generated by different management outcomes.
Applying and Interpreting Diversity Metrics: Comparing Sites
We used linear model contrasts to compare three types of sites: high-quality remnant (10 plots), degraded remnants (3 plots), and restoration sites (14 plots). All metrics except evenness differed between the high-quality remnant and restoration sites; few metrics differed between degraded remnants and restoration sites. The sample size in degraded remnants was small, so comparisons with this group should be interpreted cautiously.
Native species richness in the high-quality remnant was higher than in the other site types (Figure 4, Table 3). In the northern Great Plains, species richness was higher in lightly grazed remnant prairies compared to heavily grazed remnants (Smart et al. 2011). Remnant hay meadows in Kansas had higher richness than grazed remnants; both remnant types had more species than restored cropfields (Jog et al. 2006). In contrast, Taft et al. (2006) found similar native species richness in remnant prairies and in restored cropfields, but species richness varied among remnants. Blackland Prairie remnants with different herbicide histories had similar species richness (Hickman and Derner 2007). Based on the range of responses in our data and other studies, species richness does not consistently distinguish sites of different quality.
Simpson diversity of native species was higher in the remnant than in the other site types. Degraded remnants and restoration sites did not differ (Figure 4, Table 3). In Kansas, Simpson diversity of forbs, which account for most of the species diversity in prairies, was also higher in remnants compared to restored sites (Denning and Foster 2017). In our data, differences in Simpson diversity among site types follow the same pattern as species richness, suggesting that this metric does not provide additional information.
Average conservatism (all species) was one of the few metrics to differ significantly between restoration and degraded remnant plots, but it did not differ between remnant types (Figure 4, Table 3). In Illinois, average native conservatism differed between remnants and restorations, as well as among remnants of varying quality (Taft et al. 2006). In South Dakota and Minnesota, average conservatism was higher in infrequently grazed remnants than in frequently grazed sites (Smart et al. 2011). Similarly, remnant hayfields in Kansas had higher average conservatism than grazed remnants. Both remnant types had higher average conservatism than restored cropfields (Jog et al. 2006). In most cases, average conservatism can measure site quality.
Specialist species were most common in the high-quality prairie and almost completely absent from the restoration sites; degraded prairies differed marginally from the other site types. Brudvig et al. (2007) found no difference in the proportion of specialist species under different management regimes. The proportion of generalist species (conservatism ≤ 2) also did not change. Changes in the number of species of different conservatism levels may obscure proportional changes for any one group. Using the number of specialist species, rather than their proportion, may be a useful measure of site quality.
FQI (all species) differed significantly among site types (Figure 4, Table 3). In Illinois, FQI was higher in remnant prairies than in planted prairies and also differed among remnants of varying quality (Taft et al. 2006). In Kansas, FQI was highest in hayed remnants; grazed remnants and restoration sites had lower FQI (Jog et al. 2006). FQI was influenced by both grazing and herbicide use in the northern Great Plains, with patterns in FQI more closely matching patterns of species richness than average conservatism (Smart et al. 2011). In Missouri, FQI changed less than species richness after grazing and burning (Briggler et al. 2017). FQI can reflect site quality but is strongly influenced by species richness in some cases.
Evenness did not differ among site types but varied the most in restoration plots (Figure 4, Table 3). Evenness in Illinois remnants and reconstructed prairies also did not differ (Taft et al. 2006), while evenness in Blackland Prairie remnants with different herbicide histories differed by only 0.09 (Hickman and Derner 2007). These results suggest that evenness is not a useful measure of prairie quality in the Great Plains.
In summary, all diversity metrics except evenness could distinguish the high-quality remnant from restoration sites. Species richness and FQI differed between high-quality and degraded remnants; the number of specialist species was also slightly different. Average conservatism and FQI differed between degraded remnants and restoration sites; the number of specialist species was also marginally different.
Conclusions
We created a conservatism list for 492 species based on their fidelity to remnant prairies in the Texas Blackland Prairie region. We then compared ten diversity metrics to identify which provided unique information, could detect trends over time in high-quality sites, and distinguished among sites of different quality. In our dataset, the number of specialist species worked well for detecting change over time in high-quality sites. Specialists also differed between a high-quality remnant and restoration sites (Table 4). This metric is intuitive, easy to collect, and does not require a complete list of regional coefficients of conservatism. It should be examined in more datasets to test its applicability to other vegetation types.
Native species richness showed the largest change over time in remnant prairie. In our data, richness differed between the highest and lowest quality sites; other studies suggest that richness does not vary consistently with site quality. Like the number of specialist species, richness is easy to measure and understand. However, comparisons between different sampling schemes, especially if their sample area vary greatly, can be deceiving. The estimation of species richness in different datasets is an active area of research (e.g., Magurran and McGill 2011, Chao and Chin 2016). We encourage the use of long-term monitoring plots to make trend detection simpler.
Average conservatism and the Floristic Quality Index can be used to measure site quality, but neither was ideal for trend detection. Average conservatism is less sensitive to sample size than FQI, but it is sensitive to species richness and may be biased in low diversity sites. FQI could distinguish among all quality levels in our data. Both average conservatism and FQI require the development of a regional list of coefficients of conservatism. Lacking such a list, the number of specialist species can provide a useful proxy.
Acknowledgements
We thank Marcia Hackett for help with the initial design of the study. David Diamond and Fred Smeins conducted vegetation sampling (1980s) which also informed our study design. Brandon Belcher, Jacqueline Ferrato, Chan Glidewell, and Tom Phillips helped with field data collection. Bill Carr, Jason Singhurst, David Diamond, and Matt White contributed to the development of the coefficients of conservatism. Jorge Brenner, Mike Duran, Jacqueline Ferrato, Rich Kostecke, Ryan Smith, an anonymous reviewer, and members of the Notorious Ecological Restoration Discussion provided valuable feedback on the manuscript.
This open access article is distributed under the terms of the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0) and is freely available online at:http://jhr.uwpress.org