This vignette captures the analyses in a forthcoming manuscript (preprint + peer-reviewed paper) evaluating the utility of remotely sensed data for monitoring compliance with section 7 of the US Endangered Species Act. The section 7 consultation data - work types, coordinates, etc. - are from the TAILS database from the US Fish and Wildlife Service (FWS); see Malcom and Li 2015, PNAS for more information.
The dataset includes 364 consultations randomly selected from among over 44,000 consultations recorded with coordinates by FWS from 2008-2015.
head(dat)
dim(dat)
## [1] 364 35
Although FWS’s consultation data shows that formal consultations only make up ~7% of all consultations, actions only undergo formal consultation if an ESA- listed species is likely to be harmed. That is, we are generally more interested in actions that have undergone formal consultation. The 182 formal consultations we investigated are therefore a “denser” sampling of all formal consultations than the 182 samples of informal consultations.
The locations of the consultations in our sample generally reflected the distribution of consultations observed by Malcom and Li (2015): formal consultations were more likely to come from the western US, informal consultations more likely from the East. The bias in which offices record coordinates is apparent with the lack of formal consultations from some states, like Colorado (n ~ 500 formal consultations, but rarely records coordinates) and Washington (n ~ 500 formal consultations, high data recording rate).
### A static map in case we want one for Supp. Info.
#
# to_map <- dplyr::filter(dat, dat$lat_dec_deg.x > 20 & dat$lat_dec_deg.x < 60)
# map <- qmplot(
# data = to_map,
# x = long_dec_deg.x,
# y = lat_dec_deg.x,
# maptype = "toner-lite",
# color = formal_in,
# size = I(2),
# alpha = I(0.5),
# extent = "normal",
# zoom = 5) +
# theme_bw() +
# labs(x = "Longitude",
# y = "Latitude") +
# scale_color_viridis(discrete = TRUE, name = "Type")
# ggsave(plot = map, "inst/figs/map.pdf")
cols <- substr(
viridis(3), 0, 7
)
dat$formal_colors <- ifelse(
dat$formal_in == "formal",
cols[1],
cols[3]
)
tags$div(
style = "background-color: #e9e9e9",
leaflet(dat) %>%
setView(lng=-95, lat=38, zoom = 4) %>%
addProviderTiles("Stamen.TonerLite") %>%
mapOptions(zoomToLimits = "first") %>%
addCircleMarkers(
lng= ~long_dec_deg.x,
lat= ~lat_dec_deg.x,
stroke = FALSE,
fillOpacity = 0.5,
fillColor = ~formal_colors,
popup = ~activity_code
)
)
Top lead agencies
head(sort(table(as.character(dat$lead_agency)), decreasing = TRUE), 10)
##
## Army Corps of Engineers
## 90
## -- OTHER: CONSULTANT --
## 53
## Federal Highway Administration
## 40
## Forest Service
## 16
## Bureau of Indian Affairs
## 15
## Fish and Wildlife Service
## 12
## Federal Communications Commission
## 11
## California Dept. of Transportation District 6
## 10
## Community Planning and Development (HUD)
## 10
## California Dept. of Transportation District 3
## 9
Top 10 lead ES offices
head(sort(table(as.character(dat$ESOffice.x)), decreasing = TRUE), 10)
##
## SACRAMENTO FISH AND WILDLIFE OFFICE
## 69
## PENNSYLVANIA ECOLOGICAL SERVICES FIELD OFFICE
## 50
## SOUTH FLORIDA ECOLOGICAL SERVICES FIELD OFFICE
## 26
## WASHINGTON FISH AND WILDLIFE OFFICE
## 23
## VENTURA FISH AND WILDLIFE OFFICE
## 21
## MISSISSIPPI ECOLOGICAL SERVICES FIELD OFFICE
## 18
## TENNESSEE ECOLOGICAL SERVICES FIELD OFFICE
## 17
## IDAHO FISH AND WILDLIFE OFFICE
## 11
## BLOOMINGTON ECOLOGICAL SERVICES FIELD OFFICE
## 9
## RALEIGH ECOLOGICAL SERVICES FIELD OFFICE
## 9
Fiscal years
table(dat$FY)
##
## 2008 2009 2010 2011 2012 2013 2014 2015
## 70 65 62 44 39 33 34 17
Before starting to explore the remotely sensed data, two authors (TK and JM) independently scored each action type for expected observability, with:
After independent scoring, we met and reconciled the expected scores. The authors - primarily TK - then attempted to find all 364 consultations in aerial images using Google Earth and the coordinates provided by FWS.
First, the overall expected and observed observability for formal and informal consultations:
expected <- aggregate(reconcile ~ formal_in,
data = dat,
FUN = mean, na.rm = TRUE)
names(expected)[2] <- "mean"
expected$OE <- "Exp"
observed <- aggregate(action_found ~ formal_in,
data = dat,
FUN = mean, na.rm = TRUE)
names(observed)[2] <- "mean"
observed$OE <- "Obs"
overall <- rbind(expected, observed)
overall <- select(overall, OE, formal_in, mean)
names(overall) <- c("Obs/Exp", "Formal/Informal", "Mean")
kable(overall, digits = 3)
Obs/Exp | Formal/Informal | Mean |
---|---|---|
Exp | formal | 0.538 |
Exp | informal | 0.657 |
Obs | formal | 0.389 |
Obs | informal | 0.397 |
Next, we break the categories down to identify which classes changed between the expected and observed stages, e.g., if “maybe”s were, in reality, not observable:
values <- c("No", "Maybe", "Yes")
for_exp <- table(filter(dat, formal_in == "formal")$reconcile)
for_obs <- table(filter(dat, formal_in == "formal")$action_found)
inf_exp <- table(filter(dat, formal_in == "informal")$reconcile)
inf_obs <- table(filter(dat, formal_in == "informal")$action_found)
new <- data_frame(
OE = c(rep("expected", 6), rep("observed", 6)),
`in/formal` = c(rep("formal", 3), rep("informal", 3),
rep("formal", 3), rep("informal", 3)),
observed = rep(values, 4),
frequency = c(for_exp, inf_exp, for_obs, inf_obs)
)
ggplot(data = new, aes(x = observed, y = frequency)) +
geom_bar(stat = "identity") +
labs(x = "Action observability",
y = "# consultations") +
facet_grid(OE ~ `in/formal`) +
theme_hc()
The “Maybe” categories for expected were mostly “No” when observed, for both formal and informal consultations.
new
Determining the types of work that are most amenable to monitoring from the standpoint of observability - whether the effects of the action are observable in aerial or satellite imagery - is a key challenge. Before showing observability rate, we the sample size by work type and by formal/informal consultation:
n_obs <- as_data_frame(table(dat$work_category, dat$formal_in))
names(n_obs) <- c("work_cat", "formal_in", "n")
marg <- ggplot(n_obs, aes(x = n, y = work_cat)) +
geom_lollipop(horizontal = TRUE, point.size = 3) +
facet_grid(. ~ formal_in) +
labs(x = NULL, y = "Sample size") +
theme_hc()
marg
# Remove the y-axis labels to make compound fig and save for the ms.
# marg <- marge +
# theme(axis.title.y=element_blank(),
# axis.text.y=element_blank(),
# axis.ticks.y=element_blank())
# ggsave("inst/figs/work_cat_counts.pdf", height = 9.75, width = 3.25)
And now the mean observability by work category and formal/informal consultation:
type_obs <- aggregate(
action_found ~ work_category + formal_in,
data = dat,
FUN = mean, na.rm = TRUE
)
names(type_obs) <- c("category", "in/formal", "observability")
fig <- ggplot(type_obs, aes(y = category, x = `in/formal`, fill = observability)) +
geom_tile(color = "white", size = 0.1) +
labs(x = "Consultation Type", y = "") +
scale_y_discrete(limits = levels(type_obs$category)) +
scale_fill_viridis(name="Mean Observability") +
theme_hc()
# fig
# ggsave("inst/figs/work_cat_obs.pdf", height = 9.75, width = 6.5)
# Use ggplotly for this HTML vignette:
ggplotly(fig)
There has been a slight decrease in observability over time, but that decrease is not statistically significant:
cor.test(dat$action_found, dat$FY)
##
## Pearson's product-moment correlation
##
## data: dat$action_found and dat$FY
## t = -0.91304, df = 358, p-value = 0.3618
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1508101 0.0554384
## sample estimates:
## cor
## -0.04819961
get_observabilities <- function(dat, formal_in) {
type_mean <- tapply(dat$action_found,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
type_median <- tapply(dat$action_found,
INDEX = dat$work_type,
FUN = median, na.rm = TRUE)
if (formal_in == "formal") {
type_count <- tapply(dat$N_formal,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
} else {
type_count <- tapply(dat$N_consultations,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
}
expect_to_see_all <- type_mean * type_count
# And these are the results at the highest level:
tot_num_formal <- sum(type_count, na.rm = TRUE)
exp_num_see <- sum(expect_to_see_all, na.rm = TRUE)
obs_rate <- round(exp_num_see / tot_num_formal, 2)
in_set <- round(tot_num_formal, 0)
exp_obs <- round(exp_num_see, 0)
return(c(obs_rate, in_set, exp_obs))
}
rows <- c("Observability | work type",
"# consultations in set",
"# consultations expect to see effects")
exp_df <- data_frame(
value = rows,
formal = get_observabilities(formal, "formal"),
informal = get_observabilities(informal, "informal")
)
exp_df
Note: The number of consultations in each set is the number in the work categories we examined, not the total number of consultations with coordinates. There is a long tail of work categories with few consultations.
Formal consultations:
make_scatter_df <- function(dat, cons_dat, formal_in) {
type_mean <- tapply(dat$action_found,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
type_median <- tapply(dat$action_found,
INDEX = dat$work_type,
FUN = median, na.rm = TRUE)
if (formal_in == "formal") {
type_count <- tapply(dat$N_formal,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
} else {
type_count <- tapply(dat$N_consultations,
INDEX = dat$work_type,
FUN = mean, na.rm = TRUE)
}
expect_to_see_all <- type_mean * type_count
tmp_dat <- data.frame(type_count = as.vector(type_count),
type_mean = as.vector(type_mean),
work = names(type_count))
work_cat_type <- data.frame(cat = cons_dat$work_category,
work = as.character(cons_dat$work_type))
work_cat_type$uniq <- duplicated(work_cat_type$work)
work_cat_type <- work_cat_type[work_cat_type$uniq == FALSE, ]
tmp_dat <- inner_join(tmp_dat, work_cat_type, by = "work")
return(tmp_dat)
}
form_obs_dat <- make_scatter_df(formal, form_cons, "formal")
tags$div(
plot_ly(data = form_obs_dat,
type = "scatter",
mode = "markers",
y = form_obs_dat$type_count,
x = round(form_obs_dat$type_mean, 2),
text = paste("Work type:",
form_obs_dat$work,
"<br>Work category:",
form_obs_dat$cat),
marker = list(
color = substr(
viridis(n = length(unique(form_obs_dat$cat))), 0, 7
),
opacity = 0.6,
size = 20)
) %>%
layout(yaxis = list(title = "# actions"),
xaxis = list(title = "Prop. observed"))
)
tags$br()
tags$br()
Informal consultations:
inform_obs_dat <- suppressWarnings(
make_scatter_df(informal, inform_cons, "informal")
)
tags$div(
plot_ly(data = inform_obs_dat,
type = "scatter",
mode = "markers",
y = inform_obs_dat$type_count,
x = round(inform_obs_dat$type_mean, 2),
text = paste("Work type:",
inform_obs_dat$work,
"<br>Work category:",
inform_obs_dat$cat),
marker = list(
color = substr(
viridis(n = length(unique(inform_obs_dat$cat))), 0, 7
),
opacity = 0.6,
size = 20)
) %>%
layout(yaxis = list(title = "# actions"),
xaxis = list(title = "Prop. observed"))
)
We recorded the type of habitat change observed at each found action site to evaluate the types of change that are consulted on:
trans_hab <- function(x) {
if (is.na(x) | is.null(x)) NA
else if (x == 1) "natural -> natural"
else if (x == 2) "natural -> agriculture"
else if (x == 3) "natural -> development"
else if (x == 4) "agriculture -> development"
else "development -> development"
}
tmp <- sapply(dat$hab_chg, trans_hab)
conv_type <- table(tmp, dat$formal_in)
conv_type <- data_frame(
conversion = row.names(conv_type),
formal = conv_type[, 1],
informal = conv_type[, 2]
)
conv_type
Knowing the distribution of the oldest average aerial images is useful for anticipating how far we can “turn the clock back” to track habitat changes at consultation sites.
summary(dat$earliest_date, na.rm = T)
## Min. 1st Qu. Median Mean 3rd Qu.
## "1939-12-01" "1993-03-01" "1994-02-01" "1992-09-24" "1995-02-01"
## Max. NA's
## "2005-03-01" "3"
ggplot(dat, aes(earliest_date)) +
geom_histogram(colour="white") +
labs(x = "Earliest Aerial Image Date") +
theme_hc()