For events reported by forecast zone, use regular expressions to match as many as possible to counties.
Arguments
- storm_data_z
A dataframe of storm events reported by forecast zone (i.e.,
cz_type == "Z"
) rather than county. This dataframe should include the columns:state
: State name, in lowercasecz_name
: Location name, in lowercasecz_fips
: Forecast zone FIPS
Value
The dataframe of events input to the function, with county FIPS
added for events matched to a county in the fips
column. Events
that could not be matched are kept in the dataframe, but the fips
code is set to NA
.
Details
This function tries to match the cz_name
of each event to
a state and county name from the county.fips
dataframe that comes
with the maps
package. The following steps are taken to try to
match each cz_name
to a state and county name from county.fips
:
Tries to match
cz_name
to the county name incounty.fips
after removing any periods or apostrophes incz_name
.Next, for county names with "county" in them, try to match the word before "county" to county name in
county.fips
. Then check the two words before "county", then the one and two words before "counties".Next, pull out the last word in
cz_name
and try to match it to the county name incounty.fips
. The check the last two words incz_name
, then check the last three words incz_name
.Next, pull any words right before a slash and check that against the county name.
Finally, try removing anything in parentheses in
cz_name
before matching.
Note
This function does not provide any matches for events outside of the continental U.S.
You may want to hand-check that event listings with names like "Lake", "Mountain", and
"Park" have not been unintentionally linked to a county like "Lake County". While such
examples seem rare in the example data used to develop this function (NOAA Storm Events
for 2015), it can sometimes happen. To do so, you can use the str_detect
function
from the stringr
package.
Examples
counties_to_parse <- dplyr::data_frame(
event_id = c(1:19),
cz_name = c("Suffolk",
"Eastern Greenbrier",
"Ventura County Mountains",
"Central And Southeast Montgomery",
"Western Cape May",
"San Diego County Coastal Areas",
"Blount/Smoky Mountains",
"St. Mary's",
"Central & Eastern Lake County",
"Mountains Southwest Shasta County To Northern Lake County",
"Kings (Brooklyn)",
"Lower Bucks",
"Central St. Louis",
"Curry County Coast",
"Lincoln County Except The Sheep Range",
"Shasta Lake/North Shasta County",
"Coastal Palm Beach County",
"Larimer & Boulder Counties Between 6000 & 9000 Feet",
"Yellowstone National Park"),
state = c("Virginia",
"West Virginia",
"California",
"Maryland",
"New Jersey",
"California",
"Tennessee",
"Maryland",
"Oregon",
"California",
"New York",
"Pennsylvania",
"Minnesota",
"Oregon",
"Nevada",
"California",
"Florida",
"Colorado",
"Wyoming"))
#> Warning: `data_frame()` was deprecated in tibble 1.1.0.
#> ℹ Please use `tibble()` instead.
match_forecast_county(counties_to_parse)
#> # A tibble: 19 × 4
#> event_id cz_name state fips
#> <int> <chr> <chr> <int>
#> 1 1 Suffolk Virg… 51800
#> 2 2 Eastern Greenbrier West… 54025
#> 3 3 Ventura County Mountains Cali… 6111
#> 4 4 Central And Southeast Montgomery Mary… 24031
#> 5 5 Western Cape May New … 34009
#> 6 6 San Diego County Coastal Areas Cali… 6073
#> 7 7 Blount/Smoky Mountains Tenn… 47009
#> 8 8 St. Mary's Mary… 24037
#> 9 9 Central & Eastern Lake County Oreg… 41037
#> 10 10 Mountains Southwest Shasta County To Northern Lake Coun… Cali… 6089
#> 11 11 Kings (Brooklyn) New … 36047
#> 12 12 Lower Bucks Penn… 42017
#> 13 13 Central St. Louis Minn… 27137
#> 14 14 Curry County Coast Oreg… 41015
#> 15 15 Lincoln County Except The Sheep Range Neva… 32017
#> 16 16 Shasta Lake/North Shasta County Cali… 6089
#> 17 17 Coastal Palm Beach County Flor… 12099
#> 18 18 Larimer & Boulder Counties Between 6000 & 9000 Feet Colo… 8013
#> 19 19 Yellowstone National Park Wyom… NA