Chapter 2 The Covid19R Package
The covid19R package is written to access the data collected by packages that are part of the Covid19R project. It has minimal dependencies, and does not require you to install any of the other data access packages. Rather, it queries our building database of standardized tidy datasets, allowing you to search what is available, and then directly download the datasets that have been updated in the previous 6 hours.
2.1 Finding out what data we have with get_covid19_data_info()
To see what data is available, get_covid19_data_info()
returns a tibble of all data sets that are available, as well as information about each. This is the same information that individual data packages have returned with their own get_info_*()
functions discussed in Standardized Package Functions. In addition, it provides information on when the data was last updated, and if the data package is currently able to acquire data, or is failing.
Here is what we have at the time of writing this documentation.
dat_info <- get_covid19_data_info()
dat_info %>%
knitr::kable() %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive")) %>%
kableExtra::scroll_box(width = "100%", height = "100%")
data_set_name | package_name | function_to_get_data | data_details | data_url | license_url | data_types | location_types | spatial_extent | has_geospatial_info | get_info_passing | refresh_status | last_refresh_update |
---|---|---|---|---|---|---|---|---|---|---|---|---|
covid19nytimes_states | covid19nytimes | refresh_covid19nytimes_states | Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US States. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data. | https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv | https://github.com/nytimes/covid-19-data/blob/master/LICENSE | cases_total, deaths_total | state | country | FALSE | TRUE | Passed | 2020-06-20 16:13:11 |
covid19nytimes_counties | covid19nytimes | refresh_covid19nytimes_counties | Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US by County. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data. | https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv | https://github.com/nytimes/covid-19-data/blob/master/LICENSE | cases_total, deaths_total | state | country | FALSE | TRUE | Passed | 2020-06-20 16:13:17 |
covid19france | covid19france | refresh_covid19france | Open Source data from opencovid19-fr on distribution of confirmed Covid-19 cases and deaths in the US States. For more, see https://github.com/opencovid19-fr/data. | https://raw.githubusercontent.com/opencovid19-fr/data/master/dist/chiffres-cles.csv | https://github.com/opencovid19-fr/data/blob/master/LICENSE | confirmed, dead, icu, hospitalized, recovered, discovered | county, region, country, overseas collectivity | country | FALSE | TRUE | Passed | 2020-06-20 16:13:36 |
CanadaC19_cases | CanadaC19 | refresh_CanadaC19_cases | Open Source data from multiple public reporting data throughout Canada. For more, see https://github.com/ishaberry/Covid19Canada. | https://raw.githubusercontent.com/ishaberry/Covid19Canada/master/cases.csv | https://github.com/debusklaneml/CanadaC19/blob/master/LICENSE | cases_new | state | state | FALSE | TRUE | Passed | 2020-06-20 16:13:39 |
covid19us | covid19us | refresh_covid19us | Open Source data from COVID Tracking Project on the distribution of Covid-19 cases and deaths in the US. For more, see https://github.com/opencovid19-fr/data. | https://covidtracking.com/api | https://github.com/aedobbyn/covid19us/blob/master/LICENSE.md | positive, negative, pending, hospitalized_currently, hospitalized_cumulative, in_icu_currently, in_icu_cumulative, on_ventilator_currently, on_ventilator_cumulative, recovered, death, hospitalized, total_tests_viral, positive_tests_viral, negative_tests_viral, positive_cases_viral, positive_increase, negative_increase, total, total_test_results, total_test_results_increase, death_increase, hospitalized_increase, negative_regular_score, negative_score, positive_score | state | country | FALSE | TRUE | Passed | 2020-06-20 16:13:45 |
VirginiaC19 | VirginiaC19 | refresh_VirginiaC19 | Open Source data from Virginia Department of Health. For more information, see https://www.vdh.virginia.gov/coronavirus/. | https://www.vdh.virginia.gov/content/uploads/sites/182/2020/03/VDH-COVID-19-PublicUseDataset-Cases.csv | https://github.com/debusklaneml/VirginiaC19/blob/master/LICENSE | cases_new,deaths_new,hosp_new | county | state | FALSE | TRUE | Passed | 2020-06-20 16:13:45 |
covid19tunisia | covid19tunisia | refresh_covid19tunisia | Open Source data on distribution of confirmed Covid-19 cases, recovered ones and deaths in Tunisia. For more, https://github.com/MounaBelaid/covid19datatunisia | https://raw.githubusercontent.com/MounaBelaid/covid19datatunisia/master/dist/data.csv | https://github.com/MounaBelaid/covid19datatunisia/blob/master/LICENSE | cases_new, deaths_new, recovered_new | state | country | FALSE | TRUE | Passed | 2020-06-20 16:13:45 |
covid19mobility_apple_country | covid19mobility | refresh_covid19mobility_apple_country | Data reflects relative volume of directions requests compared to a baseline volume on January 13th, 2020 for multiple transportation modes aggregated at the country level. | https://www.apple.com/covid19/mobility | https://www.apple.com/covid19/mobility | driving, walking, transit | country | global | FALSE | TRUE | Passed | 2020-06-20 16:13:47 |
covid19mobility_apple_subregion | covid19mobility | refresh_covid19mobility_apple_subregion | Data reflects relative volume of directions requests compared to a baseline volume on January 13th, 2020 for multiple transportation modes aggregated at the subregion (state) level. | https://www.apple.com/covid19/mobility | https://www.apple.com/covid19/mobility | driving, walking, transit | state | global | FALSE | TRUE | Passed | 2020-06-20 16:13:53 |
covid19mobility_apple_city | covid19mobility | refresh_covid19mobility_apple_city | Data reflects relative volume of directions requests compared to a baseline volume on January 13th, 2020 for multiple transportation modes aggregated at the city level. | https://www.apple.com/covid19/mobility | https://www.apple.com/covid19/mobility | driving, walking, transit | city | global | TRUE | TRUE | Passed | 2020-06-20 16:14:00 |
covid19mobility_google_country | covid19mobility | refresh_covid19mobility_google_country | Changes for each day are compared to a baseline value for that day of the week as compared to the 5-week period Jan 3-Feb 6, 2020 for visits to places falling in to certain categories. | https://www.google.com/covid19/mobility/ | https://www.google.com/covid19/mobility/ | retail_and_recreation_percent_change_from_baseline,grocery_and_pharmacy_percent_change_from_baseline,parks_percent_change_from_baseline,transit_stations_percent_change_from_baseline,workplaces_percent_change_from_baseline,residential_percent_change_from_baseline | country | global | FALSE | TRUE | Passed | 2020-06-20 16:14:08 |
covid19mobility_google_subregions | covid19mobility | refresh_covid19mobility_google_subregions | Changes for each day are compared to a baseline value for that day of the week as compared to the 5-week period Jan 3-Feb 6, 2020 for visits to places falling in to certain categories. Data is aggregated at the state or subdivision level. | https://www.google.com/covid19/mobility/ | https://www.google.com/covid19/mobility/ | retail_and_recreation_percent_change_from_baseline,grocery_and_pharmacy_percent_change_from_baseline,parks_percent_change_from_baseline,transit_stations_percent_change_from_baseline,workplaces_percent_change_from_baseline,residential_percent_change_from_baseline | state | global | FALSE | TRUE | Passed | 2020-06-20 16:14:39 |
covid19mobility_google_us_counties | covid19mobility | refresh_covid19mobility_google_us_counties | Changes for each day are compared to a baseline value for that day of the week as compared to the 5-week period Jan 3-Feb 6, 2020 for visits to places falling in to certain categories. Data is aggregated at the county level for the USA only. | https://www.google.com/covid19/mobility/ | https://www.google.com/covid19/mobility/ | retail_and_recreation_percent_change_from_baseline,grocery_and_pharmacy_percent_change_from_baseline,parks_percent_change_from_baseline,transit_stations_percent_change_from_baseline,workplaces_percent_change_from_baseline,residential_percent_change_from_baseline | county | country | FALSE | TRUE | Passed | 2020-06-20 16:15:11 |
The data can be easily filtered on to find the data most relevant to your effort, such as
dat_info %>%
filter(stringr::str_detect(location_types, "state")) %>%
select(data_set_name, data_details) %>%
knitr::kable() %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive")) %>%
kableExtra::scroll_box(width = "100%", height = "100%")
data_set_name | data_details |
---|---|
covid19nytimes_states | Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US States. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data. |
covid19nytimes_counties | Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US by County. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data. |
CanadaC19_cases | Open Source data from multiple public reporting data throughout Canada. For more, see https://github.com/ishaberry/Covid19Canada. |
covid19us | Open Source data from COVID Tracking Project on the distribution of Covid-19 cases and deaths in the US. For more, see https://github.com/opencovid19-fr/data. |
covid19tunisia | Open Source data on distribution of confirmed Covid-19 cases, recovered ones and deaths in Tunisia. For more, https://github.com/MounaBelaid/covid19datatunisia |
covid19mobility_apple_subregion | Data reflects relative volume of directions requests compared to a baseline volume on January 13th, 2020 for multiple transportation modes aggregated at the subregion (state) level. |
covid19mobility_google_subregions | Changes for each day are compared to a baseline value for that day of the week as compared to the 5-week period Jan 3-Feb 6, 2020 for visits to places falling in to certain categories. Data is aggregated at the state or subdivision level. |
2.2 Downloading a dataset with get_covid19_dataset()
Once you have determined the relevant dataset you want, you can download it with get_covid19_dataset()
. For example
nytimes_states <- get_covid19_dataset("covid19nytimes_states")
nytimes_states %>%
head() %>%
knitr::kable(format = "html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
date | location | location_type | location_code | location_code_type | data_type | value |
---|---|---|---|---|---|---|
2020-06-17 | Alabama | state | 01 | fips_code | cases_total | 27312 |
2020-06-17 | Alabama | state | 01 | fips_code | deaths_total | 790 |
2020-06-17 | Alaska | state | 02 | fips_code | cases_total | 776 |
2020-06-17 | Alaska | state | 02 | fips_code | deaths_total | 10 |
2020-06-17 | Arizona | state | 04 | fips_code | cases_total | 41159 |
2020-06-17 | Arizona | state | 04 | fips_code | deaths_total | 1252 |
2.3 Examining controlled vocabulary with get_covid19_controlled_vocab()
In our data standard, we have three types of controlled vocabulary - location_type
, location_code_type
, and data_type
. To see what the current controlled vocabulary is (both to understand the fields and to see if we should add more), use get_covid19_controlled_vocab()
as below:
get_covid19_controlled_vocab("location_type") %>%
knitr::kable(format = "html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
location_type | description |
---|---|
continent | continental scale |
country | a country with soverign borders |
state | a spatial area inside that country such as a state, province, canton, etc. |
county | a spatial area demarcated within a state |
city | a single municipality - the smallest spatial grain of government in a country |
canton | the cantons of Switzerland and Principality of Liechtenstein (FL) |
get_covid19_controlled_vocab("location_code_type") %>%
knitr::kable(format = "html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
location_code_type | description | URL |
---|---|---|
fips_code | The federal code in US designating regions of governance | https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt |
iso_3166 | ISO 3166 is a standard published by the International Organization for Standardization (ISO) that defines codes for the names of countries, dependent territories, special areas of geographical interest, and their principal subdivisions (e.g., provinces or states). | https://en.wikipedia.org/wiki/ISO_3166 |
iso_3166_2 | ISO 3166-2 is part of the ISO 3166 standard published by the International Organization for Standardization (ISO), and defines codes for identifying the principal subdivisions (e.g., provinces or states) of all countries coded in ISO 3166-1. | https://en.wikipedia.org/wiki/ISO_3166-2 |
un_locode | The United Nations Code for Trade and Transport Locations. Covers cities, airports, train stations and other smaller areas of transportation significance. | https://www.unece.org/cefact/locode/welcome.html |
get_covid19_controlled_vocab("data_type") %>%
knitr::kable(format = "html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
data_type | description |
---|---|
cases_new | new confirmed Covid-19 cases during on the current date |
cases_total | cumulative confirmed Covid-19 cases as of the current date |
recovered_total | cumulative number of patients recovered as of the current date |
recovered_new | new number of patients recovered on the current date |
deaths_new | new deaths on the current date |
deaths_total | cumulative deaths due to Covid-19 as of the current date |
tested_total | cumulative number of tests performed as of the date |
hosp_new | new hospitalizations on the current date |
hosp_current | current number of hospitalized patients as of the current date |
icu_current | number of hospitalized patients in ICUs as of the current date |
vent_current | number of hospitalized patients requiring ventilation as of the current date |
driving_req_rel_volume | relative volume of driving direction requests from a platform such as Apple Maps |
walking_req_rel_volume | relative volume of walking direction requests from a platform such as Apple Maps |
transit_req_rel_volume | relative volume of transit direction requests from a platform such as Apple Maps |
retail_and_recreation_perc_ch | percent change in visits to retail and recreation within a given area from a geotracking source such as Google |
grocery_and_pharmacy_perc_ch | percent change in visits to groceries and pharmacies within a given area from a geotracking source such as Google |
parks_perc_ch | percent change in visits to parks within a given area from a geotracking source such as Google |
transit_stations_perc_ch | percent change in visits to transit stations within a given area from a geotracking source such as Google |
workplaces_perc_ch | percent change in visits to places of work within a given area from a geotracking source such as Google |
residential_perc_ch | percent change in visits to residental locations within a given area from a geotracking source such as Google |