Chapter 10 Standardized Package Functions
All packages that are part of the Covid19R project have two functions at their core. These functions are necessary in order to hook the package up to the Covid19R data aggregator - and/or it could mean that for users interested in individual packages instead of pulling data from the covid19R package, they’ll know their way around your data package quickly!
10.1 Getting Information
The get_info_*()
function returns all of the salient info about each dataset in a package. It ends in the name of the package, e.g., get_info_covid19nytimes()
for the covid19nytimes package. Each package only has one get_info_*()
function. It returns the following fields:
data_set_name
- The name of the data set.
package_name
- The name of the package the data set lives in.
function_to_get_data
- The function in the package that is used to get the data.
data_details
- A detailed description of the data set.
data_url
- A URL for where the data comes from.
license_url
- A URL for the license for usage of the data. PLEASE READ THIS.
data_types
- What kinds of data is here? cases, deaths, hospital beds, etc.
location_types
- What types of locations are represented in the data, e.g., States, Countries, etc.
spatial_extent
- How large of an area does the data set cover? A single country? A continent? The globe?
has_geospatial_info
- Does the data set have explicit geospatial information (e.g., latitude and longtitude) such that it can be easily converted into an sf
object or otherwise?
For example
covid19us::get_info_covid19us() %>%
knitr::kable(format = "html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
data_set_name | package_name | function_to_get_data | data_details | data_url | license_url | data_types | location_types | spatial_extent | has_geospatial_info |
---|---|---|---|---|---|---|---|---|---|
covid19us | covid19us | refresh_covid19us | Open Source data from COVID Tracking Project on the distribution of Covid-19 cases and deaths in the US. For more, see https://github.com/opencovid19-fr/data. | https://covidtracking.com/api | https://github.com/aedobbyn/covid19us/blob/master/LICENSE.md | positive, negative, pending, hospitalized_currently, hospitalized_cumulative, in_icu_currently, in_icu_cumulative, on_ventilator_currently, on_ventilator_cumulative, recovered, death, hospitalized, total_tests_viral, positive_tests_viral, negative_tests_viral, positive_cases_viral, positive_increase, negative_increase, total, total_test_results, total_test_results_increase, death_increase, hospitalized_increase, negative_regular_score, negative_score, positive_score | state | country | FALSE |
10.2 Obtaining fresh data
Each data set has its own refresh_*()
function. It can either be refresh_PACKAGENAME()
if the data package only supplies one data set. For example refresh_covid19france()
. Or, if there are multiple data sets in a package, refresh_PACKAGENAME_MOREINFO()
where PACKAGENAME_MOREINFO
is the full name of a data set. For example refresh_covid19nytimes_states()
and refresh_covid19nytimes_counties()
return two different data sets from the covid19nytimes data package.
covid19nytimes::refresh_covid19nytimes_states() %>%
head() %>%
knitr::kable("html") %>%
kableExtra::kable_styling(
bootstrap_options = c("striped","condensed", "responsive"))
date | location | location_type | location_code | location_code_type | data_type | value |
---|---|---|---|---|---|---|
2020-06-19 | Alabama | state | 01 | fips_code | cases_total | 29002 |
2020-06-19 | Alabama | state | 01 | fips_code | deaths_total | 822 |
2020-06-19 | Alaska | state | 02 | fips_code | cases_total | 821 |
2020-06-19 | Alaska | state | 02 | fips_code | deaths_total | 10 |
2020-06-19 | Arizona | state | 04 | fips_code | cases_total | 46914 |
2020-06-19 | Arizona | state | 04 | fips_code | deaths_total | 1322 |