Chapter 10 Standardized Package Functions

All packages that are part of the Covid19R project have two functions at their core. These functions are necessary in order to hook the package up to the Covid19R data aggregator - and/or it could mean that for users interested in individual packages instead of pulling data from the covid19R package, they’ll know their way around your data package quickly!

10.1 Getting Information

The get_info_*() function returns all of the salient info about each dataset in a package. It ends in the name of the package, e.g., get_info_covid19nytimes() for the covid19nytimes package. Each package only has one get_info_*() function. It returns the following fields:

data_set_name - The name of the data set.
package_name - The name of the package the data set lives in.
function_to_get_data - The function in the package that is used to get the data.
data_details - A detailed description of the data set.
data_url - A URL for where the data comes from.
license_url - A URL for the license for usage of the data. PLEASE READ THIS.
data_types - What kinds of data is here? cases, deaths, hospital beds, etc.
location_types - What types of locations are represented in the data, e.g., States, Countries, etc.
spatial_extent - How large of an area does the data set cover? A single country? A continent? The globe?
has_geospatial_info - Does the data set have explicit geospatial information (e.g., latitude and longtitude) such that it can be easily converted into an sf object or otherwise?

For example

data_set_name package_name function_to_get_data data_details data_url license_url data_types location_types spatial_extent has_geospatial_info
covid19us covid19us refresh_covid19us Open Source data from COVID Tracking Project on the distribution of Covid-19 cases and deaths in the US. For more, see https://github.com/opencovid19-fr/data. https://covidtracking.com/api https://github.com/aedobbyn/covid19us/blob/master/LICENSE.md positive, negative, pending, hospitalized_currently, hospitalized_cumulative, in_icu_currently, in_icu_cumulative, on_ventilator_currently, on_ventilator_cumulative, recovered, death, hospitalized, total_tests_viral, positive_tests_viral, negative_tests_viral, positive_cases_viral, positive_increase, negative_increase, total, total_test_results, total_test_results_increase, death_increase, hospitalized_increase, negative_regular_score, negative_score, positive_score state country FALSE

10.2 Obtaining fresh data

Each data set has its own refresh_*() function. It can either be refresh_PACKAGENAME() if the data package only supplies one data set. For example refresh_covid19france(). Or, if there are multiple data sets in a package, refresh_PACKAGENAME_MOREINFO() where PACKAGENAME_MOREINFO is the full name of a data set. For example refresh_covid19nytimes_states() and refresh_covid19nytimes_counties() return two different data sets from the covid19nytimes data package.

date location location_type location_code location_code_type data_type value
2020-06-19 Alabama state 01 fips_code cases_total 29002
2020-06-19 Alabama state 01 fips_code deaths_total 822
2020-06-19 Alaska state 02 fips_code cases_total 821
2020-06-19 Alaska state 02 fips_code deaths_total 10
2020-06-19 Arizona state 04 fips_code cases_total 46914
2020-06-19 Arizona state 04 fips_code deaths_total 1322