Compile a specialized data frame based on the <code>rvtable</code> package

Compile a specialized data frame based on the rvtable package using distribution data frames of SNAP downscaled climate data.

dist_data(data, variable, margin = NULL, seed = NULL, metric = NULL,
  year_range, rcp_min_yr, base_max_yr, all_models, baseline_model = NULL,
  composite = "Composite GCM", baseline_scenario = "Historical",
  general_scenario = "Projected", margin_drop = c(baseline_scenario,
  baseline_model), density_size = 200, margin_size = 100,
  sample_size = margin_size, limit_sample = TRUE, baseline_only = FALSE,
  progress = TRUE)

Arguments

data	a data frame. It does not need to be an `rvtable`-class data frame in advance, but it must be coercible to one.
variable	character, a valid random variable. See details for currently available options.
margin	variable to marginalize over. Defaults to `NULL`.
seed	numeric or `NULL` (default), set random seed for reproducible sampling in app.
metric	`NULL` or logical. Output data in metric units, otherwise in US Standard. Input data in `data` is assumed metric. If `NULL` (default), no conversion or climate variable-specific rounding is performed.
year_range	full range of years in data set.
rcp_min_yr	minimum year for RCP, e.g., for CMIP5 data this is 2006.
base_max_yr	maximum year for baseline historical comparison data set that sometimes accompanies GCM data (e.g., CRU observation-based data, version 4.0 is 2015)
all_models	character, vector of climate model names in data set, to include baseline model if present.
baseline_model	character, name of baseline model in data set, e.g., `"CRU 4.0"`.
composite	character, name to use for composite climate models after marginalizing over models.
baseline_scenario	character, defaults to `"Historical"`.
general_scenario	character, defaults to `"Projected"`.
margin_drop	levels of variables to exclude from marginalizing operations on those variables. Defaults to the baseline scenario and baseline model.
density_size	numeric, sample size for density estimations. Defaults to `200`.
margin_size	numeric, sample size for marginalizing operations. Defaults to `100`.
sample_size	numeric, sample size for density estimations. Defaults to `margin.size`.
limit_sample	logical, see details.
baseline_only	logical, only processing baseline data set. Useful for climatology data.
progress	logical, include progress bar in app.

Value

a specialized data frame

Details

This is a specialized function suited to preparing reactive data frames for an app where the upstream source data represents an rvtable-class probability density data frame from the rvtable package. Many such data frames of SNAP data are available.

This function assumes the presence of certain data frame columns: Val, Prob, Var, RCP, Model, and Year. It will insert a Decade column. It will check to ensure a valid Var column, meaning a data frame can contain only one unique variable in its Var ID column and it must currently be one of "pr", "tas", "tasmin", "tasmax". This is because the current implementation makes certain assumptions about the data based on presently existing realistic use cases.

A powerful feature of this function, given an appropriate rvtable data frame, is the ability to marginalize over categorical variables (and meaningfully discrete numeric variables such as year) using the margin argument. The current implementation allows marginalizing over RCPs and/or climate models.

Arguments such as variable and year.range can be determined internally with data directly, but in the app context these variables are already determined in the session environment and there is no need to repeat scans of large data frames columns with every call to dist_data.

Note that during marginalizing operations, baseline historical data sets are not integrated with climate models when integrating models and historical climate models years are not integrated with future projections when integrating RCPs. All categorical variables are factors with explicit levels, not character.

If limit.sample=TRUE (default), the final sample size is reduced by a factor proportional to the number of unique RCP-GCM pairs. This helps prevent massive in-app samples when users select large amounts of data from many RCPs and models. A minimum sample size per group is still maintained regardless of how much data is requested. Detailed progress is provided for sampling from distributions and for calculating marginal distributions.

Examples

#not run

Compile a specialized data frame based on the `rvtable` package

Arguments

Value

Details

Examples

Contents

Compile a specialized data frame based on the rvtable package

Arguments

Value

Details

Examples

Contents

Compile a specialized data frame based on the `rvtable` package