| Title: | Toolkit for Data Processing, Quality, and Statistical Models |
|---|---|
| Description: | Offers tools for data formatting, anomaly detection, and classification of tree-ring data using spatial comparisons and cross-correlation. Supports flexible detrending and climate–growth modeling via generalized additive mixed models (Wood 2017, ISBN:978-1498728331) and the 'mgcv' package (<https://CRAN.R-project.org/package=mgcv>), enabling robust analysis of non-linear trends and autocorrelated data. Provides standardized visual reporting, including summaries, diagnostics, and model performance. Compatible with '.rwl' files and tailored for the Canadian Forest Service Tree-Ring Data (CFS-TRenD) repository (Girardin et al. (2021) <doi:10.1139/er-2020-0099>), offering a comprehensive and adaptable framework for dendrochronologists working with large and complex datasets. |
| Authors: | Xiao Jing Guo [aut, cre], Martin Girardin [aut], Juha Metsaranta [aut], David Gervais [aut], Elizabeth Campbell [aut] |
| Maintainer: | Xiao Jing Guo <[email protected]> |
| License: | GPL-3 |
| Version: | 0.3.0.9000 |
| Built: | 2026-05-31 13:29:51 UTC |
| Source: | https://github.com/cfs-treering/growthtrendr |
To address the computational limitations of GAMMs for large datasets, this function offers a hybrid solution combining the efficiency of the mgcv::bam() function
bam_spatial(data, resp_scale = "resp_gamma", m.candidates, sp.option = "AICc")bam_spatial(data, resp_scale = "resp_gamma", m.candidates, sp.option = "AICc")
data |
data containing all necessary columns to run the model |
resp_scale |
Character. Specifies how the response variable is treated in the model.
|
m.candidates |
the list of candidate equations. |
sp.option |
Character. Spatial autocorrelation selection method.
|
This function accounts for within-site variability and temporal autocorrelation by including series identity as random effects and a first-order autoregressive (AR1) correlation structures, respectively. Among-site variability and spatial effects are captured by incorporating site identity as random effects. The model is refitted automatically by introducing a smooth term for latitude and longitude using the thin plate ("tp") basis if significant spatial autocorrelation persists. “Normalized” residuals are provided for future analysis.
This function supports parallel computation for the large-scale, geographically distributed datasets.
If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.
If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.
list including model, fitting statistics, ptable, stable and prediction table
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # bam_spatial model m.sp <-bam_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # bam_spatial model m.sp <-bam_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))
calculate basal area (cm2) and basal area increment (cm2)
calc_bai(data)calc_bai(data)
data |
data in long format containing at least 3 columns: id, year, rw |
add 3 columns to the original input data, ageC for cambial age, ba_cm2_t_1 for basal area of the previous year in cm2, and bai_cm2 for annual basal area increment in cm2
# generate data dt.rw <- data.table::data.table( uid_radius = rep(paste0("R", 1:3), each = 5), year = rep(2001:2005, times = 3), rw_mm = round(runif(15, 0.5, 3.5),2) ) data.table::setorder(dt.rw, uid_radius, year) # calculate bai dt.rw <- calc_bai(dt.rw)# generate data dt.rw <- data.table::data.table( uid_radius = rep(paste0("R", 1:3), each = 5), year = rep(2001:2005, times = 3), rw_mm = round(runif(15, 0.5, 3.5),2) ) data.table::setorder(dt.rw, uid_radius, year) # calculate bai dt.rw <- calc_bai(dt.rw)
converts tree-ring data from various formats into a format compatible with hierarchical structure of the CFS-TRenD data collection (Girardin et al., 2021).
CFS_format(data, usage, out.csv = NULL)CFS_format(data, usage, out.csv = NULL)
data |
a list, first is input data in wide format; second is a flat sequence referring to the column indices of ring measurement variables |
usage |
1: for submission data, 2: for cfstrend id structure to perform the analyse |
out.csv |
output csv file (default is NULL, otherwise to specify the directory to output) |
A list of 3 elements: 1) A list containing seven tables compatible with CFS-TRenD data structure; 2) A data table containing all the meta-data and ring width measurement in wide format; 3) A data table for the percentage of completeness of each variable.
Girardin, M.P., Guo, X.J., Metsaranta, J., Gervais, D., Campbell, E., Arsenault, A., Isaac-Rentone, M., Harvey, J.E., Bhatti, J., Hogg, E.A. 2021. A national tree-ring repository for Canadian forests (CFS-TRenD): structure, synthesis and applications. Environmental Reviews, 29 (999), 1-17. https://doi.org/10.1139/er-2020-0099
# ring measurement dt.samples <- data.table::fread( system.file("extdata", "dt.samples.csv", package = "growthTrendR")) # formatting the users' data conformed to CFS-TRenD data structure dt.samples_trt <- CFS_format(data = list(dt.samples, 39:68), usage = 1, out.csv = NULL) # save it to extdata for further use # saveRDS(dt.samples_trt, file = "inst/extdata/dt.samples_trt.rds")# ring measurement dt.samples <- data.table::fread( system.file("extdata", "dt.samples.csv", package = "growthTrendR")) # formatting the users' data conformed to CFS-TRenD data structure dt.samples_trt <- CFS_format(data = list(dt.samples, 39:68), usage = 1, out.csv = NULL) # save it to extdata for further use # saveRDS(dt.samples_trt, file = "inst/extdata/dt.samples_trt.rds")
frequency distributions by geo-location per species
CFS_freq( data, freq.label_data = "", freq.uid_level = "uid_radius", freq.cutoff_year = -999, freq.geo_resolution = NULL )CFS_freq( data, freq.label_data = "", freq.uid_level = "uid_radius", freq.cutoff_year = -999, freq.geo_resolution = NULL )
data |
meta table from function CFS_format() |
freq.label_data |
description of data |
freq.uid_level |
which uid level to count(uid_project, uid_site, uid_tree, uid_meas, uid_sample, uid_radius) |
freq.cutoff_year |
cut-off year for a subset which series were recorded on or after |
freq.geo_resolution |
Numeric vector (lon, lat) giving spatial resolution in degrees (e.g., c(5, 3)). If NULL, each value defaults to one quarter of the corresponding coordinate range. |
a data table of counts of uid by latitude-longitude per species
# treated ring measurement dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # Compute frequency statistics at radius level dt.freq <- CFS_freq( dt.samples_trt$tr_all_wide, freq.label_data = "demo-samples", freq.uid_level = "uid_radius" ) class(dt.freq)# treated ring measurement dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # Compute frequency statistics at radius level dt.freq <- CFS_freq( dt.samples_trt$tr_all_wide, freq.label_data = "demo-samples", freq.uid_level = "uid_radius" ) class(dt.freq)
This function performs inverse distance weighting (IDW) interpolation of tree-ring data across a spatial grid, either for all species combined or by individual species. It generates yearly interpolated raster maps over a user-defined extent or the extent of the input data.
CFS_mapping( data, year.span = c(1801, 2017), extent.lim = NULL, grid.step = 0.1, by.spc = FALSE )CFS_mapping( data, year.span = c(1801, 2017), extent.lim = NULL, grid.step = 0.1, by.spc = FALSE )
data |
input in wide format. |
year.span |
Numeric vector of length 2 giving the range of years to include. |
extent.lim |
Optional numeric vector defining the spatial extent
( |
grid.step |
Numeric value specifying the grid spacing in degrees. |
by.spc |
Logical; if |
An object of class cfs_map, a list of interpolated raster layers
by species and year.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) cols.meta = c("uid_tree", "uid_site", "longitude", "latitude", "species") dt.mapping <- dt.samples_trt$tr_all_wide[ , c(..cols.meta, as.character(1991:1995)), with = FALSE] results_mapping <- CFS_mapping(dt.mapping, year.span = c(1991,1993))# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) cols.meta = c("uid_tree", "uid_site", "longitude", "latitude", "species") dt.mapping <- dt.samples_trt$tr_all_wide[ , c(..cols.meta, as.character(1991:1995)), with = FALSE] results_mapping <- CFS_mapping(dt.mapping, year.span = c(1991,1993))
Performs comprehensive quality assurance analysis for tree-ring crossdating using pairwise cross-correlation functions and iterative chronologies refinement with automatic memory-efficient batch processing.
CFS_qa( dt.input, qa.label_data = "", qa.label_trt = "", qa.max_lag = 10, qa.max_iter = 100, qa.min_nseries = 100, qa.blcrit = 0.1, qa.mem_target = 0.6 )CFS_qa( dt.input, qa.label_data = "", qa.label_trt = "", qa.max_lag = 10, qa.max_iter = 100, qa.min_nseries = 100, qa.blcrit = 0.1, qa.mem_target = 0.6 )
dt.input |
A data.table containing tree-ring measurements with required columns:
|
qa.label_data |
Character. Label identifier for the dataset (required) |
qa.label_trt |
Character. Label identifier for the treatment method (required) |
qa.max_lag |
Integer. Maximum lag for cross-correlation analysis (default: 10) |
qa.max_iter |
Integer. Maximum iterations for chronologies refinement (default: 100) |
qa.min_nseries |
Integer. Minimum number of series required (default: 100) |
qa.blcrit |
Numeric. Borderline threshold criterion for quasi-pass classification (default: 0.1) |
qa.mem_target |
Numeric. Proportion of free memory to use for batch processing (0-1, default: 0.6). Higher values use more memory but may be faster. |
The function performs a two-step quality assurance process:
Step 1: Pairwise Cross-Correlation
Computes CCF for all pairs of treated series
Uses auto-batching to manage memory efficiently
Identifies initial qualified samples with max CCF at lag 0
Step 2: Iterative chronologies Refinement
Computes chronologies from qualified samples
Evaluates each sample by running CCF with the chronologies
Iteratively refines the qualified samples until convergence
QA Code Classification:
pass: Maximum correlation at lag 0
borderline: Second highest correlation at lag 0 (within threshold)
pm1: Maximum correlation at lag +/- 1 (slight misalignment)
highpeak: Maximum at non-zero lag, >2x second highest
fail: All other cases
Auto-batching: The function automatically determines optimal batch size based on:
Available system memory
Number of CPU cores
Estimated memory per CCF operation
Safety margins to prevent out-of-memory errors
An object of class cfs_qa containing:
data.table with CCF results and QA codes (qa_code) for each series
data.table with chronologies statistics
data.table with summary statistics per radius
List of data.tables formatted for plotting (raw and treated series, CCF plots)
List of parameters used in the analysis
ccf for cross-correlation function
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # data processing dt.samples_long <- prepare_samples_clim(dt.samples_trt, calbai = FALSE) # rename to the reserved column name data.table::setnames(dt.samples_long, c("sample_id", "year", "rw_mm"), c("SampleID", "Year" ,"RawRing")) # assign treated series # users can decide their own treated series # for rhub::rhub_check() on macos VECTOR_ELT issues data.table::setorder(dt.samples_long, SampleID, Year) dt.samples_long$RW_trt <- ave( as.numeric(dt.samples_long$RawRing), dt.samples_long$SampleID, FUN = function(x) if (length(x) > 1L) c(NA_real_, diff(x)) else NA_real_ ) # quality check on radius alignment based on the treated series dt.qa <-CFS_qa(dt.input = dt.samples_long, qa.label_data = "demo-samples", qa.label_trt = "difference", qa.min_nseries = 5)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # data processing dt.samples_long <- prepare_samples_clim(dt.samples_trt, calbai = FALSE) # rename to the reserved column name data.table::setnames(dt.samples_long, c("sample_id", "year", "rw_mm"), c("SampleID", "Year" ,"RawRing")) # assign treated series # users can decide their own treated series # for rhub::rhub_check() on macos VECTOR_ELT issues data.table::setorder(dt.samples_long, SampleID, Year) dt.samples_long$RW_trt <- ave( as.numeric(dt.samples_long$RawRing), dt.samples_long$SampleID, FUN = function(x) if (length(x) > 1L) c(NA_real_, diff(x)) else NA_real_ ) # quality check on radius alignment based on the treated series dt.qa <-CFS_qa(dt.input = dt.samples_long, qa.label_data = "demo-samples", qa.label_trt = "difference", qa.min_nseries = 5)
Apply a k-nearest neighbors (k-NN) method based on geographic location for the same species. It compares the median tree-ring measurements of a specific site to those of nearby sites
CFS_scale( target_site, ref_sites, scale.label_data_ref = "", scale.max_dist_km = 20, scale.N_nbs = 10 )CFS_scale( target_site, ref_sites, scale.label_data_ref = "", scale.max_dist_km = 20, scale.N_nbs = 10 )
target_site |
data.table with columns uid_site, site_id, species, longitude and latitude |
ref_sites |
reference sites including species, uid_site, latitude, longitude, uid_radius, year, rw_mm in long format |
scale.label_data_ref |
description of ref_sites |
scale.max_dist_km |
maximum distance to search the neighbors in km |
scale.N_nbs |
number of nearest-neighbors (maximum) |
A data table containing the median ring-width measurements of the involved sites, along with the distances from the specific site
# loading formatted dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) all.sites <- dt.samples_trt$tr_all_wide[,.N, by = c("species", "uid_site", "site_id")][, N:=NULL] dupes <- all.sites[, .N, by = .(species, site_id)][N > 1] # e.g. taking the target sites target_site <- all.sites[c(1,2), -"uid_site"] ref.sites <- merge( dt.samples_trt$tr_all_wide[,c("species", "uid_site", "site_id", "latitude","longitude", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = c("uid_radius")) dt.scale <- CFS_scale( target_site = target_site, ref_sites = ref.sites, scale.label_data_ref = "demo-samples", scale.max_dist_km = 200, scale.N_nbs = 2)# loading formatted dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) all.sites <- dt.samples_trt$tr_all_wide[,.N, by = c("species", "uid_site", "site_id")][, N:=NULL] dupes <- all.sites[, .N, by = .(species, site_id)][N > 1] # e.g. taking the target sites target_site <- all.sites[c(1,2), -"uid_site"] ref.sites <- merge( dt.samples_trt$tr_all_wide[,c("species", "uid_site", "site_id", "latitude","longitude", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = c("uid_radius")) dt.scale <- CFS_scale( target_site = target_site, ref_sites = ref.sites, scale.label_data_ref = "demo-samples", scale.max_dist_km = 200, scale.N_nbs = 2)
This function computes predicted values and confidence intervals from a fitted GAM model
with a log-link (or other link) and optionally back-transforms predictions to the response scale.
Five methods are supported:
1. **delta_link**: classic delta method on the linear predictor (link) scale; back-transformed CI is asymmetric.
2. **delta_resp**: delta method applied directly on the response scale using
.
3. **bootstrap_link**: parametric bootstrap on the linear predictor; quantiles back-transformed.
4. **bootstrap_resp**: parametric bootstrap on the response scale; quantiles computed after exponentiating.
5. **posterior**: Bayesian posterior simulation using the model covariance matrix; quantiles on response scale.
ci_resp( model, newdata, nboot = 100, model.ci_method = c("posterior", "delta_link", "delta_resp", "bootstrap_link", "bootstrap_resp"), level = 0.95 )ci_resp( model, newdata, nboot = 100, model.ci_method = c("posterior", "delta_link", "delta_resp", "bootstrap_link", "bootstrap_resp"), level = 0.95 )
model |
A fitted GAM object from |
newdata |
A data.frame containing values at which to predict. |
nboot |
Integer. Number of bootstrap or posterior samples. Default 1000. |
model.ci_method |
Character. One of |
level |
Numeric. Confidence level, default 0.95. |
References: - Wood, S.N. (2017) *Generalized Additive Models: An Introduction with R, 2nd Edition*. CRC Press. - Efron, B., & Tibshirani, R. (1993) *An Introduction to the Bootstrap*. Chapman & Hall. - Gelman, A., et al. (2013) *Bayesian Data Analysis, 3rd Edition*. CRC Press.
A data.table with columns: fit, lwr, upr.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # using gamm_spatial model as an example m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)") dt.m[, uid_site.fac:= as.factor(as.character(uid_site))] dt.ci <- ci_resp(m.sp$model$gam, newdata = dt.m)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # using gamm_spatial model as an example m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)") dt.m[, uid_site.fac:= as.factor(as.character(uid_site))] dt.ci <- ci_resp(m.sp$model$gam, newdata = dt.m)
generalized additive model that captures non-linear relationships between a response variable and predictors using smooth functions, while allowing inclusion of random effects or complex structures.
gam_mod(data, resp_scale = "", m.candidates, sp.option = "AICc")gam_mod(data, resp_scale = "", m.candidates, sp.option = "AICc")
data |
data containing all necessary columns to run the model |
resp_scale |
Character. Specifies how the response variable is treated in the model.
|
m.candidates |
the list of candidate equations. |
sp.option |
Character. Spatial autocorrelation selection method.
|
This function models the generalized additive model using mcgv::gam.
If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.
If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.
list including model, fitting statistics, ptable, stable and prediction table
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # pre-data for model dt.samples_long <- prepare_samples_clim(dt.samples_trt) dt.samples_long$uid_site.fac <- as.factor(as.character(dt.samples_long$uid_site)) dt.m <- dt.samples_long # gam_mod m.gam <-gam_mod(data = dt.m, resp_scale = "resp_log", m.candidates = c( "rw_mm ~ s(year, by = uid_site.fac)", "rw_mm ~ year:uid_site.fac"))# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # pre-data for model dt.samples_long <- prepare_samples_clim(dt.samples_trt) dt.samples_long$uid_site.fac <- as.factor(as.character(dt.samples_long$uid_site)) dt.m <- dt.samples_long # gam_mod m.gam <-gam_mod(data = dt.m, resp_scale = "resp_log", m.candidates = c( "rw_mm ~ s(year, by = uid_site.fac)", "rw_mm ~ year:uid_site.fac"))
models the biological growth trends in individual tree-ring width series using mgcv::gamm
gamm_radius(data, resp_scale = "resp_gamma", m.candidates)gamm_radius(data, resp_scale = "resp_gamma", m.candidates)
data |
data containing all necessary columns to run the model |
resp_scale |
Character. Specifies how the response variable is treated in the model.
|
m.candidates |
the list of candidate equations. |
This function models the biological growth trends in individual tree-ring width series using mcgv::gamm. By integrating a first-order autoregressive (AR1) component, it accounts for temporal autocorrelation. This method can provide “normalized” residuals, which are adjusted to reflect deviations after considering the AR1 correlation structure. 'Normalized' residuals are valuable for further analyses, such as investigating relationships with climatic variables. If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.
If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.
list including model, fitting statistics, ptable, stable and prediction table
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # pre-data for model dt.samples_long <- prepare_samples_clim(dt.samples_trt) dt.m <- dt.samples_long[uid_site == 1][ageC >1] # gamm_radius model m.radius <-gamm_radius(data = dt.m, resp_scale = "resp_log", m.candidates = c( "rw_mm ~ s(year)", "rw_mm ~ s(year)"))# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # pre-data for model dt.samples_long <- prepare_samples_clim(dt.samples_trt) dt.m <- dt.samples_long[uid_site == 1][ageC >1] # gamm_radius model m.radius <-gamm_radius(data = dt.m, resp_scale = "resp_log", m.candidates = c( "rw_mm ~ s(year)", "rw_mm ~ s(year)"))
models the growth trend or climate-growth relationship per site.
gamm_site(data, resp_scale = "resp_gamma", m.candidates)gamm_site(data, resp_scale = "resp_gamma", m.candidates)
data |
data containing all necessary columns to run the model |
resp_scale |
Character. Specifies how the response variable is treated in the model.
|
m.candidates |
the list of candidate equations. |
This function accounts for within-site variability and temporal autocorrelation by including series identity as random effects and a first-order autoregressive (AR1) correlation structures, respectively. using mgcv::gamm()
If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.
If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.
list including model, fitting statistics, ptable, stable and prediction table
#' @examples # loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[uid_site == 1][ageC >1] # gamm_site model m.site <-gamm_site(data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))#' @examples # loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[uid_site == 1][ageC >1] # gamm_site model m.site <-gamm_site(data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))
models the growth trend or climate-growth relationship at regional-level with multiple sites
gamm_spatial(data, resp_scale = "resp_gamma", m.candidates, sp.option = "AICc")gamm_spatial(data, resp_scale = "resp_gamma", m.candidates, sp.option = "AICc")
data |
data containing all necessary columns to run the model |
resp_scale |
Character. Specifies how the response variable is treated in the model.
|
m.candidates |
the list of candidate equations. |
sp.option |
Character. Spatial autocorrelation selection method.
|
This function accounts for within-site variability and temporal autocorrelation by including series identity as random effects and a first-order autoregressive (AR1) correlation structures, respectively. Among-site variability and spatial effects are captured by incorporating site identity as random effects. The model is refitted automatically by introducing a smooth term for latitude and longitude using the thin plate ("tp") basis if significant spatial autocorrelation persists. “Normalized” residuals are provided for future analysis.
If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.
If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.
list including model, fitting statistics, ptable, stable, prediction table and spatial effect(moranI)
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # gamm_spatial model m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_gamma", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # gamm_spatial model m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_gamma", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD", "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))
Creates HTML reports from various analysis objects using predefined R Markdown templates. The function automatically selects the appropriate template based on the input object's class and renders a comprehensive report with visualizations and analysis results.
generate_report( robj, data_report.reports_sel = c(1, 2, 3, 4), output_file = NULL, ... )generate_report( robj, data_report.reports_sel = c(1, 2, 3, 4), output_file = NULL, ... )
robj |
An R object containing analysis results from functions in this package. The object's class determines which report template is used. Supported classes depend on available templates in the package. |
data_report.reports_sel |
Numeric vector. Specifies which sections of the data report to include in the output: 1 = project level; 2 = species level; 3 = site level; 4 = radius level. The default is 'c(1, 2, 3, 4)', which includes all sections. |
output_file |
Character string. Optional path and filename for the output HTML file. If NULL (default), the report is generated with an automatic filename and opened in RStudio viewer. |
... |
Additional parameters passed to the R Markdown template. Available parameters vary by template type and are filtered to only include those recognized by the selected template. |
Invisibly returns the file path of the generated report. The function is primarily called for its side effect of generating the report file.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # genereate data summary report at project level outfile_data <- tempfile(fileext = ".html") generate_report(robj = dt.samples_trt, data_report.reports_sel = c(1), output_file = outfile_data)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # genereate data summary report at project level outfile_data <- tempfile(fileext = ".html") generate_report(robj = dt.samples_trt, data_report.reports_sel = c(1), output_file = outfile_data)
This function plots the frequency distribution by geo-location for each species
plot_freq(dt.freq, out.species = "all")plot_freq(dt.freq, out.species = "all")
dt.freq |
a table with class "cfs_freq", resulting from function CFS_freq() |
out.species |
species list, default is 'all' to output the frequency distribution for all species |
A list of ggplot objects corresponding to the species
requested via out.species. Each element of the list contains
a faceted spatial plot of tree locations, with point size proportional
to the number of samples.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) dt.freq <- CFS_freq( dt.samples_trt$tr_all_wide, freq.label_data = "demo-samples", freq.uid_level = "uid_radius") plots.lst <- plot_freq(dt.freq)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) dt.freq <- CFS_freq( dt.samples_trt$tr_all_wide, freq.label_data = "demo-samples", freq.uid_level = "uid_radius") plots.lst <- plot_freq(dt.freq)
Visualizes spatial interpolation results produced by
CFS_mapping and optionally exports raster files,
static maps, and animations.
plot_mapping( mapping_results, crs.src = "EPSG:4326", parms.out = c("csv", "tif", "png", "gif"), dir.shp = NULL, dir.out = NULL, animation_fps = 1, ... )plot_mapping( mapping_results, crs.src = "EPSG:4326", parms.out = c("csv", "tif", "png", "gif"), dir.shp = NULL, dir.out = NULL, animation_fps = 1, ... )
mapping_results |
A list returned by |
crs.src |
Coordinate Reference System of the input data, specified as a string in the format 'EPSG:<ID>' (for example, 'EPSG:4326'). The function will stop with an error if 'crs.src' is not provided in this format. |
parms.out |
Character vector indicating output formats to generate.
Supported values are |
dir.shp |
Character or NULL. Path to the folder containing shapefiles for cropping data to the Canadian boreal regions. Only used for specific research purposes; if NULL (default), no cropping is applied and all data are included. |
dir.out |
Output directory used to save generated files.
Required when |
animation_fps |
Frames per second used when creating GIF animations. |
... |
Additional arguments passed to
|
This function assumes that spatial interpolation has already
been performed using CFS_mapping. The input
object is iterated by species and year to generate maps and
optional exports.
When "gif" is requested in parms.out, yearly
PNG images are combined into animated GIFs.
A magick-image object representing an animated GIF composed of the generated frames.
# Load processed demo data dt.samples_trt <- readRDS( system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR") ) # prepare data for IDW model cols.meta = c("uid_tree", "uid_site", "longitude", "latitude", "species") dt.mapping <- dt.samples_trt$tr_all_wide[ , c(..cols.meta, as.character(1991:1995)), with = FALSE] # Run spatial interpolation mapping_results <- CFS_mapping( dt.mapping, year.span = c(1991, 1993) ) # generate png plots img_ani <- plot_mapping( mapping_results = mapping_results, parms.out = NULL, dir.shp = NULL, dir.out = NULL, png.text = list( text_top = "Ring width measurement - ", text_bott = "Source: demo-samples", text_side = "ring width (mm)" ) )# Load processed demo data dt.samples_trt <- readRDS( system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR") ) # prepare data for IDW model cols.meta = c("uid_tree", "uid_site", "longitude", "latitude", "species") dt.mapping <- dt.samples_trt$tr_all_wide[ , c(..cols.meta, as.character(1991:1995)), with = FALSE] # Run spatial interpolation mapping_results <- CFS_mapping( dt.mapping, year.span = c(1991, 1993) ) # generate png plots img_ani <- plot_mapping( mapping_results = mapping_results, parms.out = NULL, dir.shp = NULL, dir.out = NULL, png.text = list( text_top = "Ring width measurement - ", text_bott = "Source: demo-samples", text_side = "ring width (mm)" ) )
Plotting the time series and cross-correlation to contrast each radius with the chronologies, using both raw ring-width measurements and treated series.
plot_qa(qa.trt, qa.out_series = "all")plot_qa(qa.trt, qa.out_series = "all")
qa.trt |
object "cfs_qa" from CFS_qa() function. |
qa.out_series |
series_id list to be plotted, default "all" for plotting the graphs for all. |
A named list of tree-ring series, where each element corresponds to
a series requested via qa.out_series. Each element is itself a list
containing four ggplot objects:
Raw tree-ring series vs year.
Treated (detrended/standardized) series vs year.
Cross-correlation function of the raw series.
Cross-correlation function of the treated series.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # data processing dt.samples_long <- prepare_samples_clim( dt.samples_trt = dt.samples_trt, calbai = FALSE ) # rename to the reserved column name data.table::setnames( dt.samples_long, c("sample_id", "year", "rw_mm"), c("SampleID", "Year" ,"RawRing")) # assign treated series # users can decide their own treated series # for rhub::rhub_check() on macos VECTOR_ELT issues data.table::setorder(dt.samples_long, SampleID, Year) dt.samples_long$RW_trt <- ave( as.numeric(dt.samples_long$RawRing), dt.samples_long$SampleID, FUN = function(x) if (length(x) > 1L) c(NA_real_, diff(x)) else NA_real_ ) # quality check on radius alignment based on the treated series dt.qa <-CFS_qa(dt.input = dt.samples_long, qa.label_data = "demo-samples", qa.label_trt = "difference", qa.min_nseries = 5) plots.lst <- plot_qa(dt.qa, qa.out_series = "X003_101_005")# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # data processing dt.samples_long <- prepare_samples_clim( dt.samples_trt = dt.samples_trt, calbai = FALSE ) # rename to the reserved column name data.table::setnames( dt.samples_long, c("sample_id", "year", "rw_mm"), c("SampleID", "Year" ,"RawRing")) # assign treated series # users can decide their own treated series # for rhub::rhub_check() on macos VECTOR_ELT issues data.table::setorder(dt.samples_long, SampleID, Year) dt.samples_long$RW_trt <- ave( as.numeric(dt.samples_long$RawRing), dt.samples_long$SampleID, FUN = function(x) if (length(x) > 1L) c(NA_real_, diff(x)) else NA_real_ ) # quality check on radius alignment based on the treated series dt.qa <-CFS_qa(dt.input = dt.samples_long, qa.label_data = "demo-samples", qa.label_trt = "difference", qa.min_nseries = 5) plots.lst <- plot_qa(dt.qa, qa.out_series = "X003_101_005")
This function generates ggplot objects for each smooth term in a GAM model.
Predictions vary one smooth term at a time, keeping all other terms fixed at reference values.
Confidence intervals are computed using ci_resp(), supporting multiple methods:
"delta_link", "delta_resp", "bootstrap_link", "bootstrap_resp", and "posterior".
Small samples automatically trigger the "posterior" method for more robust CIs.
plot_resp(robj, model.ci_method = "posterior", ...)plot_resp(robj, model.ci_method = "posterior", ...)
robj |
R object from the modelling functions. |
model.ci_method |
Character. One of |
... |
Additional arguments passed to |
A list of ggplot objects, one per smooth term.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # gamm_site model m.spatial <-gamm_spatial( data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)")) plots.lst <- plot_resp(m.spatial, model.ci_resp = "posterior")# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # gamm_site model m.spatial <-gamm_spatial( data = dt.m, resp_scale = "resp_log", m.candidates = c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)")) plots.lst <- plot_resp(m.spatial, model.ci_resp = "posterior")
This function plots the time series and the geographical distribution of the median ring-width measurements to contrast the target site with its neighbouring sites.
plot_scale(dt.scale)plot_scale(dt.scale)
dt.scale |
a table with class "cfs_scale", resulting from function CFS_scale() |
A named list with two ggplot objects showing the contrast
between the target site and its neighboring sites:
A time-series plot of median ring width by year.
A spatial plot showing the geographic location of the sites, with point size proportional to the magnitude of site-level ring-width measurements.
# loading formatted dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) all.sites <- dt.samples_trt$tr_all_wide[,.N, by = c("species", "uid_site", "site_id")][, N:=NULL] # e.g. taking the target sites target_site <- all.sites[c(1,2), -"uid_site"] ref.sites <- merge( dt.samples_trt$tr_all_wide[,c("species", "uid_site", "site_id", "latitude","longitude", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = c("uid_radius")) dt.scale <- CFS_scale( target_site = target_site, ref_sites = ref.sites, scale.label_data_ref = "demo-samples", scale.max_dist_km = 200, scale.N_nbs = 2) plots.lst <- plot_scale(dt.scale[[1]])# loading formatted dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) all.sites <- dt.samples_trt$tr_all_wide[,.N, by = c("species", "uid_site", "site_id")][, N:=NULL] # e.g. taking the target sites target_site <- all.sites[c(1,2), -"uid_site"] ref.sites <- merge( dt.samples_trt$tr_all_wide[,c("species", "uid_site", "site_id", "latitude","longitude", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = c("uid_radius")) dt.scale <- CFS_scale( target_site = target_site, ref_sites = ref.sites, scale.label_data_ref = "demo-samples", scale.max_dist_km = 200, scale.N_nbs = 2) plots.lst <- plot_scale(dt.scale[[1]])
This function prepares tree-ring data including ring-width measurements, and optionally computes basal area increment (BAI) and merges climate variables.
prepare_samples_clim(dt.samples_trt, dt.clim = NULL, calbai = TRUE)prepare_samples_clim(dt.samples_trt, dt.clim = NULL, calbai = TRUE)
dt.samples_trt |
A list of tree-ring data formatted by
|
dt.clim |
An optional data frame containing climate variables,
joined by |
calbai |
Logical. If |
A data frame or data.table containing tree-ring measurements, optionally including basal area increment and merged climate variables.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim)
Import raw ring-width measurements from the ITRDB
read_rwl(dir.src, rwl)read_rwl(dir.src, rwl)
dir.src |
Character string specifying the directory path or URL where the RWL file is located. Users typically provide a local directory containing a user-downloaded ITRDB file, but an online source may also be used. |
rwl |
Character string giving the file name of the RWL file. |
The International Tree-Ring Data Bank (ITRDB) is maintained by the NOAA National Centers for Environmental Information (NCEI) as part of the World Data Service for Paleoclimatology. See: https://www.ncei.noaa.gov/products/paleoclimatology/tree-ring
Raw ring-width measurement files ('.rwl') distributed through the ITRDB follow the *Tucson fixed-width format*. In this convention, header records encode site-level metadata (e.g., site identifier, site name, species code, geographic information, and temporal coverage), followed by ring-width measurements stored in a decadal layout (ten annual values per line).
Record 1
Columns 1–6: Site ID
Columns 10–61: Site name
Columns 62–65: Species code
Optional ID fields
Record 2
Columns 1–6: Site ID
Columns 10–22: State / country
Columns 23–30: Species
Columns 41–45: Elevation
Columns 48–57: Latitude–longitude
Columns 62–63: Measurement type code
Columns 68–76: First and last year
Latitude–longitude values are expressed in degrees and minutes ('ddmm' or 'dddmm').
Record 3
Columns 1–6: Site ID
Columns 10–72: Investigators
Columns 73–80: Optional completion date
A list with two elements:
header: a table containing site-level metadata
rwl: a table containing raw ring-width measurements
## Online example (not run to avoid timeout and internet dependency) dir.src <- "https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada" rwl <- "cana615.rwl" dt.rwl <- read_rwl(dir.src, rwl) ## Local example using packaged data file <- system.file("extdata", "cana615.rwl", package = "growthTrendR") stopifnot(file != "") dir.src <- dirname(file) rwl <- basename(file) dt.rwl <- read_rwl(dir.src, rwl)## Online example (not run to avoid timeout and internet dependency) dir.src <- "https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada" rwl <- "cana615.rwl" dt.rwl <- read_rwl(dir.src, rwl) ## Local example using packaged data file <- system.file("extdata", "cana615.rwl", package = "growthTrendR") stopifnot(file != "") dir.src <- dirname(file) rwl <- basename(file) dt.rwl <- read_rwl(dir.src, rwl)
Evaluates the relative influence of each smooth term in a GAM model by computing
its contribution to the fitted values using the linear predictor matrix
(type = "lpmatrix"). Three summary methods are available: sum of squares,
variance, and mean absolute value across all observations.
#'
sterm_imp(gam_model, method = c("ssq", "var", "meanabs"))sterm_imp(gam_model, method = c("ssq", "var", "meanabs"))
gam_model |
A GAM model object. |
method |
A character string specifying the method to compute importance.
One of |
A data.table with columns:
Name of the smooth term.
Relative importance as a percentage.
The method used for calculating the importance.
# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # using gamm_spatial model as an example m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)") dt.m[, uid_site.fac:= as.factor(as.character(uid_site))] dt.imp <- sterm_imp(m.sp$model$gam)# loading processed data dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR")) # climate dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR")) # pre-data for model dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim) dt.m <- dt.samples_clim[ageC >1] # using gamm_spatial model as an example m.sp <-gamm_spatial(data = dt.m, resp_scale = "resp_log", m.candidates = "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)") dt.m[, uid_site.fac:= as.factor(as.character(uid_site))] dt.imp <- sterm_imp(m.sp$model$gam)