Biostatistics Data Tools

Datasets on This Page:

spBayes: Univariate and Multivariate Spatial-Temporal Modeling

About

spBayes fits univariate and multivariate models with Markov chain Monte Carlo (MCMC). Core functions include:

  • spLM – univariate Gaussian regression with spatial random effects
  • spMvLM – multivariate Gaussian regression with spatial random effects
  • spMisalignLM – multivariate Gaussian regression with spatial random effects for misaligned data
  • spGLM – univariate Logistic and Poisson regression with spatial random effects
  • spMvGLM – multivariate Logistic and Poisson regression with spatial random effects
  • spMisalignGLM – multivariate Logistic and Poisson regression with spatial random effects for misaligned data
  • spDynLM – univariate Gaussian regression with dynamic space-time random effects

FSPH faculty member Sudipto Banerjee, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

ToxTrees

About

This software suite implements tree-based analysis of quantitative structure activity relationships associated with exposure or escalation experiments in toxicology and nanotoxicology. Given a library of chemical or nanomaterial stressors, their physicochemical characteristics are related to biological outcomes through regression trees with exposure response models on the tree leaves. Inference is based on Bayesian model averaging, providing full predictive distributions of exposure-response relationships. The outcome can be represented as dose-escalation only or escalation protocols involving dose and time of exposure. The software allows estimating variable importance, as well as marginal influence functions. Details about models and applications can be found in Low-Kam et. al (2015).

FSPH faculty member Donatello Telesca, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

MIC2 R Package

About

The MIC2 R package implements Multilevel Integrative Clustering of brain cortical regions, imaged via electroencephalography (EEG). The goal of the package is that of producing statistical inference about patterns of synchronous brain activity, combining and borrowing strength from data collected longitudinally on multiple subjects. Given a longitudinal measure of electrode similarity, intended as coherence, cross correlation, etc., MIC formulates a hierarchical model for grouping electrodes on the cortex. The model jointly clusters electrodes over three levels of a hierarchy, so that a representative map of clustered cortical regions are available both at the level of specific individuals and as a group summary. The package includes, some data preprocessing routines, allowing for the estimation of similarity matrices longitudinally, given segmented multivariate time series. Graphical tools for visualization of results and inference are also implemented.

While the applications context is brain imaging, the technique can be easily generalized to accommodate various scientific contexts requiring the integration of data sources collected in a hierarchical fashion. For more of its details, please refer to our manuscript here (https://arxiv.org/abs/1609.09532).

This tool was created by Damla Senturk, Catherine Sugar, Donatello Telesca, and Qian Li, department of Biostatistics.

Get the Data Tool

 

Unimodal Curve Registration (ucr)

About

This R package implements joint curve registration and regression techniques for functional and longitudinal data. Functional variability is related to a set of predictors through regression models of functions amplitude (summarizing the overall strength of a functional signal) and phase (summarizing average timing of functional features). Supported models, include Gaussian, Poisson and Censored Gaussian sampling. Inference is based on Markov Chain Monte Carlo (MCMC) simulation. Details about modeling capabilities can be found in: Telesca et al. (2012), Erosheva et al. (2014), and Telesca (2015).

FSPH faculty member Donatello Telesca, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

 

mombf: Moment and Inverse Moment Bayes Factors

About

The mombf R package implements Bayesian model selection (BMS) and model averaging (BMA) for linear, asymmetric linear, median and quantile regression. This is the main package implementing the family of non-local prior (NLP) distributions (see Johnson and Rossell (2010, 2012) for a more detailed treatment), although other priors (mainly Zellner’s) are also implemented. The main features are:

  • Density, cumulative density, quantiles and random numbers for NLPs
  • BMS in linear regression (Section 1, Johnson and Rossell (2010, 2012).
  • BMA in linear regression (Section 4, Rossell and Telesca (2016).
  •  Exact BMS and BMA under orthogonal and block-diagonal regression (Section 5, Papaspiliopoulos and Rossell (2016).
  • BMS and BMA for certain generalized linear models (Section 6, Johnson and Rossell (2012); Rossell et al. (2013)
  • BMS in linear regression with non-normal residuals (Rossell and Rubio, 2016).

Particular cases are Bayesian versions of asymmetric least squares, median and quantile regression. This manual introduces some basic notions underlying NLPs and illustrates the use of R functions implementing the main operations required for model selection and averaging. Most of these are internally implemented in C++ so, while they are not optimal in any sense they are designed to be minimally scalable to high dimensions (large p).

FSPH faculty member Donatello Telesca, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

 

MBA: Multilevel B-spline Approximation

About

MBA generates surfaces interpolated from scattered data using Multilevel B-Splines.

FSPH faculty member Sudipto Banerjee, department of Biostatistics, created this tool in collaboration with others.

 

Get the Data Tool

B2Z: Bayesian Two-Zone Model

About

This package fits the Bayesian two-Zone Models. 

FSPH faculty member Sudipto Banerjee, department of Biostatistics, created this tool in collaboration with others.

 

Get the Data Tool

 

BrokenAdaptiveRidge

About

BrokenAdaptiveRidge is an R package for performing large scale and high-dimensional L_0-based variable selection for GLM and Cox’s proportional hazards model.

 

References
  1. Li, N. Peng, X. Suchrd, M.A., Li, G. (2017), “Scalable Sparse Regression for Massive Generalized Linear Models via Broken Adaptive Ridge”. Submitted.
  2. Kawaguchi, E.S., Suchard, M.A., Liu, Z., Li, G. (2017), “Scalable Sparse Cox’s Regression for Large Scale Survival Data via Broken Adaptive Ridge”. Submitted.

 

FSPH faculty member Gang Li, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

controlTest

About

controlTest is an R package for two-sample nonparametric comparison of survival quantiles. The main features are:

  • quantileSE: Finding the estimated standard error for a survival quantile for one sample via bootstrap.
  • quantileControlTest: Comparing the survival quantiles of two independent samples based on the control median test, control percentile test, and a supremum-type test.

 

References
  1. Li, G., Tiwari, R.C., and Wells, M. (1996). "Quantile Comparison Functions in Two-Sample Problems: With Applications to Comparisons of Diagnostic Markers." Journal of the American Statistical Association, 91, 689-698.
  2. Chakraborti, S., and Mukerjee, R. (1989), "A Confidence Interval for a Measure Associated With the Comparison of a Treatment With a Control," South African Statistical Journal, 23, 219-230.
  3. Gastwirth, J. L., and Wang, J. L. (1988), "Control Percentile Test for Censored Data," Journal of Statistical Planning and Inference, 18, 267-276.

 

FSPH faculty member Gang Li, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

powerCompRisk

About

powerCompRisk is a power analysis tool for joint testing of cause-specific hazard and overall hazard with competing risks data using R.

 

References
  1. Yang, Q., Fung, W.K., Li, G. (2017) Sample size determination for jointly testing a cause-specific hazard and the any-cause hazard in the presence of competing risks. Statistics in Medicine, in press.

 

FSPH faculty member Gang Li, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

JMcmprsk

About

JMcmprsk is an R package to fit joint models of continuous or ordinal longitudinal data and time-to-event data with competing risks.

 

References
  1. Robert Elashoff, Gang li and Ning Li (2016, ISBN:9781439807828) ; Robert M. Elashoff,Gang Li and Ning Li (2008); Ning Li, Robert Elashoff, Gang Li and Jeffrey Saver (2010)

 

FSPH faculty member Gang Li, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool

ZlBseq

About

ZIBseq is an R package to Detect abundance differences across clinical conditions. Besides, it takes the sparse nature of metagenomic data into account and handles compositional data efficiently.

 

References
  1. Peng X.L., Li, G., and Liu, Z.Q.(2015). “Zero-inflated Beta Regression for Differential Abundance Analysis with Metagenomics Data." Journal of Computational Biology, 2015 Dec 16. [Epub ahead of print] PMID: 26675626 [PubMed - as supplied by publisher] PMCID: Pending

 

FSPH faculty member Gang Li, department of Biostatistics, created this tool in collaboration with others.

Get the Data Tool