pLASSO

pLASSO is a statistical method which incorporates prior information into the L1 penalized generalized linear models. We distribute here two R functions (function_linear.R and function_logistic.R) related to pLASSO. These two functions are for linear regression and logistic regression, respectively. Both functions can find all six estimators compared in Jiang, He, and Zhang (2014), i.e., LASSO, p, pLASSO; LASSO-A, p-A, pLASSO-A. The functions use cross validation to select the optimal tuning parameters. See the following paper for more details.

  • Jiang Y, He Y, and Zhang H. (2014). Variable selection with prior information for generalized linear models via the pLASSO method.

Input Files:

  • Design matrix: A numerical matrix x including the observations of predictors.
  • Response vector: A numerical vector y including the observations of responses. In linear regression, y is continuous; while in logistic regression, y is binary (0/1).

Sample Program:

You can find a sample R program (  sample_linear.R) for linear regression on simulated x (x_linear.txt) and y (y_linear.txt), as well as a sample R program (sample_logistic.R) for logistic regression on simulated x (x_logistic.txt) and y (y_logistic.txt).

Instruction:

  • Prepare the data file x and y as in the sample data sets. Run the corresponding sample R program (sample_linear.R or sample_logistic.R) to get all six estimators.
  • For logistic regression, make sure to install the R package grplasso using the file here (Linux: grplasso_0.4-2.tar.gz, or Windows: grplasso_0.4-2.zip) before running the sample R program.