Title: | Fast Sequential Pleiotropy Test |
---|---|
Description: | It performs a fast multi-trait genome-wide association analysis based on seemingly unrelated regressions. It tests for pleiotropic effects based on a series of Intersection-Union Wald tests. The package can handle large and unbalanced data and plot results. |
Authors: | Fernando Aguate [aut, cre] , Gustavo de los Campos [aut] , Alexander Grueneberg [ctb] |
Maintainer: | Fernando Aguate <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2024-11-01 04:54:00 UTC |
Source: | https://github.com/feraguate/pleiotest |
This function is used internally in pleioR.
identify_subsets(trait, id)
identify_subsets(trait, id)
trait |
character indicating traits. |
id |
character indicating IDs. |
list with an ID matrix and ID subsets.
Original code by Fernando Aguate.
Manhattan plot of results from mt_gwas function.
manhattan_plot(mt_gwas_results, trait, bp_positions, ...)
manhattan_plot(mt_gwas_results, trait, bp_positions, ...)
mt_gwas_results |
Object returned by mt_gwas |
trait |
integer indicating the position of the trait (see: names(mt_gwas_results)) to be plotted. |
bp_positions |
dataframe with SNPs base pair positions. colnames msut be 'chr' and 'position', rownames must be SNP identifiers matching names in mt_gwas. |
... |
further graphical parameters. Options include: title=, bty=, pch=, cex.lab=, and cex.main=. |
Manhattan plot
Performs a multi-trait model with correlated errors (seemingly unrelated regressions), and generates results by trait in a list.
mt_gwas(pleio_results, save_at = NULL)
mt_gwas(pleio_results, save_at = NULL)
pleio_results |
object of class pleio_class (returned by pleioR() function). |
save_at |
character with directory and/or file name (.rdata) to save the results. This is useful when handling multiple results such as in parallel jobs. |
list with by trait dataframes that contain results of the multi-trait model.
Plots genomic segments that contain significant pleiotropic SNPs using results of pleio_test(). It also returns a dataframe with segment information.
pleio_ideogram( pleio_res, alpha = "bonferroni05", n_traits = 2, bp_positions, window_size = 1e+06, centromeres = NULL, color_bias = 1, set_plot = T, set_legend = T, set_ylim_prop = 1.1, ... )
pleio_ideogram( pleio_res, alpha = "bonferroni05", n_traits = 2, bp_positions, window_size = 1e+06, centromeres = NULL, color_bias = 1, set_plot = T, set_legend = T, set_ylim_prop = 1.1, ... )
pleio_res |
list returned by pleio_test(). |
alpha |
numeric threshold for significance level (Bonferroni correction by default). |
n_traits |
integer indicating the level of pleiotropy to test (a.k.a. number of traits). |
bp_positions |
dataframe with colnames 'chr' and 'pos' indicating the chromosome and position for each SNP. Rownames must contain SNP names matching results of pleio_test. |
window_size |
numeric value indicating the minimum size (in base pairs) of the genomic region that contains significant SNPs. |
centromeres |
string 'human' or dataframe (or matrix) with chromosome and position (in mbp) of the centromeres in the first and second columns. If NULL (default) does not plot the centromeres. |
color_bias |
number for bias of the color scale. See help(colorRampPalette). By default color_bias = 1 |
set_plot |
logical indicating whether to plot the ideogram (TRUE by default). |
set_legend |
logical indicating whether to plot a legend (TRUE by default). |
set_ylim_prop |
numeric proportion of upper margin to fit the legend (no margin by default). 1 = no margin, 1.1 = 10% left for margin, etc. |
... |
more plot arguments. |
Ideogram plot and a dataframe with genomic segments information.
Plots the p-values that test the hypothesis of pleiotropic effects on n_traits. This function also returns a dataframe with information of the significant SNPs.
pleio_plot( pleio_res, alpha = "bonferroni05", n_traits = 2, bp_positions = NULL, set_colors = NULL, set_text = NULL, set_plot = TRUE, chr_spacing = 1e+05, ... )
pleio_plot( pleio_res, alpha = "bonferroni05", n_traits = 2, bp_positions = NULL, set_colors = NULL, set_text = NULL, set_plot = TRUE, chr_spacing = 1e+05, ... )
pleio_res |
object returned by pleio_test(). |
alpha |
numeric threshold for significance level (Bonferroni correction by default). |
n_traits |
integer indicating the level of pleiotropy to test (a.k.a. number of traits). |
bp_positions |
dataframe with colnames 'chr' and 'pos' indicating the chromosome and position for each SNP. Rownames must contain SNP names matching results of pleio_test. |
set_colors |
string with 3 colors to use in the plot (by default: c('goldenrod4', 'brown4', 'royalblue2')). |
set_text |
dataframe or matrix with strings to add as text to identify SNPs or genes. Rownames must be SNP names matching results of pleio_test. The first column of the dataframe must have strings to plot as text. |
set_plot |
logical indicating whether to return the manhattan plot (TRUE by default). |
chr_spacing |
integer indicating the spacing (in base pair positions) between chromosomes. 1e5 by default. |
... |
additional graphic parameters for the plot. |
Manhattan plot and dataframe with information related to significant SNPs.
Example function to create simulations with no effects.
pleio_simulate(n_traits, n_individuals, n_snp, percentage_mv = 0)
pleio_simulate(n_traits, n_individuals, n_snp, percentage_mv = 0)
n_traits |
number of traits to simulate. |
n_individuals |
number of individuals to simulate. |
n_snp |
number of SNPs to simulate. |
percentage_mv |
proportion of missing values. By default = 0. |
a list with pheno and geno to test the pleioR function.
Original code by Fernando Aguate.
sim1 <- pleio_simulate(n_traits = 3, n_individuals = 1e4, n_snp = 1e3, percentage_mv = 0.1)
sim1 <- pleio_simulate(n_traits = 3, n_individuals = 1e4, n_snp = 1e3, percentage_mv = 0.1)
Performs the sequential test of pleiotropic effects using results of pleioR().
pleio_test( pleio_results, loop_breaker = 1, save_at = NULL, contrast_matrices_list = NULL )
pleio_test( pleio_results, loop_breaker = 1, save_at = NULL, contrast_matrices_list = NULL )
pleio_results |
pleio_class object returned by pleioR(). |
loop_breaker |
numeric value for a maximum p-value used to stop the sequence if a higher p-value is obtained. This saves computation time if there are many tests to perform. |
save_at |
character with directory and/or file name (.rdata) to save the results. This is useful when handling multiple results such as in parallel jobs. |
contrast_matrices_list |
user-specified contrast matrices within a list of lists, or a single contrast matrix (see example). Each matrix must have the same number of columns, and must be equal to the number of traits. |
list of p-values, indices, and trait numeric identifier.
# Example of user-specified contrast matrices with 3 traits cm1 <- matrix(c(-1, 0, 1), ncol = 3) cm2 <- matrix(c(0, -1, 1), ncol = 3) contrast_matrices <- list('1vs3' = list(cm1), '2vs3' = list(cm2)) # or a single contrast matrix as: contrast_matrices <- cm1
# Example of user-specified contrast matrices with 3 traits cm1 <- matrix(c(-1, 0, 1), ncol = 3) cm2 <- matrix(c(0, -1, 1), ncol = 3) contrast_matrices <- list('1vs3' = list(cm1), '2vs3' = list(cm2)) # or a single contrast matrix as: contrast_matrices <- cm1
Fits a seemingly unrelated regression with, possibly unbalanced data, and/or covariates. It returns a pleio_class object to perform the sequential test with pleio_test() or to obtain by-trait estimates with mt_gwas().
pleioR(pheno, geno, i = NULL, j = NULL, covariates = NULL, drop_subsets = 10)
pleioR(pheno, geno, i = NULL, j = NULL, covariates = NULL, drop_subsets = 10)
pheno |
dataframe with phenotypic data. Must have columns 'id', 'trait', and 'y'. Column 'y' must contain the observations for the corresponding 'trait' and 'id'. See function melt() in the 'reshape2' package for a simple formatting of your data. |
geno |
matrix with SNPs in columns and IDs in rownames. This can also be a memory-mapped matrix returned by BEDMatrix() in the 'BEDMatrix' package. |
i |
integers indexing rows from geno to use in the model. |
j |
integers indexing columns from geno to use in the model. Useful when working with multiple jobs in parallel. |
covariates |
(optional) dataframe or matrix containing covariates in columns and IDs as rownames. These IDs must match those in geno. |
drop_subsets |
minimum sample size of sub-data sets to consider for analysis, 10 by default. When working with unbalanced data (a.k.a. fragmented data), save computation time by dropping small fragments of data. |
pleio_class list of left and right hand side solutions of the model.
# Random generated example with 3 traits, 1e4 individuals, 1000 SNPs and 10% missing values. sim1 <- pleio_simulate(n_traits = 3, n_individuals = 1e4, n_snp = 1e3, percentage_mv = 0.1) pleio_model <- pleioR(pheno = sim1$pheno, geno = sim1$geno) pleio_model_test <- pleio_test(pleio_model)
# Random generated example with 3 traits, 1e4 individuals, 1000 SNPs and 10% missing values. sim1 <- pleio_simulate(n_traits = 3, n_individuals = 1e4, n_snp = 1e3, percentage_mv = 0.1) pleio_model <- pleioR(pheno = sim1$pheno, geno = sim1$geno) pleio_model_test <- pleio_test(pleio_model)
internal function to calculate crossproducts within pleioR.
xrsx_xrsy(id_matrix, sets_rs, xx, xy)
xrsx_xrsy(id_matrix, sets_rs, xx, xy)
id_matrix |
matrix of IDs |
sets_rs |
list of inverses of matrix R |
xx |
numeric vector with crossproducts of the X matrix |
xy |
matrix with X transpose Y products. |