Analysis of relative quantifications, including:

• Annotations

• Summary files in different format (xls, txt) and shapes (long, wide)

• Numerous summary plots

• Enrichment analysis using Gprofiler

• PCA of quantifications

• Clustering analysis

• Basic imputation of missing values

artmsAnalysisQuantifications(
log2fc_file,
modelqc_file,
species,
output_dir = "analysis_quant",
outliers = c("keep", "iqr", "std"),
enrich = TRUE,
l2fc_thres = 1,
isBackground = "nobackground",
isPtm = "global",
mnbr = 2,
pathogen = "nopathogen",
plotPvaluesLog2fcDist = TRUE,
plotAbundanceStats = TRUE,
plotReproAbundance = TRUE,
plotCorrConditions = TRUE,
plotCorrQuant = TRUE,
plotPCAabundance = TRUE,
plotFinalDistributions = TRUE,
plotPropImputation = TRUE,
plotHeatmapsChanges = TRUE,
plotTotalQuant = TRUE,
plotClusteringAnalysis = TRUE,
data_object = FALSE,
printPDF = TRUE,
verbose = TRUE
)

## Arguments

log2fc_file (char) MSstats results file location (char) MSstats modelqc file location (char) Select one species. Species currently supported for a full analysis (including enrichment analysis): HUMAN MOUSE To find out species supported only for annotation check ?artmsIsSpeciesSupported() (char) Name for the folder to output the results from the function. Default is current directory (recommended to provide a new folder name). (char) It allows to keep or remove outliers. Options: keep (default): it keeps outliers 'keep', 'iqr', 'std' iqr (recommended): remove outliers +/- 6 x Interquartile Range (IQR) std : 6 x standard deviation (logical) Performed enrichment analysis using GprofileR? Only available for species HUMAN and MOUSE. TRUE (default if "human" or "mouse" are the species) or FALSE (int) log2fc cutoff for enrichment analysis (default, l2fc_thres = 1.5) (char) specify whether pvalue or adjpvalue should use for the analysis. The default option is adjpvalue (multiple testing correction). But if the number of biological replicates for a given experiment is too low (for example n = 2), then choosePvalue = pvalue is recommended. (char) background of gene names for enrichment analysis. nobackground (default) will use the total number of genes detected. Alternatively provided the file path name to the background gene list. (char) Is a ptm-site quantification? global (default), ptmsites (for site specific analysis), ptmph (Jeff Johnson script output evidence file) (int) PARAMETER FOR NAIVE IMPUTATION: "minimal number of biological replicates" for "naive imputation" and filtering. Default: mnbr = 2. Details: Intensity values for proteins/PTMs that are completely missed in one of the two conditions compared ("condition A"), but are found in at least 2 biological replicates (mnbr = 2) of the other "condition B", are imputed (values artificially assigned) and the log2FC values calculated. The goal is to keep those proteins/PTMs that are consistently found in one of the two conditions (in this case "condition B") and facilitate the inclusion in downstream analysis (if wished). The imputed intensity values are sampled from the lowest intensity values detected in the experiment, and (WARNING) the p-values are just randomly assigned between 0.05 and 0.01 for illustration purposes (when generating a volcano plot with the output of artmsAnalysisQuantifications) or to include them when making a cutoff of p-value < 0.05 for enrichment analysis CAUTION: mnbr would also add the constraint that any protein must be identified in at least nmbr biological replicates of the same condition or it will be filtered out. That is, if mnbr = 2, a protein found in two conditions but only in one biological replicate in each of them, it would be removed. (char) Is there a pathogen in the dataset as well? if it does not, then use pathogen = nopathogen (default). Pathogens available: tb (Tuberculosis), lpn (Legionella) (logical) If TRUE (default) plots pvalues and log2fc distributions (logical) If TRUE (default) plots stats graphs about abundance values (logical) If TRUE plots reproducibility based on normalized abundance values (logical) If TRUE plots correlation between the different conditions (logical) if TRUE plots correlation between the available quantifications (comparisons) (logical) if TRUE performs PCA analysis of conditions using normalized abundance values (logical) if TRUE plots distribution of both log2fc and pvalues (logical) if TRUE plots proportion of overall imputation (logical) if TRUE plots heatmaps of quantified changes (both all and significant only). Only if printPDF is also TRUE (logical) if TRUE plots barplot of total number of quantifications per comparison (logical) if TRUE performs clustering analysis between quantified comparisons (more than 1 comparison required) (logical) flag to indicate whether the required files are data objects. Default is FALSE If TRUE (default) prints out the pdfs. Warning: plot objects are not returned due to the large number of them. (logical) TRUE (default) shows function messages

## Value

(data.frame) summary of quantifications, including annotations, enrichments, etc

## Examples

# Testing that the files cannot be empty
artmsAnalysisQuantifications(log2fc_file = NULL,
modelqc_file = NULL,
species = NULL,
output_dir = NULL)
#> ---------------------------------------------#> artMS: ANALYSIS OF QUANTIFICATIONS#> ---------------------------------------------#> [1] "The evidence_file, modelqc_file, species and output_dir arguments cannot be NULL"