This function simplifies the enrichment analysis performed by the excellent tool GprofileR.

artmsEnrichProfiler(
  x,
  categorySource = c("GO"),
  species,
  background = NA,
  verbose = TRUE
)

Arguments

x

(list, data.frame) List of protein ids. It can be anything: either a list of ids, or you could also send a data.frame and it will find the columns with the IDs. Is not cool? Multiple list can be also sent simultaneously, as for example running: tmp <- split(enrichment$Gene, enrichment$cl_number, drop= TRUE)

categorySource

(vector) Resources providing the terms on which the enrichment will be performed. The supported resources by gprofiler are:

  • GO (GO:BP, GO:MF, GO:CC): Gene Ontology (see more below)

  • KEGG: Biological pathways

  • REAC: Biological pathways (Reactome)

  • TF: Regulatory motifs in DNA (TRANSFAC TFBS)

  • MI: Regulatory motifs in DNA (miRBase microRNAs)

  • CORUM: protein complexes database

  • HP: Human Phenotype Ontology

  • HPA: Protein databases (Human Protein Atlas)

  • OMIM: Online Mendelian Inheritance in Man annotations:

  • BIOGRID: BioGRID protein-protein interactions The type of annotations for Gene Ontology:

  • Inferred from experiment (IDA, IPI, IMP, IGI, IEP)

  • Direct assay (IDA) / Mutant phenotype (IMP]

  • Genetic interaction (IGI) / Physical interaction (IPI)

  • Traceable author (TAS) / Non-traceable author (NAS) / Inferred by curator (IC)

  • Expression pattern (IEP) / Sequence or structural similarity (ISS) / Genomic context (IGC)

  • Biological aspect of ancestor (IBA) / Rapid divergence (IRD)

  • Reviewed computational analysis (RCA) / Electronic annotation (IEA)

  • No biological data (ND) / Not annotated or not in background (NA)

species

(char) Specie code: Organism names are constructed by concatenating the first letter of the name and the family name. Example: human - ’hsapiens’, mouse - ’mmusculus’. Check gProfileR to find out more about supported species.

background

(vector) gene list to use as background for the enrichment analysis. Default: NA

verbose

(logical) TRUE (default) shows function messages

Value

The enrichment results as provided by gprofiler

Details

This function uses the following gprofiler arguments as default:

  • ordered_query = FALSE

  • significant = TRUE

  • exclude_iea = TRUE

  • underrep = FALSE

  • evcodes = FALSE

  • region_query = FALSE

  • max_p_value = 0.05

  • min_set_size = 0

  • max_set_size = 0

  • min_isect_size = 0

  • correction_method = "analytical" #Options: "gSCS", "fdr", "bonferroni"

  • hier_filtering = "none"

  • domain_size = "known" # annotated or known

  • numeric_ns = ""

  • png_fn = NULL

  • include_graph = TRUE

Examples

if (FALSE) { # annotate the MSstats results to get the Gene name data_annotated <- artmsAnnotationUniprot( x = artms_data_ph_msstats_results, columnid = "Protein", species = "human") # Filter the list of genes with a log2fc > 2 filtered_data <- unique(data_annotated$Gene[which(data_annotated$log2FC > 2)]) # And perform enrichment analysis data_annotated_enrich <- artmsEnrichProfiler( x = filtered_data, categorySource = c('KEGG'), species = "hsapiens", background = unique(data_annotated$Gene)) }