FANTASIA (Functional ANnoTAtion based on embedding space SImilArity)

Pipeline for functional annotation of proteins using language models

Info

Members (researchers): Ana Rojas

Research Groups: Computational Biology and Bioinformatics (CBBIO.CABD)

Contact Email: frapercan1@alum.us.es, gemma.martinez@ibe.upf-csic.es, a.rojas.m@csic.es, rosa.fernandez@ibe.upf-csic.es

Tool Repository: https://github.com/CBBIO/FANTASIA

Documentation: https://fantasia.readthedocs.io/en/latest/

Publications DOI: https://doi.org/10.1101/2024.02.28.582465, https://doi.org/10.1101/2024.02.14.580341

Institution: Centro Andaluz de Biología del Desarrollo (CABD)

Application domain:

Applications of Computational Biology, Artificial Intelligence, Data Analysis, Function prediction, Functional annotation, Functional genomics, Large language models, Model organisms

Technical details

Type of application

Command line pipeline
Galaxy tool
Singularity container

Software compatibility

Linux

Hardware requirements

For an average peptide FASTA file containing 20k sequences it would need around 50Gb of RAM (the higher the number of sequences the higher the memory requirements).

Programming language

BASH
Python

Type of containerization

Singularity/AppTainer

Wrapper type

None

Input file formats

fasta

Output file formats

Other

Compatibility with other tools

Output files are already formatted to be used as input for topGO R package (GO enrichment).

Tool