Herramientas y Workflows | IMPaCT-Data. Infraestructura de Medicina de Precisión asociada a la Ciencia y la Tecnología

Las herramientas software para el análisis de datos de IMPaCT-Data se pueden encontrar en el dominio IMPaCT-Data de acceso público en bio.tools. Bio.tools es un registro de componentes software y bases de datos dirigida a investigadores en el campo de las ciencias biológicas y biomédicas para facilitarles el trabajo de encontrar, entender, utilizar y citar recursos de uso diario.

Los workflows forman parte del proyecto actual. En este momento se está trabajando en su desarrollo, y estarán disponibles en esta página una vez finalizados.

FJD-pipeline

Pipeline for Single Nucleotide Variants (SNVs) and Copy Number Variation (CNVs) variant calling. This tool is no longer maintained. See the new version: PARROT-FJD.

PTMCode

PTMCode is a resource of known and predicted functional associations between protein post-translational modifications (PTMs) within and between interacting proteins. It currently contains 316,546 modified sites from 69 different PTM types which are also propagated through ortholgs between 19 different eukaryotic species. A total of 1.6 million sites and 17 million functional associations more than 100,000 proteins can currently be explored.

XICRA

Small RNAseq pipeline for paired-end reads

IonGAP

Publicly available integrated pipeline designed for the assembly and subsequent analysis of Ion Torrent bacterial sequence data. Both its components and their configuration are based on a research process aimed to discover the optimal combination of tools for obtaining good results from single-end reads generated by the Ion Torrent PGM sequencer.

vulcanSpot

Tool to prioritize therapeutic vulnerabilities in cancer.

MetaGenyo

MetaGenyo is a simple, ready-to-use software which has been designed to perform meta-analysis of genetic association studies.

DiSMed

DiSMed is a de-identification methodology for Spanish medical texts based on Named Entity Recognition (NER). It is based on spaCy and partially based on the networks designed by Gillaume Genthial implemented on Tensorflow 1. DiSMed includes both the Python code and the curated dataset, available under request under a research use agreement.

hipathia

Hipathia is a method for the computation of signal transduction along signaling pathways from transcriptomic data. The method is based on an iterative algorithm which is able to compute the signal intensity passing through the nodes of a network by taking into account the level of expression of each gene and the intensity of the signal arriving to it. It also provides a new approach to functional analysis allowing to compute the signal arriving to the functions annotated to each pathway.

APPRIS

Annotates variants with biological data such as protein structural information, functionally important residues, conservation of functional domains and evidence of cross-species conservation.

RD-Connect Genome-Phenome Analysis Platform (GPAP)

An online tool for diagnosis and gene discovery in rare disease research. The platform features allow identifying disease-causing mutations in rare disease patients and linking them with detailed clinical information.

Hipathia-genomics

Using mechanistic models for the clinical interpretation of complex genomic variation. The sustained generation of genomic data in the last decade has increased the knowledge on the causal mutations of a large number of diseases, especially for highly penetrant Mendelian diseases, typically caused by a unique or a few genes. However, the discovery of causal genes in complex diseases has been far less successful. Many complex diseases are actually a consequence of the failure of complex biological modules, composed by interrelated proteins, which can happen in many different ways, which conferring a multigenic nature to the condition that can hardly be attributed to one or a few genes.

EvolClust

EvolClust predicts groups of genes that are conserved in terms of gene order across different species distinguishing it from the background gene order conservation found between species. We define a cluster as a group of homologous proteins that are found grouped together in at least two different genomes and which are more conserved than what is expected for the pair of genomes. The order of the genes inside the cluster is not necessarily conserved. Pairwise clusters are grouped into multi-species families.

Beyondcell

Beyondcell is a computational methodology for identifying tumour cell subpopulations with distinct drug responses in single-cell RNA-seq data and proposing cancer-specific treatments.

DISGENET

DISGENET is a comprehensive knowledge database integrating and standardizing information on disease-associated genes and variants. It covers the full spectrum of human diseases as well as normal and abnormal traits, including adverse events of drugs. Due to the adherence to FAIR data principles, DISGENET is an interoperable resource supporting a variety of applications in genomic medicine and drug R&D. DISGENET simplifies the process of accessing genetic evidence for diseases and therefore can streamline the incorporation of this type of evidence in research, drug R&D and precision medicine applications. DISGENET is available for free for academic users. License fees are applicable for the commercial use of DISGENET. Learn more here https://www.disgenet.com/plans DISGENET is the new evolution of the community-recognized DisGeNET platform, cited by over 6000 publications and one of the ELIXIR Recommended Interoperability Resources.

FAIR4Health Data Privacy Tool

This is a standalone, desktop application developed by the FAIR4Health project (https://www.fair4health.eu/). The tool aims to handle the privacy challenges exposed by the sensitive health data. It is designed to work on an HL7 FHIR API so that it can be used on top of any standard FHIR Repository as a data de-identification, anonymization, and related actions toolset. The tool accesses FHIR resources, presents metadata to the user, guide the user about the configuration to be applied and then output the processed FHIR resources.

OpenEBench

OpenEBench (https://openebench.bsc.es) is the ELIXIR benchmarking and technical monitoring platform for bioinformatics tools, web servers and workflows. OpenEBench is part of the ELIXIR Tools platform and its development is led by the Barcelona Supercomputing Center (BSC) in collaboration with partners within ELIXIR and beyond. OpenEBench, holds a specific infrastructure to monitor software quality. In an initial analysis phase BSC has put together a series of quality metrics taken from a number of sources. The source of such metrics includes documents by the Software Sustainability Institute, recommendations for open source software development, or for software quality metrics. For each metric, a specific source of information have been chosen and the necessary interface implemented.

FAIR4Health Data Curation Tool

This is a standalone, desktop application developed by the FAIR4Health project (https://www.fair4health.eu/). The tool is used to connect the health data sources which can be in various formats (Excel files, CSV files, SQL databases) and migrate data into a HL7 FHIR Repository. The tool shows the available FHIR profiles to the user so that he/she can perform mappings appropriately. The tool can also contact a Terminology Server (which is actually another HL7 FHIR Repository) so that data fields can be annotated if coding schemes such as ICD10 or SNOMED-CT are in use.

PhylomeDB

PhylomeDB is a public database for complete catalogs of gene phylogenie. It allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments. Moreover, phylomeDB provides genome-wide orthology and paralogy predictions which are based on the analysis of the phylogenetic trees. The automated pipeline used to reconstruct trees aims at providing a high-quality phylogenetic analysis of different genomes, including Maximum Likelihood tree inference, alignment trimming and evolutionary model testing.

GeneCodis

GeneCodis is a web-based tool for the ontological analysis of lists of genes, proteins, and regulatory elements like miRNAs, transcription factors, and CpGs.

mCSEA

Identification of diferentially methylated regions (DMRs) in predefined regions (promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data. Provides methods to rank CpG probes based on linear models and includes plotting functions.

ImaGEO

ImaGEO is a web tool for gene expression Meta-Analysis that implements a complete and comprehensive meta-analysis workflow starting from Gene Expression Omnibus (GEO) dataset identifiers. The application integrates GEO datasets, applies different meta-analysis techniques and provides functional analysis results in an easy-to-use environment. ImaGEO is a powerful and useful resource that allows researchers to integrate and perform meta-analysis of GEO datasets to lead robust findings for biomarker discovery studies.

MetaPhors

MetaPhOrs is a public repository of phylogeny-based orthologs and paralogs that were computed using phylogenetic trees available in twelve public repositories. Currently, over 117,131,162 of unique homologs are deposited in MetaPhOrs database. These predictions were retrieved from 8,246,911 Maximum Likelihood trees for 4,094 species. For each prediction, MetaPhOrs provides a Consistency Score and Evidence Level describing its goodness, together with number of trees and links to their source databases.

DREIMT

DREIMT is a bioinformatics tool for hypothesis generation and prioritization of drugs capable of modulating immune cell activity from transcriptomics data.

PanDrugs

PanDrugs is a method to prioritize anticancer drug treatments according to individual genomic data. PanDrugs current version integrates data from 24 primary sources and supports 56297 drug-target associations obtained from 4804 genes and 9092 unique compounds.

FireDB

A curated inventory of catalytic and biologically relevant small ligand-binding sites.

TRIFID

TRIFID is an ML-based tool trained on the evidence of large-scale proteomics analysis and evolutionary, structural, annotation, splicing, and RNA-seq based features to classify the biologically important splice isoforms.

APID

APID Interactomes provides a comprehensive collection of protein interactomes for more than 500 organisms based on the integration of known experimentally validated protein-protein physical interactions (PPIs). Construction of the interactomes is done with a methodological approach to report quality levels and coverage over the proteomes for each organism included. APID unifies PPIs from primary databases of molecular interactions (BIND, BioGRID, DIP, HPRD, IntAct, MINT) and from experimentally resolved 3D structures (PDB) where more than two distinct proteins have been identified. APID also includes a data visualization web-tool that allows the construction of sub-interactomes using query lists of proteins of interest and the visual exploration of the corresponding networks, including an interactive selection of the properties of the interactions reliability of the "edges") and a mapping of the functional environment of the proteins (functional annotations of the "nodes").

NanoCLUST

A species-level analysis of 16S rRNA nanopore sequencing data based on de novo clustering and consensus building.

NanoRtax

NanoRTax is a taxonomic and diversity analysis pipeline built originally for Nanopore 16S rRNA data with real-time analysis support in mind. It combines state-of-the-art classifiers such as Kraken2, Centrifuge and BLAST with downstream analysis steps to provide a framework for the analysis of in-progress sequencing runs. NanoRTax retrieves the final output files in the same structure/format for every classifier which enables more comprehensive tool/database comparison and better benchmarking capabilities. Additionally, NanoRTax includes a web application (./viz_webapp/) for visualizing complete or partial pipeline outputs. The NanoRTax pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with conda environments and docker containers making installation trivial and results highly reproducible.

PlasmidID

PlasmidID is a mapping-based, assembly-assisted plasmid identification tool that analyzes and gives graphic solution for plasmid identification.

iSkyLIMS

Open-source LIMS (laboratory Information Management System) for Next Generation Sequencing sample management, statistics and reports, and bioinformatics analysis service management.

Taranis

cg/wgMLST allele calling software, schema evaluation and allele distance estimation for outbreak reserch.

TFTA

Transcription Factor Target Enrichment Analysis

PriorR

Priorr is a prioritization program of disease-linked genetic variants devoloped within the Genetics&Genomics Department of La Fundacion Jimenez Diaz University Hospital. Priorr is conceived to analyse the output of the FJD-pipeline of SNVs or CNVs. This software program offers a number of useful functionalities for variant analysis such as: filtering by a virtual panel of genes. manual control of different population frequencies or pathogenicity predictors or filtering out variants that have been already found by another protocol.

Mini-IsoQLR

This pipeline was developed to detect and quantify isoforms from the expression of minigenes, whose cDNA was sequenced using Oxford Nanopore Technologies (ONT).

GLOWgenes

Prioritization of gene diseases candidates by disease-aware evaluation of heterogeneous evidence networks

Automatic segmentation tool

Tool that automatically detects the histological type of a tumor region of interest from Whole Slide Imaging technique

Slides Viewer

Tool to visualize WSI and navigate through their zones at different zoom levels.

LinkEHR

LinkEHR is a set of tools that enables the semantic interoperability of your data by: Creating clinical information models (archetypes) Transforming clinical data into standards such as openEHR, HL7 CDA, or ISO 13606

Liferay

Liferay, Inc., is an open-source company that provides free documentation and paid professional service to users of its software. Mainly focused on enterprise portal technology, the company has its headquarters in Diamond Bar, California, United States.

Metabolizer

Metabolizer is a web tool for analysis of modular architecture of metabolic pathways using transcriptomic data. Metabolizer calculates impact of modules on production of metabolites. These modules are conserved part of metabolism which starts with substrate(s) and ends with a product.

CyPathia

CyPathia is a cytoscape app, that provides a user friendly and straightforward interface. The CyPathia app is based on Hipathia Bioconductor package, allowing the Cytoscape community the possibility of using mechanistic models.

CoV-Hipathia

A web tool implements a mechanistic model of human signaling for the interpretation of the consequences of the combined changes of gene expression levels and/or genomic mutations in the context of signalling pathways known to be involved in the infection by SARS-CoV-2, which are updated with the curated versions released by the COVID-19 Disease Map curation project .

impuSARS

impuSARS allows the imputation of viral whole genome sequences from partially sequenced samples. Additionally, impuSARS provides the lineage associated to the imputed sequence. impuSARS have been validated with a reference of SARS-CoV-2 sequences.

SPACNACS

SPACNACS is a crowdsourcing initiative to provide information about Copy Number Variations of the Spanish population to the scientific/medical community.

nf-core-viralrecon

nfcore/viralrecon is a bioinformatics analysis pipeline used to perform assembly and intra-host/low-frequency variant calling for viral samples. The pipeline supports short-read Illumina sequencing data from both shotgun (e.g. sequencing directly from clinical samples) and enrichment-based library preparation methods (e.g. amplicon-based: ARTIC SARS-CoV-2 enrichment protocol; or probe-capture-based).

MIGNON

MIGNON (Mechanistic InteGrative aNalysis Of rNa-seq data) is a versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways.

ExpHunterSuite

ExpHunterSuite is an R package for the comprehensive analysis of transcriptomic data.

MetaFun

MetaFun is a web tool for the integration and functional characterization by unveiling sex differences in multiple omics studies through comprehensive functional meta-analysis.

DomFun

DomFun is a system to assign functions to unknown proteins using a systemic approach without considering their sequence but their domains associated with functional systems. It uses associations calculated between protein domains and functional annotations as training dataset and performs predictions over proteins (using UniProt identifiers) by finding their domains and if they have been associated with functional annotations (in GO molecular functions, biological processes, KEGG and Reactome pathway terms).