From Bioinformatics Core Wiki
Revision as of 13:53, 5 September 2016 by Jponomarenko (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


The CRG Bioinformatics Core facility provides researchers at CRG-CNAG/PRBB and external organizations, both academic and commercial, with services of consultation and data analysis, with a focus on Next Generation Sequencing (NGS) and other high-throughput experiments. To support researchers throughout all steps from an experiment planning to the result delivery , we work in synergy with the other CRG core facilities.

We also provide training in basic and advanced bioinformatics techniques and support CRG core facilities and CRG groups with development and maintenance of data management resources, including databases, LIMS, web-tools, and software.

In addition to services provided for fee, we support fully collaborative grant-funded investigations. This includes preliminary data analysis, planning the grant budget and experiments provided by the CRG Core facilities, writing relevant sections of the proposal, data analysis and biological inference, custom software development, and co-authored dissemination of the grant results.


To request a service or a free consultation, to propose a collaborative project or a grant proposal, please contact us via email or phone. We encourage researchers to discuss both experimental and bioinformatics procedures before submitting materials for sequencing at the CRG Genomics Unit. We can help in selecting the cost effective approaches.

After an agreement on the provided services and deliverables is reached, we issue an official quotation, which has to be approved by the user in writing (via e-mail). By agreeing with the quotation, the user enters into the contract with CRG and agrees with Terms and Conditions of the service.

Please note that payment of fees for services and authorship are not mutually exclusive. Each Core personnel who has participated in the work sufficiently enough to take public responsibility for appropriate portions of the content should be recognized as co-author; co-authorship should follow commonly-accepted scientific practice. The recovery of Core expenses through the recharge system does not exclude the possibility for authorship for Core personnel. Similarly, authorship does not substitute for payment of Core expenses for services rendered.

All work performed by the CRG Bioinformatics core should be acknowledged in scholarly publications, posters, and presentations by a direct statement in the acknowledgement section “The authors would like to thank <Name(s) of Consultant(s)> of the <Name(s) of the facility(s)> of the Center of Genome Regulation for assistance with <services performed>.”

Our turnaround time for completing the request is 3-10 business days, depending on the request’s complexity and current capacity.

Throughout the project, we document our work, track the personnel and computational hours, regularly communicate with the researchers on the project progress, and revise initial goals if needed. To avoid unnecessary expenses, if the problem with the data quality was spotted, we communicate it right away.

When the request is completed, we issue an invoice on actual accounted hours (please refer to our FEES). We can also provide the final written report, facilitate preparation of relevant sections of publications, and handle submission of data to public repositories.

The original and derived data are guaranteed to be stored at CRG for 6 months after completion of the request.


All our communication with users, including consultations, meetings, quotations, e-mail and skype communication, is free of charge. Please refer to cost estimates for standard bioinformatics services and to the CRG webpage for our current fees for manual (Data Analysis, Database Maintenance Support, Programming / Database Development ) and computing hours (Automatic Data Analysis ) for CRG/PRBB and other public organizations. Prices for commercial users are subject of negotiation.


The provided services are the subject of the CRG Core facilities Terms and Conditions of the service.


  • Reference-based and de novo assembly of eukaryotic and prokaryotic genomes.
  • Genome re-sequencing and quality assessment of genome assemblies.
  • ChIP-seq (TFs, histone modifications): peak calling, differential binding analysis among sample groups, peak annotation.
  • Whole exome and whole genome analysis: variant calling, CNVs.
  • Identification and annotation of DNA structural variants for common and rare human diseases: individual and family analysis, cancer driver gene mutations.
  • Genomes comparison.
  • Genome functional annotation: ab initio gene prediction, annotation of genes, transcripts, DNA motifs, promoters, and other DNA regulatory elements.
  • Analysis of 5C, Hi-C, ATAC-seq, and other high-throughput data.


  • Reference-based and de novo assembly of eukaryotic and prokaryotic transcriptomes.
  • Transcriptome functional annotation: ab initio gene prediction, annotation of genes, transcripts, DNA motifs, promoters, and other regulatory elements.
  • Variant calling from transcriptome sequencing data.
  • Analysis of commercial and custom microarrays: differentially expressed genes, group comparison.
  • RNA-seq for mRNA: discovery of new transcripts, differentially expressed genes/transcripts.
  • Functional analysis of differentially expressed genes/transcripts: Gene Ontology terms, DNA motifs, and pathways enrichment analysis.
  • RNA-seq for small and non-coding RNA: differential expression, discovery of new microRNAs, microRNA target prediction.
  • Analysis of OpenArray real-time PCR, and other high-throughput experimental data.
  • Identification of batch effects and visualization of data and results: hierarchical clustering, heatmaps, dendrograms, volcano plots, principal components analysis for the overall (dis)similarity among experiments.
  • RNA-target-based sequencing: RIP-seq, iCLIP, CLIP-seq, and other.
  • Analysis of B and T cell repertoires (adaptive immune receptor repertoires, or AIRR) from high-throughput sequencing data: germline allele assignment, identification of clones, visualization of clonal frequencies.


  • Analysis of amplicon (16S rRNA genes), whole genome and transcriptome shotgun sequencing data.
  • Identification of microbial communities, taxonomic diversity and abundances at the levels of genus, family, order, class, phylum.
  • Conservation and abundance of bacterial gene functional modules and biochemical pathways.
  • Estimation of microbial diversity and sequence coverage.
  • ORF prediction and functional annotation.
  • Phylogenetic analysis.
  • Comparative analysis of samples: microbial profiles, Gene Ontology terms, metabolic and pathway analyses.


  • Protein functional annotation and prediction.
  • Analysis of SNPs and other variations effects on protein structure and function.
  • Multiple sequence alignment.
  • Orthologs and paralogs assignment.
  • Phylogenetic analysis and tree construction.
  • Protein structure comparison and 3D homology modeling.
  • Protein-protein and protein-ligand 3D docking.
  • B- and T-cell epitope prediction.


We have an extensive experience in design, development and support of following bioinformatics resources (browse our related projects here):

  • Databases: Relational and NoSQL.
  • Websites for data submission, search, and analysis.
  • Web-tools.
  • LIMSs (Laboratory Information Management System) for management of the laboratory's operations, data flow, and communication with users and external collaborators.
  • Software evaluation and benchmarking.
  • Software development: bioinformatics scripts; data processing and analysis pipelines; integrative bioinformatics web applications; customized genome browsers.
  • NextFlow pipelines.
  • External and internal data integration solutions.


Please contact us with any bioinformatics problem you may have, even if it does not fit to those listed in our services. Being part of several bioinformatics communities and alliances we may find the solution or experts in the field. Among custom and additional services we provide are the following:

  • Support in the design and and analysis of customized NGS experiment.
  • Statistical analysis and plots: R scripts, descriptive and inferred statistics, hypothesis testing, sample size estimation, PCA, clustering, linear regression, correlation, ANOVA.
  • One-to-one training.
  • Data submission to GEO, ArrayExpress, SRA, and other public data repositories.
  • Mining public databases and publications and re-analyzing published data.
  • Manuscript preparation: writing methods, results, results interpretation and visualization.
  • Grant support: writing methods, methodology for data management and sharing, experimental and bioinformatics analysis design, obtaining preliminary results, result interpretation.
Bioinformatics Core Facility @ CRG — 2011-2023