Summer School 2016 in Metagenomics


The French Institute of Bioinformatics, France Génomique  and Institut Pasteur are organizing the First Summer School 2016 in Metagenomics

Les participants sont invités à se présenter à partir de 13h15 au 28 rue du Docteur Roux avec une pièce d'identité.


Date: Monday 12 – Friday 16 September 2016

Venue: Institut Pasteur, 28 rue du Docteur Roux, 75015 PARIS (France) – Auditorium François Jacob

Application opens: Thursday May 13 2016

Application deadline: Monday July 11 2016

Participation: Open application (120 participants), with selection for the hands-on tutorials (30 places)

Contact: Scientific contact : 
Organization contact :

Registration free of charge. Tutorials registrations are closed.


Metagenomics, the sequencing of DNA directly from a sample without first culturing and isolating the organisms, has become the principal tool of “meta-omic” analysis. It can be used to explore the diversity, function, and ecology of microbial communities.

The aim of these 4 days workshop will be to give researchers and students an overview of the tools and bioinformatics techniques available for the analysis of next generation sequence data from microbial communities. Its content will focus on the taxonomic assignment and the functional analysis of metatranscriptomic and metagenomic data. The format will comprise a mixture of lectures and hands-on practical tutorials where students will process example data sets in real-time.

Summer School on the website of Institut Pasteur

The workshop comprises two parts


Part I: lectures
(Monday 12/09 2 p.m to Wednesday 14/09 12.30 am)

The conferences will focus on the state-of-the-art analysis tools for de novo similarity- and composition-based tools for the taxonomic assignment (binning) of unassembled and assembled samples, draft genome recovery for abundant community members, metatranscriptomic analysis, and an overview of functional annotation methods. These methodological presentations will be illustrated by several applications in the domain of marine, soil, clouds, and microbiotes metagenomics.


Invited Speakers:                                                                                                                      

» Matthieu Almeida (Center for Bioinformatics and Computational Biology, College Park, MD, USA)
» Karine Clement (l'Institut hospitalo-universitaire de Cardiologie, Paris, FR)
» A Murat Eren (Université Chicago, USA) 
» Sebastian Luecker (Department of Microbiology, Radboud University Nijmegen, NL)
» Folker Meyer (Institute for Genomics and Systems Biology, Argonne National Laboratory, Chicago, USA)
» Alex Mitchell (The European Bioinformatics Institute, InterPro and EBI Metagenomics database, UK)
» Pascal Simonet (Ecole Centrale de Lyon, FR)
» Gabriel Valiente (Technical University of Catalonia, Department of Computer Science, ES)
» Patrick Wincker (CEA/Genoscope, Evry, FR)



Part II: hands-on tutorials
(Wednesday 14/09 2 p.m to Friday 16/09 12.30  am) 

These tutorials are designed as self-contained units that include example data and pre-installed bioinformatics tools.  These hands-on practical tutorials will demonstrate the use of the following metagenome analysis tools:

FROGS: Find Rapidly OTU with Galaxy Solution [Géraldine Pascal et al. INRA Toulouse & Olivier Rué, Jouy, FR]
cDPCoA: Constrained DPCOA for community comparison [Stéphane Dray, Lyon, FR]
SHAMAN: SHiny Application for Metagenomic ANalysis [Amine Ghozlane, C3BI, Institut Pasteur, FR]
EggNOG, fetchMG, iVireon and PICRUSt for the functional annotation of metagenomic data [Mathieu Almeida, Center for Bioinformatics and Computational Biology, College Park,  MD, USA]
Anvi’o: an advanced analysis and visualization platform for ‘omics data [A. Murat Eren, Univ. Chicago, USA]


Scientific committee:

Erwan Corre, ABiMS, Roscoff
Jean-Michel Claverie, PACA-Bioinfo, Marseille
Jean-François Gibrat, IFB, Gif/Yvette
Sean Kennedy, Institut Pasteur, Paris
Claudine Médigue, MicroScope, Evry
Eric Pelletier, GENOSCOPE, Evry
Guy Perriere, PRABI, Lyon
Pierre Peyret, AUDI, Clermont


Involved IFB/FG platforms:

IFB core (Gif-sur-Yvette), PRABI (CNRS, Lyon), C3BI (Institut Pasteur, Paris), ABiMS (CNRS/UPMC, Roscoff), Genotoul (INRA,Toulouse), MIGALE (INRA, Jouy), MicroScope (CEA/CNRS, Evry)

Part I: lectures


Monday 12/09/2016

1:45 pm Presentation of the workshop (Chairman: Claudine Médigue)
2:00 pm – 2:30 pm Pierre Peyret, EA 4678 CIDAM, Université d’Auvergne, Clermont-Fd, France
« Metagenomics to illuminate ecosystem functioning »
2:30 pm – 3:00 pm Sean Kennedy, Bio-Omics Pole, Pasteur Institute, Paris, FR
« From Samples to Data : Assuring Downstream Analysis with Upstream Planning »
3:00 – 3:45 pm Matthieu Almeida, Center for Bioinformatics and Computational Biology, College Park, MD, USA
« Deciphering the human intestinal tract microbiome using metagenomic computational methods »
3:45 pm – 4:15 pm Break
4:15 pm Session 2 (Chairman: Pierre Peyret)
4:15 pm – 5:00 pm Karine Clement, l'Institut hospitalo-universitaire de Cardiologie, Métabolisme, Nutrition (ICAN), Paris, FR
"Gut metagenomics in cardiometabolic diseases"
5:00 pm – 5:30 pm Romain Koszul, Spatial Regulation of Genomes, Pasteur Institute, Paris, FR
« Exploiting collisions between DNA molecules to characterize the genomic structures of complex communities »
5:30 pm – 6:00 pm Eric Dugat-Bony, Génie et Microbiologie des Procédés Alimentaires, INRA Grigon, FR
« Who is doing what on the cheese surface? Overview of the cheese microbial ecosystem functioning by metatranscriptomic analyses »


Tuesday 13/09/2016

9:00 am Session 3 (Chairman: Sean K.)
9:00 am – 9:45 am Gabriel Valiente, Technical University of Catalonia, Department of Computer Science, Barcelone, ES
« Taxonomic Assignment: From Amplicon to Shotgun Sequencing »
9:45 am – 10:15 am Jean Michel Claverie, Structural & Genomic Information Lab. (IGS), Marseille, FR
« Rationale and Tools to look for the unknown in (metagenomic) sequence data »
10:15 am – 10:45 am Eric Coissac, Laboratoire d’Ecologie Alpines (LECA), Grenoble, FR
"Sequencing 6000 chloroplast genomes : the PhyloAlps project"
10:45 am  – 11:15 am Break
11:15 am Session 4 (Chairman: Eric Pelletier)
11:15 am – 11:45 am Pascal Simonet, Environmental Microbial Genomics Group Laboratoire Ampère, Ecole Centrale de Lyon, FR
"Potentials and limitations of soil metagenomics for fundamental studies and industrial applications."
11:45 am – 12:15 pm Pierre Peterlongo, GenScale team, INRIA/IRISA, Rennes, FR
« Multiple Comparative Metagenomics using Multiset k-mer Counting »
12:15 pm – 12:45 pm Sébastien Terrat, Interaction Plantes Micro-organismes, Agroécologie, INRA, Dijon, FR
“Assessing microbial biogeography by using a metagenomic approach”
12:45 pm -  2:00 pm Lunch
2:00 pm Session 5 (Chairman: Damien Eveillard)
2:00 pm – 2:45 pm A. Murat Eren, The Univ. Of Chicago Department of Medecine Faculty, USA
« Reconstructing genomes from metagenomes: The holy grail of microbiology »
2:45 pm – 3:15 pm Violette Da Cunha, Institut Pasteur, Unité de Biologie Moléculaire du Gène chez les Extrêmophiles (BMGE) & Institute for Integrative Biology of the Cell (I2BC), Paris, FR
"Dr Jekyll and Mr Hyde: The dual face of metagenomics in phylogenetic analysis"
3:15 pm – 3:45 pm Guy Perrière, LBBE "Biométrie et Biologie Évolutive", Lyon, FR
« Prokaryotic Phylogeny on the Fly: databases and tools for online taxonomic identification »
3:45 pm – 4:15 pm Break
4:15 pm Session 6 (Chairman: J.M. Claverie)
4:15 pm – 5:00 pm Alex Mitchell, European Bioinformatics Institute (EBI), Cambridge, UK
« 200 billion sequences and counting: analysis, discovery and exploration of datasets with EBI Metagenomics »
5:00 pm – 5:30 pm Pierre Amato, Institut de Chimie de Clermont-Ferrand (ICCF/BIOMETA), Clermont-Ferrand, FR
"Structure and functioning of cloud microbiota"
5:30 pm – 6:00 pm Hélène Touzet,  BONSAI group at CRIStAL and INRIA, Bioinformatics and Sequence Analysis, Lille, FR
« Fast filtering, mapping and assembly of 16S ribosomal RNA »



Wednesday 14/09/2016

8:45 Session 7 (Chairman: J.F. Gibrat)
8:45 am – 9.30 am Patrick Wincker, CEA/GENOSCOPE, Laboratoire d’Analyses Génomiques des Eucaryotes (LAGE), Evry, FR
“ Holistic metagenomics in marine communities »
9:30 am – 10:00 am Chantal Abergel, Structural & Genomic Information Lab. (IGS), Marseille, FR
« Hidden in the permafrost »
​10:00 am – 10:45 am Folker Meyer, Institute for Genomics and Systems Biology, Argonne National Laboratory, Argonne, USA
« MG-RAST — experiences from processing a quarter million metagenomic data sets »
10:45 am – 11:15  am Break
11:15 am Session 8 (Chairman: Guy Perrière)
11:15 am – 11:45 am Sebastian Luecker, Department of Microbiology, IWWR, Radboud University, NL
« New perspectives on nitrite-oxidizing bacteria - linking genomes to physiology »
11:45 am – 12:15 pm Eric Pelletier, CEA/GENOSCOPE, Laboratoire d’Analyses Génomiques des Eucaryotes (LAGE), Evry, FR
"Marine planktonic eukaryotic metatranscriptomics : the Tara Oceans project"
12:15 pm – 12:45 pm Damien Eveillard, Laboratoire d'Informatique de Nantes Atlantique, irisa, Nantes, FR
« Revealing and analyzing microbial networks: from topology to functional behaviors »
12:45 pm end of the first part of the workshop

Prokaryotic Phylogeny on the Fly: databases and tools for online taxonomic identification
Guy Perrière - CNRS, Lab. BBE, Lyon

PPF (Prokaryotic Phylogeny on the Fly) is an automated pipeline allowing to compute molecular phylogenies for prokarotic organisms. It is based on a set of specialized databases devoted to SSU rRNA, the most commonly used marker for bacterial txonomic identification. Those databases are splitted into different subsets using phylogenetic information.   The procedure for computing a phylogeny is completely automated. Homologous sequence are first recruited through a BLAST search performed on a sequence (or a set of sequences). Then the homologous sequences detected are aligned using one of the multiple sequence alignment programs provided in the pipeline (MAFFT, MUSCLE or CLUSTALO). The alignment is then filtered using BMGE and a Maximum Likelihood (ML) tree is computed using the program FastTree. The tree can be rooted with an outgroup provided by the user and its leaves are coloured with a scheme related to the taxonomy of the sequences.  The main advantage provided by PPF is that its databases are generated using a phylogeny-oriented procedure and and therefore much more efficient for phylogentic analyses that "generic" collections such as SILVA (in the case SSU rRNA) por GenBank. It is therefore much more suited to compute prokaryotic molecular phylogenies than related systems such as the online system.  PPF can be accessed online at


Exploiting collisions between DNA molecules to characterize the genomic structures of complex communities
Romain Koszul, Spatial Regulation of Genomes, Institut Pasteur, Paris

Meta3C is an experimental and computational approach that exploits the physical contacts experienced by DNA molecules sharing the same cellular compartments. These collisions provide a quantitative information that allows interpreting and phasing the genomes present within complex mixes of species without prior knowledge. Not only the exploitation of chromosome physical 3D signatures hold interesting premises regarding solving the genome sequences from discrete species, but it also allows assigning mobile elements such as plasmids or phages to their hosts.


Multiple Comparative Metagenomics using Multiset k-mer Counting 
Pierre Peterlongo, Scalable Optimized and Parallel Algorithms for Genomics, INRIA, Rennes

Large scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can be associated to known organisms. On the other hand, de novo methods, that compare the whole set of sequences, do not scale up on ambitious metagenomic projects.
These limitations motivated the development of a new de novo metagenomic comparative method, called Simka. This method computes a large collection of standard ecology distances by replacing species counts by k-mer counts. Simka scales-up today metagenomic projects thanks to a new parallel k-mer counting strategy on multiple datasets.
Experiments on public Human Microbiome Project datasets demonstrate that Simka captures the essential underlying biological structure. Simka was able to compute in a few hours both qualitative and quantitative ecology distances on hundreds of metagenomic samples (690 samples, 32 billions of reads). We also demonstrate that analyzing metagenomes at the k-mer level is highly correlated with extremely precise de novo comparison techniques which rely on all-versus-all sequences alignment strategy.


Structure and functioning of cloud microbiota
Pierre Amato - CNRS, UMR 6296, ICCF, BP 80026, F-63178 Aubière, France.
Clermont Université, Université Blaise Pascal, Institut de Chimie de Clermont-Ferrand (ICCF), BP 10448, F-63000 Clermont-Ferrand, France.

         The atmosphere carries microorganisms and connects distant ecosystems. In addition of the underlying epidemiological issues associated with the presence of living microorganisms in the air, it was shown recently that they can contribute to atmospheric physico-chemical processes. Clouds are thus now considered in some aspects as habitats for microorganisms, albeit temporary by essence. Our first culture-based studies led on cloud microflora recovered from the atmospheric observatory at the puy de Dôme mountain summit (1465 m asl.), in the early 2000’s, revealed a high diversity in the microbial community, dominated by a few genera of bacteria and fungi (Pseudomonas, Sphingomonas, Dioszegia…). The advent of new DNA amplification methods (MDA), associated with next generation sequencing tools allows clarifying our vision of cloud biodiversity and its functioning, while overcoming the difficulties raised by the low biomass in these environments (~104 cells m-3). Thereby, cloud water metagenomes, metatranscriptomes and amplicons libraries (16S and 18S rRNA genes) were investigated. The results clearly showed a high taxonomic diversity in both prokaryotes and eukaryotes, but a very uneven distribution with a few abundant and numerous rare OTUs. The large domination of Proteobacteria was confirmed, and the presence of noticeable groups such as viruses, Cyanobacteria and Archaea was revealed. The active biodiversity was largely related to some groups of bacteria, notably Alpha-Proteobacteria. Analyses of metatranscriptomes and mRNAs-enriched metatranscritptomes showed a large overrepresentation of functions related to metabolic regulation, genome reorganization, access to substrates and defense against oxidants. These new pictures of cloud microbial communities indicate that these environments are open to numerous taxa, but only a few can actually maintain. They must rapidly adjust their functioning for surviving in these inhospitable environments, suggesting that atmospheric transport probably operates strong selection on the microorganisms of outdoor surfaces, and thus drives in some extent microbial evolution.

Reconstructing genomes from metagenomes: The holy grail of microbiology 
A. Murat Eren, The Univ. Of Chicago Department of Medecine Faculty, USA

Shotgun metagenomics provides insights into a larger context of naturally occurring microbial genomes when short reads are assembled into contiguous DNA segments (contigs). Contigs are often orders of magnitude longer than individual sequences, offering improved annotations, and key information about the organization of genes in cognate genomes. Several factors affect the assembly performance, and the feasibility of the assembly-based approaches varies across environments. However, increasing read lengths, novel experimental approaches, advances in computational tools and resources, and improvements in assembly algorithms and pipelines render the assembly-based metagenomic workflow more and more accessible. The utility of metagenomic assembly remarkably increases when contigs are organized into metagenome-assembled genomes (MAGs). Often-novel MAGs frequently provide deeper insights into bacterial lifestyles that would otherwise remain unknown as evidenced by recent discoveries. The increasing rate of the recovery of MAGs presents new opportunities to link environmental distribution patterns of microbial populations and their functional potential, and transforms the field of microbiology by providing a more complete understanding of the microbial life, ecology, and evolution.


Gut metagenomics in cardiometabolic diseases
Karine Clément, MD, PhD,Institute of Cardiometabolism and Nutrition (ICAN), Pitié-Salpetrière hospital,
INSERM/ Sorbonne University/ Université Pierre et Marie-Curie, Paris,,

Cardio-metabolic and Nutrition-related diseases (CMDs) represent an enormous burden for health care. They are characterized by very heterogeneous phenotypes progressing with time. It is virtually impossible to predict who will or will not develop cardiovascular comorbidities. There is a clear need to intervene earlier in the natural cycle of the disease, before irreversible tissue damages develop. Predictive tools still remain elusive and environmental factors (food, nutrition, physical activity and psychosocial factors) play major roles in the development of these interrelated pathologies. Poor nutritional environment and lifestyle also promote health deterioration resulting in CMD progression. In the last few years, the characterization of the gut microbiome (i.e. collective bacteria genome) and gut-derived molecules (i.e. metabolites, lipids, inflammatory molecules) has opened up new avenues for the generation of fundamental knowledge regarding putative shared pathways in CMD. The gut microbiome is likely to have an even greater impact than genetic factors given its close relationship with environmental factors. In metabolic disorders, the discoveries that low bacterial gene richness associates with cardiovascular risks stimulate encourage these developments. Due to the complexity of the gut microbiome, and its interactions with human (host) metabolism as well as with the immune system, it is only through integrative analyses where metabolic network models are used as scaffold for analysis that it will be possible to identify markers and shared pathways, which will contribute to improve patient stratification and develop new modes of patient care.

3 References
- Aron-Wisnewsky J and Clément K The gut microbiome, diet, and links to cardiometabolic and chronic disorders. Nature Reviews, Nephrology, 2016
-Dao MC, Everard A, Aron-Wisnewsky J, Sokolovska N, Prifti E, Verger EO, Kayser BD, Levenez F, Chilloux J, Hoyles L; MICRO-Obes Consortium, Dumas ME, Rizkalla SW, Doré J, Cani PD, Clément K. Akkermansia muciniphila and improved metabolic health during a dietary intervention in obesity: relationship with gut microbiome richness and ecology. Gut. 2015 Jun 22
-Cotillard A, Kennedy SP, Kong LC, Prifti E, Pons N, Le Chatelier E, Almeida M, Quinquis B, Levenez F, Galleron N, Gougis S, Rizkalla S, Batto JM, Renault P; ANR MicroObes consortium, Doré J, Zucker JD, Clément K, Ehrlich SD. Dietary intervention impact on gut microbial gene richness. Nature. 2013 Aug 29;500(7464):585-8. 


Soil metagenomics, potential and pitfalls
Pascal SIMONET, 
Environmental Microbial Genomics group, Ampère-UMR CNRS 5005, ECL and University of Lyon, 69134 Ecully cedex, France.

The soil microorganisms are responsible for a range of critical functions including those that directly affect our quality of life (e.g., antibiotic production and resistance – human and animal health, nitrogen fixation -agriculture, pollutant degradation – environmental bioremediation). Nevertheless, genome structure information has been restricted by a large extent to a small fraction of cultivated species. This limitation can be circumvented now by modern alternative approaches including metagenomics or single cell genomics.  Metagenomics includes the data treatment of DNA sequences from many members of the microbial community, in order to either extract a specific microorganism’s genome sequence or to evaluate the community function based on the relative quantities of different gene families. In my talk I will show how these metagenomic datasets can be used to estimate and compare the functional potential of microbial communities from various environments with a special focus on antibiotic resistance genes. However, metagenomic datasets can also in some cases be partially assembled into longer sequences representing microbial genetic structures for trying to correlate different functions to their co-location on the same genetic structure. I will show how the microbial community composition of a natural grassland soil characterized by extremely high microbial diversity could be managed for sequentially attempt to reconstruct some bacterial genomes.

Metagenomics can also be used to exploit the genetic potential of environmental microorganisms. I will present an integrative approach coupling rrs phylochip and high throughput shotgun sequencing to investigate the shift in bacterial community structure and functions after incubation with chitin. In a second step, these functions of potential industrial interest can be discovered by using hybridization of soil metagenomic DNA clones spotted on high density membranes by a mix of oligonucleotide probes designed to target genes encoding for these enzymes. After affiliation of the positive hybridizing spots to the corresponding clones in the metagenomic library the inserts are sequenced, DNA assembled and annotated leading to identify new coding DNA sequences related to genes of interest with a good coverage but a low similarity against closest hits in the databases confirming novelty of the detected and cloned genes.


Taxonomic Assignment: From Amplicon to Shotgun Sequencing
Gabriel Valiente, Technical University of Catalonia, Department of Computer Science, Barcelone, ES

TANGO and BioMaS are a tool and a pipeline for microbiome classification from amplicon metagenomic data, and MetaShot is a pipeline for host-associated microbiome classification from shotgun metagenomic data. They combine coarse grained sequence similarity (fast sequence read screening) and fine grained sequence similarity (sequence read mapping and optimal taxonomic classification) based aproaches to attain the best compromise between computational efficiency and assignment accuracy, and allow for the classification of ambiguous sequence reads to archaeal, bacterial, fungal, protozoan, and viral species at the best possible taxonomic rank.

MG-RAST — experiences from processing a quarter million metagenomic data sets
Folker Meyer, Institute for Genomics and Systems Biology, Argonne National Laboratory, Argonne, USA

MG-RAST has been offering metagenomic analyses since 2007. Over 20,000 researchers have submitted data. I will describe the current MG-RAST implementation and demonstrate some of its capabilities. In the course of the presentation I will highlight several metagenomic pitfalls. MG-RAST: MG-RAST-APP:

New perspectives on nitrite-oxidizing bacteria - linking genomes to physiology
Sebastian Lücker, Department of Microbiology, IWWR, Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, the Netherlands.

It is a generally accepted characteristic of the biogeochemical nitrogen cycle that nitrification is catalyzed by two distinct clades of microorganisms. First, ammonia-oxidizing bacteria and archaea convert ammonia to nitrite, which subsequently is oxidized to nitrate by nitrite-oxidizing bacteria (NOB). The latter were traditionally perceived as physiologically restricted organisms and were less intensively studied than other nitrogen-cycling microorganisms. This picture is contrasted by new discoveries of an unexpected high diversity of mostly uncultured NOB and a great physiological versatility, which includes complex microbe-microbe interactions and lifestyles outside the nitrogen cycle. Most surprisingly, close relatives to NOB perform complete nitrification (ammonia oxidation to nitrate), a process that had been postulated to occur under conditions selecting for low growth rates but high growth yields.

The existence of Nitrospira species that encode all genes required for ammonia and nitrite oxidation was first detected by metagenomic analyses of an enrichment culture for nitrogen-transforming microorganisms sampled from the anoxic compartment of a recirculating aquaculture system biofilter. Batch incubations and FISH-MAR experiments showed that these Nitrospira indeed formed nitrate from the aerobic oxidation of ammonia, and used the energy derived from complete nitrification for carbon fixation, thus proving that they indeed represented the long-sought-after comammox organisms. Their ammonia monooxygenase (AMO) enzymes were distinct from canonical AMOs, therefore rendering recent horizontal gene transfer from known ammonia-oxidizing microorganisms unlikely. Instead, their AMO displayed highest similarities to the “unusual” particulate methane monooxygenase from Crenothrix polyspora, thus shedding new light onto the function of this sequence group. This recognition of a novel AMO type indicates that a whole group of ammonia-oxidizing microorganisms has been overlooked, and will improve our understanding of the environmental abundance and distribution of this functional group. Data mining of publicly available metagenomes already indicated a widespread occurrence in natural and engineered environments like aquifers and paddy soils, and drinking and wastewater treatment systems.

Fast filtering, mapping and assembly of 16S ribosomal RNA
Hélène Touzet,
  BONSAI group at CRIStAL and INRIA, Bioinformatics and Sequence Analysis, Lille, FR

The application of next-generation sequencing technologies to RNA or DNA directly extracted from a community of organisms yields a mixture of nucleotide fragments. The task to distinguish amongst these and to further categorize the families of ribosomal RNAs (or any other given marker) is an important step for examining the phylogenetic classification of the constituting species. In this perspective, we have developed  a complete bioinformatics suite, called MATAM, capable of handling large sets of  reads in a fast and accurate way. MATAM covers all steps of the analysis, from the identification of reads of interest in the raw sequencing data to the reconstruction of the  full-length sequences of the marker and alignment to a reference database for taxonomic assignment. Part of MATAM is based on the SortMeRNA software, also developed by the team.

Deciphering the human intestinal tract microbiome using metagenomic computational methods
Matthieu Almeida
Center for Bioinformatics and Computational Biology, College Park, MD, USA

In 2010, the MetaHIT consortium published a 3.3M microbiota gene catalog generated by whole genome shotgun metagenomic sequencing, representing a mixture of bacteria, archaea, parasites and viruses coming from 124 human stool metagenomic samples [Qin et al, Nature 2010].
However most of the genes were fragmented, taxonomically and functionally unknown, making it difficult to define and select biomarkers of interest for genome-wide association studies.
Since that, this human gene catalog was improved multiple times, with the last update by [Li et al, Nature Biotechnology, 2014], which generated a 10M gene catalog using more than 1000 metagenomic samples and including some prevalent human microbe genome available at that time. Along with the catalog update, the scientific community developed new tools to challenge the complexity of this dataset and provided new ways to assemble, annotate, quantify and classify the genes coming from these catalogs.
In this talk we will discuss the main approaches related to the computational treatment of the different gene catalog other the time, illustrated by the different papers that deciphered step by step the hidden information of our microbiota and his link with our health.


Hidden in the permafrost
Chantal Abergel & Jean-Michel Claverie,
 Structural and Genomic Information laboratory, CNRS-AMU UMR7256, IMM, Parc Scientifique de Luminy, Marseille, France

The last decade witnessed the discovery of four families of giant viruses infecting Acanthamoeba. They have genome encoding from 500 to 2000 genes, a large fraction of which encoding proteins of unknown origin. These unique proteins meant to recognize and manipulate the same building blocks as cells raise the question on their origin as well as the role viruses played in the cellular word evolution. The Mimiviridae and the Pandoraviridae are increasingly populated by members from very diverse habitats and are ubiquitous on the planet. After prospecting the space, we went back in the past and isolated two other giant virus families from a 30,000 years old permafrost sample, Pithovirus and Mollivirus sibericum. A metagenomics study of the sample was performed to inventory its biodiversity and assess to what extend the host and the viruses were dominant. I will describe the two sequencing approaches which have been used and compare the results.

1: Raoult D, Audic S, Robert C, Abergel C, Renesto P, Ogata H, La Scola B, Suzan M, Claverie JM. The 1.2-megabase genome sequence of Mimivirus. Science. 2004 Nov 19;306(5700):1344-50.
2: Philippe N, Legendre M, Doutre G, Couté Y, Poirot O, Lescot M, Arslan D, Seltzer V, Bertaux L, Bruley C, Garin J, Claverie JM, Abergel C. Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. Science. 2013 Jul 19;341(6143):281-6. 
3: Legendre M, Bartoli J, Shmakova L, Jeudy S, Labadie K, Adrait A, Lescot M, Poirot O, Bertaux L, Bruley C, Couté Y, Rivkina E, Abergel C, Claverie JM. Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology. Proc Natl Acad Sci U S A. 2014 Mar 18;111(11):4274-9.
4: Legendre M, Lartigue A, Bertaux L, Jeudy S, Bartoli J, Lescot M, Alempic JM, Ramus C, Bruley C, Labadie K, Shmakova L, Rivkina E, Couté Y, Abergel C, Claverie JM. In-depth study of Mollivirus sibericum, a new 30,000-y-old giant virus infecting Acanthamoeba. Proc Natl Acad Sci U S A. 2015 Sep 22;112(38):E5327-35.

From Samples to Data : Assuring Downstream Analysis with Upstream Planning 
Sean Kennedy, 
Bio-Omics Pole, Pasteur Institute, Paris, FR

Metagenomic studies have gained increasing popularity in the years since the introduction of next generation sequencing. NGS allows for the production of millions of reads for each sample without the intermediate step of cloning. However, just as in the past, the quality of the data generate by this powerful technology depends on sample preparation, library construction and the selection of appropriate sequencing technology and sequencing depth. Here we explore the different variables involved in the process of preparing samples for sequencing analysis including sample collection, DNA extraction and library construction. We also examine the various sequencing technologies deployed for routine metagenomic analysis and considerations for their use in different model systems including humans, mouse and the environment. Future developments such as long-reads will also be discussed to provide a complete picture of important aspects prior to data analyses which play a critical role in the success of metagenomic studies.

200 billion sequences and counting: analysis, discovery and exploration of datasets with EBI Metagenomics
Alex Mitchell
, European Bioinformatics Institute (EBI), Cambridge, UK

EBI metagenomics (EMG, is a freely available hub for the analysis and exploration of metagenomic, metatranscriptomic, amplicon and assembly data. The resource provides rich functional and taxonomic analyses of user-submitted sequences, as well as analysis of publicly available metagenomic datasets held within the European Nucleotide Archive (ENA). EMG has recently undergone rapid expansion, with an over 10-fold increase in data volumes in the first 5 months of 2016. It now houses ~ 50k publicly available data sets, and represents one of the largest collections of analysed metagenomic data. As its data content has grown, EMG has increasingly become a platform for data discovery. To support this process, we have made a series of user-interface improvements, including the classification of projects by biome, presentation of results data for better visualisation and more convenient download, and provision of project level summary files. More recently, we have indexed project metadata for use with the EBI search engine, enabling exploration across different datasets. For example, users are able to search with a particular taxonomic lineage or protein function and discover the projects, samples and sequencing runs in which that lineage or function is found. This functionality allows users to explore associations between biomes, environmental conditions and organisms and functions (e.g., discovering protein coding sequences that correspond to certain enzyme families found in aquatic environments at a given temperature range). Here, we give an overview of the EMG data analysis pipeline and web site, and illustrate the use of the new search facility for data discovery.

Metagenomics to illuminate ecosystem functioning
, EA 4678 CIDAM, Université d’Auvergne, Clermont-Fd, France

Microorganisms comprise the majority of living organisms on our planet. Most of them, identified by indirect molecular approaches, belong to microbial dark matter. For many years, exploration of the composition of microbial communities has been performed through the PCR-based study of the small subunit rRNA gene due to its high conservation across the domains of life. The application of this method has resulted in the discovery of many unexpected evolutionary lineages. The first application of high-throughput 16S rRNA amplicon sequencing has also revealed the rare biosphere. The advent of metagenomic approaches has highlighted the metabolic capabilities of numerous members of this dark matter through genome reconstruction. Successive sequencing improvements combined with dedicated bioinformatics tools have contributed to the exponential acquisition of complete genomes. Thus, linking functions back to the species has revolutionized our understanding of how ecosystem function is sustained by the microbial world. However, the sequencing depth required to provide a comprehensive view of microbial communities and the sequence data treatment remain particularly challenging for numerous ecosystems. New strategies of complexity reduction have been developed, such as single-cell genomics (SCG) or sequence capture by hybridization to reveal the missed microbial diversity.

Who is doing what on the cheese surface? Overview of the cheese microbial ecosystem functioning by metatranscriptomic analyses
Eric Dugat-Bony,
Génie et Microbiologie des Procédés Alimentaires, INRA Grigon, FR

Cheese ripening is a complex biochemical process driven by microbial communities composed of both eukaryotes and prokaryotes. Surface-ripened cheeses are widely consumed all over the world and are appreciated for their characteristic flavor. Microbial community composition has been studied for a long time on surface-ripened cheeses, but only limited knowledge has been acquired about its in situ metabolic activities. We used an iterative sensory procedure to select a simplified microbial consortium, composed of only nine species (three yeasts and six bacteria), producing the odor of Livarot-type cheese when inoculated in a sterile cheese curd. All the genomes were sequenced in order to determine the functional capacities of the different species and facilitate RNA-Seq data analyses. We followed the ripening process of experimental cheeses made using this consortium during four weeks, by metatranscriptomic and biochemical analyses. By combining all of the data, we were able to obtain an overview of the cheese maturation process and to better understand the metabolic activities of the different community members and their possible interactions. We next applied the same approach to investigate the activity of the microorganisms in real cheeses, namely Reblochon-style cheeses. This provided useful insights into the physiological changes that occur during cheese ripening, such as changes in energy substrates, anabolic reactions, or stresses.

Dr Jekyll and Mr Hyde: The dual face of metagenomics in phylogenetic analysis
Violette Da Cunha,
 Institut Pasteur, Unité de Biologie Moléculaire du Gène chez les Extrêmophiles (BMGE) & Institute for Integrative Biology of the Cell (I2BC), Paris, FR

The aim of this lecture is to present the impact of metagenomics and single-cell genomics on public databases. These new powerful approches allow us to have access to the diversity of life on our planet. However, care has to be taken when using these data for posterior analyses, such as phylogenetic studies, as critical errors can still be present in the databases. This course will incorporate examples taken from real studies, and we will investigate methods used for error detection.

Revealing and analyzing microbial networks: from topology to functional behaviors
Damien Eveillard, 
Laboratoire d'Informatique de Nantes Atlantique, irisa, Nantes, FR

Understanding the interactions between microbial communities and their environment well enough to be able to predict diversity on the basis of physicochemical parameters is a fundamental pursuit of microbial ecology that still eludes us. However, modeling microbial communities is a complicated task, because (i) communities are complex, (ii) most are described qualitatively, and (iii) quantitative understanding of the way communities interacts with their surroundings remains incomplete. Within this seminar, we will illustrate two complementary approaches that aim to overcome these points in different manners.

Assessing microbial biogeography by using a metagenomic approach
Sébastien Terrat,
 Interaction Plantes Micro-organismes, Agroécologie, INRA, Dijon, FR

Soils are highly complex ecosystems and are considered as one of the Earth’s main reservoirs of biological diversity. Bacteria account for a major part of this biodiversity, and it is now clear that such microorganisms have a key role in soil functioning processes. However, environmental factors regulating the diversity of below-ground bacteria still need to be investigated, which limits our understanding of the distribution of such bacteria at various spatial scales. The overall objectives of this study were: (i) to determine the spatial patterning of bacterial community diversity in soils at a broad scale, and (ii) to rank the environmental filters most influencing this distribution.
This study was performed at the scale of the France by using the French Soil Quality Monitoring Network. This network includes more than 2,200 soil samples along a systematic grid sampling. For each soil, bacterial diversity was characterized using a pyrosequencing approach targeting the 16S rRNA genes directly amplified from soil DNA, obtaining more than 18 million of high-quality sequences.
This study provides the first estimates of microbial diversity at the scale of France, with for example, bacterial richness ranging from 555 to 2,007 OTUs (on average: 1,289 OTUs). It also provides the first extensive map of bacterial diversity, as well as of major bacterial taxa, revealing a bacterial heterogeneous and spatially structured distribution at the scale of France. The main factors driving bacterial community distribution are the soil physico-chemical properties (pH, texture...) and land use (forest, grassland, crop system...), evidencing that bacterial spatial distribution at a broad scale depends on local filters such as soil characteristics and land use when regarding the community (quality, composition) as a whole. Moreover, this study also offers a better evaluation of the impact of land uses on soil microbial diversity and taxa, with consequences in terms of sustainability for agricultural systems.


Rationale and Tools to look for the unknown in (metagenomic) sequence data
Jean-Michel Claverie,
Sébastien Santini, Olivier Poirot
Information Génomique & Structurale, UMR7256 Aix-Marseille University & CNRS

The interpretation of metagenomic data (environmental, microbiome, etc, ...) usually involves the recognition of sequence similarity with previously identified (micro-organisms). This is for instance the main approach to taxonomical assignments and a starting point to most diversity analyses. When exploring beyond the frontier of known biology, one should expect a large proportion of environmental sequences not exhibiting any significant similarity with known organisms. Notably, this is the case for eukaryotic viruses belonging to new families, for which the proportion of "no match" could reach 90%. Most metagenomics studies tend to ignore this large fraction of sequences that might be the equivalent of "black matter" in Biology. We will present some of the ideas and tools we are using to extract that information from large metagenomics data sets in search of truly unknown microorganisms.

One of the tools, "Seqtinizer", an interactive contig selection/inspection interface will also be presented in the context of "pseudo-metagenomic" projects, where the main organism under genomic study (such as sponges or corals) turns out to be (highly) mixed with an unexpected population of food, passing-by, or symbiotic microorganisms. 


Holistic metagenomics in marine communities
Patrick Wincker,
CEA/GENOSCOPE, Laboratoire d’Analyses Génomiques des Eucaryotes (LAGE), Evry, FR

Complex microscopic communities are composed of species belonging to all life realms, from single-cell prokaryotes to multicellular eukaryotes of small size. Each component of a community needs to be studied for a full understanding of the functions performed by the whole assemblage, however methods to investigate microbiomes are generally restricted to a single kingdom. Using examples from the Tara Oceans project, we will show how size fractionation and use of varied metabarcoding, metagenomics and metatranscriptomics approaches can help studying the marine plankton community as a whole, in a wide geographic space. 


Sequencing 6000 chloroplast genomes : the PhyloAlps project
Eric Coissac, Laboratoire d’Ecologie Alpines (LECA), Grenoble, FR

Biodiversity is now commonly described by DNA based approches. Several actors are currently using DNA to describe biodiversity, and most of the time they use different genetic markers that is hampering an easy sharing of the accumulated knowledges. Taxonomists rely a lot on the DNA Barcoding initiative, phylogeneticists often prefer markers with better phylogenic properties, and ecologists, with the coming of the DNA metabarcoding, look for a third class of markers easiest to amplify from environmental DNA. Nevertheless they have all the same need of the knowledge accumulated by the others. But having different markers means that the sequecences have been got from different individuals in differente lab, following various protocoles. On that base, building a clean reference database, merging for each species all the available markers becomes a challenge. With the phyloAlps project we implement genome skimming at a large  scale and propose it as a new way to set up such universal reference database usable by taxonomists, phylogeneticists, and ecologists. The Phyloalps project is producing for each species of the Alpine flora at least a genome skim composed of six millions of 100bp sequence reads. From such data it is simple to extract all chloroplastic, mitochondrial and nuclear rDNA markers commonely used. Moreover, most of the time we can get access to the complete chloroplast genome sequence and to a shallow sequencing of many nuclear genes. This methodes have already been successfully applied to algeae, insects and others animals. With the new single cell sequencing methods it will be applicable to most of the unicellular organisms. The good question is now : Can we consider the genome skimming as the next-generation DNA barcode ?

Part II: hands-on tutorials
(Wednesday 14/09 2 p.m to Friday 16/09 12.30  am) 

These tutorials are designed as self-contained units that include example data and pre-installed bioinformatics tools.  These hands-on practical tutorials will demonstrate the use of the following metagenome analysis tools:


Wednesday September the 14th from 2:00 PM to 6:00 PM

FROGS: Find Rapidly OTU with Galaxy Solution [Géraldine Pascal et al. INRA Toulouse & Olivier Rué, Jouy, FR] 
Frédéric ESCUDIE (1)*, Lucas AUER (2)*, Maria BERNARD (3), Laurent CAUQUIL (4), Katia VIDAL (4), Sarah MAMAN (4), Mahendra MARIADASSOU (5), Guillermina HERNANDEZ-RAQUET (2), Géraldine PASCAL (4). 
1 Bioinformatics platform Toulouse Midi-Pyrenees, MIAT, INRA Auzeville CS 52627 31326 Castanet Tolosan cedex, France 
2 Université de Toulouse, INSA, UPS, LISBP, F-31077 Toulouse Cedex 4, France ; INRA, UMR792 ISBP, F-31400 Toulouse, France 
3 INRA, UMR1313, SIGENAE, F-78352 Jouy-en-Josas, France 
4 INRA, UMR1388, F-31326 Castanet-Tolosan, France, Université de Toulouse INPT ENSAT, UMR1388, F-31326 Castanet-Tolosan, France, Université de Toulouse INPT ENVT, UMR1388, F-31076 Toulouse, France 
5 INRA, Unité MaIAGE, F-78352 Jouy-en-Josas, France 
* ‘These authors contributed equally to this work’ 

High-throughput sequencing of 16S/18S/23S RNA amplicons has opened new horizons in the study of microbe communities. With the sequencing at great depth the current processing pipelines struggle to run rapidly and the most effective solutions are often designed for specialists. These tools are designed to give both the abundance table of operational taxonomic units (OTUs) and their taxonomic affiliation. In this context we developed the pipeline FROGS: « Find Rapidly OTU with Galaxy Solution ». Developed for biologists on the Galaxy platform. 

A preprocessing tool merges paired sequences into contigs with flash, cleans the data with cutadapt, deletes the chimeras with VSEARCH combined with a cross-validation method and dereplicates sequences with a home-made python script. The clusterisation tool runs with SWARM that uses a local clustering threshold, not a global clustering threshold like other software do. The affiliation tool returns taxonomic affiliation for each OTU using both RDPClassifier and NCBIBlast+ on different databases (Silva, Greengenes). And finally, the post processing tool allows users to process this table with the user-specified filters and provides statistical results and numerous graphical illustrations of these data. 

FROGS has been developed to be very fast even on large amounts of 454/HiSeq/MiSeq data in using cutting-edge tools and an optimized design, also it is portable on all Galaxy platforms. FROGS was tested on numerous simulated datasets. The tool has been extremely rapid, robust and highly sensitive for the OTU detection with very few false positives compared to other pipelines widely used by the community. 

Keywords: clustering local; efficient removing chimera; multi-hit affiliation; statistics graphics; fast; accurate; user-friendly; galaxy

Access the tutorial presentation


Thursday September the 15th from 9:00 AM to 11:00 AM


SHAMAN:  A SHiny Application for Metagenomic ANalysis [Amine Ghozlane, C3BI, Institut Pasteur, FR]
Amine Ghozlane*, Stevenn Volant*, Hugo Varet, Christophe Malabat, Sean Kennedy, Marie-Agnès Dillies.

Quantitative metagenomic is broadly employed to identify genera or species associated with several diseases. These different studies are all based on the targeted sequencing of 16S/18S/ITS rDNA or on random sequencing of whole-community DNA. Quantitative data are obtained by mapping the reads against operational taxonomic units (OTU) or a gene catalog. The data generated can then be analysed quantitatively using R packages (metagenomeseq, edgeR) or with web-interfaces (Shiny-phyloseq, Phinch) that do not integrate the statistical modeling. The lack of easy access methods providing both the statistical modeling and the visualisation constitutes a critical issue to address this type of analysis. Here we present SHAMAN, a Shiny-based application integrating the metagenomic data (a count matrix for each sample and a table assigning a taxonomical annotation to each feature), the experimental design, the statistical model and a dynamic-interface dedicated to the differential analysis.

SHAMAN process is divided into three steps : normalisation, modelisation and visualisation. The count matrix is normalised at the OTU/gene level using the DESeq2 normalisation method and then, based on the experimental design, a generalised linear model is applied to detect differences in abundance at the considered taxonomic level.

SHAMAN provides diagnostic plots to check the quality of the modelisation and visualisation that highlight the differences in abundance that have been identified by the statistical analysis. Diversity plots are also proposed to illustrate the results from a more global view.

Tools : SHAMAN is freely accessible through a web interface at


Thursday September the 15th from 2:00 PM to 4:00 PM

cDPCoA: Constrained DPCOA for community comparison [Stéphane Dray, Lyon, FR]
Multivariate analysis and graphics using ade4/adegraphics packages for R: an application to genomic data

This practical focuses on the use of multivariate methods to analyze metagenomic data using ade4 and adegraphics packages for R. We will illustrate the use of simple methods (e.g. PCA), the integration of phylogenetic information (DPCoA) and external information (Consrained DPCoA) using real data sets.  The new package adegraphics will be used to produce high-quality graphics summarizing the main outputs of statistical methods.

Tools : RStudio ; R Packages : ade4 et adegraphics


Access the tutorial presentation

Download the R scripts


Thursday September the 15th from 11:00 AM to 12:00 PM and 4:00 PM tp 6:00 PM

EggNOG, fetchMG, iVireon and PICRUSt for the functional annotation of metagenomic data [Mathieu Almeida, Center for Bioinformatics and Computational Biology, College Park,  MD, USA]

In this workshop, we will explore the bacterial and viral functional composition of a human intestinal tract whole metagenomic sample, right after the metagenomic assembly and protein prediction step.

First, we will focus on particular key bacterial functions by producing some protein databases on the fly using NCBI website. These protein database will then be indexed and use for the functional annotation step, with first the classical blast+ tool, that we will then compare with more recent method like diamond, following with Hidden Markov Models (HMMs) search based approach using hmmer and fetchMG with models designed from the eggnog database.

Secondly, we will look at particular key viral functions by performing the same steps as for the bacterial annotation to appreciate the limit  of the different tools in the viral world, and demonstrate the potential of alternative method for key viral function annotation like iVireon. 

Finally, we will analyse a 16S metagenomic sample coming from the same individual and use a function prediction method like PICRUSt, that we will compare with the functional annotation from the whole genome shotgun sample.

Tools : 

  • blast+ :
  • diamond :
  • EggNOG :
  • hmmer :
  • fetchMG :
  • iVireon :
  • PICRUSt :

Download the tutorial files


Friday September the 16th from 9:00 AM to 12:30 PM

Anvi’o: an advanced analysis and visualization platform for ‘omics data [A. Murat Eren, Univ. Chicago, USA]

Advances in high-throughput sequencing and ‘omics technologies are revolutionizing studies of naturally occurring microbial communities. Comprehensive investigations of microbial lifestyles require the ability to interactively organize and visualize genetic information and to incorporate subtle differences that enable greater resolution of complex data. Anvi’o is an advanced analysis and visualization platform that offers automated and human-guided characterization of microbial genomes in metagenomic assemblies, with interactive interfaces that can link ‘omics data from multiple sources into a single, intuitive display. Its extensible visualization approach distills multiple dimensions of information about each contig, offering a dynamic and unified work environment for data exploration, manipulation, and reporting. Anvi’o is an open-source platform that empowers researchers without extensive bioinformatics skills to perform and communicate in-depth analyses on large ‘omics datasets.

Tools : Anvi’o


Access the tutorial

Registration is closed

Registered list for Lectures :

Acuña Amador Luis Alberto, Université de Rennes 1
Adam Panagiotis, Institut Pasteur Paris
Amato Pierre, CNRS
Amouri Adel Amar, Université d'Oran 1 Ahmed Ben Bella.
Barray Anaïs, Institut Pasteur de Paris
Barre Aurélien, Université Bordeaux
Belliardo Carole, Unice
Ben Abdelkrim Ahmed, Jacques Monod institute
Benoiston Anne-Sophie, Institut de Biologie Paris Seine (UPMC)
Bermudez Luis, INRA
Bing Ma Bing, Institute for Genome Sciences, University of Maryland
Bonnarme Pascal, INRA
Boyer Mickaël, Danone Nutricia Research
Chervaux Christian, Danone Research
Chiapello Hélène, Inra
Codoni Veronica, INSERM
Cosson Jean-François, INRA
Da Cunha Violette, Institut Pasteur
Danjou Fabrice, Institut du cerveau et de la Moelle-epiniere
De Sordi Luisa, Institut Pasteur
Debnath Olivia, Indian Instituteof Science Education and Research Kolkata
Dominguez del Angel Victoria, IFB
Douard Véronique, INRA
Drain Alice, INRA
Dridi Bedis, INRA
Dugat-Bony Eric, INRA
Dussart Caroline, IFREMER
Echenique Isidora, MNHN
El Mhijar Sanae, Faculté de sciences Nantes
Eveillard Damien, Université de Nantes
Fei Na, INRA
Fourrage Cécile, Institut Imagine
Fu Yu, Gustave roussy
Gabriel Valiente, Technical University of Catalonia
Gautreau Guillaume, CEA
Gibrat Jean-François, INRA
Grange Thierry, Institut Jacques Monod, CNRS, Université Paris Diderot
Guirimand Thibaut, INRA
Guyomar Cervin, INRIA
Habib Mahjoubi, Institut de biologie moléculaire des plantes (IBMP)
Harvengt Luc, FCBA
Hubler Frédérique, CNRS
Jabri Hiba, Pasteur Institut of Tunisia
Josso Adrien, CNRS
Kennedy Sean, Pasteur
Kutub Uddin Muhammad Ashraf, University of Joseph Fourier
Lavenier Dominique, CNRS
Le Cavorzin Arnaud, Biocodex
Le Gall –David Sandrine, Université de Rennes1
Le Gouil Meriadeg, Institut Pasteur
Lemonnier Clarisse, IUEM
Lepage Patricia, INRA
Leroi Laura, Ifremer
Leuillet Sébastien, Biofortis
Lücker Sebastian, Radboud University
Maignien Loïs, Institut Europeen des Sciences de la Mer - UBO
Mansos  Lourenço Marta, Institut Pasteur
Marbouty Martial, CNRS - Institut Pasteur
Mariadassou Mahendra, INRA
Martin-Gallausiaux Camille, INRA
Mathieu Alban, Enterome
Meng Arnaud, Université Pierre et Marie Curie
Michel Elisa, ONIRIS / INRA
Mitchell Alex, EMBL-EBI
Moya Alvarez Violeta, Institut Pasteur
Narwani Tarun, INSERM
Noël Cyril, Université de Pau et des Pays de l'Adour
Oger Christine, Université Lyon1
Pauvert Charlie, INRA
Agostinho Escudeiro Pedro, cE3c, University of Lisbon
Perrière Guy, CNRS
Peterlongo Pierre, Inria
Pible Olivier, CEA/DRF/IBITECs
Pierre Mathieu, INRA
Plaza Onate Florian, INRA
Poirot Olivier, CNRS
Pollet Nicolas, CNRS
Prevost Hervé, Oniris
Puga Freitas Ruben, Université Paris Est Créteil
Quintric Laure, IFREMER
Reboul Guillaume, Genoscope
Renault Pierre, INRA
Reveillaud Julie, INRA/CIRAD
Rinaldino Julia, Université Nice
Roach Dwayne, Institut Pasteur
Robert Guillaume, INRA Versailles
Rué Olivier, INRA
Saffarian Azadeh, Institut Pasteur
Schbath Sophie, INRA
Segurel Laure, CNRS - Musée de l'Homme
Sempéré Guilhem, CIRAD
Shah Shivani, CEA Saclay
Simonet Pascal, CNRS/ECL
Sousa Jorge, Institut Pasteur
Touzet Hélène, CNRS
Trigodet Florian, IUEM
Utge Jose, Muséum National d'Histoire Naturelle
Vacher Corinne, INRA
Veiga Patrick, Danone Nutricia Research
Weill François-Xavier, Institut Pasteur
Wiernasz Norman, Ifremer/Oniris
Wu Jiang-bo, INRA
Zancarini Anouk, UPMC
Zhang Xufei, INRA

Registered list for Tutorials

ADAM Panagiotis, Institut Pasteur
ALBAN Mathieu, ENTEROME biosciences
De SORDI Luisa, Institut Pasteur
DRAIN Alice, INRA, UMR agroécologie
ECHENIQUE Isidora, Univ Pierre et Marie Curie / Institut Pasteur
ESCUDEIRO Pedro; Université de Lisbonne,
JABRI Hiba, Institut Pasteur, Tunisie
LEMONNIER Clarisse, Ecologie microbienne, LEMER
LOURENCO Marta, Institut Pasteur, Interaction Bactreiophages Bacteria Animals Labs
MARBOUTY Martial, Institut Pasteur, G5- Régulation Spatiale des Génomes
MOKRANI Mohamaed, Institut de Chimie et de Biologie des Membranes et des Nano-objets (CBMN)
MOURA de SOUSA Jorge, Eduardo's Lab, Institut Pasteur
MOYA-ALVAREZ Violeta, Pathogénie Microbienne moléculaire, Institut Pasteur
POIROT Olivier, IGS, UMR7256
POLLET Nicolas, CNRS UMR9191, Lab Evolution, Genomes, Behaviour and Ecology
POLSTON Pastsy, Institut Pasteur, Department of Enteric Viruses
REVEILLAUD Julie, INRA, CMAEE (‘Emerging and Exotic Animal Disease Control’)
SAFFARIAN Azadeh, Molecular Microbial Pathogenesis, Institut Pasteur
SEGUREL Laure, Museum histoire naturelle, MNHN, UMR 7206
SEMPERE Guilhem, UMR Intertryp, CIRAD
SHAH Shivani, CEA
TRIGODET Florian, Institut Corrosion et LMEE
UTGE José, Museum histoire naturelle, MNHN, UMR 7206
WEILL François-Xavier, Institut Pasteur, Unité EBPRE
ZANCARINI Anouk, Univ Pierre et Marie Curie

Accommodation (hotel closed to Institut Pasteur)

- Cactus hotel, 47 rue des Volontaires, 75015 Paris – Tel 01 47 34 70 47 -
- Lecourbe hotel, 28 rue Lecourbe, 75015 Paris – Tel 01 47 34 49 06 -
- For students, Kellerman Center :


Travel Information

Become a Sponsor


We've a few sponsorship opportunities for Summer School 2016 in Metagenomics. Attendees will come mainly from research organizations that are doing data-intensive researches in Genomics and Metagenomics

We expect around 120 participants.

The package include:

  • Organization logo displayed as sponsor on conference homepage with link to sponsor URL
  • Organization logo displayed as sponsor on printed conference material 
  • Full registration (including conference dinner) for one attendee 
  • 1/4-page, full color ad in program 
  • Cost 800 €


If your organization is interested in participating at the Summer School 2016 in Metagenomics as a sponsor

please contact :




Evaluation of the lectures

Evaluation of the practical sessions