Article

Finding approximate gene clusters with Gecko 3

Details

Citation

Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J & Bocker S (2016) Finding approximate gene clusters with Gecko 3. Nucleic Acids Research, 44 (20), pp. 9600-9610. https://doi.org/10.1093/nar/gkw843

Abstract
Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min.

Journal
Nucleic Acids Research: Volume 44, Issue 20

StatusPublished
Publication date30/11/2016
Publication date online26/09/2016
Date accepted by journal12/09/2016
URLhttp://hdl.handle.net/1893/24323
PublisherOxford University Press
ISSN0305-1048