Skip to Main Content

GigaScience Prize Track

Since 2017, authors of cutting edge, unpublished research in the field of large-scale biological and biomedical research were invited to submit to a special session at BGI's ICG (International Conference on Genomics) conference in Shenzhen. Using a cutting-edge review process, manuscripts were submitted as pre-prints in bioRxiv and peer-reviewed openly and in real time via a special channel of the Preprint.Space peer review platform. Peer-review and prize judging was handled by the GigaScience editors and panel encompassing external experts. Below are the accepted papers which had a chance to win travel expenses, cash prizes and article processing costs, and are presented in a special track at the conference.

View all collections published in GigaScience

Technical Note

Rice Galaxy: an open resource for plant science

Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available.

Venice Juanillas; Alexis Dereeper; Nicolas Beaume; et al. 

GigaScience
Published on: 18 May 2019

Full Text | PDF

Research

Comparative analyses identify genomic features potentially involved in the evolution of birds-of-paradise

The diverse array of phenotypes and courtship displays exhibited by birds-of-paradise have long fascinated scientists and nonscientists alike. Remarkably, almost nothing is known about the genomics of this iconic radiation. There are 41 species in 16 genera currently recognized within the birds-of-paradise family (Paradisaeidae), most of which are endemic to the island of New Guinea.

Stefan Prost; Ellie E Armstrong; Johan Nylander; et al.

GigaScience
Published on: 24 January 2019

Full Text | PDF

Research

Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes

De novo transcriptome assemblies are required prior to analyzing RNAseq data from a species without an existing reference genome or transcriptome. Despite the prevalence of transcriptomic studies, the effects of using different workflows, or “pipelines”, on the resulting assemblies are poorly understood. Here, a pipeline was programmatically automated and used to assemble and annotate raw transcriptomic short read data collected by the Marine Microbial Eukaryotic Transcriptome Sequencing Project (MMETSP). The resulting transcriptome assemblies were evaluated and compared against assemblies that were previously generated with a different pipeline developed by the National Center for Genome Research (NCGR).

Lisa K Johnson; Harriet Alexander; C Titus Brown

GigaScience
Published on: 13 December 2018

Full Text | PDF

Technical Note

Aequatus: an open-source homology browser

Phylogenetic information inferred from the study of homologous genes helps us to understand the evolution of genes and gene families, including the identification of ancestral gene duplication events as well as regions under positive or purifying selection within lineages. Gene family and orthogroup characterization enables the identification of syntenic blocks, which can then be visualized with various tools. Unfortunately, currently available tools display only an overview of syntenic regions as a whole, limited to the gene level, and none provide further details about structural changes within genes, such as the conservation of ancestral exon boundaries amongst multiple genomes.

Anil S Thanki; Nicola Soranzo; Javier Herrero; et al.

GigaScience
Published on: 5 November 2018

Full Text | PDF

Technical Note

PiGx: Reproducible genomics analysis pipelines with GNU Guix

In bioinformatics, as well as other computationally-intensive research fields, there is a need for workflows that can reliably produce consistent output, from known sources, independent of the software environment or configuration settings of the machine on which they are executed. Indeed, this is essential for controlled comparison between different observations or for the wider dissemination of workflows. Providing this type of reproducibility and traceability, however, is often complicated by the need to accommodate the myriad dependencies included in a larger body of software, each of which generally come in various versions. Moreover, in many fields (bioinformatics being a prime example), these versions are subject to continual change due to rapidly evolving technologies, further complicating problems related to reproducibility. Here, we propose a principled approach for building analysis pipelines and managing their dependencies with GNU Guix. 

Ricardo Wurmus; Bora Uyar; Brendan Osberg; et al.

GigaScience
Published on: 2 October 2018

Full Text | PDF

Technical Note

Monitoring changes in the Gene Ontology and their impact on genomic data analysis

The Gene Ontology (GO) is one of the most widely used resources in molecular and cellular biology, largely through the use of “enrichment analysis.” To facilitate informed use of GO, we present GOtrack, which provides access to historical records and trends in the GO and GO annotations.

Matthew Jacobson; Adriana Estela; Sedeño-Cortés; et al.

GigaScience
Published on: 1 August 2018

Full Text | PDF

Technical Note

PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics

Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results. 

Jie Zheng; Tom G Richardson; Louise A C Millard; et al.

GigaScience
Published on: 24 August 2018

Full Text | PDF

Research

Innovative assembly strategy contributes to understanding the evolution and conservation genetics of the endangered Solenodon paradoxus from the island of Hispaniola

Solenodons are insectivores that live in Hispaniola and Cuba. They form an isolated branch in the tree of placental mammals that are highly divergent from other eulipothyplan insectivores. The history, unique biology, and adaptations of these enigmatic venomous species could be illuminated by the availability of genome data. However, a whole genome assembly for solenodons has not been previously performed, partially due to the difficulty in obtaining samples from the field. Island isolation and reduced numbers have likely resulted in high homozygosity within the Hispaniolan solenodon (Solenodon paradoxus). Thus, we tested the performance of several assembly strategies on the genome of this genetically impoverished species. 

Kirill Grigorev; Sergey Kliver; Pavel Dobrynin; et al.

GigaScience
Published on: 16 March 2018

Full Text | PDF

Research

Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computation method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals.

Quan H Nguyen; Ross L Tellam; Marina Naval-Sanchez; et al.

GigaScience
Published on 16 February 2018

Full Text | PDF

Data Note

Draft genome of the reindeer (Rangifer tarandus)

The reindeer (Rangifer tarandus) is the only fully domesticated species in the Cervidae family, and it is the only cervid with a circumpolar distribution. Unlike all other cervids, female reindeer, as well as males, regularly grow cranial appendages (antlers, the defining characteristics of cervids). Moreover, reindeer milk contains more protein and less lactose than bovids' milk. A high-quality reference genome of this species will assist the efforts to elucidate these and other important features in the reindeer. 

Zhipeng Li; Zeshan Lin; Hengxing Ba; et al.

GigaScience
Published on: 1 November 2017

Full Text | PDF

Research

MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle

The increasing number of metagenomic and genomic sequences has dramatically improved our understanding of microbial diversity, yet our ability to infer metabolic capabilities in such datasets remains challenging. We describe the Multigenomic Entropy Based Score pipeline (MEBS), a software platform designed to evaluate, compare, and infer complex metabolic pathways in large "omic" datasets, including entire biogeochemical cycles. MEBS is open source and available through GitHub. To demonstrate its use, we modeled the sulfur cycle by exhaustively curating the molecular and ecological elements involved (compounds, genes, metabolic pathways, and microbial taxa). 

Valerie De Anda; Icoquih Zapata-Peñasco; Augusto Cesar Poot-Hernandez; et al.

GigaScience
Published on: 23 October 2017

Full Text | PDF

Data Note

The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. 

Aleksey V Zimin; Daniela Puiu; Richard Hall; et al.

GigaScience
Published on: 23 October 2017

Full Text | PDF

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close