Info - Scenedesmus obliquus UTEX B 3031

Status

The genome sequence and gene models of Scenedesmus obliquus strain UTEX B 3031 (=DOE0152Z) were not determined by the Joint Genome Institute (JGI). The annotation was performed by Dr. Zaid McKie-Krisberg at Brooklyn College of the City University of New York. In order to allow comparative analyses with other genomes sequenced by the JGI, a copy of this genome is incorporated into the JGI Genome Portal. JGI tools were used to automatically annotate predicted proteins. Please note that the release presented here includes the gene annotation v1.0, and all gene models are included in the ExternalModels track. This annotation has not been published and permission should be requested for use.

We applied filters to remove if present: 1) transposable elements, 2) pseudogenes, 3) alternative transcripts and overlapping models, 4) alleles on secondary scaffolds, and 5) unsupported short models. This resulted in removal of 14,261 models from S. obliquus UTEX B 3031 and generation of the FilteredModels2 gene track. JGI tools were used to automatically annotate predicted proteins. Please note that this copy of the genome is not maintained by the JGI and is therefore not automatically updated.

S. obliquus UTEX B 3031 is likely diploid, and this is reflected in an assembly and annotation with significant separation of alleles. 1,705 of the 2,812 scaffolds are very similar to larger scaffolds and are predicted to constitute an alternate or secondary haplotype. To represent these primary and secondary haplotypes in the Portal, we have created 'primary alleles' and 'secondary alleles' gene model tracks, comprising the models found on each haplotype. The goal of the GeneCatalog (GC) is to produce a non-redundant set of models which captures the full functional repertoire of the genome, and so the few secondary alleles that are unique were included in the GC, while all others were not.

Summary statistics for the Scenedesmus obliquus UTEX B 3031 v1.0 release are below.

Genome Assembly
Genome Assembly size (Mbp)	210.26
Sequencing read coverage depth	86x
# of contigs	2812
# of scaffolds	2812
# of scaffolds >= 2Kbp	2812
Scaffold N50	348
Scaffold L50 (Mbp)	0.15
# of gaps	0
% of scaffold length in gaps	0.0%
Three largest Scaffolds (Mbp)	2.33, 1.49, 1.27

Gene Models	FilteredModels2
length (bp) of:	average	median
gene	4373	3494
transcript	1878	1566
exon	259	156
intron	401	312
description:
protein length (aa)	456	363
exons per gene	7.25	6
# of gene models	22378

Collaborators

Dr. Juergen Polle at Brooklyn College of CUNY
Dr. Zaid McKie-Krisberg at Brooklyn College of CUNY
Dr. Shawn Starkenburg at LANL

Genome Reference(s)

Please cite the following publication(s) if you use the data from this genome in your research:

Starkenburg SR, Polle JEW, Hovde B, Daligault HE, Davenport KW, Huang A, Neofotis P, McKie-Krisberg Z
Draft Nuclear Genome, Complete Chloroplast Genome, and Complete Mitochondrial Genome for the Biofuel/Bioproduct Feedstock Species Scenedesmus obliquus Strain DOE0152z.
Genome Announc. 2017 Aug 10;5(32):. doi: 10.1128/genomeA.00617-17

Funding

This project was not sequenced at the JGI.

Status

Collaborators

Genome Reference(s)

Links

Funding