Status
[November 2019] The Scenedesmus obliquus EN0004 genome was sequenced with PacBio, assembled using MECAT with polishing by ARROW, and annotated with the JGI Annotation Pipeline. The transcriptome was sequenced with Illumina and assembled with Trinity. IsoSeq reads sequenced by Pacbio were also used in the annotation. RNASeq and IsoSeq reads were provided by our collaborators and not sequenced by JGI. Mitochondrial and chloroplast genomes were assembled separately and are available in the downloads section.
Summary statistics for the Scenedesmus obliquus EN0004
v1.0 release are below.
Genome Assembly | |
Genome Assembly size (Mbp) | 100.08 |
Sequencing read coverage depth | 236.48x |
# of contigs | 99 |
# of scaffolds | 99 |
# of scaffolds >= 2Kbp | 99 |
Scaffold N50 | 13 |
Scaffold L50 (Mbp) | 2.49 |
# of gaps | 0 |
% of scaffold length in gaps | 0.0% |
Three largest Scaffolds (Mbp) | 10.97, 5.63, 4.30 |
ESTs | Data set | # sequences total | # mapped to genome | % mapped to genome |
Ests | est.fasta | 159169885 | 95185614 | 59.8% |
Other | IsoSeq_reads | 182475 | 163006 | 89.3% |
Other | JGI_RNA_contigs | 56668 | 20666 | 36.5% |
- Please note: Using BLAST, unmapped RNA reads and contigs were found to hit Homo sapiens, as well as Proteobacteria (Burkholderia contaminans) and Actinobacteria (Cutibacterium acnes), suggesting that the RNA library was contaminated.
Gene Models | FilteredModels1 | |
length (bp) of: | average | median |
gene | 4071 | 2946 |
transcript | 1855 | 1470 |
exon | 275 | 148 |
intron | 388 | 299 |
description: | ||
protein length (aa) | 446 | 308 |
exons per gene | 6.74 | 5 |
# of gene models | 15053 |
Collaborators
- James Umen at Danforth Center
- Shawn Starkenburg at LANL
- Juergen Polle at Brooklyn College of CUNY
Funding
The work conducted by the U.S. Department of Energy Joint Genome
Institute, a DOE Office of Science User Facility, is supported by
the Office of Science of the U.S. Department of Energy under
Contract No. DE-AC02-05CH11231.