hESC transcriptome - H1 cell line
This website contains the data and the results in each section of our PNAS paper. You can click the section title to find the updated data, figures, tables and other results.
- Gene isoform detection and prediction
- 8,084 detections and 10,811 predictions of gene isoforms. Overall identification rate is 74%.
- Novel genes
- More than 100 novel isoforms are identied in the novel gene loci, which are not reported by existing annotations (RefSeq, Ensembl, KnownGen and Gencode).
- hESC-specific novel genes
- 23 novel genes (Human Pluripotent Assoiated Transcript, HPAT) have specificly high expression in hESC.
- Novel gene isoforms
- 2,103 novel isoforms are idenfied and are not reported bexisting annotations (RefSeq, Ensembl, KnownGen and Gencode).
- Quantification of isoform abundance
- Gene isoform abundance estimation by using this identified transcriptome.
- Isoforms of pluripotent stem cell markers
- The expressed isoforms (including novel ones) of the pluripotent stem cell markers, such as Oct4, Sall4, Sox2 ...
- ncRNA identification
- 116 novel lncRNAs are identified and 68 have relatively high expressions in hESC.
- Data
- Millions of PacBio long reads and 100 million of Illumina short reads.
- How it works?
- The SpliceMap-LSC-IDP pipeline is applied to identify gene isoforms.
We keep optimizing the results by running the updated LSC and IDP. More and more reliable results will be updated. If you want to subscribe our update news, please contact Kin Fai Au:
Latest publications
Durruthy-Durruthy J., Sebastiano V., Wossidlo M., Cepeda D., Cui J., Grow E.J., Davila J., Mall M., Wing W.H., Wysocka J., Au, K.F., Pera, R.R.
A novel primate-specific noncoding RNA modulates human embryo- and pluripotent stem cell fate.
Nature Genetics. 2015. In press.
Kin Fai Au, Vittorio Sebastiano, Pegah Tootoonchi Afshar, Jens Durruthy Durruthy, Lawrence Lee, Brian A. Williams, Honoratus Van Bakel, Eric Schadt, Renee A. Reijo Pera, Jason Underwood, Wing Hung Wong
Characterization of the human ESC transcriptome by hybrid sequencing [preprint]
Proc. Natl. Acad. Sci. USA 2013 110 (50) E4821-E4830 [preprint]
SpliceMap-LSC-IDP pipeline
This hESC transcriptome is identified by SpliceMap-LSC-IDP pipeline.
SpliceMap takes short reads from the Second Generation Sequencing platforms, such as Illumina, to detect exon junctions.
LSC makes use of the high-quality short reads to correct the long reads from PacBio platform. The output is the error-corrected long reads.
IDP uses the junction detections and the alignment of error-corrected long reads to detect the relatively short isoforms at full-length and predict the very long isoforms by statistical modeling.
Latest News
11-26-2013: hESC transcriptome and IDP paper is released
Kin Fai Au, Vittorio Sebastiano, Pegah Tootoonchi Afshar, Jens Durruthy Durruthy, Lawrence Lee, Brian A. Williams, Honoratus Van Bakel, Eric Schadt, Renee A. Reijo Pera, Jason Underwood, Wing Hung WongCharacterization of the human ESC transcriptome by hybrid sequencing
If you want to subscribe our update news, please contact Kin Fai Au: .