Repository: Freie Universität Berlin, Math Department

Tailored Analysis in Studying Transcriptome Landscape

You, Xintian Arthur (2015) Tailored Analysis in Studying Transcriptome Landscape. PhD thesis, Freie Universität Berlin.

Full text not available from this repository.

Official URL:


The knowledge of the transcriptome landscape is crucial in molecular biology, and increasingly more important for disease diagnosis and treatments. Broadly speaking, three layers contribute to the importance of the transcriptome landscape. First, the profile of all isoforms of protein-coding genes determines the development path of cells and organisms. Second, the profile of regulatory elements modulates the activity of protein-coding genes. Third, the interplay of protein-coding genes and regulatory elements shapes the dynamic property of transcriptome landscape. Identifying the players in the regulatory network is the first step for reverse-engineering molecular biology. In this thesis, I present four tailored analyses on projects belonging to the first two layers. First, a hybrid assembly pipeline is developed for identification of transcriptome independent of genomic sequences. By combining two complementary sequencing technologies in conjunction with efficient cDNA normalization, a high quality transcriptome can be characterized. It out- performs other assembly tools that focus on one type of input data, and the results are experimentally validated. Second, an analysis framework is developed to characterize full-length transcripts. By tailoring tools for long read-length sequencing technology, transcriptome landscape could be examined with greater detail. Moreover, the association of different RNA processing events could be experimentally measured. The application on fly Dscam gene transcripts resolved the independent splicing hypothesis and calls for re- examination of previous experiments. The application on rat brain greatly enhanced the transcriptome annotation, which is crucial for the neuroscience community that use rat as a model organism. Third, a de novo microRNA prediction tools is presented. By designing sequencing experiments that capture snapshots of miRNA biogenesis process, not only mature and precursor miRNAs could be identified, but also the information on miRNA processing and modification could be learnt. Proof- of-principle experiments on well-studies organism like mouse and C. elegans demonstrate the efficacy and application potential of this method. Finally, a customized pipeline is developed for profiling and characterizing circRNAs. By examining potential splicing junctions based on local alignments, circRNAs can be identified from the otherwise neglected RNA- Seq data. Tens of thousands of circRNAs are identified and quantified in mouse, rat and fly. Further experiments demonstrate that circRNAs are enriched in brain synapses and participate in brain development and neuronal homeostatic plasticity. In summary, this thesis presents four tailored analyses on different aspects of transcriptome landscape. The methods can be used in conjunction towards an integrated understanding of molecular biology and medicine.

Item Type:Thesis (PhD)
Subjects:Mathematical and Computer Sciences > Computer Science
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:2532
Deposited By: Anja Kasseckert
Deposited On:24 Mar 2021 12:18
Last Modified:24 Mar 2021 12:18

Repository Staff Only: item control page