Published Pages | chanaka | RNASeq Analysis Lab

Galaxy provides multiple tools for performing RNA-seq analysis.This exercise try to introduces these Galaxy tools(TopHat–>Cufflinks–>Cuffcompare–>Cuffdiff) and guides use of these tools to implement similar RNASeq Analysis Lab exercise .This tutorial is helps to familiar with Galaxy tools,Galaxy workflow and get basic understanding of RNA-seq analysis. 

Here are the sample datasets that we are going to use;

RNA samples data from leaves:

Galaxy Dataset | asp5_leaf_read1.P001.fq

Aspen leaves paired-end reads (2 into 50 bp) and the target insert size was 200 bp.

Galaxy Dataset | asp5_leaf_read2.P001.fq

Aspen leaves paired-end reads (2 into 50 bp) and the target insert size was 200 bp.

RNA samples data from xylem (woody tissue): 

Galaxy Dataset | asp5_xylem_read1.P001.fq

Aspen xylem paired-end reads (2 into 50 bp) and the target insert size was 200 bp.

Galaxy Dataset | asp5_xylem_read2.P001.fq

Aspen xylem paired-end reads (2 into 50 bp) and the target insert size was 200 bp.

Additional input file

We can copy following input file to existing history from shared data libraries(Shared Data>Data Libraries>Ptrichocarpa_129_gene>select Ptrichocarpa_129_gene.gtf>Import to current history>Go).Now we are going to use following Ptrichocarpa_129_gene.gtf dataset for [NGS:RNA Analysis >]cuffcompare function(Use Reference Annotation:Yes;Reference Annotation:select Ptrichocarpa_129_gene.gtf ) to identify predicted transcripts in common between the two samples. 

Galaxy Dataset | Ptrichocarpa_129_gene.gtf

Ptrichocarpa_129_gene.gtf

Understand the workflow in RNAseq Gene Expression

  • Get data in Galaxy (Get Data) 
  • Read random lines from input fastq pairs.
  • FASTQ format (Fastq Groomer)
  • Quality control check and Control (FastQC,Fastq Trimmer)
  • Map sequence reads on genome (Tophat)
  • Assemble the transcripts (Cufflinks)
  • Splice and isoform (CuffCompare)
  • Counts and FPRM (cufflinks) 
  • Differential expression (Cuffdiff)

Data processing flow

RNASeq workflow

Example use

Step1:Import datasets (asp5_leaf_read1.P001.fq,asp5_leaf_read2.P001.fq,asp5_xylem_read1.P001.fq,asp5_xylem_read2.P001.fq,Ptrichocarpa_129_gene.gtf) to current history.We can import datasets by using  import icon.

Step2:Now we can run RNASeq Analysis workflow(workflow>RNASeq Analysis>Run).

Step3:We need to manually select the input datasets respectively. Select Send results to a new history and Run Workflow.Lets say we rename it to "RNASeq Analysis history"

Galaxy Workflow

Future

This is typical steps for RNASeq Analysis. We can change the workflow parameters according to our requirements OR we can run tools seperatly.

Final Results as follows

For in-depth analysis we can send data to Popgenie.org or GBrowse (Send data to Popgenie).