Harwood Lab masthead
School of MedicineUniversity of Washington • Box 357735 • 1705 NE Pacific St • Seattle WA 98195
   
Example dataset walkthrough — Xpression 1.0rc1 documentation

Example dataset walkthrough

This written guide will follow that video which shows a sample of the example dataset as it is run in Xpression. It will follow the steps as shown in the video below.

Needed items

To proceed with the walkthrough, please ensure you have a copy of the Xpression GUI (See Download) and all software is installed, or use Xpression Virtual Environment.

Please see Installation for further details.

Next, you will need the Example Dataset. Of course, you can use your own data, so long as a reference fasta, reference Genbank and of course a sequencing fastq file are included.

Walkthrough video

Start Xpression

You should be able to start Xpression by simply double-clicking the Xpression.jar file. If this does not work, try right-clicking the Xpression.jar file and choosing Java as the program to run this file with. If there is no application option, look for a “custom command” field, enter java -jar and press enter. Alternatively, you can type java -jar ./path/to/Xpression.jar into a terminal emulator.

Now the Xpression window is open.

We need to specify some options to let Xpression know what and where our sample is.

Entering Sample information

First, lets change the Sample ID to Example1. Next, the barcode needs to be defined to allow reads with a certain barcode to be extracted. For our multiplexed example, the four-nucleotide barcode is ACCC, so we type ACCC into the Barcode field. Now all reads starting with this barcode will be included in the sample output.

Xpression will complete all the steps of analysis by default, but you can click on the drop-down box Final Step to explore which steps will occur.

Locating reference files

Now the sequencing file itself must be located on the filesystem. This file is in fastq format, with the fastq quality encoding which is the default. This file can remain compressed to save time. Click on the Sequencing fastq Get File and browse to the place where this file is located.

Next, the location of the sample organism’s genome reference in fasta format is entered. This file is used as a reference to map the reads. Click Genome reference Get File to locate this file.

Finally, since we want to generate expression data for our sample, a genbank reference file corresponding to the organism of the sample needs to be located. If the pipeline is only run to step 2, this is not needed. Click Genbank reference Get File to specify where this file is.

Now that all reference files have been located, and some sample information is entered, Xpression is ready to run.

Tip

Other options can be changed, but for the common set of parameters used for this file, they can be left as defaults. These other options are located in the Options window, accessed by clicking Edit options.. from the Options menu.

Running Xpression

Now click the Start Run Start Run button located in the upper right hand corner of the main window. This will add the sample to the list and start the pipeline for this sample.

Various output shows the steps and status of the pipeline as it is running.

Once Xpression has completed, browse to the Xpression_results folder in your home directory. There will be a folder labeled example, the name of the example fastq file. Within this folder, there will be a folder named RNAseq-1 since this was the Sample ID.