ERASE-Seq Cloud-Based Analysis

Print Friendly and PDF

Summary:

This document describes the usage of the ERASE-Seq v2.4b front-end client for cloud-based analysis.

Obtaining the client

  1. Download the Client
  2. The client requires Windows 10 (Windows 7 with .NET4.X is also acceptable)
  3. To install the client, simply drag the application folder into a desired location and unzip.
  4. Make sure all files are extracted into a single folder.

Obtaining a license:

  1. Launch the application
  2. When prompted, choose between ‘Local execution’ or ‘Cloud execution.’ Select Cloud execution.’
  3. A prompt will state ‘Your copy of the application is not activated.’ Selecting ‘OK’ will display a unique, machine specific user ID (UID).
  4. Copy the UID (or select ‘Copy to Clipboard’) and send to support@fluxionbio.com.
  5. A unique license will be sent to you for cloud execution.
  6. Once a license file has been received, open it with a text editor and paste the content into the ‘License’ field. Select ‘OK.’ The client is now activated for cloud runs.
  7. The client may throw an exception about the log file being available for read/write, depending on your system configuration. This can be safely ignored at this time.

Run Execution

Data preparation

  1. Place all raw reads, in gzipped FASTQ (fastq.gz) format, into a folder.
    1. Reads must be named according to the following convention
      [sample_identifier]_Tube_[A/B]_R[1/2]_001.fastq.gz
      1. sample_identifier: a unqiue sample name without spaces
      2. [A/B]: reaction identifier A, B
      3. R1/R2: forward or reverse reads
        1. Example of a one-reaction effort:
          1. Sample-1_Tube_A_R1_001.fastq.gz
          2. Sample-1_Tube_A_R2_001.fastq.gz
        2. Two-reaction-per-sample sequencing will have four FASTQ files:
          1. Sample-1_Tube_A_R1_001.fastq.gz
          2. Sample-1_Tube_A_R2_001.fastq.gz
          3. Sample-1_Tube_B_R1_001.fastq.gz
          4. Sample-1_Tube_B_R2_001.fastq.gz
    2. Four-reaction sequencing assays will have eight files corresponding to Tubes A, B, C, and D.

Folder and pipeline selection

  1. Select this directory by clicking on ‘choose folder’ under ‘Input Files:
  2. By default, run results will be returned to the same folder. You may select a different location, if desired.
  3. The first time a run is executed, only the option ‘Primer trimming, alignment, and caller’ is available. Subsequent runs with the same sample, if not reset, are able to utilize reads that have already been processed by selecting the ‘Caller only (0.5 hour per sample)’ option.
  4. Sample name and replicate IDs are auto-populated by parsing file names following conventions outlined in 3.1.1.2.
  5. A pipeline version, panel, and number of replicates may be selected from drop-down menus under ‘Pipeline Settings.’ If unsure of the appropriate pipeline to run, submit a support ticket.
  6. We strongly suggest consulting with Fluxion when running a new set of data. While a general description of each pipeline can be accessed by clicking the information button (i) next to the Pipeline Version drop-down menu, Fluxion personnel are best equipped to recommend a pipeline that will best suit your analysis goals.

Starting a run

  1. Selecting ‘Start’ will begin the run with the desired settings
  2. Raw data files will be uploaded to the cloud. Do not exit the client during this time.
  3. Once the file transfer is complete, compute resources will be provisioned, and remote data analysis will commence.
  4. The client monitors log files as part of this process and displays the current state of the run. The client may be quit, computers turned off, etc., without affecting the run or results once the pipeline has started. When restarting the client, a connection will be made with the log file on the remote server and run status will be reported.

Returning results

  1. Upon completion, results will be downloaded into the folder selected before starting the run or the same folder as the input files (default).
  2. The user has the option to change the pipeline settings and restart a run. Additionally, subsequent runs may be executed with the caller only.
  3. A ‘Reset’ button will be also be present. Selecting this option will prompt the user to save any information, such as logs, that may be desired. While results are downloaded to the folder automatically, fully resetting a run will remove data on the cloud and will require a full re-run, including trimming, for additional analysis. This will ready the client for the next datasets to be analyzed.

Result files

  1. Three files are returned:
    1. A VCF with annotated variants.
    2. A filtered results table, returned as a tab-delimited file (TSV) with gene and variant information as well as COSMIC ID. 
    3. A filtered results table, returned in Microsoft Excel Open XML format (XLSX). This is identical to the TSV except for its format.