RNASSA: Ribonucleic Acid Secondary Structure Analysis Software Version: RNASSA 2.0 January 2015 Document Author: RNAVLab, Bioinformatics Program, UTEP Email: rnavlab@utep.edu Table of Contents for Release Notes: Updated 1/30/2015 ***************************************************************************************** 1. Root Directory (\RNASSA20) and Output Subdirectory (\RNASSA20\inverslist) 2. Allowable Characters in Sequence Name 3. Restart a New RNASSA Session After Having Problems Using Sequence Name with " or ' 4. What's new for Version 2.0? 5. Use of FASTA Files and Knowledge of Sequence Inversions 6. R Environment (GUI) Required for Running Segmentation 7. Find Inversions, Prepare Inversion List, and Run Segmentation 8. Files Generated in the RNASSA Root Directory and \inverslist Subdirectory 9. Update in Version 2.0.150128 for Including Overlapping Inversions as Default ***************************************************************************************** ***************************************************************************************** 1. Root Directory (\RNASSA20) and Output Subdirectory (\RNASSA20\inverslist) ***************************************************************************************** After unzipping the RNASSA20.zip file, the root directory (\RNASSA20) for the Segmenta20.jar executive file and output subdirectory (\RNASSA20\inverslist) for chunk files and PDF plots will be created automatically. ***************************************************************************************** 2. Allowable Characters in Sequence Name: no quotation and apostrophe ***************************************************************************************** Spaces and all characters on a common keyboard except two are allowed in the sequence name. Only quotation mark (") and apostrophe (') are not allowed as they cause problems during the prediction process on the Linux-based RNAVLab server. Version 1.0.120218 or later has a fix on this problem by converting " and ' to grave accent (`) automatically while the Insert Sequence method on Tab 1. ***************************************************************************************** 3. Restart a New RNASSA Session After Having Problems Using Sequence Name with " or ' ***************************************************************************************** After using a sequence name with " or ' for prediction in Tab 3, you must close the RNASSA session and restart a new one by logging again. ***************************************************************************************** 4. What's new for RNASSA Version 2.0? ***************************************************************************************** This a major upgrade from Version 1.0 with the integration of Segmenta 2.0.121208 under a new section inside Tab 1 and a button in Tab 2 to launch the standalone Segmenta 2.0 while running RNASSA. In addition, a new editing tool for cutting chunks has been added under "Manual Cut/Edit (Ctrl-M)" in the RNA menu or it can be launched by the "Manual Cut /Edit" button on the right side of Tab 2. Another important improvement is the change of Tab 5 (now named "Comparison" instead of "Alignments") where the listing of the two sequences is better formatted for analysis, and the comparison results are generated for structures up to a million bases long. In addition, Segmenta21.R for Regular and Centered Methods only is included for faster execution without running the Optimized Method in R. ***************************************************************************************** 5. Use of FASTA Files and Knowledge of Sequence Inversions ***************************************************************************************** Sequence file must be prepared in FASTA format and placed in the root directory for loading. Type the filename in the first input field or click "Browse" to select the .fasta file, and click the radio buttons for optional choices in finding the inversions. In addition, you may select your own starting and ending values of minimum stem lengths, maximum gaps, and number of mismatches allowed for Segmenta to run iteratively within the ranges. Default entries will show up after clicking the "Find Inversions" button and you may edit those values. ***************************************************************************************** 6. R Environment (GUI) Required for Running Segmentation ***************************************************************************************** In addition to the Java environment, RNASSA 2.0 requires R to be installed on your computer running either 32-bit and 64-bit Windows 7, and Version 2.14.2 is recommended. For other versions, you have to use the standalone version, Segmenta 2.0, by clicking "View/Cut" under the pull-down RNA menu. Press the left button "Segmenta 2.0" to activate its interface dialog box, then you may click the "Browse" button for finding "Path for R" in the lower left-hand corner of Segmenta 2.0. ***************************************************************************************** 7. Find Inversions, Prepare Inversion List, and Run Segmentation ***************************************************************************************** After finding the inversions, you may type in your own file name for the generated output file or you may just use the default one. Before running RNA segmentation, you have to click the "Prepare Inversion List" to generate files in the inverslist directory to be used by Segmenta 2.0 files. Then, you may enter the value of Maximum Chuck Size (C) or use the default of 100, and click "Run Segmentation" for the GUI version of R to load, where you select "Source R Code..." under the File menu. Highlight "Segmenta20.R" (with all methods) or "Segmenta21.R" (without using Optimized Method for shorter running time), and click the file to run the program. Output text, FASTA, and PDF Rplot files will be placed in the inverslist subdirectory. The FASTA files (in subdirectory \inverslist) for chunks after segmentation can be loaded directly to RNASSA 2.0 for prediction. ***************************************************************************************** 8. Files Generated in the RNASSA Root Directory and \inverslist Subdirectory ***************************************************************************************** In the RNASSA Root Directory: InversList_param.txt: a parameter file for R, overwritten by running another sequence sequenceName_filelist.txt: list of FASTA files and PDF Plots generated or excluded sequenceName_inverslist.txt: list of inversions indicated by the start and end positions sequenceName_noinversfiles.txt: list of files with no inversion sequenceName_output.txt: output file of the sequence and their inversions In the \inverslist Subdirectory: sequenceName.txt: text file of the sequence in letters without sequence name (not FASTA) sequenceName_C_Regular.fasta: sequence plus chunks cut by Regular Method (only one file) sequenceName_L...Centered.fasta: chunks cut by Centered Method sequenceName_L...Optimized.fasta: chunks cut by Optimized Method sequenceName_L......_Plot.pdf: excursion plots in PDF files for all three methods sequenceName_L...txt: selected files listing base positions of inversions, generated for Segmenta20.R (or Segmenta21.R), without files containing no inversion (except the first file in each set of consecutive files), and excluding those with mismatches more than 25% of the minimum stem length (e.g., for L=3, no mismatch is allowed; for L=11, only two is allowed). ***************************************************************************************** 9. Update in Version 2.0.150128 for Including Overlapping Inversions as Default ***************************************************************************************** In previous versions, the default was excluding overlaps and to include them by clicking the radio button "Report overlaps." In this release, the results will include overlaps by default and user has to click the radio "Exclude inversions inside longer ones" to exclude this type of inversions. ***************************************************************************************** Please email us at rnavlab@utep.edu immediately if you encounter problems or have comments. The latest version is RNASSA 2.0.150128.