Transdecoder#

Introduction#

Transdecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Versions#

5.5.0
5.7.1

Commands#

cdna_alignment_orf_to_genome_orf.pl
compute_base_probs.pl
exclude_similar_proteins.pl
fasta_prot_checker.pl
ffindex_resume.pl
gene_list_to_gff.pl
get_FL_accs.pl
get_longest_ORF_per_transcript.pl
get_top_longest_fasta_entries.pl
gff3_file_to_bed.pl
gff3_file_to_proteins.pl
gff3_gene_to_gtf_format.pl
gtf_genome_to_cdna_fasta.pl
gtf_to_alignment_gff3.pl
gtf_to_bed.pl
nr_ORFs_gff3.pl
pfam_runner.pl
refine_gff3_group_iso_strip_utrs.pl
refine_hexamer_scores.pl
remove_eclipsed_ORFs.pl
score_CDS_likelihood_all_6_frames.pl
select_best_ORFs_per_transcript.pl
seq_n_baseprobs_to_loglikelihood_vals.pl
start_codon_refinement.pl
train_start_PWM.pl
TransDecoder.LongOrfs
TransDecoder.Predict
uri_unescape.pl

Example job#

Adjust slurm options based on job requirements (slurm cheat sheet):

#!/bin/bash
#SBATCH -p partitionName  # batch, gpu, preempt, mpi or your group's own partition
#SBATCH -t 1:00:00  # Runtime limit (D-HH:MM:SS)
#SBATCH -N 1   # Number of nodes
#SBATCH -n 1   # Number of tasks per node
#SBATCH -c 4   # Number of CPU cores per task
#SBATCH --mem=8G       # Memory required per node
#SBATCH --job-name=transdecoder        # Job name
#SBATCH --mail-type=FAIL,BEGIN,END     # Send an email when job fails, begins, and finishes
#SBATCH --mail-user=your.email@tufts.edu       # Email address for notifications
#SBATCH --error=%x-%J-%u.err   # Standard error file: <job_name>-<job_id>-<username>.err
#SBATCH --output=%x-%J-%u.out  # Standard output file: <job_name>-<job_id>-<username>.out

module purge   ### Optional, but highly recommended.
module load transdecoder/XXXX  ### Latest version is recommended.