--- tags: bioinformatics --- # Bioinformatics applications Below are bioinformatics applications deployed by Tufts Research Technology. ```{gallery-grid} :grid-columns: 1 :grid-rows: 16 - header: "{fas}`dna;pst-color-primary` abcreg" content: "ABCreg is a tool used for automating approximate Bayesian computation by local linear regression." link: "doc/abcreg/abcreg.html" - header: "{fas}`dna;pst-color-primary` abyss" content: "ABySS is a de novo sequence assembler intended for short paired-end reads and genomes of all sizes." link: "doc/abyss/abyss.html" - header: "{fas}`dna;pst-color-primary` alphafold" content: "Alphafold is an artificial intelligence program developed by Alphabets's/Google's DeepMind which performs predictions of protein structure." link: "doc/alphafold/alphafold.html" - header: "{fas}`dna;pst-color-primary` amplify" content: "AMPlify is an attentive deep learning model for antimicrobial peptide prediction." link: "doc/amplify/amplify.html" - header: "{fas}`dna;pst-color-primary` angsd" content: "Angsd is a software for analyzing next generation sequencing data." link: "doc/angsd/angsd.html" - header: "{fas}`dna;pst-color-primary` bakta" content: "Bakta is a tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs. It provides dbxref-rich, sORF-including and taxon-independent annotations in machine-readable JSON & bioinformatics standard file formats for automated downstream analysis." link: "doc/bakta/bakta.html" - header: "{fas}`dna;pst-color-primary` bbmap" content: "Bbmap is a short read aligner, as well as various other bioinformatic tools." link: "doc/bbmap/bbmap.html" - header: "{fas}`dna;pst-color-primary` bbtools" content: "BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data." link: "doc/bbtools/bbtools.html" - header: "{fas}`dna;pst-color-primary` bcftools" content: "Bcftools is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF." link: "doc/bcftools/bcftools.html" - header: "{fas}`dna;pst-color-primary` beast2" content: "BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences." link: "doc/beast2/beast2.html" - header: "{fas}`dna;pst-color-primary` bedops" content: "Bedops is a software package for manipulating and analyzing genomic interval data." link: "doc/bedops/bedops.html" - header: "{fas}`dna;pst-color-primary` bedtools" content: "Bedtools is an extensive suite of utilities for genome arithmetic and comparing genomic features in BED format." link: "doc/bedtools/bedtools.html" - header: "{fas}`dna;pst-color-primary` biobakery_workflows" content: "BioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters." link: "doc/biobakery_workflows/biobakery_workflows.html" - header: "{fas}`dna;pst-color-primary` biopython" content: "Biopython is a set of freely available tools for biological computation written in Python." link: "doc/biopython/biopython.html" - header: "{fas}`dna;pst-color-primary` blast" content: "BLAST (Basic Local Alignment Search Tool) finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance." link: "doc/blast/blast.html" - header: "{fas}`dna;pst-color-primary` bowtie2" content: "Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes." link: "doc/bowtie2/bowtie2.html" - header: "{fas}`dna;pst-color-primary` breseq" content: "Breseq is a computational pipeline for the analysis of short-read re-sequencing data." link: "doc/breseq/breseq.html" - header: "{fas}`dna;pst-color-primary` busco" content: "BUSCO (Benchmarking sets of Universal Single-Copy Orthologs) provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs." link: "doc/busco/busco.html" - header: "{fas}`dna;pst-color-primary` cactus" content: "Cactus is a reference-free whole-genome multiple alignment program." link: "doc/cactus/cactus.html" - header: "{fas}`dna;pst-color-primary` canu" content: "Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION)." link: "doc/canu/canu.html" - header: "{fas}`dna;pst-color-primary` cellprofiler" content: "CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically." link: "doc/cellprofiler/cellprofiler.html" - header: "{fas}`dna;pst-color-primary` cellprofiler-analyst" content: "CellProfiler Analyst allows interactive exploration and analysis of data, particularly from high-throughput, image-based experiments." link: "doc/cellprofiler-analyst/cellprofiler-analyst.html" - header: "{fas}`dna;pst-color-primary` cellranger" content: "Cellranger is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more." link: "doc/cellranger/cellranger.html" - header: "{fas}`dna;pst-color-primary` cellranger-atac" content: "Cellranger-atac is a set of analysis pipelines that process Chromium Single Cell ATAC data." link: "doc/cellranger-atac/cellranger-atac.html" - header: "{fas}`dna;pst-color-primary` cellrank" content: "Cellrank is a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data." link: "doc/cellrank/cellrank.html" - header: "{fas}`dna;pst-color-primary` cufflinks" content: "Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols." link: "doc/cufflinks/cufflinks.html" - header: "{fas}`dna;pst-color-primary` cutadapt" content: "Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads." link: "doc/cutadapt/cutadapt.html" - header: "{fas}`dna;pst-color-primary` diamond" content: "Diamond is sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data." link: "doc/diamond/diamond.html" - header: "{fas}`dna;pst-color-primary` dorado" content: "Dorado is a high-performance, easy-to-use, open source basecaller for Oxford Nanopore reads." link: "doc/dorado/dorado.html" - header: "{fas}`dna;pst-color-primary` dragon_ora" content: "The DRAGEN ORA Helper Suite Software is a suite of software for Linux distributions, designed to integrate in a transparent manner compressed FASTQ." link: "doc/dragon_ora/dragon_ora.html" - header: "{fas}`dna;pst-color-primary` exonerate" content: "Exonerate is a generic tool for pairwise sequence comparison/alignment." link: "doc/exonerate/exonerate.html" - header: "{fas}`dna;pst-color-primary` fasta3" content: "Fasta3 is a suite of programs for searching nucleotide or protein databases with a query sequence." link: "doc/fasta3/fasta3.html" - header: "{fas}`dna;pst-color-primary` fastp" content: "Fastp is an ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging, etc)." link: "doc/fastp/fastp.html" - header: "{fas}`dna;pst-color-primary` fastqc" content: "FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis." link: "doc/fastqc/fastqc.html" - header: "{fas}`dna;pst-color-primary` fastspar" content: "FastSpar is a C++ implementation of the SparCC algorithm which is up to several thousand times faster than the original Python2 release and uses much less memory. The FastSpar implementation provides threading support and a p-value estimator which accounts for the possibility of repetitious data permutations." link: "doc/fastspar/fastspar.html" - header: "{fas}`dna;pst-color-primary` fasttree" content: "FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7." link: "doc/fasttree/fasttree.html" - header: "{fas}`dna;pst-color-primary` filtlong" content: "Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter." link: "doc/filtlong/filtlong.html" - header: "{fas}`dna;pst-color-primary` flye" content: "Flye: Fast and accurate de novo assembler for single molecule sequencing reads" link: "doc/flye/flye.html" - header: "{fas}`dna;pst-color-primary` fqtk" content: "fqtk is a toolkit for working with FASTQ files, written in Rust." link: "doc/fqtk/fqtk.html" - header: "{fas}`dna;pst-color-primary` gatk4" content: "GATK (Genome Analysis Toolkit) is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discoverye." link: "doc/gatk4/gatk4.html" - header: "{fas}`dna;pst-color-primary` genomad" content: "geNomad: Identification of mobile genetic elements." link: "doc/genomad/genomad.html" - header: "{fas}`dna;pst-color-primary` geomx_ngs_pipeline" content: "The GeoMx NGS Pipeline, developed by NanoString, is an essential part of the GeoMx NGS workflow." link: "doc/geomx_ngs_pipeline/geomx_ngs_pipeline.html" - header: "{fas}`dna;pst-color-primary` guppy" content: "Guppy is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features." link: "doc/guppy/guppy.html" - header: "{fas}`dna;pst-color-primary` hap.py" content: "Hap.py is a tool to compare diploid genotypes at haplotype level." link: "doc/hap.py/hap.py.html" - header: "{fas}`dna;pst-color-primary` hisat2" content: "HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome." link: "doc/hisat2/hisat2.html" - header: "{fas}`dna;pst-color-primary` hmmer" content: "Hmmer is used for searching sequence databases for sequence homologs, and for making sequence alignments." link: "doc/hmmer/hmmer.html" - header: "{fas}`dna;pst-color-primary` homer" content: "HOMER is a suite of tools for Motif Discovery and next-gen sequencing analysis." link: "doc/homer/homer.html" - header: "{fas}`dna;pst-color-primary` htseq" content: "HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments." link: "doc/htseq/htseq.html" - header: "{fas}`dna;pst-color-primary` humann" content: "Humann is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads)." link: "doc/humann/humann.html" - header: "{fas}`dna;pst-color-primary` impute2" content: "Impute2 is a genotype imputation and haplotype phasing program." link: "doc/impute2/impute2.html" - header: "{fas}`dna;pst-color-primary` iqtree2" content: "IQ-TREE is an efficient phylogenomic software by maximum likelihood." link: "doc/iqtree2/iqtree2.html" - header: "{fas}`dna;pst-color-primary` kallisto" content: "Kallisto is a program for quantifying abundances of transcripts from bulk and single-cell RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads." link: "doc/kallisto/kallisto.html" - header: "{fas}`dna;pst-color-primary` kneaddata" content: "Kneaddata is a tool designed to perform quality control on metagenomic sequencing data." link: "doc/kneaddata/kneaddata.html" - header: "{fas}`dna;pst-color-primary` kraken2" content: "Kraken2 is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences." link: "doc/kraken2/kraken2.html" - header: "{fas}`dna;pst-color-primary` krakentools" content: "Krakentools is a suite of scripts to be used for post-analysis of Kraken/KrakenUniq/Kraken2/Bracken results." link: "doc/krakentools/krakentools.html" - header: "{fas}`dna;pst-color-primary` macs2" content: "MACS2 is Model-based Analysis of ChIP-Seq for identifying transcript factor binding sites." link: "doc/macs2/macs2.html" - header: "{fas}`dna;pst-color-primary` macs3" content: "Macs3 is Model-based Analysis of ChIP-Seq for identifying transcript factor." link: "doc/macs3/macs3.html" - header: "{fas}`dna;pst-color-primary` masurca" content: "The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner." link: "doc/masurca/masurca.html" - header: "{fas}`dna;pst-color-primary` medaka" content: "Medaka is a tool to create consensus sequences and variant calls from nanopore sequencing data." link: "doc/medaka/medaka.html" - header: "{fas}`dna;pst-color-primary` megahit" content: "Megahit is a ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph." link: "doc/megahit/megahit.html" - header: "{fas}`dna;pst-color-primary` meme" content: "Meme is a collection of tools for the discovery and analysis of sequence motifs. Contents." link: "doc/meme/meme.html" - header: "{fas}`dna;pst-color-primary` metaphlan" content: "Metaphlan is computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level." link: "doc/metaphlan/metaphlan.html" - header: "{fas}`dna;pst-color-primary` miniasm" content: "Miniasm is a very fast OLC-based de novo assembler for noisy long reads." link: "doc/miniasm/miniasm.html" - header: "{fas}`dna;pst-color-primary` minimap2" content: "Minimap2 is a versatile pairwise aligner for genomic and spliced nucleotide sequences." link: "doc/minimap2/minimap2.html" - header: "{fas}`dna;pst-color-primary` minipolish" content: "Minipolish is a tool for Racon polishing of miniasm assemblies." link: "doc/minipolish/minipolish.html" - header: "{fas}`dna;pst-color-primary` mirdeep2" content: "miRDeep2 discovers active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, ...)." link: "doc/mirdeep2/mirdeep2.html" - header: "{fas}`dna;pst-color-primary` mirge3" content: "Mirge3 is an update to Python package to perform comprehensive analysis of small RNA sequencing data, including miRNA annotation, A-to-I editing, novel miRNA detection, isomiR analysis, visualization through IGV, processing Unique Molecular Identifieres (UMI), tRF detection and producing interactive graphical output." link: "doc/mirge3/mirge3.html" - header: "{fas}`dna;pst-color-primary` mothur" content: "Mothur is an open source software package for bioinformatics data processing." link: "doc/mothur/mothur.html" - header: "{fas}`dna;pst-color-primary` multiqc" content: "Multiqc is a reporting tool that parses summary statistics from results and log files generated by other bioinformatics tools." link: "doc/multiqc/multiqc.html" - header: "{fas}`dna;pst-color-primary` nf-core-ampliseq" content: "nfcore/ampliseq is a bioinformatics analysis pipeline used for amplicon sequencing, supporting denoising of any amplicon and supports a variety of taxonomic databases for taxonomic assignment including 16S, ITS, CO1 and 18S. Phylogenetic placement is also possible. Multiple region analysis such as 5R is implemented. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data. Default is the analysis of 16S rRNA gene amplicons sequenced paired-end with Illumina." link: "doc/nf-core-ampliseq/nf-core-ampliseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-atacseq" content: "nfcore/atacseq is a bioinformatics analysis pipeline used for ATAC-seq data." link: "doc/nf-core-atacseq/nf-core-atacseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-bacass" content: "nf-core/bacass is a bioinformatics best-practice analysis pipeline for simple bacterial assembly and annotation. The pipeline is able to assemble short reads, long reads, or a mixture of short and long reads (hybrid assembly)." link: "doc/nf-core-bacass/nf-core-bacass.html" - header: "{fas}`dna;pst-color-primary` nf-core-bamtofastq" content: "nf-core/bamtofastq is a bioinformatics best-practice analysis pipeline that converts (un)mapped .bam or .cram files into fq.gz files." link: "doc/nf-core-bamtofastq/nf-core-bamtofastq.html" - header: "{fas}`dna;pst-color-primary` nf-core-chipseq" content: "nfcore/chipseq is a bioinformatics analysis pipeline used for Chromatin ImmunopreciPitation sequencing (ChIP-seq) data." link: "doc/nf-core-chipseq/nf-core-chipseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-denovotranscript" content: "nf-core/denovotranscript is a bioinformatics pipeline for de novo transcriptome assembly of paired-end short reads from bulk RNA-seq. It takes a samplesheet and FASTQ files as input, perfoms quality control (QC), trimming, assembly, redundancy reduction, pseudoalignment, and quantification. It outputs a transcriptome assembly FASTA file, a transcript abundance TSV file, and a MultiQC report with assembly quality and read QC metrics." link: "doc/nf-core-denovotranscript/nf-core-denovotranscript.html" - header: "{fas}`dna;pst-color-primary` nf-core-detaxizer" content: "nf-core/detaxizer is a pipeline to assess raw (meta)genomic data for contaminations and optionally filter reads which were classified as contamination. Default taxa classified as contamination are Homo and Homo sapiens." link: "doc/nf-core-detaxizer/nf-core-detaxizer.html" - header: "{fas}`dna;pst-color-primary` nf-core-differentialabundance" content: "nf-core/differentialabundance is a bioinformatics pipeline that can be used to analyse data represented as matrices, comparing groups of observations to generate differential statistics and downstream analyses. The pipeline supports RNA-seq data such as that generated by the nf-core rnaseq workflow, and Affymetrix arrays via .CEL files." link: "doc/nf-core-differentialabundance/nf-core-differentialabundance.html" - header: "{fas}`dna;pst-color-primary` nf-core-eager" content: "nf-core/eager is a scalable and reproducible bioinformatics best-practise processing pipeline for genomic NGS sequencing data, with a focus on ancient DNA (aDNA) data. It is ideal for the (palaeo)genomic analysis of humans, animals, plants, microbes and even microbiomes." link: "doc/nf-core-eager/nf-core-eager.html" - header: "{fas}`dna;pst-color-primary` nf-core-fetchngs" content: "nf-core/fetchngs is a bioinformatics pipeline to fetch metadata and raw FastQ files from both public databases. At present, the pipeline supports SRA / ENA / DDBJ / GEO ids." link: "doc/nf-core-fetchngs/nf-core-fetchngs.html" - header: "{fas}`dna;pst-color-primary` nf-core-funcscan" content: "nf-core/funcscan is a bioinformatics best-practice analysis pipeline for the screening of nucleotide sequences such as assembled contigs for functional genes. It currently features mining for antimicrobial peptides, antibiotic resistance genes and biosynthetic gene clusters." link: "doc/nf-core-funcscan/nf-core-funcscan.html" - header: "{fas}`dna;pst-color-primary` nf-core-hic" content: "nf-core/hic is a bioinformatics best-practice analysis pipeline for Analysis of Chromosome Conformation Capture data (Hi-C)." link: "doc/nf-core-hic/nf-core-hic.html" - header: "{fas}`dna;pst-color-primary` nf-core-mag" content: "nf-core/mag is a bioinformatics best-practise analysis pipeline for assembly, binning and annotation of metagenomes." link: "doc/nf-core-mag/nf-core-mag.html" - header: "{fas}`dna;pst-color-primary` nf-core-metatdenovo" content: "nf-core/metatdenovo is a bioinformatics best-practice analysis pipeline for assembly and annotation of metatranscriptomic data, both prokaryotic and eukaryotic.0" link: "doc/nf-core-metatdenovo/nf-core-metatdenovo.html" - header: "{fas}`dna;pst-color-primary` nf-core-methylseq" content: "nf-core/methylseq is a bioinformatics analysis pipeline used for Methylation (Bisulfite) sequencing data. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results." link: "doc/nf-core-methylseq/nf-core-methylseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-nanoseq" content: "nfcore/nanoseq is a bioinformatics analysis pipeline for Nanopore DNA/RNA sequencing data that can be used to perform basecalling, demultiplexing, QC, alignment, and downstream analysis." link: "doc/nf-core-nanoseq/nf-core-nanoseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-nanostring" content: "nf-core/nanostring is a bioinformatics pipeline that can be used to analyze NanoString data. The performed analysis steps include quality control and data normalization." link: "doc/nf-core-nanostring/nf-core-nanostring.html" - header: "{fas}`dna;pst-color-primary` nf-core-pangenome" content: "nf-core/pangenome is a bioinformatics best-practice analysis pipeline for pangenome graph construction. The pipeline renders a collection of sequences into a pangenome graph. Its goal is to build a graph that is locally directed and acyclic while preserving large-scale variation. Maintaining local linearity is important for interpretation, visualization, mapping, comparative genomics, and reuse of pangenome graphs." link: "doc/nf-core-pangenome/nf-core-pangenome.html" - header: "{fas}`dna;pst-color-primary` nf-core-proteinfold" content: "nf-core/proteinfold is a bioinformatics best-practice analysis pipeline for Protein 3D structure prediction." link: "doc/nf-core-proteinfold/nf-core-proteinfold.html" - header: "{fas}`dna;pst-color-primary` nf-core-raredisease" content: "nf-core/raredisease is a best-practice bioinformatic pipeline for calling and scoring variants from WGS/WES data from rare disease patients." link: "doc/nf-core-raredisease/nf-core-raredisease.html" - header: "{fas}`dna;pst-color-primary` nf-core-rnafusion" content: "nf-core/rnafusion is a bioinformatics best-practice analysis pipeline for RNA sequencing consisting of several tools designed for detecting and visualizing fusion genes. Results from up to 5 fusion callers tools are created, and are also aggregated, most notably in a pdf visualiation document, a vcf data collection file, and html and tsv reports." link: "doc/nf-core-rnafusion/nf-core-rnafusion.html" - header: "{fas}`dna;pst-color-primary` nf-core-rnaseq" content: "nf-core/rnaseq is a bioinformatics pipeline that can be used to analyse RNA sequencing data obtained from organisms with a reference genome and annotation. It takes a samplesheet and FASTQ files as input, performs quality control (QC), trimming and (pseudo-)alignment, and produces a gene expression matrix and extensive QC report." link: "doc/nf-core-rnaseq/nf-core-rnaseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-rnasplice" content: "nf-core/rnasplice is a bioinformatics pipeline for alternative splicing analysis of RNA sequencing data obtained from organisms with a reference genome and annotation." link: "doc/nf-core-rnasplice/nf-core-rnasplice.html" - header: "{fas}`dna;pst-color-primary` nf-core-sarek" content: "nf-core/sarek is a workflow designed to detect variants on whole genome or targeted sequencing data. Initially designed for Human, and Mouse, it can work on any species with a reference genome. Sarek can also handle tumour / normal pairs and could include additional relapses." link: "doc/nf-core-sarek/nf-core-sarek.html" - header: "{fas}`dna;pst-color-primary` nf-core-scrnaseq" content: "nf-core/scrnaseq is a bioinformatics best-practice analysis pipeline for processing 10x Genomics single-cell RNA-seq data." link: "doc/nf-core-scrnaseq/nf-core-scrnaseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-smrnaseq" content: "nf-core/smrnaseq is a bioinformatics best-practice analysis pipeline for Small RNA-Seq." link: "doc/nf-core-smrnaseq/nf-core-smrnaseq.html" - header: "{fas}`dna;pst-color-primary` nf-core-taxprofiler" content: "nf-core/taxprofiler is a bioinformatics best-practice analysis pipeline for taxonomic classification and profiling of shotgun short- and long-read metagenomic data. It allows for in-parallel taxonomic identification of reads or taxonomic abundance estimation with multiple classification and profiling tools against multiple databases, and produces standardised output tables for facilitating results comparison between different tools and databases." link: "doc/nf-core-taxprofiler/nf-core-taxprofiler.html" - header: "{fas}`dna;pst-color-primary` nf-core-viralrecon" content: "nf-core/viralrecon is a bioinformatics analysis pipeline used to perform assembly and intra-host/low-frequency variant calling for viral samples. The pipeline supports both Illumina and Nanopore sequencing data." link: "doc/nf-core-viralrecon/nf-core-viralrecon.html" - header: "{fas}`dna;pst-color-primary` orthofinder" content: "Orthofinder is a fast, accurate and comprehensive platform for comparative genomics. It finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of the gene duplication events in those gene trees. It also infers a rooted species tree for the species being analysed and maps the gene duplication events from the gene trees to branches in the species tree. OrthoFinder also provides comprehensive statistics for comparative genomic analyses." link: "doc/orthofinder/orthofinder.html" - header: "{fas}`dna;pst-color-primary` pandaseq" content: "Pandaseq is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence." link: "doc/pandaseq/pandaseq.html" - header: "{fas}`dna;pst-color-primary` parabricks" content: "NVIDIA's Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Google's DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows." link: "doc/parabricks/parabricks.html" - header: "{fas}`dna;pst-color-primary` pepper_deepvariant" content: "PEPPER is a genome inference module based on recurrent neural networks that enables long-read variant calling and nanopore assembly polishing in the PEPPER-Margin-DeepVariant pipeline." link: "doc/pepper_deepvariant/pepper_deepvariant.html" - header: "{fas}`dna;pst-color-primary` petitefinder" content: "petiteFinder is an automated computer vision tool to compute Petite colony frequencies in baker's yeast." link: "doc/petitefinder/petitefinder.html" - header: "{fas}`dna;pst-color-primary` picard" content: "Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF." link: "doc/picard/picard.html" - header: "{fas}`dna;pst-color-primary` plink" content: "Plink is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner." link: "doc/plink/plink.html" - header: "{fas}`dna;pst-color-primary` plink2" content: "Plink2 is a whole genome association analysis toolset." link: "doc/plink2/plink2.html" - header: "{fas}`dna;pst-color-primary` polypolish" content: "Polypolish is a tool for polishing genome assemblies with short reads." link: "doc/polypolish/polypolish.html" - header: "{fas}`dna;pst-color-primary` prokka" content: "Prokka is a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files." link: "doc/prokka/prokka.html" - header: "{fas}`dna;pst-color-primary` qiime2" content: "QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results." link: "doc/qiime2/qiime2.html" - header: "{fas}`dna;pst-color-primary` qualimap" content: "Qualimap is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts." link: "doc/qualimap/qualimap.html" - header: "{fas}`dna;pst-color-primary` r-bioinformatics" content: "RStudio is an integrated development environment (IDE) for the R statistical computation and graphics system." link: "doc/r-bioinformatics/r-bioinformatics.html" - header: "{fas}`dna;pst-color-primary` r-scrnaseq" content: "RStudio is an integrated development environment (IDE) for the R statistical computation and graphics system." link: "doc/r-scrnaseq/r-scrnaseq.html" - header: "{fas}`dna;pst-color-primary` r-shinyngs" content: "Shinyngs is an R package designed to facilitate downstream analysis of RNA-seq and similar expression data with various exploratory plots and data mining tools." link: "doc/r-shinyngs/r-shinyngs.html" - header: "{fas}`dna;pst-color-primary` raven-assembler" content: "Raven-assembler is a de novo genome assembler for long uncorrected reads." link: "doc/raven-assembler/raven-assembler.html" - header: "{fas}`dna;pst-color-primary` raxml-ng-mpi" content: "Raxml-ng is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion." link: "doc/raxml-ng-mpi/raxml-ng-mpi.html" - header: "{fas}`dna;pst-color-primary` relion" content: "RELION (for REgularised LIkelihood OptimisatioN) is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy. It is developed in the research group of Sjors Scheres at the MRC Laboratory of Molecular Biology." link: "doc/relion/relion.html" - header: "{fas}`dna;pst-color-primary` rmats2sashimiplot" content: "Rmats2sashimiplot produces a sashimiplot visualization of rMATS output." link: "doc/rmats2sashimiplot/rmats2sashimiplot.html" - header: "{fas}`dna;pst-color-primary` rnaquast" content: "Rnaquast is a quality assessment tool for de novo transcriptome assemblies." link: "doc/rnaquast/rnaquast.html" - header: "{fas}`dna;pst-color-primary` rosettafold2" content: "RoseTTAFold2 extends the original three-track architecture of RoseTTAFold over the full network, incorporating the concepts of Frame-aligned point error, recycling during training, and the use of a distillation set from AlphaFold2." link: "doc/rosettafold2/rosettafold2.html" - header: "{fas}`dna;pst-color-primary` rosettafold2na" content: "RoseTTAFoldNA rapidly produces three-dimensional structure models with confidence estimates for protein–DNA and protein–RNA complexes." link: "doc/rosettafold2na/rosettafold2na.html" - header: "{fas}`dna;pst-color-primary` salmon" content: "Salmon is a tool for quantifying the expression of transcripts using RNA-seq data." link: "doc/salmon/salmon.html" - header: "{fas}`dna;pst-color-primary` samtools" content: "Samtools is a set of utilities for the Sequence Alignment/Map (SAM) format." link: "doc/samtools/samtools.html" - header: "{fas}`dna;pst-color-primary` scanpy" content: "Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata." link: "doc/scanpy/scanpy.html" - header: "{fas}`dna;pst-color-primary` scvelo" content: "Scvelo is a scalable toolkit for RNA velocity analysis in single cells." link: "doc/scvelo/scvelo.html" - header: "{fas}`dna;pst-color-primary` signalp6" content: "SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes" link: "doc/signalp6/signalp6.html" - header: "{fas}`dna;pst-color-primary` spaceranger" content: "Spaceranger is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images." link: "doc/spaceranger/spaceranger.html" - header: "{fas}`dna;pst-color-primary` spades" content: "Spades is an assembly toolkit containing various assembly pipelines." link: "doc/spades/spades.html" - header: "{fas}`dna;pst-color-primary` squid" content: "SQUID is designed to detect both fusion-gene and non-fusion-gene transcriptomic structural variations from RNA-seq alignment." link: "doc/squid/squid.html" - header: "{fas}`dna;pst-color-primary` star" content: "STAR (Spliced Transcripts Alignment to a Reference) is an ultrafast universal RNA-seq aligner." link: "doc/star/star.html" - header: "{fas}`dna;pst-color-primary` subread" content: "Subread carries out high-performance read alignment, quantification and mutation discovery. It is a general-purpose read aligner which can be used to map both genomic DNA-seq reads and DNA-seq reads. It uses a new mapping paradigm called seed-and-vote to achieve fast, accurate and scalable read mapping. Subread automatically determines if a read should be globally or locally aligned, therefore particularly powerful in mapping RNA-seq reads. It supports INDEL detection and can map reads with both fixed and variable lengths." link: "doc/subread/subread.html" - header: "{fas}`dna;pst-color-primary` tmhmm" content: "Tmhmm is used for prediction of transmembrane helices in proteins." link: "doc/tmhmm/tmhmm.html" - header: "{fas}`dna;pst-color-primary` transdecoder" content: "Transdecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks." link: "doc/transdecoder/transdecoder.html" - header: "{fas}`dna;pst-color-primary` trgt" content: "TRGT is a tool for targeted genotyping of tandem repeats from PacBio HiFi data. In addition to the basic size genotyping, TRGT profiles sequence composition, mosaicism, and CpG methylation of each analyzed repeat and visualization of reads overlapping the repeats." link: "doc/trgt/trgt.html" - header: "{fas}`dna;pst-color-primary` trim-galore" content: "Trim-galore is a wrapper tool that automates quality and adapter trimming to FastQ files." link: "doc/trim-galore/trim-galore.html" - header: "{fas}`dna;pst-color-primary` trimmomatic" content: "Trimmomatic is a flexible read trimming tool for Illumina NGS data" link: "doc/trimmomatic/trimmomatic.html" - header: "{fas}`dna;pst-color-primary` trinity" content: "Trinity assembles transcript sequences from Illumina RNA-Seq data." link: "doc/trinity/trinity.html" - header: "{fas}`dna;pst-color-primary` trinotate" content: "Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms." link: "doc/trinotate/trinotate.html" - header: "{fas}`dna;pst-color-primary` trycycler" content: "Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes." link: "doc/trycycler/trycycler.html" - header: "{fas}`dna;pst-color-primary` vcftools" content: "VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files." link: "doc/vcftools/vcftools.html" - header: "{fas}`dna;pst-color-primary` viennarna" content: "Viennarna is a set of standalone programs and libraries used for prediction and analysis of RNA secondary structures." link: "doc/viennarna/viennarna.html"