JVARKIT

Author : Pierre Lindenbaum Phd. Institut du Thorax. Nantes. France. Version : 8ebd2be2 Compilation : 20240424122145 Github : https://github.com/lindenb/jvarkit Issues : https://github.com/lindenb/jvarkit/issues

Usage

  java -jar jvarkit.jar [options]

or

  java -jar jvarkit.jar <command name> (other arguments)

Options

  • --help show this screen
  • --help-all show all commands, including the private ones.
  • --version print version

Compilation Installation

Please, read how to run and install jvarkit

Tools

BAM Visualization

Tool Description Creation Update
bam2raster BAM to raster graphics
bam2svg BAM to Scalar Vector Graphics (SVG) 20141013 20210728
biostar139647 Convert alignment in Fasta/Clustal format to SAM/BAM file
biostar145820 subsample/shuffle BAM to fixed number of alignments. 20150615 20211005
lowresbam2raster Low Resolution BAM to raster graphics 20170523 20211126
mkminibam Creates an archive of small bams with only a few regions. 20190410 20221019
plotsashimi Print Sashimi plots from Bam 20191117 20191104
prettysam Pretty SAM alignments 20171215 20211105
sv2svg BAM to SVG. Used to display structural variations. 20181115 20230505
wgscoverageplotter Whole genome coverage plotter 20201125 20230505

CNV/SV

Tool Description Creation Update
bammatrix Bam matrix, inspired from 10x/loupe 20190620 20211206
cnvtview Text visualization of bam DEPTH for multiple regions in a terminal 20181018 20210412
coverageplotter Display an image of depth to display any anomaly an intervals+bams 20200605 20221125
indexcov2vcf convert indexcov data to vcf 20200528 20400313
samfindclippedregions Fins clipped position in one or more bam. 20140228 20220329
swingindexcov indexcov visualization 2020511 2020512
vcfstrech2svg another VCF to SVG 20210304 20210309
wescnvsvg SVG visualization of bam DEPTH for multiple regions 20180726 20210726

Functional prediction

Tool Description Creation Update
backlocate Mapping a mutation on a protein back to the genome. 20140619 20190820
groupbygene Group VCF data by gene/transcript. By default it uses data from VEP , SnpEff 20131209 20220529

BED Manipulation

Tool Description Creation Update
bedcluster Clusters a BED file into a set of BED files. 20200130 20220914
bedmergecnv Merge continuous sorted bed records if they overlap a fraction of their lengths. 20200330 20200603
bednonoverlappingset Split a Bed file into non-overlapping data set. 20180607 20200408
bedrenamechr Convert the names of the chromosomes in a Bed file 20190503
setfiletools Utilities for the setfile format 20210125 20220426

Biostars

Tool Description Creation Update
biostar103303 Calculate Percent Spliced In (PSI).
biostar105754 bigwig : peak distance from specific genomic region 20140708 20220110
biostar165777 Split a XML file 20151114 20151114
biostar170742 convert sam format to axt Format 20151228 20210412
biostar172515 Convert BAI to XML
biostar173114 make a bam file smaller by removing unwanted information see also https://www.biostars.org/p/173114/
biostar175929 Construct a combination set of fasta sequences from a vcf 20160208 20211012
biostar178713 split bed file into several bed files where each region is separated of any other by N bases 20160226 20200818
biostar214299 Extract allele specific reads from bamfiles 20160930 20220420
biostar234081 convert extended CIGAR to regular CIGAR ('X','=' -> 'M') 20170130 20200409
biostar234230 Sliding Window : discriminate partial and fully contained fragments (from a bam file) 20190417
biostar251649 Annotating the flanking bases of SNPs in a VCF file 20170508 20200213
biostar322664 Extract PE Reads (with their mates) supporting variants in vcf file
biostar332826 Fast Extraction of Variants from a list of IDs 20180817 20210412
biostar336589 displays circular map as SVG from BED and REF file 20180907 20210818
biostar352930 Fills the empty SEQ() and QUAL() in a bam file using the the reads with the same name carrying this information.
biostar398854 Extract every CDS sequences from a VCF file 20190916 20240418
biostar404363 introduce artificial mutation SNV in bam 20191023 20191024
biostar480685 paired-end bam clip bases outside insert range 20201223 20200220
biostar489074 call variants for every paired overlaping read 20200205 20210412
biostar497922 Split VCF into separate VCFs by SNP count 20210319 20210319
biostar59647 SAM/BAM to XML 20131112 20190926
biostar76892 fix strand of two paired reads close but on the same strand.
biostar77288 Low resolution sequence alignment visualization
biostar77828 Divide the human genome among X cores, taking into account gaps
biostar78285 Extract BAMs coverage as a VCF file.
biostar81455 Defining precisely the exonic genomic context based on a position . 20130918 20200603
biostar84452 remove clipped bases from a BAM file
biostar84786 Matrix transposition
biostar86363 Set genotype of specific sample/genotype comb to unknown in multisample vcf file. See http://www.biostars.org/p/86363/
biostar86480 Genomic restriction finder 20131114 20220426
biostar90204 Bam version of linux split.
biostar9462889 Extracting reads from a regular expression in a bam file 20210402 20210402
biostar9469733 Extract reads mapped within chosen intronic region from BAM file 20210511 20210511
biostar9501110 Keep reads including/excluding variants from VCF 20211210 20211213
biostar9556602 Filtering of tricky overlapping sites in VCF

Deprecated/barely used

Tool Description Creation Update
addlinearindextobed Use a Sequence dictionary to create a linear index for a BED file. Can be used as a X-Axis for a chart. 20140201 20230126
bam2sql Convert a SAM/BAM to sqlite statements 20160414 20160414
bam2xml converts a BAM to XML 20130506 20210315

Pubmed

Tool Description Creation Update
pubmed404 Test if URL in the pubmed abstracts are reacheable. 20181210 20200204
pubmedcodinglang Programming language use distribution from recent programs / articles 20170404 20200223
pubmeddump Dump XML results from pubmed/Eutils 20140805 20200204
pubmedgender Add gender-related attributes in the Author tag of pubmed xml.
pubmedgraph Creates a Gephi-gexf graph of references-cites for a given PMID 20150605 20200220

GTF/GFF Manipulation

Tool Description Creation Update
gtf2bed Convert GTF/GFF3 to BED. 20220629 20220630
gtf2xml Convert GTF/GFF to XML 20150811 20230512

Utilities

Tool Description Creation Update
goutils Gene Ontology Utils. Retrieves terms from Gene Ontology 20180130 20211020
ncbitaxonomy2xml Dump NCBI taxonomy tree as a hierarchical XML document or as a table 20120320 20240320
oboutils OBO Ontology Utils. 20230105 20230105
ukbiobanksamples Select samples from ukbiobank 20210705 20220322
uniprot2svg plot uniprot to SVG 20220608 20220922
xsltstream XSLT transformation for large XML files. xslt is only applied on a given subset of nodes. 20190222

Unclassfied

Tool Description Creation Update
bamliftover Lift-over a BAM file.
barcodegenerator Barcode generator for EricCharp 20230629 20230629
bedremovebed Remove bed file from each record of input bed file. Output is a SETFILE 20221210 20221210
bigwigmerge merge several Bigwig files using different descriptive statistics (mean, median, etc..) 20240417 20240417
cnvvalidatorserver Review files generated by coverageplotter 20220818 20220826
convertliftoverchain Convert the contigs in a liftover chain to match another REFerence. (eg. to remove chr prefix, unknown chromosomes etc...) 20190409 20190409
coverageserver Jetty Based http server serving Bam coverage. 20200212 20200330
evadumpfiles Dump files locations from European Variation Archive 20230314 20230314
fastqshuffle Shuffle Fastq files 20140901 20240129
gff3upstreamorf Takes a ucsc genpred file, scan the 5' UTRs and generate a GFF3 containing upstream-ORF. Inspired from https://github.com/ImperialCardioGenetics/uORFs 20220724 20230820
gtexrs2qtl extract gtex eqtl data from a list of RS 20230215 20240225
gtfliftover LiftOver GTF file. 20190823 20190823
gtfretrocopy Scan retrocopies by comparing the gtf/intron and the deletions in a VCF 20190813 20191104
htsfreemarker Apply Freemarker to VCF/BAM/JSON files. 20230616 20230616
illuminadir Create a structured (JSON or XML) representation of a directory containing some Illumina FASTQs. 20131021 20180717
kg2bed converts UCSC knownGenes file to BED. 20140311 20230815
kg2fa convert ucsc genpred to fasta 20190213 20230815
kg2gff Convert UCSC genpred file to gff3 20210106 20230817
knownretrocopy Annotate VCF structural variants that could be intron from retrocopies. 20190815 20230817
ngsfilessummary Scan folders and generate a summary of the files (SAMPLE/BAM SAMPLE/VCF etc..). Useful to get a summary of your samples. 20140430 20240324
optimizefisher Optimize fisher test on VCF using genetic algo 20221013 20240207
pubmedmap Use Pubmed Author's Affiliation to map the authors in the world. 20160426
repairfastq Join single end reads to paired end 20240128 20240128
sam2json Convert a SAM input to JSON 20210402 20210315
sam4weblogo Sequence logo for different alleles or generated from SAM/BAM 20130524 20191014
samjdk Filters a BAM using a java expression compiled in memory. 20170807 20191119
scanlabguru scan the files stored in labguru 20240325 20240325
sortvcfoninfo Sort a VCF a field in the INFO column 20140218 20201204
sv2fasta convert VCF of structural variant(s) to fasta for pggb 20230403 20230403
tssenrich Transcription Start Site (TSS) Enrichment Score calculation 20240130 20240206
vcf2bam vcf to bam 20150612 20211022
vcf2xml Convert VCF to XML 20230822
vcfburdenfisherh Fisher Case /Controls per Variant 20160418 20200713
vcfburdenslidingwindow apply fisher test on VCF using a sliding window 20190920 20231213
vcffilterbyliftover Add FILTER(s) to a variant when it is known to map elsewhere after liftover. 20190418 20210603
vcfgatkeval Eval/Plot gatk INFO tags for filtering 20230424 20240321
vcfgroupbypop create INFO data by population 20190319 20230712
vcfpeekvcf Get the INFO from a VCF and use it for another VCF 20150521 20240405
vcfscanupstreamorf Scan BAM for upstream-ORF. Inspired from https://github.com/ImperialCardioGenetics/uORFs 20190218 20200804
vcfserver Web Server displaying VCF file. A web interface for vcf2table 20171027 20220517
vcfspliceai Annotate VCF with spiceai web service 20201107 20201107
vcftbi2bed extracts BED for each contig in a tabix-indexed VCF peeking first of last variant for each chromosome. 20230214 20230214
wib2bedgraph Extract Wib files to bedgraph or wig 20230819 20230819

VCF Manipulation

Tool Description Creation Update
bioalcidaejdk java-based version of awk for bioinformatics 20170712 20210412
biostar130456 Split individual VCF files from multisamples VCF file 20150210 20200603
builddbsnp Build a DBSNP file from different sources for GATK 20200904 2021070726
findavariation Finds a specific mutation in a list of VCF files 20140623 20200217
findgvcfsblocks Find common blocks of calleable regions from a set of gvcfs 20210806 20220401
mantamerger Merge Vcf from Manta VCF. 20190916 20230320
minicaller Simple and Stupid Variant Caller designed for @AdrienLeger2 201500306 20220705
swingvcfjexl Filter VCF using Java Swing UI and JEXL/Javascript expression 20220413 20220414
swingvcfview VCFviewer using Java Swing UI 20210503 20210503
vcf2intervals split a vcf to interval or bed for parallelization 20211112 20221128
vcf2table convert a vcf to a table, to ease display in the terminal 20170511 20220507
vcfallelebalance Insert missing allele balance annotation using FORMAT:AD 20180829 20200805
vcfancestralalleles Annotate a VCF with it's ancestral allele. Data from http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/supporting/ancestral_alignments/human_ancestor_GRCh37_e59.README 20180418 20220126
vcfbigbed Annotate a VCF with values from a bigbed file 20220107 20220107
vcfbigwig Annotate a VCF with values from a bigwig file 20200506 20230819
vcfburdenmaf MAF for Cases / Controls 20160418 202000713
vcfcadd Annotate VCF with Combined Annotation Dependent Depletion (CADD) (Kircher & al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014 Feb 2. doi: 10.1038/ng.2892.PubMed PMID: 24487276. 20140218 20220119
vcfcombinetwosnvs Detect Mutations than are the consequences of two distinct variants. This kind of variant might be ignored/skipped from classical variant consequence predictor. Idea from @SolenaLS and then @AntoineRimbert 20160215 20200425
vcfcomposite (in developpement) Finds Variants involved in a Het Compound Disease 20170331 20200210
vcfconcat Concatenate VCFs with same sample. See also bcftools concat 20131230 20240426
vcfdistancevariants Annotate variants with the distance between previous and next variant. 20190410 20230510
vcffiltergenes Filter VEP/SnpEff Output from a list of genes. 20160322 20230505
vcffiltergtf Filter VCF on GTF 20230703 20230704
vcffilterjdk Filtering VCF with dynamically-compiled java expressions 20170705 20220830
vcffilterso Filter a VCF file annotated with SNPEff or VEP with terms from Sequence-Ontology. Reasoning : Children of user's SO-terms will be also used. 20170331 20200924
vcfflatten Flatten variants to one variant 20230222 20230222
vcfgenesplitter Split VCF+VEP by gene/transcript. 20160310 202220531
vcfgnomad Peek annotations from gnomad 20170407 20231103
vcfgnomadsv Peek annotations from gnomad structural variants 20190814 20211109
vcfgrantham add grantham score from annotated VCF variant 20230503 20230503
vcfhead print the first variants of a vcf 20131210 20200518
vcfmulti2oneinfo 'one variant with INFO with N values' to 'N variants with one INFO' 20260106 20230524
vcfpar Flag human sexual regions excluding PAR. 20200908 20200908
vcfpeekaf Peek the AF from another VCF 20200624 20200904
vcfphased01 X10 Phased SVG to Scalar Vector Graphics (SVG) 20190710 20230818
vcfpolyx Number of repeated REF bases around POS. 20200930 20230526
vcfrebase Restriction sites overlaping variations in a vcf 20131115 20200624
vcfregulomedb Annotate a VCF with the Regulome2 data (https://regulomedb.org/) 20140709 20230512
vcfsetdict Set the ##contig lines in a VCF header on the fly 20140105 20210201
vcfshuffle Shuffle a VCF 20131210 20200818
vcfsplitnvariants Split VCF to 'N' VCF files 202221122 202221201
vcfspringfilter Uses the java spring Framework to build complex vcf filters 20230526 20230526
vcfstats Produce VCF statitics 20131212 20230707
vcfsvannotator SV Variant Effect prediction using gtf, gnomad, etc 20190815 20230509
vcftail print the last variants of a vcf 20131210 20200518
vcftrio Find mendelian incompatibilitie / denovo variants in a VCF 20130705 20200624

Retrocopy

Tool Description Creation Update
scanretrocopy Scan BAM for retrocopies 20190125 20230818
starretrocopy Scan retrocopies from the star-aligner/bwa output 20190710 20191008

BAM Manipulation

Tool Description Creation Update
bam2haplotypes Reconstruct SNP haplotypes from reads 20211015 20211020
bamphased01 Extract Reads from a SAM/BAM file supporting at least two variants in a VCF file. 20210218 20210218
bamrenamechr Convert the names of the chromosomes in a BAM file 20131217 20191210
bamstats04 Coverage statistics for a BED file. 20130513 20191003
bamstats05 Coverage statistics for a BED file, group by gene 20151012 20210317
bamwithoutbai Query a Remote BAM without bai 20191213 20191217
basecoverage 'Depth of Coverage' per base. 20220420 20220420
bioalcidaejdk java-based version of awk for bioinformatics 20170712 20210412
biostar154220 Cap BAM to a given coverage 20150812 20210312
biostar9566948 Trim Reads So Only First Base Remains 20230621 20230621
findallcoverageatposition Find depth at specific position in a list of BAM files. My colleague Estelle asked: in all the BAM we sequenced, can you give me the depth at a given position ? 20141128 20210818
sam2tsv Prints the SAM alignments as a TAB delimited file. 20170712 20210304
samgrep grep read-names in a bam file 20130506 20210726
samrmdupnames remove duplicated names in sorted BAM 20221207 20221207
samviewwithmate Extract reads within given region(s), and their mates 20190207 20191004
sortsamrefname Sort a BAM on chromosome/contig and then on read/querty name 20150812 20210312
swingbamcov Bam coverage viewer using Java Swing UI 20210420 20220513
swingbamview Read viewer using Java Swing UI 20220503 20230427
texbam Write text in a bam. Mostly for fun... 20220708 20220708