Biostar9501110

Last commit

Keep reads including/excluding variants from VCF

Usage

This program is now part of the main jvarkit tool. See jvarkit for compiling.

Usage: java -jar dist/jvarkit.jar biostar9501110  [options] Files

Usage: biostar9501110 [options] Files
  Options:
    --bamcompression
      Compression Level. 0: no compression. 9: max compression;
      Default: 5
    --buffer-size
      When we're looking for variants in a lare VCF file, load the variants in 
      an interval of 'N' bases instead of doing a random access for each 
      variant. 
      Default: 1000
    -clip, --clip
      search variant in clipped section of reads
      Default: false
    -h, --help
      print help and exit
    --helpFormat
      What kind of help. One of [usage,markdown,xml].
    --inverse
      inverse selection. Keep reads that does NOT contain any variant
      Default: false
    -m, --min-variants
      Find a least 'x' variants in each read
      Default: 1
    -o, --out
      Output file. Optional . Default: stdout
    -R, --reference
      Indexed fasta Reference file. This file must be indexed with samtools 
      faidx and with picard/gatk CreateSequenceDictionary or samtools dict
    --regions
      Limit analysis to this interval. A source of intervals. The following 
      suffixes are recognized: vcf, vcf.gz bed, bed.gz, gtf, gff, gff.gz, 
      gtf.gz.Otherwise it could be an empty string (no interval) or a list of 
      plain interval separated by '[ \t\n;,]'
    --samoutputformat
      Sam output format.
      Default: SAM
      Possible Values: [BAM, SAM, CRAM]
    --tag
      add attribute with this tag containing the informations about the 
      variants. Empty:ignore
      Default: XV
    --validation-stringency
      SAM Reader Validation Stringency
      Default: LENIENT
      Possible Values: [STRICT, LENIENT, SILENT]
  * -V, --variants, --vcf
      indexed vcf file
    --version
      print version and exit

Keywords

  • sam
  • bam
  • vcf

See also in Biostars

Creation Date

20211210

Source code

https://github.com/lindenb/jvarkit/tree/master/src/main/java/com/github/lindenb/jvarkit/tools/biostar/Biostar9501110.java

Contribute

License

The project is licensed under the MIT license.

Citing

Should you cite biostar9501110 ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md

The current reference is:

http://dx.doi.org/10.6084/m9.figshare.1425030

Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030

Example

java -jar dist/biostar9501110.jar -V src/test/resources/rotavirus_rf.freebayes.vcf.gz src/test/resources/S*.bam


@HD VN:1.6  GO:none SO:coordinate
@SQ SN:RF01 LN:3302
(...)
@CO biostar9501110. compilation:20211210154516 githash:efc3d5165 htsjdk:2.24.1 date:20211210155226. cmd:-V src/test/resources/rotavirus_rf.freebayes.vcf.gz src/test/resources/S1.bam src/test/resources/S2.bam src/test/resources/S3.bam src/test/resources/S4.bam src/test/resources/S5.bam
RF01_188_587_2:0:0_2:0:0_af 99  RF01    188 60  70M =   518 400 AGCTCTTAGTTGAATATAGCGATGTTATGGAGAATGCCACACTGTTGTCAATATTCTCGTACTCTTATGA  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S4 NM:i:2  AS:i:60 XS:i:0
RF01_198_667_4:0:0_1:0:0_49 99  RF01    198 60  70M =   598 470 AGAATATAGCGATGTTATGGAGAATGCCACACTGTTGTCAATATTCTCGAACTCTTATGATAAATATAAG  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S2 NM:i:4  AS:i:58 XS:i:0
RF01_198_667_4:0:0_1:0:0_49 99  RF01    198 60  70M =   598 470 AGAATATAGCGATGTTATGGAGAATGCCACACTGTTGTCAATATTCTCGAACTCTTATGATAAATATAAG  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S3 NM:i:4  AS:i:58 XS:i:0
RF01_201_685_1:0:0_0:0:0_a8 99  RF01    201 60  70M =   616 485 ATATAGCGATGTTATGGAGAATGCCACACTGTTGTCAATATTCTCGTACTCTTATGATAAATATAACGCT  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S4 NM:i:1  AS:i:65 XS:i:0
RF01_244_811_1:0:0_2:0:0_4c 163 RF01    244 60  70M =   742 568 TCGTACTCTTATGATAAATATAACGCTGTTGAAAGGCAATTAGTAAAATATGCAAAAGGTAAGCCGCTAG  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S5b    NM:i:1  AS:i:65 XS:i:0
RF01_257_807_2:0:0_0:0:0_2b 99  RF01    257 60  70M =   738 551 ATAAATATAACGCTGTTGAAAGGCAATTAGTAAAATATGCAAAAGGTAAGCCGGTAGAAGCAGATTTGAC  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S5 NM:i:2  AS:i:60 XS:i:0
RF01_314_833_2:0:0_0:0:0_84 99  RF01    314 60  70M =   764 520 AAGCAGATTTGAGAGTGAATGAGTTGGATTATGAAAATAACAAGATAACATCTGAACATTTCCCAACAGC  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S2 NM:i:2  AS:i:60 XS:i:0
RF01_314_833_2:0:0_0:0:0_84 99  RF01    314 60  70M =   764 520 AAGCAGATTTGAGAGTGAATGAGTTGGATTATGAAAATAACAAGATAACATCTGAACATTTCCCAACAGC  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S3 NM:i:2  AS:i:60 XS:i:0
RF01_329_808_2:0:0_0:0:0_b0 99  RF01    329 60  70M =   739 480 TGAATGAGTTGGATTATGAAAAAAACAAGATAACATCTGAACTTTTCCCAACAGCAGAGGAATAAACTGA  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S3 NM:i:2  AS:i:60 XS:i:0
RF01_329_808_2:0:0_0:0:0_b0 99  RF01    329 60  70M =   739 480 TGAATGAGTTGGATTATGAAAAAAACAAGATAACATCTGAACTTTTCCCAACAGCAGAGGAATAAACTGA  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S2 NM:i:2  AS:i:60 XS:i:0
RF01_350_917_6:0:0_1:0:0_a9 99  RF01    350 60  70M =   848 568 AAAACAAGAAAACATGTGAACTTTTCCGAACAGCAGAGGAATATACTGAATCATTTATGGATCCAGCAAT  2222222222222222222222222222222222222222222222222222222222222222222222  RG:Z:S2 NM:i:6  AS:i:43 XS:i:0

See also

  • BamPhased01