Search…
VirSorter

VirSorter2

Updated 2/25/2021
To use:
1
module use /mod/scgc/
2
module load anaconda3/2019.07
3
source activate vs2
Copied!
For instructions on how to run:
1
virsorter run --help
Copied!
All underlying databases have been installed, and it should run smoothly.

Original VirSorter

1
module use /mod/scgc/
2
module unload anaconda
3
module load anaconda3/2019.07
4
module load virsorter
5
module load mga
6
7
source activate virsorter
Copied!
For instructions:
1
julia at c1 in /mnt/scgc_nfs/lab/julia/virsorter_tests
2
$ wrapper_phage_contigs_sorter_iPlant.pl --help
3
Usage:
4
wrapper_phage_contigs_sorter_iPlant.pl -f sequences.fa
5
6
Required Arguments:
7
8
-f|--fna Fasta file of contigs
9
10
Options:
11
12
-d|--dataset Code dataset (DEFAULT "VIRSorter")
13
--cp Custom phage sequence
14
--db Either "1" (DEFAULT Refseqdb) or "2" (Viromedb)
15
--wdir Working directory (DEFAULT cwd)
16
--ncpu Number of CPUs (default: 4)
17
--virome Add this flag to enable virome decontamination mode, for datasets
18
mostly viral to force the use of generic metrics instead of
19
calculated from the whole dataset. (default: off)
20
--data-dir Path to "virsorter-data" directory (e.g. /path/to/virsorter-data)
21
--diamond Use diamond (in "--more-sensitive" mode) instead of blastp.
22
Diamond is much faster than blastp and may be useful for adding
23
many custom phages, or for processing extremely large Fasta files.
24
Unless you specify --diamond, VirSorter will use blastp.
25
--keep-db Specifying this flag keeps the new HMM and BLAST databases created
26
after adding custom phages. This is useful if you have custom phages
27
that you want to be included in several different analyses and want
28
to save the database and point VirSorter to it in subsequent runs.
29
By default, this is off, and you should only specify this flag if
30
you're SURE you need it.
31
--help Show help and exit
Copied!
You may have to specify the data-dir to get it to work. Use this one: /mnt/scgc_nfs/ref/virsorter/virsorter-data/ which was the latest version of the database as of May, 2018

Vibrant

1
module use /mod/scgc/
2
module load anaconda3/2019.07
3
source activate vibrant
4
5
VIBRANT_run.py --help
6
7
usage: VIBRANT_run.py [-h] [--version] -i I [-f {prot,nucl}] [-folder FOLDER]
8
[-t T] [-l L] [-o O] [-virome] [-no_plot] [-d D] [-m M]
9
10
Usage: VIBRANT_run.py -i <input_file> [options]. VIBRANT identifies bacterial
11
and archaeal viruses (phages) from assembled metagenomic scaffolds or whole
12
genomes, including the excision of integrated proviruses. VIBRANT also
13
performs curation of identified viral scaffolds, estimation of viral genome
14
completeness and analysis of viral metabolic capabilities.
15
16
optional arguments:
17
-h, --help show this help message and exit
18
--version show program's version number and exit
19
-i I input fasta file
20
-f {prot,nucl} format of input [default="nucl"]
21
-folder FOLDER path to deposit output folder and temporary files, will
22
create if doesn't exist [default= working directory]
23
-t T number of parallel VIBRANT runs, each occupies 1 CPU
24
[default=1, max of 1 CPU per scaffold]
25
-l L length in basepairs to limit input sequences [default=1000,
26
can increase but not decrease]
27
-o O number of ORFs per scaffold to limit input sequences
28
[default=4, can increase but not decrease]
29
-virome use this setting if dataset is known to be comprised mainly
30
of viruses. More sensitive to viruses, less sensitive to
31
false identifications [default=off]
32
-no_plot suppress the generation of summary plots [default=off]
33
-d D path to original "databases" directory that contains .HMM
34
files (if moved from default location)
35
-m M path to original "files" directory that contains .tsv and
36
model files (if moved from default location)
Copied!

DeepVirFinder

1
module use /mod/scgc/
2
module load deepvirfinder/1.0
3
module load anaconda3/2019.07
4
source activate dvf
5
6
dvf.py --help
7
Usage: dvf.py [options]
8
9
Options:
10
-h, --help show this help message and exit
11
-i INPUT_FA, --in=INPUT_FA
12
input fasta file
13
-m MODDIR, --mod=MODDIR
14
model directory (default ./models)
15
-o OUTPUT_DIR, --out=OUTPUT_DIR
16
output directory
17
-l CUTOFF_LEN, --len=CUTOFF_LEN
18
predict only for sequence >= L bp (default 1)
19
-c CORE_NUM, --core=CORE_NUM
20
number of parallel cores (default 1)
Copied!

CheckV

1
module use /mod/scgc/
2
module load anaconda3/2019.07
3
source activate checkv
4
5
checkv end_to_end --help
6
Run full pipeline to estimate completeness, contamination, and identify closed genomes
7
8
usage: checkv end_to_end <input> <output> [options]
9
10
positional arguments:
11
input Input nucleotide sequences in FASTA format
12
output Output directory
13
14
optional arguments:
15
-h, --help show this help message and exit
16
-d PATH Reference database path. By default the CHECKVDB environment variable is used
17
-t INT Number of threads to use for Prodigal and DIAMOND
18
--restart Overwrite existing intermediate files. By default CheckV continues where program left
19
off
20
--quiet Suppress logging messages
Copied!
You will need to designate the database location. On Charlie, it is located at: /mnt/scgc/scgc_nfs/ref/checkv/checkv-db-v1.0/
An example command would be: checkv end_to_end <in_fasta> <out_directory> -t <number_of_cores> -d /mnt/scgc/scgc_nfs/ref/checkv/checkv-db-v1.0/

VContact

1
module use /mod/scgc/
2
module load anaconda3/2019.07
3
source activate vContact2
4
5
vcontact2 --help
Copied!
suggested parameters: --db 'ProkaryoticViralRefSeq94-Merged' --pcs-mode MCL --vcs-mode ClusterONE --c1-bin /mnt/scgc/scgc_nfs/opt/common/clusterone/1.0/cluster_one-1.0.jar
Last modified 4mo ago