Genblast Module Documentation

GenBlast identifies homologous gene sequences in genomic databases. One of the key features of GenBlast is its flexibility to handle comparative genomics tasks and accurately identify homologs even when the sequences have undergone significant evolutionary changes. This capability makes it a valuable resource for researchers studying gene evolution, gene families, and gene function across diverse species. GenBlast has been widely used in various genomic analyses and is available as a standalone command-line tool or as part of different bioinformatics pipelines. Researchers in the field of comparative genomics and gene function analysis often rely on GenBlast to perform sensitive homology searches and obtain valuable insights into the evolutionary relationships and functional conservation of genes in different organisms.

She, R., Chu, J.S., Uyar, B., Wang, J., Wang, K., and Chen, N. (2011). GenBlastA: enabling BLAST to identify homologous gene sequences. Genome Res., 21(5): 936-949.

ensembl.tools.anno.protein_annotation.genblast.run_genblast(masked_genome: Path, output_dir: Path, protein_dataset: Path, max_intron_length: int, genblast_timeout_secs: int = 10800, genblast_bin: Path = PosixPath('genblast'), convert2blastmask_bin: Path = PosixPath('convert2blastmask'), makeblastdb_bin: Path = PosixPath('makeblastdb'), num_threads: int = 1, protein_set: str = ['uniprot', 'orthodb']) None[source]

Executes GenBlast on genomic slices

param masked_genome:

Masked genome file path.

type masked_genome:

Path

param output_dir:

Working directory path.

type output_dir:

Path

param protein_dataset:

Protein dataset (Uniprot/OrthoDb) path.

type protein_dataset:

Path

param genblast_timeout_secs:

Time for timeout (sec).

type genblast_timeout_secs:

int, default 10800

param max_intron_length:

Maximum intron length.

type max_intron_length:

int

param genblast_bin:

Software path.

type genblast_bin:

Path, default genblast

param convert2blastmask_bin:

Software path.

type convert2blastmask_bin:

Path, default convert2blastmask

param makeblastdb_bin:

Software path.

type makeblastdb_bin:

Path, default makeblastdb

param genblast_timeout:

seconds

type genblast_timeout:

int, default 1

param num_threads:

int, number of threads.

type num_threads:

int, default 1

param protein_set:

Source

type str:

[“uniprot”, “orthodb”]

return:

None

rtype:

None