prepare
ensembl.io.genomio.seq_region.prepare
¶
Construct a seq_region metadata file from INSDC files.
main()
¶
Module's entry-point.
Source code in src/python/ensembl/io/genomio/seq_region/prepare.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
|
prepare_seq_region_metadata(genome_file, report_file, dst_file, *, gbff_file=None, to_exclude=None, mock_run=False)
¶
Prepares the sequence region metadata found in the INSDC/RefSeq report and GBFF files.
The sequence region information is loaded from both sources and combined. Elements are added/excluded as requested, and the final sequence region metadata is dumped in a JSON file that follows the schema defined in "src/python/ensembl/io/genomio/data/schemas/seq_region.json".
Parameters:
Name | Type | Description | Default |
---|---|---|---|
genome_file
|
StrPath
|
Genome metadata JSON file path. |
required |
report_file
|
StrPath
|
INSDC/RefSeq sequences report file path to parse. |
required |
gbff_file
|
StrPath | None
|
INSDC/RefSeq GBFF file path to parse. |
None
|
dst_file
|
StrPath
|
JSON file output for the processed sequence regions JSON. |
required |
to_exclude
|
list[str] | None
|
Sequence region names to exclude. |
None
|
mock_run
|
bool
|
Do not call external taxonomy service. |
False
|
Source code in src/python/ensembl/io/genomio/seq_region/prepare.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
|