Skip to content

checksums

ensembl.utils.checksums

Utils for common hash operations (often referred to as checksums) over files, e.g. MD5 or SHA128.

get_file_hash(file_path, algorithm='md5')

Returns the hash value for a given file and hash algorithm.

Parameters:

Name Type Description Default
file_path StrPath

File path to get the hash for.

required
algorithm str

Secure hash or message digest algorithm name.

'md5'
Source code in src/ensembl/utils/checksums.py
23
24
25
26
27
28
29
30
31
32
33
34
def get_file_hash(file_path: StrPath, algorithm: str = "md5") -> str:
    """Returns the hash value for a given file and hash algorithm.

    Args:
        file_path: File path to get the hash for.
        algorithm: Secure hash or message digest algorithm name.
    """
    hash_func = hashlib.new(algorithm)
    with Path(file_path).open("rb") as f:
        data_bytes = f.read()
    hash_func.update(data_bytes)
    return hash_func.hexdigest()

validate_file_hash(file_path, hash_value, algorithm='md5')

Returns true if the file's hash value is the same as the one provided for that hash algorithm, false otherwise.

Parameters:

Name Type Description Default
file_path StrPath

Path to the file to validate.

required
hash_value str

Expected hash value.

required
algorithm str

Secure hash or message digest algorithm name.

'md5'
Source code in src/ensembl/utils/checksums.py
37
38
39
40
41
42
43
44
45
46
47
def validate_file_hash(file_path: StrPath, hash_value: str, algorithm: str = "md5") -> bool:
    """Returns true if the file's hash value is the same as the one provided for that hash
    algorithm, false otherwise.

    Args:
        file_path: Path to the file to validate.
        hash_value: Expected hash value.
        algorithm: Secure hash or message digest algorithm name.
    """
    file_hash = get_file_hash(file_path, algorithm)
    return file_hash == hash_value