Statistics Pipeline Troubleshooting Guide¶
This guide covers common issues, error messages, and solutions for the Ensembl genes statistics pipeline.
Table of Contents¶
- Database Connection Issues
- Module-Specific Errors
- Resource and Performance Issues
- File System and I/O Problems
- Version and Dependency Issues
- Data Quality and Validation
- Debugging Strategies
Database Connection Issues¶
Error: "Access denied for user"¶
Symptoms:
Causes: - Incorrect database credentials - Missing database privileges - Network access restrictions
Solutions: 1. Verify credentials in your configuration:
params.user = 'correct_username'
params.password = 'correct_password'
params.host = 'correct_host'
params.port = 3306
-
Test connection manually:
-
Check database grants:
Affected Modules: All database-dependent modules (FETCH_GENOME, FETCH_PROTEINS, RUN_STATISTICS, RUN_ENSEMBL_META, POPULATE_DB, BUSCO_CORE_METAKEYS)
Error: "Can't connect to MySQL server"¶
Symptoms:
Causes: - Database server is down - Network connectivity issues - Firewall blocking connection - Wrong host/port
Solutions: 1. Verify server is running:
-
Check firewall rules allow outbound connections to database port
-
Verify host and port are correct in parameters
-
For compute environments, ensure security groups allow database access
Error: "Unknown database"¶
Symptoms:
Causes: - Database name doesn't exist - Typo in database name - Metadata contains incorrect dbname
Solutions: 1. List available databases:
-
Verify metadata
dbnamefield matches actual database name -
Check for case sensitivity (database names are case-sensitive on some systems)
-
Ensure database was created before running pipeline
Module-Specific Errors¶
BUSCO Modules¶
Error: "BUSCO dataset not found"¶
Symptoms:
Causes: - Network connectivity to BUSCO database - Invalid lineage name - Corrupted cache directory
Solutions: 1. Verify lineage name is valid:
-
Clear cache and retry:
-
Check internet connectivity:
-
Manually download dataset if needed:
Affected Modules: BUSCO_DATASET, BUSCO_GENOME_LINEAGE, BUSCO_PROTEIN_LINEAGE
Error: "BUSCO failed to run"¶
Symptoms:
Causes: - Input file format issues - Insufficient memory - Corrupted input data
Solutions: 1. Validate input file format:
-
Check file is not empty:
-
Increase memory allocation in config:
-
Check BUSCO logs in work directory for detailed error
OMAmer/OMark Modules¶
Error: "OMAmer database not found"¶
Symptoms:
Causes:
- params.omamer_database not set correctly
- Database file doesn't exist
- Insufficient permissions
Solutions: 1. Verify database path exists:
-
Set correct path in parameters:
-
Download database if missing:
Affected Modules: OMAMER_HOG, OMARK
Error: "OMAmer search failed"¶
Symptoms:
Causes: - Empty or invalid protein file - Corrupted database - Memory issues
Solutions: 1. Validate protein FASTA:
-
Check database integrity:
-
Increase maxForks if memory constrained:
Fetch Modules¶
Error: "No translations found"¶
Symptoms:
Causes: - Database has no protein_coding genes - Wrong database selected - Database missing translation data
Solutions: 1. Check gene count:
-
Check translation table:
-
Verify correct database in metadata:
Affected Modules: FETCH_PROTEINS
Error: "Genome fetch failed"¶
Symptoms:
Causes: - Missing DNA sequences in database - Insufficient memory for large genomes - Database connection timeout
Solutions: 1. Verify DNA data exists:
- Increase timeout for large genomes:
Affected Modules: FETCH_GENOME
Statistics Modules¶
Error: "Statistics generation failed"¶
Symptoms:
Causes: - Database schema issues - Missing required tables - Perl API version mismatch
Solutions: 1. Verify database schema version:
-
Check required tables exist:
-
Ensure Ensembl API matches schema:
Affected Modules: RUN_STATISTICS, RUN_ENSEMBL_META
Database Population¶
Error: "SQL execution failed"¶
Symptoms:
Causes: - SQL syntax errors - Missing table/column - Insufficient privileges - Duplicate key violations
Solutions: 1. Test SQL file manually:
-
Check for errors in SQL:
-
Verify INSERT/UPDATE privileges:
-
Check for duplicate entries:
Affected Modules: POPULATE_DB, BUSCO_CORE_METAKEYS
Resource and Performance Issues¶
Error: "OutOfMemoryError" or process killed¶
Symptoms:
Causes: - Insufficient memory allocation - Memory leak - Processing very large files
Solutions: 1. Increase memory in configuration:
process {
withLabel: busco {
memory = { 8.GB * task.attempt }
}
withLabel: omamer {
memory = { 16.GB * task.attempt }
}
}
-
Enable automatic retry with more memory:
-
Monitor memory usage:
Issue: Pipeline very slow / processes waiting¶
Symptoms: - Many processes in "PENDING" state - Low CPU usage - Processes waiting hours to start
Causes: - Too many parallel forks - Database connection bottleneck - I/O bottleneck
Solutions: 1. Reduce maxForks for database-heavy processes:
process {
withLabel: fetch_file {
maxForks = 10 // Reduce from 20
}
withName: OMAMER_HOG {
maxForks = 5 // Reduce from 15
}
}
-
Stagger job submission:
-
Check database connection pool limits
-
Monitor I/O with:
Issue: "Too many open files"¶
Symptoms:
Causes: - System file descriptor limit reached - Too many parallel processes - File handles not being released
Solutions: 1. Increase file descriptor limit:
-
Set in your shell profile:
-
Reduce maxForks globally:
File System and I/O Problems¶
Error: "No space left on device"¶
Symptoms:
Causes: - Work directory full - Cache directory full - Output directory full
Solutions: 1. Check disk usage:
-
Enable cleaning:
-
Use different storage for cache:
-
Clean up old work directories:
Issue: File system latency causing failures¶
Symptoms: - Files not found immediately after creation - Intermittent "No such file or directory" errors - Process succeeds on retry
Causes: - Distributed/NFS file system sync delay - High I/O load
Solutions: 1. Increase latency delay:
-
Use faster storage for work directory:
-
Enable error retry:
Error: "Permission denied"¶
Symptoms:
Causes: - Insufficient file permissions - Wrong user/group ownership - Read-only file system
Solutions: 1. Check permissions:
-
Fix ownership:
-
Set correct permissions:
-
Verify write access to output/cache directories
Version and Dependency Issues¶
Error: "Command not found"¶
Symptoms:
Causes: - Tool not installed in container - Wrong container image - PATH not set correctly
Solutions: 1. Verify container has the tool:
-
Check container definition in config:
-
For conda environments:
Error: Version incompatibility¶
Symptoms:
Causes: - Outdated tool version - Wrong Ensembl release - Script-API version mismatch
Solutions: 1. Check tool versions in versions.yml outputs
-
Update container to correct version:
-
Match Ensembl API to database schema:
Issue: Python/Perl module not found¶
Symptoms:
Causes: - Missing dependencies in environment - Wrong Python/Perl environment active
Solutions: 1. For Perl modules:
-
For Python modules:
-
Ensure correct environment in container:
-
Check PERL5LIB and PYTHONPATH:
Data Quality and Validation¶
Issue: Empty BUSCO results¶
Symptoms: - BUSCO completes but reports 0% completeness - No genes found
Causes: - Wrong lineage selected - Input data format issues - Corrupted input file
Solutions: 1. Verify lineage is appropriate:
-
Check input file format:
-
Try broader lineage:
-
Validate file integrity:
Issue: OMark reports high contamination¶
Symptoms: - OMark warns of possible contamination - Many "unexpected" proteins
Causes: - Actual contamination in assembly - Wrong taxonomic placement - Horizontal gene transfer
Solutions:
1. Review OMark detailed output in omark_output/ directory
-
Check for known contaminants:
-
Run contamination screening on assembly:
-
If legitimate, document in metadata
Issue: Statistics don't match expectations¶
Symptoms: - Gene counts seem wrong - Missing expected data in statistics
Causes: - Database not fully populated - Wrong database selected - Filtering parameters
Solutions: 1. Manually check database:
SELECT biotype, COUNT(*) FROM gene GROUP BY biotype;
SELECT biotype, COUNT(*) FROM transcript GROUP BY biotype;
-
Verify using correct database:
-
Review statistics SQL files:
-
Compare with previous runs if available
Debugging Strategies¶
Strategy 1: Enable trace and debugging¶
// nextflow.config
trace {
enabled = true
file = 'trace.txt'
}
dag {
enabled = true
file = 'dag.html'
}
report {
enabled = true
file = 'report.html'
}
Strategy 2: Examine work directories¶
# Find failed process work directory
find work/ -name .exitcode -exec grep -l '1' {} \; | head -1 | xargs dirname
# View logs
cd <work_dir>
cat .command.sh # Command that was run
cat .command.log # stdout
cat .command.err # stderr
cat .command.trace # Resource usage
Strategy 3: Run process manually¶
# Navigate to work directory
cd <work_dir>
# Run command directly
bash .command.sh
# Or step through script line by line
Strategy 4: Reduce scope for testing¶
// Test with single sample
meta_ch = channel.of([
[gca: 'GCA_000001405.15', dbname: 'test_db', production_name: 'test']
])
Strategy 5: Check versions compatibility¶
# Collect all versions
cat work/**/versions.yml > all_versions.yml
# Compare against known working versions
Strategy 6: Increase verbosity¶
# Run with debug logging
nextflow run main.nf -profile test --debug
# Nextflow trace
nextflow run main.nf -with-trace -with-report -with-timeline
Common Error Patterns¶
Pattern: Intermittent failures¶
Indicators: - Process succeeds on retry - Random timing of failures - Different processes failing
Likely Causes: - File system latency - Network issues - Resource contention
Solution:
Pattern: All processes fail at same stage¶
Indicators: - All samples fail at same module - Consistent error message - Happens immediately on start
Likely Causes: - Configuration error - Missing parameter - Wrong path/credential
Solution: - Review parameters for that specific module - Check parameter documentation - Validate paths and credentials
Pattern: Failures after long runtime¶
Indicators: - Process runs for hours then fails - Memory-related errors - Disk space errors
Likely Causes: - Insufficient resources - Memory leak - Large file handling
Solution:
process {
memory = { 8.GB * task.attempt }
time = { 4.h * task.attempt }
errorStrategy = 'retry'
maxRetries = 3
}
Getting Help¶
Before asking for help:¶
- ✅ Check this troubleshooting guide
- ✅ Review module documentation
- ✅ Examine work directory logs
- ✅ Check parameter configuration
- ✅ Try with single sample/minimal data
When reporting issues, include:¶
- Error message (full text from .command.err)
- Command executed (.command.sh contents)
- Configuration (relevant params)
- Environment (Nextflow version, executor)
- Steps to reproduce
- Versions file (if generated)
Useful diagnostic commands:¶
# Nextflow version
nextflow -version
# Process logs
find work/ -name .command.err -exec grep -l ERROR {} \;
# Resource usage
cat work/**/.command.trace
# Configuration used
nextflow config -profile <your_profile>
# List failed processes
nextflow log <run_name> -f status,name,exit | grep FAILED
Quick Reference¶
| Issue | First Check | Module |
|---|---|---|
| Can't connect to database | Credentials, network | All DB modules |
| BUSCO fails | Lineage, input format | BUSCO_* |
| OMAmer database not found | Path, file exists | OMAMER_HOG, OMARK |
| Empty output | Input data, database content | FETCH_, BUSCO_ |
| SQL errors | Privileges, syntax | POPULATE_DB |
| Out of memory | Process memory config | BUSCO_, OMAMER_ |
| File not found | File system latency | All |
| Slow pipeline | maxForks, database load | All |
Last Updated: 2026-02-07
For: Ensembl Genes Statistics Pipeline v1.0