Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cancer splicing lib to starfusion ref #610

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add nf-test to local subworkflow: `FUSIONCATCHER_WORKFLOW` [#591](https://github.com/nf-core/rnafusion/pull/591)
- Add nf-test to local subworkflow: `STARFUSION_WORKFLOW`. [#597](https://github.com/nf-core/rnafusion/pull/597)
- Add nf-test to local module: `FUSIONINSPECTOR`. [#601](https://github.com/nf-core/rnafusion/pull/601)
- Added `CTATSPLICING_PREPGENOMELIB` to update the starfusion genome library directory with a cancer splicing index. [#610](https://github.com/nf-core/rnafusion/pull/610)

### Changed

Expand Down
14 changes: 13 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -360,10 +360,22 @@ process {
]
}

withName: 'NFCORE_RNAFUSION:BUILD_REFERENCES:STARFUSION_BUILD' {
atrigila marked this conversation as resolved.
Show resolved Hide resolved
withName: 'STARFUSION_BUILD' {
cpus = { 24 * task.attempt }
memory = { 100.GB * task.attempt }
time = { 2.d * task.attempt }
publishDir = [
enabled: !params.ctatsplicing && !params.all,
path: { "${params.genomes_base}/starfusion" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
ext.args = "--max_readlength ${params.read_length} --human_gencode_filter"
}

withName: 'CTATSPLICING_PREPGENOMELIB' {
cpus = { 1 * task.attempt }
memory = { 20.GB * task.attempt }
publishDir = [
path: { "${params.genomes_base}/starfusion" },
mode: params.publish_dir_mode,
Expand Down
44 changes: 44 additions & 0 deletions modules/local/ctatsplicing/prepgenomelib/main.nf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the module's test ? :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I forgot thanks for the reminder :)

Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
process CTATSPLICING_PREPGENOMELIB {
tag "$meta.id"
label 'process_single'

container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://data.broadinstitute.org/Trinity/CTAT_SINGULARITY/CTAT-SPLICING/ctat_splicing.v0.0.2.simg' :
'docker.io/trinityctat/ctat_splicing:0.0.2' }"

input:
tuple val(meta), path(genome_lib)
path(cancer_intron_tsv)

output:
tuple val(meta), path(genome_lib, includeInputs:true), emit: reference
path "versions.yml" , emit: versions

script:
def VERSION = '0.0.2' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
"""
/usr/local/src/CTAT-SPLICING/prep_genome_lib/ctat-splicing-lib-integration.py \\
--cancer_introns_tsv cancer_introns.*.tsv.gz \\
--genome_lib_dir $genome_lib

cat <<-END_VERSIONS > versions.yml
"${task.process}":
ctat-splicing: $VERSION
END_VERSIONS
"""

stub:
def VERSION = '0.0.2' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
"""
touch $genome_lib/refGene.bed
touch $genome_lib/refGene.sort.bed.gz
touch $genome_lib/refGene.sort.bed.gz.tbi
mkdir $genome_lib/cancer_splicing_lib
touch $genome_lib/cancer_splicing_lib/cancer_splicing.idx

cat <<-END_VERSIONS > versions.yml
"${task.process}":
ctat-splicing: $VERSION
END_VERSIONS
"""
}
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ params {
starfusion_ref = "${params.genomes_base}/starfusion/ctat_genome_lib_build_dir"
starindex_ref = "${params.genomes_base}/star"
fusionreport_ref = "${params.genomes_base}/fusion_report_db"

ctatsplicing_cancer_introns = "https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/CANCER_SPLICING_LIB_SUPPLEMENT/cancer_introns.GRCh38.Jun232020.tsv.gz"

// Internal file presence checks
salmon_index_stub_check = "${params.genomes_base}/salmon/salmon/complete_ref_lens.bin"
Expand Down
7 changes: 7 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,13 @@
"fa_icon": "far fa-file-code",
"description": "Path to file in starfusion references"
},
"ctatsplicing_cancer_introns": {
"type": "string",
"format": "file-path",
"exists": true,
"description": "Path to the cancer introns CSV file to create the CTAT-SPLICING reference with",
"default": "https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/CANCER_SPLICING_LIB_SUPPLEMENT/cancer_introns.GRCh38.Jun232020.tsv.gz"
},
"starindex": {
"type": "boolean",
"fa_icon": "far fa-file-code",
Expand Down
12 changes: 11 additions & 1 deletion subworkflows/local/build_references.nf
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ include { HGNC_DOWNLOAD } from '../../modules/local/hgnc/main'
include { STARFUSION_BUILD } from '../../modules/local/starfusion/build/main'
include { GTF_TO_REFFLAT } from '../../modules/local/uscs/custom_gtftogenepred/main'
include { GET_RRNA_TRANSCRIPTS } from '../../modules/local/get_rrna_transcript/main'
include { CTATSPLICING_PREPGENOMELIB } from '../../modules/local/ctatsplicing/prepgenomelib/main.nf'

/*
========================================================================================
Expand Down Expand Up @@ -142,7 +143,16 @@ workflow BUILD_REFERENCES {
!file(params.starfusion_ref_stub_check).exists() || file(params.starfusion_ref_stub_check).isEmpty() )) {
STARFUSION_BUILD(ch_fasta, ch_gtf, params.fusion_annot_lib, params.species)
ch_versions = ch_versions.mix(STARFUSION_BUILD.out.versions)
ch_starfusion_ref = STARFUSION_BUILD.out.reference
if (params.ctatsplicing || params.all) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (params.ctatsplicing || params.all) {
if (params.ctatsplicing) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The params.all is in the outer if clause:

 if ((params.starfusion || params.all) &&
            (!file(params.starfusion_ref).exists() || file(params.starfusion_ref).isEmpty() ||
            !file(params.starfusion_ref_stub_check).exists() || file(params.starfusion_ref_stub_check).isEmpty() )) {
            STARFUSION_BUILD(ch_fasta, ch_gtf, params.fusion_annot_lib, params.species)

I think that is why the test_build is failing. It is not populating the ch_starfusion_ref with CTATSPLICING_PREPGENOMELIB.out.reference or STARFUSION_BUILD.out.reference

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but the ctatsplicing reference creation will not run if --all is supplied this way?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is difficult to differentiate which is the default here. If --all is supplied, should ch_starfusion_ref produce STARFUSION_BUILD.out.reference or CTATSPLICING_PREPGENOMELIB.out.reference?

Before adding your new module CTATSPLICING_PREPGENOMELIB, the default for --all was to use the STARFUSION_BUILD.out.reference but now it will always default to CTATSPLICING_PREPGENOMELIB.out.reference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi yeah sorry for the confusion:
CTATSPLICING_PREPGENOMELIB.out.reference basically is the STARFUSION reference with some added extra files to the ctatsplicing folder inside of it. So this does not emit a different reference but an expanded reference

CTATSPLICING_PREPGENOMELIB(
STARFUSION_BUILD.out.reference,
params.ctatsplicing_cancer_introns
)
ch_versions = ch_versions.mix(CTATSPLICING_PREPGENOMELIB.out.versions)
ch_starfusion_ref = CTATSPLICING_PREPGENOMELIB.out.reference
rannick marked this conversation as resolved.
Show resolved Hide resolved
} else {
ch_starfusion_ref = STARFUSION_BUILD.out.reference
}
}
else {
ch_starfusion_ref = Channel.fromPath(params.starfusion_ref)
Expand Down
Loading
Loading