Using PacBio IsoSeq Pipelines#
Warning
In this mode, truncate_ratio_5p
and truncate_ratio_3p
cannot be effective.
CCS reads generated by pbsim3
can be used in officially supported PacBio IsoSeq pipelines. To finish this tutorial, you need to install PacBio SMRTLink or its community version (recommended). The version of Dependencies:
Software |
Version |
---|---|
pbmerge |
3.0.0 (commit v3.0.0) |
pbindex |
3.0.0 (commit v3.0.0) |
1.16.1 |
|
6.0.0 (commit v6.0.0-2-gf165cc26) |
|
1.10.0 (commit v1.10.0) |
|
3.8.2 (commit v3.8.2) |
Generation of CCS reads. We would use PacBio Sequel for example.
python -m yasim pbsim3 \
-m SEQUEL \
-M errhmm \
-F chrm_trans.fa.d \
-d isoform_low_depth.tsv \
-o chrm_ccs_isoseq \
-j 40 \
--ccs_pass 10 \
--preserve_intermediate_files
Warning
By default, intermediate files like CCS BAM will be removed in YASIM PBSIM3 workflow to save space. However, this is not wanted when using IsoSeq3 pipeline. You need to use --preserve_intermediate_files
to disable this behaviour.
Merge all small CCS BAMs into single CCS BAM.
python -m yasim_scripts merge_pbccs \
--out chrm_ccs_isoseq.ccs.bam \
--input_bam_glob 'chrm_ccs_isoseq.d/*/tmp*.ccs.bam'
pbindex chrm_ccs_isoseq.ccs.bam
Now the file chrm_ccs_isoseq.ccs.bam
should be considered a HiFi BAM without adapters, primers, etc. So it is clear that Limma and IsoSeq polish will not be needed. Then you can use the standard PacBio IsoSeq pipeline. For example:
# Cluster reads from the same molecule
isoseq3 cluster \
chrm_ccs_isoseq.ccs.bam \
chrm_ccs_isoseq.transcripts.xml \
--log-level INFO \
--num-threads 40
# Align clustered reads to the reference genome
pbmm2 align \
--preset ISOSEQ \
--sort \
--log-level INFO \
chrm_ccs_isoseq.transcripts.xml.hq.bam \
chrM.fa \
chrm_ccs_isoseq.aln.bam
# Collapse aligned reads to GFF
isoseq3 collapse \
--do-not-collapse-extra-5exons \
--log-level INFO \
chrm_ccs_isoseq.aln.bam \
chrm_ccs_isoseq.ccs.bam \
chrm_ccs_isoseq.collapse.gff
The generated annotation file would be available at chrm_ccs_isoseq.collapse.gff
. You are free to use GffCompare, SQANTI3 or Pigeon for further analysis.