Using PacBio IsoSeq Pipelines#

Warning

In this mode, truncate_ratio_5p and truncate_ratio_3p cannot be effective.

CCS reads generated by pbsim3 can be used in officially supported PacBio IsoSeq pipelines. To finish this tutorial, you need to install PacBio SMRTLink or its community version (recommended). The version of Dependencies:

Software

Version

pbmerge

3.0.0 (commit v3.0.0)

pbindex

3.0.0 (commit v3.0.0)

samtools

1.16.1

ccs

6.0.0 (commit v6.0.0-2-gf165cc26)

pbmm2

1.10.0 (commit v1.10.0)

isoseq3

3.8.2 (commit v3.8.2)

Generation of CCS reads. We would use PacBio Sequel for example.

python -m yasim pbsim3 \
    -m SEQUEL \
    -M errhmm \
    -F chrm_trans.fa.d \
    -d isoform_low_depth.tsv \
    -o chrm_ccs_isoseq \
    -j 40 \
    --ccs_pass 10 \
    --preserve_intermediate_files

Warning

By default, intermediate files like CCS BAM will be removed in YASIM PBSIM3 workflow to save space. However, this is not wanted when using IsoSeq3 pipeline. You need to use --preserve_intermediate_files to disable this behaviour.

Merge all small CCS BAMs into single CCS BAM.

python -m yasim_scripts merge_pbccs \
    --out chrm_ccs_isoseq.ccs.bam \
    --input_bam_glob 'chrm_ccs_isoseq.d/*/tmp*.ccs.bam'
pbindex chrm_ccs_isoseq.ccs.bam

Now the file chrm_ccs_isoseq.ccs.bam should be considered a HiFi BAM without adapters, primers, etc. So it is clear that Limma and IsoSeq polish will not be needed. Then you can use the standard PacBio IsoSeq pipeline. For example:

# Cluster reads from the same molecule
isoseq3 cluster \
    chrm_ccs_isoseq.ccs.bam \
    chrm_ccs_isoseq.transcripts.xml \
    --log-level INFO \
    --num-threads 40
# Align clustered reads to the reference genome
pbmm2 align \
    --preset ISOSEQ \
    --sort \
    --log-level INFO \
    chrm_ccs_isoseq.transcripts.xml.hq.bam \
    chrM.fa \
    chrm_ccs_isoseq.aln.bam
# Collapse aligned reads to GFF
isoseq3 collapse \
    --do-not-collapse-extra-5exons \
    --log-level INFO \
    chrm_ccs_isoseq.aln.bam \
    chrm_ccs_isoseq.ccs.bam \
    chrm_ccs_isoseq.collapse.gff

The generated annotation file would be available at chrm_ccs_isoseq.collapse.gff. You are free to use GffCompare, SQANTI3 or Pigeon for further analysis.