CLI Internals

The following page describes the internal API used by the Command Line Pipeline. This function and objects are not meant for interactive usage.

velocyto.counter module

class velocyto.counter.ExInCounter(valid_bcs2idx: typing.Dict[str, int] = None) → None[source]

Bases: object

Main class to do the counting of introns and exons

bclen
process(ivlfile: str, samfile: str) → None[source]

Performs all steps in the right order

static parse_cigar_tuple(pos: int) → typing.Tuple[[typing.List[typing.Tuple[int, int]], bool, int], int][source]
peek(samfile: str, lines: int = 30) → None[source]

Peeks into the samfile to determine if it is a cellranger or dropseq file

iter_unique_alignments(samfile: str, yield_line: bool = False) → typing.Iterable[source]

Iterates over the bam/sam file

read_repeats(gtf_file: str) → int[source]

Read masked repeats

read_genes(ivlfile: str) → int[source]

Read genes and intervals

mark_up_introns(samfile: str, intron_validation: str) → None[source]

Mark up introns that have reads across junctions to exons ?and count last exon reads?

count(samfile: str, sam_output: typing.Tuple[[typing.Any, typing.Any, typing.Any, typing.Any], typing.Any] = None) → None[source]

Do the counting of molecules

velocyto.molitem module

class velocyto.molitem.Molitem(gene: velocyto.genes.Gene) → None[source]

Bases: object

Keeps track of the alignments (to intervals of one gene) of all reads corresponding to the same molecule

gene
has_some_spliced_read
ivlhits
mark_hit_ivls(matchivls: typing.List[velocyto.intervals.Interval], read_is_spliced: bool = False) → None[source]

Add info for the alignment(s) of one read

count(my_bcidx: int) → None[source]

Call after all reads have been processed to annotate this molecule on the gene

velocyto.transcript module

class velocyto.transcript.Transcript(trname: str, trid: str, genename: str, geneid: str, chrom: str, strand: str) → None[source]

Bases: object

Object rappresenting a transcript

trname
trid
genename
geneid
chrom
strand
get_chromstrand() → str[source]
add_exon(start: int, end: int, fuse: bool = False) → None[source]
fuse() → None[source]
sorted_exons() → typing.List[typing.Any][source]
sorted_ivls() → typing.Iterable[source]

Returns sorted intervals of both exons and introns using only the exons

spans_over(interval: typing.Tuple[int, int]) → bool[source]
get_start() → int[source]
get_end() → int[source]
get_chrom() → str[source]
get_strand() → str[source]
is_fw() → bool[source]
get_trname() → str[source]
get_genename() → str[source]
get_trid() → str[source]
get_geneid() → str[source]
exons_sorted

velocyto.genes module

class velocyto.genes.Gene(genename: str, geneid: str, chrom: str, strand: str, nbcs: int, genestart: int = 0, geneend: int = 0) → None[source]

Bases: object

genename
geneid
chrom
strand
lastexon_end_pos
spliced_mol_counts
ambiguous_mol_counts
unspliced_mol_counts
deduced_tr_end
start
end
read_start_counts_from_locus_end
get_start() → int[source]
get_end() → int[source]
get_chrom() → str[source]
get_strand() → str[source]
get_spliced_mol_counts(bcidx: int) → int[source]

Molecule counts that are exonic in all transcript models [ordered by barcode index]

get_ambiguous_mol_counts(bcidx: int) → int[source]

Molecule counts that are exonic in some but not all transcript models [ordered by barcode index]

get_unspliced_mol_counts(bcidx: int) → int[source]

Molecule counts that are intronic by all transcript models [ordered by barcode index]

set_range(start: int, end: int) → None[source]
set_ivls(ivls: typing.List) → None[source]
num_ivls() → int[source]
get_lastexon_counts() → typing.Tuple[int, int][source]

return the tuple: (read count spanning most 3’ intron-exon junction, read count on most 3’ exon)

get_lastexon_length() → int[source]
get_last_3prime_exon_interval() → int[source]
validate_intron_ivls(rule: str = 'strict') → None[source]

Annotate the introns that can safely be used for intron molecule counting => either flanking exon is sure an exon and has >= 1 read spanning junction

Now supports two different heursitics

“strict” and “permissive”

add_read_stats(read: velocyto.read.Read) → None[source]
get_deduced_tr_end() → int[source]
get_tr_end() → int[source]
ivlinside_read_counts
ivljunction3_read_counts
ivljunction5_read_counts
ivls

velocyto.intervals module

class velocyto.intervals.Interval(start: str, end: str, gene: velocyto.genes.Gene, ivlidx: int, ivltype: str, is_last3prime: bool) → None[source]

Bases: object

Holds an exon/intron interval read from the interval file

start
end
gene
ivlidx
ivltype
is_last3prime
is_maybe_exon
is_sure_exon
is_sure_intron
is_sure_valid_intron
add_read_inside() → None[source]
add_read_spanning5end() → None[source]
add_read_spanning3end() → None[source]
ends_upstream_of(read: velocyto.read.Read) → bool[source]
The following situation happens
Read

*|||segment|||-?-||segment|||????????

???????|||||Ivl|||||||||*

starts_upstream_of(segment: typing.Tuple[int, int]) → bool[source]

The following situation happens

*||||||segment|||||????????

*|||||||||||||Ivl||||||||||????????????

contains(segment: typing.Tuple[int]) → bool[source]

The following situation happens

||||||segment|||||

|||||||||||||Ivl||||||||||||||||

start_overlaps_with_part_of(segment: typing.Tuple[int, int], minimum_flanking: int = 5) → bool[source]

The following situation happens

—|||segment||—
|||||||||||||Ivl||||||||||||||||

where idicates the minimum flanking

end_overlaps_with_part_of(segment: typing.Tuple[int, int], minimum_flanking: int = 5) → bool[source]

The following situation happens

—|||segment||—

|||||||||||||Ivl||||||||||||||||

where idicates the minimum flanking

class velocyto.intervals.IntervalsIndex(ivls: typing.List[velocyto.intervals.Interval]) → None[source]

Bases: object

Search help class used to find the intervals that a read is spanning

last_interval_not_reached
has_ivls_enclosing(read: velocyto.read.Read) → bool[source]

Finds out if there are intervals that are fully containing all the read segments

Parameters:read (vcy.Read) – the read object to be analyzed
Returns:respones – if one has been found
Return type:bool
find_overlapping_ivls(read: velocyto.read.Read) → typing.Tuple[set, typing.Dict[typing.Any, int]][source]

Finds the overlap between Read and intervals

Parameters:read (vcy.Read) – the read object to be analyzed
Returns:
  • matchgenes (set) – the genes that the read is overlapping with
  • matchivls (dict: {vcy.Interval: int}) – A dictionary with keys the intervals that the read is overlapping with and values the kind of overlapping it is one of vcy.MATCH_INSIDE (1), vcy.MATCH_OVER5END (2), vcy.MATCH_OVER3END (4)

velocyto.read module

class velocyto.read.Read(bc: str, umi: str, chrom: str, strand: str, pos: int, cigar: str, segments: typing.List, clip5: typing.Any, clip3: typing.Any, spliced: bool) → None[source]

Bases: object

Container for reads from sam alignment file

bc
umi
chrom
strand
pos
cigar
segments
clip5
clip3
spliced
is_spliced() → bool[source]
start() → int[source]
end() → int[source]