- [Feature]: It adds the option to specify the data type of the loom file layers
- [Bug]: Avoids truncating or overflowing when data per cell per gene exceeds 6500 (max of uint16), this is very rare but it can happen in deeply sampled smartseq2
- [Bug]: Avoids an error with an older versions of cellranger not outputting the _log file
- [Feature]: Be more forgiving on the requirements of the gtf file: gene_name, transcipt_name and exon_number are not required fields anymore
- [Support]: Add section of the docs, explaining better what the bam and gtf files are expected to contain
- [Feature]: Add filter_genes_by_phase_portrait as the main api to filter genes supporting versatile set of conditions
- [Bug]: Fix an error when aligners report consecutive insertion/deletion instead of mismatch
- [Feature]: Improved compatibility with InDrops and STRT through the
--umi-extentions. It allows the same pipeline to be applied to methods with short molecular barcode that cannot be used call a unique molecule without the gene mapping information.
- [Feature]: Support barcodes of different length
- [Feature]: support different verbosity levels with the
--without-umioption allows analyzing UMI-less data such as SmartSeq2
- [Feature]: Generalized logic to include more layers than just Spliced, Unspliced, Ambiguous
- [Feature]: It supports multiple bam files input, it can interpret the file(s) as either as one-file-one-cell or just as batches to be analyzed together. IMPORTANT: a cell cannot be distributed over different bamfiles!
- [Feature]: It supports SmartSeq2 and has a new run_smartseq2 command
- [Feature]: Improve the debug molecular report option to support hdf5
- [Feature]: Add possibility to constraint knn averaging: when turned on avoids edges between cells of specified groups
- [Feature]: Add size factor normalization option
- [Feature]: Add optional feature selection for the unspliced
- [Feature]: run automatically randomized (negative) control for the velocity. Added plotting options for the randomized control visualization
- [Feature]: Change behavior in no-barcode-list mode: use a very permissive heuristic of < 80 molecules per cell as the threshold to thrash empty droplets / no-cell events
- [Feature]: Add cosine projection penalty
- [Feature]: Add the set of subcommands
veloctyo toolsto bridge velocyto with other software (for now DropEst)
- [Feature]: Add subcommand
run_dropestas a shortcut to run dropEst preprocessed data (including Indrops and DropSeq)
- [Bug]: Fix an error in filter_cells: colors array is now filtered as well
- [Bug]: Fix colormap bug with matplotlib 2.2.0
- [Bug]: Fix a skip repeat error with SmartSeq2 pipeline
--multimapoption was removed because it could have yield incorrect results depending on the output format chosen for the aligner
- [Support]: Deprecation warning on default functions: they were being misused by the users.
- [Bug]: Fix a bug that caused extremely slow runtimes when the input bam was not position sorted. Now velocyto will raise an error and ask the user to sort the file using samtools.
- [Support]: Improve the changelog structure
- [Bug]: A change in slicing related to an API change of __getattr__ in loompy2
- [Bug]: Catch another error due to the API change of .create in loompy2
- [Bug]: Fix an incompatibility with loompy2 related to column and row attributes changing from dict to an object
- [Bug]: Catch error due to the API change of .create in loompy2
- [Feature]: Sample metadata file can be specified with different csv formats (the format will be determined automatically)
- [Feature]: Improve documentation: remove information about sorting .gtf files. This procedure is not needed anymore.
- [Feature]: CLI does not require presorting the gtf files. To reduce possibility of incorrect usage, now .gtf file sorting sorting is performed in memory (and not saved).
- [Bug]: Sometimes velocyto missed to detect and warn the user that the .gtf genome annotation file was not sorted, this could have caused undetected errors in the analysis. If you run velocyto without sorting the .gtf, we suggest rerunning.
- [Bug] #40: Error in hdf5 serialization when using cluster label as object array is now fixed
- [Feature]: Pipeline now consider all the possible transcript models that could be supported by a set of reads individually and then decides on the spliced/unspliced/ambiguous count.
- [Feature]: Support different Logic levels
- [Feature]: Changelog added to the doc
- [Feature]: Make the CLI simpler removing the extract interval step.
Now the source .gtf files can be provided directly, they should be provided sorted using
sort -k1,1 -k7,7 -k4,4n -o [OUTFILE] [INFILE]
- [Feature]: Large parts of the documentation rewritten to match the changes in API
- [Feature]: Remove the subcommand
- [Feature]: Add possibility to export pickle containing information of every molecule
- [Bug] #31: Memory usage bug should be solved.
- [Bug]: Many small bug fixes
- [Bug]: Incorrect 0-based indexing for splicing junction corrected (was not causing problems because buffered by MIN_FLANK)
- [Support]: Update the documentation for the new CLI
- [Bug]: fix a bug with ambiguous molecules counting and version bump
- [Bug]: The debug and sampleid option had the same short flag -d
- [Feature]: further ~5x speedup of cython functions making them 100% C and using malloc instead of memory views
- [Feature]: Add support for DropSeq pipelines where the barcode flags in the bam file are XC and XM instead of CB and UB
- [Bug]: Using sphinx 1.7 sorts the autodoc API correctly
- [Feature]: Improve the docs