BioDemultiplexer
Documentation for BioDemultiplexer.
BioDemultiplexer.classify_sequences
BioDemultiplexer.divide_fastq
BioDemultiplexer.divide_fastq
BioDemultiplexer.execute_demultiplexing
BioDemultiplexer.find_best_matching_bc
BioDemultiplexer.preprocess_bc_file
BioDemultiplexer.semiglobal_alignment
BioDemultiplexer.semiglobal_alignment
BioDemultiplexer.classify_sequences
— FunctionCompare each sequence in the fastqR1 file with the sequences in bcdf, and classify the sequences of the specified file based on that comparison.
BioDemultiplexer.divide_fastq
— MethodDivides a single FASTQ file for parallel processing.
BioDemultiplexer.divide_fastq
— MethodDivides a pair of FASTQ files into smaller parts for parallel processing. It calculates the number of reads per worker and uses the split command to divide the files.
BioDemultiplexer.execute_demultiplexing
— MethodOrchestrates the entire demultiplexing process for FASTQ files. Handles the preprocessing, dividing, demultiplexing, and merging of files.
BioDemultiplexer.find_best_matching_bc
— MethodCalculate and compare the similarity of a given sequence seq with the sequences in the given DataFrame bc_df.
Returns
A tuple (max_score_bc, delta)
, where max_score_bc
is the index of the best matching sequence in bc_df
, and delta
is the difference between the highest and second-highest scores.
BioDemultiplexer.preprocess_bc_file
— FunctionPreprocesses the barcode file by modifying sequences based on specific criteria.
BioDemultiplexer.semiglobal_alignment
— MethodFast version of semiglobal_alignment function.
BioDemultiplexer.semiglobal_alignment
— MethodThis function aligns query
and ref
strings, using semiglobal alignment algorithm.
Returns
A similarity score as a float, where higher values indicate better alignment.(0<=similarity_score<=1)