This Data Preprocessing pipeline prepares raw sequencing data for analysis through optional read trimming and high-performance alignment. It supports both single-end and paired-end BAM or FASTQ files (compressed or uncompressed), and merges all reads from the same sample into a single file.

Trimming is fully customizable and optional, but recommended to enable the removal of low-quality or unwanted regions prior to alignment. Reads are then aligned to your specified reference genome.

The pipeline produces sorted and indexed BAM or FASTQ files, along with read length, read quality, and alignment summary statistics. Output files are fully compatible with downstream workflows such as variant calling, methylation analysis, or genome annotation pipelines.