CyExpDB serves as a centralized expression resource integrating transcriptome-level gene expression data from five major Cyprinidae species—Cyprinus carpio, Carassius gibelio, Carassius auratus, Labeo rohita, and Ctenopharyngodon idella. It enables comprehensive comparative transcriptomic analyses by providing curated RNA-seq datasets, normalized expression matrices, and tissue-specific gene profiles across diverse biological contexts.
The database encompasses 1,582 RNA-seq samples collected from 190 publicly available BioProjects, representing 107 distinct tissues and developmental stages. To ensure high data quality, raw reads were processed through a standardized pipeline: quality assessment with FastQC, adapter and low-quality read trimming using Trimmomatic, alignment to species-specific reference genomes via HISAT2, and transcript assembly and quantification using StringTie.
To distinguish coding and non-coding transcripts, CPC2 was employed for coding potential estimation. Tissue specificity was quantified using the τ (tau) index, allowing classification of genes into absolute, highly specific, intermediate, and broadly expressed categories.
Functional annotation was performed using Blast2GO and KEGG pathway mapping, facilitating exploration of the biological relevance of tissue-enriched genes. Together, these analyses provide an integrated framework for understanding gene expression diversity, tissue specialization, and evolutionary conservation within the Cyprinidae family.
CyExpDB is designed to support functional genomics, molecular breeding, and evolutionary studies within Cyprinidae, providing a valuable framework for downstream research in fish biology. The intuitive web interface ensures that users can easily browse, search, and download data for further analysis.