Cyprinidae Expression Database

CyExpDB Help & Documentation

A comprehensive guide to exploring and interpreting gene expression data across Cyprinidae species.

CyExpDB is a comprehensive expression atlas encompassing five major Cyprinidae species: Cyprinus carpio, Labeo rohita, Carassius gibelio, Carassius auratus, and Ctenopharyngodon idella. It integrates 1,582 RNA-seq samples from 190 BioProjects spanning 107 tissues and cell lines, providing harmonized, tissue-resolved expression profiles for comparative functional genomics and cross-species expression analysis.

All RNA-seq datasets integrated into CyExpDB were processed using a standardized workflow to ensure comparability and reproducibility across species. The pipeline includes quality assessment, read alignment, transcript quantification, normalization, and ortholog identification.

๐Ÿงช RNA-seq Data Processing
  1. Quality Assessment: Raw reads were evaluated using FastQC to assess sequence quality and GC distribution.
  2. Alignment: Reads were aligned to reference genomes (Cyprinus carpio: GCF_018340385.1, Labeo rohita: GCF_022985175.1, Carassius gibelio: GCF_023724105.1, Carassius auratus: GCF_003368295.1, and Ctenopharyngodon idella: GCF_019924925.1) using HISAT2 (v2.2.1).
  3. Quantification: Transcript abundance was calculated using StringTie (v2.1.4) to obtain TPM and FPKM values for annotated genes.
  4. Normalization: Expression matrices were log2-transformed and quantile-normalized across samples to reduce inter-project variation.
๐Ÿ“Š Quality Control (QC)

Samples with extremely low mapping rates or read depth were excluded. The retained data were evaluated for mapping efficiency, read-depth distribution, and sample-to-sample correlations. PCA confirmed that biological tissue differences explained the major variance, ensuring minimal batch effects.

๐Ÿงฌ Ortholog Identification

Orthologous genes among the five Cyprinidae species were identified using OrthoFinder with default parameters. Orthogroups were defined based on all-against-all sequence similarity comparisons using RefSeq annotations. Each orthogroup represents a set of evolutionarily conserved genes shared across the Cyprinidae family.

๐Ÿงซ Tissue-Specific Gene Identification

The ฯ„-index (tau) was used to quantify the degree of tissue-specificity of each gene using the tispec R package. ฯ„ ranges from 0 (ubiquitous expression) to 1 (highly specific); genes with ฯ„ โ‰ฅ 0.85 were classified as tissue-specific.

๐Ÿงญ Step-by-Step Navigation
  1. From the top navigation bar, open the Search dropdown menu and select your species of interest (e.g., Cyprinus carpio or Labeo rohita).
  2. The Species Overview page will open, displaying genome statistics and data download options. Scroll down to the Select Expression Data section.
  3. Choose the desired tissue or cell line (e.g., Brain, Liver, or Gill), select Coding or Non-coding gene type, and click Submit.
  4. After submission, the Select Criteria for Result Display page will appear. This page provides multiple options to explore gene expression data:
    • ๐Ÿงฌ Gene Information โ€” view general details of genes within the selected tissue.
    • ๐Ÿงช Sample Details โ€” displays BioProject information, experimental conditions, and sample metadata.
    • ๐Ÿ“Š Expression (FPKM) โ€” visualize normalized gene expression levels in FPKM units.
    • ๐Ÿ“ˆ Expression (TPM) โ€” visualize expression levels in TPM units for more direct transcript comparisons.
    • ๐Ÿงฉ Tissue-Specific Genes โ€” list of genes classified by tissue-specificity, computed using the ฯ„-index (tau) method.
  5. On the Tissue-Specific Gene page, browse or search for genes within the selected tissue. The table displays each Gene ID, its ฯ„ score, and corresponding Specificity Class (High, Intermediate, or Low).
  6. To examine gene conservation across species, open the Ortholog Browser from the Search menu. Enter the gene name (e.g., abcg8) and select the species. The Ortholog Browser will display the orthogroup and orthologous genes across all five Cyprinidae species.
๐Ÿงช Example Workflow: Identify Liver-Specific Genes in Cyprinus carpio
  1. Go to Search โ†’ Cyprinus carpio.
  2. In the Select Expression Data panel, choose Liver and select Coding genes.
  3. Click Submit and then select View Tissue-Specific Genes.
  4. Genes with ฯ„ โ‰ฅ 0.85 are identified as Liver-specific.
  5. Enter a gene name (e.g., madd or abcg8) in the Ortholog Browser to find its orthologs in L. rohita, C. idella, and other species.
๐Ÿงฌ Example Workflow: Cross-Species Ortholog Analysis
  1. Open the Ortholog Browser from the Search menu.
  2. Select Cyprinus carpio as the source species.
  3. Enter a gene name (e.g., abcg8).
  4. Click Search to view orthologs across Carassius auratus, Ctenopharyngodon idella, Carassius gibelio, and Labeo rohita.
  5. The result table displays the orthogroup ID and corresponding gene names across all species.
๐Ÿง  Example: Conserved Brain-Specific Expression of lbx1a Orthologs

The lbx1a gene (Orthogroup OG0018250) exhibited highly specific expression in the brain of Cyprinus carpio (ฯ„ = 0.875). Its orthologs across other Cyprinidae species also displayed brain-enriched expression, indicating conserved neural function.

Species Gene ID ฯ„ Score Specificity Class Predominant Tissue
Cyprinus carpiolbx1a0.875Highly SpecificBrain
Carassius auratusLOC1130568380.926Highly SpecificBrain
Carassius auratusLOC1131126390.814Highly SpecificBrain
Labeo rohitalbx1b0.875Highly SpecificBrain
Carassius gibeliolbx1a0.501IntermediateBrain
Ctenopharyngodon idellalbx1a0.259IntermediateBrain

These findings demonstrate the conserved brain-specific expression of lbx1a and its orthologs, consistent with their known roles in vertebrate neural development and differentiation.

For queries or collaboration, contact:

Please cite CyExpDB when using data, figures, or analyses from this resource.