Novel SNPs in Diabetes: Genetic Diversity and Haplotype Analysis

Last Updated: January 8, 2026
Estimated reading time: ~7 minutes

The discovery of Novel SNPs in Diabetes within specific ethnic populations challenges the “one-size-fits-all” approach to genetic medicine. While global databases like dbSNP catalog millions of variants, this thesis reveals that the Indian population harbors unique genetic markers in the Nrf2 and FoxO1 genes that are not yet recorded in international repositories.

  • Genetic Uniqueness: The study identified 8 novel variants in Nrf2 and 8 in FoxO1 specific to the study group.
  • Population Structure: Admixture analysis confirmed the mixed ancestry of the population, crucial for validating genetic data.
  • Haplotype Complexity: Network analysis visualized distinct evolutionary branches between diabetic and control groups.
  • In-Silico Prediction: Computational modeling suggests some of these novel mutations may destabilize essential transcription factor proteins.

ASSOCIATION OF SINGLE NUCLEOTIDE POLYMORPHISM IN TRANSCRIPTION FACTORS MODULATING ANTIOXIDANT DEFENSE WITH OXIDATIVE STRESS PROFILE IN DIABETIC PATIENTS

Admixture and Population Structure Analysis

Before identifying Novel SNPs in Diabetes, it is scientifically imperative to understand the genetic backdrop of the study participants. Genetic association studies can be easily skewed if the population has hidden substructures or stratification. To ensure the validity of the findings, this thesis employed the STRUCTURE software to analyze the ancestry and admixture of the 188 subjects (98 diabetic, 90 control) collected from Pune, India.

“The estimated mean alpha – value was computed to be +0.2765, suggesting that study subjects belonged to an admixed population, i.e., mixed ancestry and did not share a common ancestor” (Kadam, 2022, p. 61).

The analysis revealed that the study subjects belonged to an admixed population. This means the genetic pool is diverse, resulting from the interbreeding of previously isolated populations over generations. By establishing this baseline using the Admixture Model (K=2), the researcher ensured that any differences found in allele frequencies were likely due to the disease phenotype (diabetes) rather than random sampling bias from different sub-populations. This step is critical in population genetics to rule out “false positives” where a genetic marker appears associated with a disease but is actually just a marker of a specific ancestry.

Student Note: Admixture refers to the breeding between two or more previously isolated populations, resulting in a new genetic lineup that influences haplotype frequency.

Analysis TypeSoftware UsedParameterResult/Observation
Admixture ModelSTRUCTURE v2.0Alpha Value+0.2765 (Mixed Ancestry)
Genetic DiversityArlequin v3.5Tajima’s DNegative (Population Expansion)
Haplotype NetNetwork 10.2Median VectorsHigh complexity (mv=70 for HO-1)

Fig: Summary of Population Genetics and Diversity Analysis tools and key results. Adapted from Kadam (2022).

Professor’s Insight: Always check for population stratification in genetic studies; ignoring admixture is a common reason why reproduced studies fail to confirm genetic risk factors.

Discovery of Novel Variants in Nrf2 and FoxO1

One of the most significant contributions of this thesis is the identification of genetic variations that were previously unknown to science. By sequencing the Nrf2 and FoxO1 genes and comparing the results against the NCBI dbSNP database, the researcher isolated Novel SNPs in Diabetes candidates. Specifically, in the Nrf2 gene, out of 32 total variations detected, 8 were novel (not in the database). Similarly, for FoxO1, 8 out of 34 detected variations were novel.

“Out of eight novel variations, three variations were found to be novel SNPs, and five were novel SNVs… Out of five SNVs, three SNVs were found to be located in the exonic and two in the intronic region” (Kadam, 2022, p. 71).

These findings are biologically significant because they occur in functional regions. For Nrf2, novel Single Nucleotide Variants (SNVs) were found in the exonic (coding) regions. For example, a novel missense mutation was identified at position chr2:177230840, which results in an amino acid change from Aspartate (D) to Glycine (G) at position 588 (D588G). Unlike synonymous mutations which are silent, this change modifies the protein’s chemical properties. The discovery of these unique variants highlights that the Indian population may have distinct genetic risk factors or protective mechanisms for diabetes that are not captured in studies conducted on European or American populations.

Student Note: An SNV (Single Nucleotide Variant) is similar to an SNP but occurs at a lower frequency (<1%) in the population, often representing a newer or rare mutation.

GeneNovel Variant LocationTypeMutation EffectPrediction
Nrf2chr2:177230840Exonic SNPD588G (Asp -> Gly)Probably Damaging
Nrf2chr2:177234100Exonic SNVQ73K (Gln -> Lys)Probably Damaging
Nrf2chr2:177234148Exonic SNVE57K (Glu -> Lys)Benign
FoxO1chr13:40666055Exonic SNVD53A (Asp -> Ala)Benign

Fig: Characteristics of selected Novel SNPs and SNVs identified in the study. Adapted from Kadam (2022).

Professor’s Insight: The distinction between “Benign” and “Damaging” in novel variants is hypothetical until validated; however, a “Damaging” prediction in a transcription factor domain is a high-priority target for future research.

Haplotype Diversity and Network Analysis

Beyond looking at single point mutations, the thesis explored Novel SNPs in Diabetes through the lens of haplotypes—clusters of genetic variants that are inherited together. Using Median-Joining (MJ) network algorithms, the researcher constructed phylogenetic trees to visualize how these genetic patterns diverged between diabetic patients and healthy controls. This approach provides a “bird’s eye view” of genetic evolution and selection pressure within the population.

“Of the, 55 haplotypes observed in the disease group, two haplotypes have the highest frequencies (15 and 11) showing large nodes… the control group has more number of unique haplotypes (45) than diabetes group” (Kadam, 2022, p. 63).

The network plots revealed that the control group possessed a higher number of unique haplotypes compared to the diabetic group. In evolutionary biology, a reduction in haplotype diversity (as seen in the diabetic group) can sometimes indicate a “bottleneck” or selective pressure where only certain genetic combinations survive or proliferate. Additionally, the study calculated Tajima’s D, a statistical test for natural selection. The negative values obtained for Nrf2 and FoxO1 suggest that these genes have undergone recent population expansion or positive selection, purging deleterious alleles while retaining functional ones.

Student Note: A Haplotype Network is a graph where nodes represent genetic sequences and edges represent mutations; it visualizes the evolutionary distance between individuals.

Professor’s Insight: High haplotype diversity in controls suggests a robust, varied genetic defense system, whereas the lower diversity in diabetics might imply a restricted genetic toolkit to handle stress.

In-Silico Mutation Modeling

Since wet-lab experimentation on every new mutation is costly and time-consuming, the thesis utilized Novel SNPs in Diabetes data to perform extensive in-silico (computer-based) analysis. Tools like PolyPhen-2, I-Mutant 2.0, and Project HOPE were used to simulate the structural consequences of the discovered amino acid changes. This predictive biology is essential for prioritizing which of the novel variants might be pathogenic.

“Mutant residue G at position 588 of novel SNP at chr2:177230840 (D588G) was predicted to be probably damaging to the structure and function of the protein… The mutant residue is smaller, neutrally charged, and more hydrophobic than the wild-type residue” (Kadam, 2022, p. 120).

The analysis provided molecular-level reasoning for potential dysfunction. For instance, the novel D588G mutation in Nrf2 replaces a negatively charged Aspartate with a neutral, hydrophobic Glycine. Glycine is the smallest amino acid and is highly flexible; introducing it into a structured region of a protein can disrupt the rigid architecture required for the transcription factor to bind DNA or partner proteins. Similarly, the Q73K mutation introduces a positive charge (Lysine) where there was previously a neutral one, potentially repelling binding partners. These structural predictions help explain why a patient carrying such a “novel” variant might have a suboptimal antioxidant response.

Student Note: Hydrophobicity refers to the tendency of non-polar substances to aggregate in aqueous solution and exclude water molecules; changing this property in a protein core can cause unfolding.

Professor’s Insight: In modern zoology and genetics, being able to model protein structures in-silico is just as important as pipetting at the bench.

Reviewed and edited by the Professor of Zoology editorial team. Aside from direct thesis quotations, the content is educational and original.

Real-Life Applications

  1. Ancestry-Based Diagnostics: The finding of population-specific novel SNPs implies that diagnostic panels for diabetes risk in India need to include these specific markers, rather than relying solely on European-derived panels.
  2. Evolutionary Biology Mapping: Haplotype networks help anthropologists and biologists trace the migration and adaptation of human populations in the Indian subcontinent, using diabetes susceptibility as a marker of metabolic adaptation.
  3. Protein Engineering: Understanding how specific mutations like D588G destabilize Nrf2 helps bioengineers design more stable synthetic proteins or drugs that can stabilize the mutant protein structure (chaperone therapy).
  4. Forensic Zoology: The methods used here (admixture analysis, haplotype networks) are identical to those used in wildlife conservation to track poaching or population isolation in endangered species.

Why this matters: For students, this illustrates the universal applicability of population genetics tools—from human disease to wildlife conservation.

Key Takeaways

  • Undocumented Diversity: The Indian population carries at least 16 novel genetic variants in key antioxidant genes (Nrf2 and FoxO1) not found in global databases.
  • Admixed Ancestry: The population studied has a mixed genetic heritage, which must be statistically accounted for to avoid false associations in disease studies.
  • Selection Pressure: Negative Tajima’s D values indicate that these antioxidant genes have been subject to evolutionary selection, likely due to their importance in survival.
  • Structural Destabilization: Novel mutations often alter the charge, size, or hydrophobicity of amino acids, theoretically disrupting the protein’s ability to fight oxidative stress.
  • Methodological Rigor: Combining wet-lab sequencing with dry-lab (in-silico) modeling provides a complete picture of how a single base pair change can lead to disease susceptibility.

MCQs

  1. What does a negative Tajima’s D value generally indicate in the context of this study?
    A. The population is shrinking.
    B. Balancing selection is maintaining multiple alleles.
    C. Population expansion or positive selection (purifying selection).
    D. Complete lack of mutations.
    Correct: C
    Difficulty: Challenging
    Explanation: A negative Tajima’s D usually signifies an excess of low-frequency polymorphisms, indicating population expansion or purifying selection.
  2. Which software was used to construct the phylogenetic relationship (haplotype networks) between the sequences?
    A. BLASTn
    B. Network 10.2.0.0
    C. PolyPhen-2
    D. SPSS
    Correct: B
    Difficulty: Moderate
    Explanation: Network 10.2.0.0 was the specific software used to generate Median-Joining (MJ) haplotype networks.
  3. Why is the novel mutation D588G in Nrf2 predicted to be damaging?
    A. It introduces a larger amino acid.
    B. It changes a hydrophobic residue to a hydrophilic one.
    C. It introduces a Glycine, which is too flexible and disrupts protein rigidity.
    D. It occurs in the intron and affects splicing.
    Correct: C
    Difficulty: Moderate
    Explanation: The thesis notes that Glycine is very flexible and can disturb the required inflexibility/rigidity of the protein at that position.
  4. What was the result of the Admixture analysis (Structure software) for the study population?
    A. The population is genetically pure.
    B. The population is distinct from all other humans.
    C. The population is admixed with mixed ancestry.
    D. The population has no genetic diversity.
    Correct: C
    Difficulty: Easy
    Explanation: The mean alpha value of +0.2765 indicated the subjects belonged to an admixed population.

FAQs

What is the difference between an SNP and an SNV?
An SNP (Single Nucleotide Polymorphism) is a variation present in >1% of the population. An SNV (Single Nucleotide Variant) is a variation present in <1% (rarer).

Why do we use “In-Silico” tools?
In-silico (computer) tools allow researchers to predict the effect of a mutation on a protein’s 3D structure and stability without needing expensive and complex physical experiments for every single variant.

What is an Admixture Model?
It is a statistical model used to estimate the ancestry of individuals, determining what proportion of their genome comes from different ancestral populations.

Lab / Practical Note

PCR Primer Design: When designing primers for large genes like FoxO1 (110kb), it is standard practice to split the gene into smaller overlapping fragments (e.g., FoxO1-a and FoxO1-b) to ensure successful amplification and sequencing, as done in this thesis.

External Resources

Sources & Citations

Thesis:
ASSOCIATION OF SINGLE NUCLEOTIDE POLYMORPHISM IN TRANSCRIPTION FACTORS MODULATING ANTIOXIDANT DEFENSE WITH OXIDATIVE STRESS PROFILE IN DIABETIC PATIENTS, Dipak Ashok Kadam, Guide: Prof. Saroj S. Ghaskadbi, Savitribai Phule Pune University, Pune, India, 2022, pages 61-125.

Correction/Feedback:
If you are the author and wish to submit corrections, please contact us at contact@professorofzoology.com. No placeholder tokens were removed from the source text.

Institutional Invitation:
We welcome universities to submit official thesis abstracts for archival and educational dissemination.

Author Box
Author: Dipak Ashok Kadam, PhD Scholar, Savitribai Phule Pune University.
Reviewer: Abubakar Siddiq

Note: This summary was assisted by AI and verified by a human editor. The content is for educational purposes only.


Discover more from Professor Of Zoology

Subscribe to get the latest posts sent to your email.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top