Results from the extensive computation analysis of the 1029 sequenced genomes from India carried out by CSIR-IGIB and CSIR- CCMB were published in the scientific journal, Nucleic Acid Research.
- The analysis led to the identification of 55,898,122 single nucleotide variants in the India genome dataset. Comparisons with the global genome datasets revealed that 18,016,257 (32.23%) variants were unique and found only in the samples sequenced from India. This emphasizes the need for an India centric population genomic initiative.
- India is the second largest country in terms of population density with more than 1.3 billion individuals encompassing 17% of the world population.
- Despite having this rich genetic diversity, India has been under-represented in global genome studies. Further, the population architecture of India has resulted in high prevalence of recessive alleles.
- In the absence of large-scale whole genome studies from India, these population-specific genetic variants are not adequately captured and catalogued in global medical literature.
- In order to fill the gap of whole genome sequences from different populations in India, CSIR initiated the IndiGen Program in April 2019.
- Under this program, the whole genome sequencing of 1029 self-declared healthy Indians drawn from across the country has been completed.
- This has enabled benchmarking the scalability of genome sequencing at population scale in a defined timeline.