Identification of Species Specific DNA Marker as Barcode Sequence from Greater Indian One-Horned Rhinoceros (Rhinoceros unicornis) from North East India
Master's Thesis 2014 45 Pages
List of abbreviations
List of tables
List of figures
CHAPTER - 1 Introduction
CHAPTER - 2 Review of literature
CHAPTER - 3 Materials and methods
CHAPTER – 4 Results
CHAPTER – 5 Discussion
LIST OF ABBREVIATIONS
Abbildung in dieser Leseprobe nicht enthalten
LIST OF TABLES
1. List of primers available in the laboratory.
2. Alignment of different primers within the template
3. Relative melting temperature (Tm) for different forward primers and mismatches in respect to Rhinoceros mtDNA
4. Relative melting temperature (Tm) for different reverse primers and mismatches in respect to Rhinoceros mtDNA
5. Proposed annealing temperature (Ta) and the predicted PCR product size of each primer pair.
6. Number of amino acids & number of nucleotides for the coding region of COI
7. The accession numbers of 12 COI sequences in different members of the Rhinoceretidae family species.
8. Nucleotide compositions of Barcoding gene (COI) of Rhinoceretidae family with different species under study.
9. Nucleotide composition for the Rhinoceretidae family.
10. Sequences producing significant alignments after BLAST with COI (632bp)
11. Mean sequence divergence (K2P) within and among the 6 species of Rhino
LIST OF FIGURES
1. Range map, distribution of Indian Rhino
2. The structure of mitochondrial genome.
3. Map of study area.
4. Band of Genomic DNA.
5. Showing invitro mitochondrial COI PCR amplified product
6. Pairwise comparisions between COI sequences among different species of Rhinoceretidae family. 38-30
7. Phylogenetic analysis of the Kimura 2- parameter (K2P) distances of COI sequences.
8. Identification tree generated through BOLD
The North East India is a reservoir of rich biodiversity for much of India’s flora and fauna, and as a consequence, the region is one of the richest in biological values. DNA based species identification or DNA barcoding is an exciting tool for documenting biodiversity with a gene sequence. Here we focused on the development of DNA barcode sequence of Greater Indian one-horned Rhinoceros [ Rhinoceros unicornis ] by determining the DNA sequence of its mitochondrial 5´ COI gene region using cross primer design for PCR amplification. The cross species primer pairs of forward primer fish [FISH COX F] and reverse primer invertebrate [INV R] DNA barcoding, successfully amplified the barcode sequence of R. unicornis which is confirmed through similarity search in National Centre for Biotechnology [NCBI, www.ncbi.nlm.nih.gov] and Barcode of life [BOLD] database and submitted to NCBI-GenBank with accession no. JN417004 is the first COI barcode sequence of R. unicornis of Greater Indian one-horned Rhino. We observed the maximum identification of the query sequence in case of R. unicornis (99%) and R. sondaicus (93%). Mean pair wise distance was computed using K2P to check the inter species divergence between all species in each genus among the members of Rhinoceretidae family in COI region taking 632 bp into consideration, within species was found to be 0.002 as the maximum and 0.000 as the minimum and that of in between species was found to be 0.064 as the maximum and 0.000 as the minimum. The study represents development of COI species specific DNA barcode of R. unicornis with regard to conservation and trade aspect. The cross species primer pair to amplify barcode gene for a species for which no specific primer are available represents a noble approach for barcoding on those areas like North East India where biological diversity is high.
CHAPTER – 1 INTRODUCTION
The biological diversity of each country is a valuable and vulnerable natural resource. The North East of India comprising of the 8 sister states of Assam, Arunachal Pradesh, Manipur, Meghalaya, Mizoram, Nagaland, Tripura and Sikkim, is a reservoir of rich biodiversity for much of India’s flora and fauna, and as a consequence, the region is one of the richest in biological values. But North East India has still been able to retain it’s a significant proportion of its biodiversity, possibly due to long years of isolation and difficult terrain but is now under increasing pressure to restrain its resources.
DNA based species identification or in other words DNA barcoding is an exciting tool for documenting biodiversity with a gene sequence. Here we focus our interest on the development of DNA barcode sequence of Greater Indian Rhinoceros (Rhinoceros unicornis) as this species is now restricted to Kaziranga, Pabitora and Orang in Assam of Northeast India. The rhinoceros is a large, primitive looking mammal that in fact dates from the Miocene era millions of years ago and is characterized by a snout with one or two horns belong to the order Perissodactyla & family Rhinoceretidae. There are five extant species of rhinos viz. Sumatran (Dicerorhinus sumatrensis), Javan (Rhinoceros sondaicus), black (Diceros bicornis), white (Ceratotherium simum) and greater one-horned Indian Rhino (Rhinoceros unicornis). The genera Rhinoceros and Dicerorhinus found in Asia, whereas Diceros and Ceratotherium are the inhabitant of Africa.
The Greater one-horned rhinoceros or Indian rhinoceros, Rhinoceros unicornis (Figure 1) once ranged throughout the entire stretch of the Indo-Gangetic Plain but excessive hunting and illegal trade in rhinoceros horns reduced their natural habitat drastically. The horns are believed in Asian traditions to have aphrodisiac or healing properties. They have commanded a very high price for centuries, sometimes surpassing that of gold. Some Middle Eastern societies prize rhinoceros horn for dagger handles. Despite a ban on trade in rhinoceros parts under the Convention on International Trade in Endangered Species (CITES) in 1976, populations declined by 90% over the next 20 years. Unless poaching is stopped, rhinoceros extinction in the wild is virtually inevitable. According to IUCN ( International Union for Conservation of Nature and Natural Resources) Indian rhinos has been declared as a vulnerable status (www.iucnredlist.org). There are now more than 2,800 Indian rhinos in 13 groups distributed between Assam (in northern India) and Nepal ( www.rhinos-irf.org/asia ).
Abbildung in dieser Leseprobe nicht enthalten
Figure1. Range Map (Redrawn from Foose and van Strien, 1997)
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Apart from species identification, DNA barcoding can be advantageous for monitoring illegal trade in animal byproducts. When such products are sold, identification through morphological characteristics might no longer be possible. Sometimes only hairs are available for species identification, and it is very difficult or even impossible to visually determine whether a hair came from an endangered or a legally sold species. Thus, DNA barcodes and DNA registries are useful for two main purposes in conservation biology: for the identification of illegally imported animal or plant products (wildlife forensics) and for the rapid assessment of biodiversity studies (Cipriano F. & Palumbi S. R.). DNA barcodes could speed up and make more precise the identification of specimens in biodiversity studies that are currently tedious and time-consuming.
Hebert et al. (2003 a, b) proposed that a DNA barcoding system for animal life could best be based upon sequence diversity in a ~650bp region near the 5’ end of the mitochondrial gene cytochrome oxidase subunit1 (cox1; also referred to as COI) and provides strong species level resolution for varied animals groups including birds (Hebert et al. 2004b), fishes (Ward et al. 2005), springtaib (Hogg & Hebert 2005), spiders (Barett & Hebert 2005), moths (Hebert et al. 2003, Janzen et al. 2005).
The Consortium for the Barcode of Life (CBOL) support the development of DNA barcoding as an international standard for species identification and by development of the Barcode of Life Data Systems (www.barcodinglife.org) – a global on line data management system for DNA barcode.
DNA sequence analysis of a uniform target gene to enable has been referred to as DNA barcoding, by analogy with the UPC barcodes used to identify manufactured goods. The Universal Product Code system developed by the industrial sector to brand retail items employs 10 alternate numerals at each of 11 positions to create 100 billion unique identifiers. Just like UPC barcodes, the DNA sequences within each species are unique. A run of 15 nucleotides, with 4 options at each position, creates the possibility of I billion codes, a hundred-fold excess over the estimated number of animal species. Of course, specific nucleotides are fixed at some positions by selection. However, this constraint can be overcome by focusing on protein-coding genes, where every third position is generally free to vary because of the degeneracy of the genetic code. As a result, by examining a stretch of 45 nucleotides in these genes, one has the prospect of nearly 1 billion alternates.
Since Linnaeus, biologists have used distinguishing features in taxonomic keys to apply binomial species names, such as Homo sapiens. Then, as a master key opens all the rooms in a building, the binomial species name accesses all knowledge about a species. From insects to birds, evidence now shows that short DNA sequences from a uniform locality on genomes can also be a distinguishing feature. As a Linnaean binomial is an abbreviated label for the morphology of a species, the short sequence is an abbreviated label for the genome of the species. The barcode of life thus provides an additional master key to knowledge about a species. Compiling a public library of sequences linked to named specimens, plus faster and cheaper sequencing, will make this new barcode key increasingly practical and useful.
Although there has never been a broadly based effort to implement a microgenomic identification system for animals, enough work has been done to indicate key design elements. It is clear that the mitochondrial (mt) genome of animals represents a better target for analysis than the nuclear genome because of its lack of introns, its limited exposure to recombination and its haploid mode of inheritance. As well, there are robust primers that enable the recovery of specific segments of the mt genome from a broad range of animals.
The mitochondrial genome includes just 13 protein-coding genes that might serve as the core of a DNA-based identification system. In common with other protein-coding genes, its third position nucleotide show high incidence of base substitution, leading to a rate of molecular evolution that is about three times greater than that of 12s or 16s rDNA. In fact, the evolution of this gene is rapid enough to allow the discrimination of not only closely allied species, but also phylogeographic groups within a single species. If COI-5’ is not sufficient for species discrimination, other rapidly evolving gene(s) may need to be analyzed as potential barcoding targets. Possible supplementary sequences include the complete COI gene, other mitochondrial genes (e.g: 16s rDNA, cytochrome b), and/ or ITS (Internal Transcribed Species) is a nuclear gene located be rDNA gene.
Abbildung in dieser Leseprobe nicht enthalten
Figure2. The structure of mitochondrial genome.
The COI gene was chosen as a barcode for animals because of the following reasons: i) the DNA sequence is easily amplified with the same set of primers across different groups (Folmer et al., 1994); ii) the third position of the codon shows a high incidence of nucleotide substitutions, as compared to other protein coding genes (McClellan 2000, Perna & Kocher 1995); iii) the overall mutation rate of the COI gene is lower than that of other mitochondrial genes (Yi et al., 2002).
The most critical step in PCR amplification of partial region of mitochondrial COI gene will be designing oligonucleotide primer when the species specific primer is not available. Cross-species polymerase chain reaction (PCR) primers are often used to amplify part of a gene or genome for which no direct species-specific sequence information is available. The simplest method of producing cross-species primers is to design primers to a region of a genome for a given species and then to empirically test these for amplification in other species (Housley Donna JE et al). For example, primer pairs designed to flank avian microsatellite repeats in one species have been used to amplify homologous sequences in other closely related avian species (Primmer et al). Cross primers were designed by checking the specificity and feasibility of different DNA barcoding primer available in our lab to amplify the targeted barcode sequence. For this, different combinations of primer pairs were made and in-silico PCR amplification using each primer pair was done to predict the region of the amplified target gene and product size.
1. Development of cross species primer for PCR amplification of mitochondrial COI.
2. Development of DNA barcode sequence of Greater Indian Rhinoceros (Rhinoceros unicornis).
3. Sequencing of Cytochrome C Oxidase subunit I (COI) of Indian Rhinoceros and Phylogenetic Analysis among Rhinoceretidae family using Bioinformatics' Tools.
CHAPTER – 2 REVIEW OF LITERATURE
DNA barcoding as a scientific idea, initiated by Paul Hebert in 2003 employs sequence diversity in short segments of standardized regions of the genome as a digital system for species recognition (www.ontariogenomics.ca/research/project/8). The classification and identification of living-organism by Linnaean taxonomy, which began 200 years ago, is based on phenotypic separation of species by morphological dichotomies. Linnaean classification is an abbreviated label for morphology of a species and the short sequence is an abbreviated label for the genome of the species (Stoeckle et al).
Hebert et al (2003a) proposed that the mitochondrial gene cytochrome c oxidase I (COI) can serve as the core of a global bioidentification system for animals. A model COI profile, based upon the analysis of a single individual from each of 200 closely allied species of lepidopterans, was 100% successful in correctly identifying subsequent specimens. When fully developed, a COI identification system will provide a reliable, cost-effective and accessible solution to the current problem of species identification. Its assembly will also generate important new insights into the diversification of life and the rules of molecular evolution.
Hebert et al. (2004) sequenced DNA barcode of 260 of bird species of North-America and tested effectiveness of COI barcode in discriminating bird species. All species had a different COI barcode(s) and none was shared between species. COI sequence in the 130 species represented by two or more individual was either identical or most similar to other sequence of the same species. COI sequence differences between closely related species were, on average, 18 times higher than the difference with in the species. This study identified four probable new species of North America birds, suggesting a global surely will lead to the recognition of many additional species. The finding of large COI sequence difference between and compared to small differences within species confirm the effective news of COI barcodes for the identification of bird species. Identification of birds through DNA barcode will help, for example, when morphological diagnosis are difficult, as when identifying remnants (include eggs, nestling, and adults) in the stomachs of predators. A DNA barcode could similarly identify fragments of birds that strike aircraft (Dove 2000) and recognize carcasses of protected or regulates species.
In 2004, the now well established Consortium for the Barcode of Life; an international initiative started supporting the development of DNA Barcoding, aims to promote both global standards and co-ordinate research in DNA Barcoding. It aims at establishing a public library of sequences and promotes development of portable devices for barcoding. The Rockfeller University in collaboration with two other more Organisations in 2004 has put up the various reasons for ‘Barcode of Life’. They enunciated about the advantage of DNA barcoding from species identification from bits to use of electronic hand held field to the life Barcoder.
Hajibabaei et al (2005) reported that Barcode is no replacement for comprehensive taxonomic analysis for example, when an unknown specimen does not return a close match to existing records in the barcode library, the barcode sequence does not in qualify the unknown specimen for designation as a new species. Instead, such specimens are flagged for through taxonomic analysis .Although the task of identifying and new ultimately achieved through comprehensive taxonomic work; DNA barcode can significantly facilitate this process. The conventional taxonomic work flow , which usually requires the collection of morphological and ecological data , can vary for different taxonomic assembles ( i.e. taxonomic identification of birds and fish require different methods and skills ), where as barcode analysis can be applied in a more or less standardized way across large domains of life . In taxonomy, DNA barcoding can be used for routine identification of specimens.
While barcode libraries have similarities to molecular phylogenetic data (Both are sequence information from assemblages of species), DNA Barcode do not usually have sufficient phylogenetic signal to resolve evolutionary relationships, especially at deeper levels .Although barcode sequences have been analyzed mainly by using phylogenetic tree reconstruction methods such as NJ, these barcode –based trees should not be interpreted as phylogenetic tree. However, because DNA barcodes are used both to identify species, and to draw attention to overlooked and new species, they can help identify candidate exemplar taxa for comprehensive phylogenetic study. In phylogenetic investigations, DNA barcoding can be a starting point for optimal selection of taxa, and barcode sequence can be added to the sequence data set for phylogenetic analysis.
Mitochondrial DNA markers are haploid and uniparentally inherited, they are frequent targets for analysis and have made a particularly strong contribution to population level studies. Although the typical sequence information gathered for DNA barcoding is not sufficient to rigorously address population – level questions, it can provide an early insight into the patterning of genomic diversity within a species. Because barcoding typically targets a large numbers of species or ecological setting.
In population genetics investigations, DNA barcode can provide a first signal of the extent and nature of population divergences and will facilitate comparative studies of population diversity in many species.
Armstrong & Ball (2005) informed that DNA barcoding information could help in providing a correct species identification tool, especially of those in which biologically important properties or molecules with IPR potential have been identified. This information would be useful in not only providing diagnostics for rapid and easier identification species in mixtures in the raw drug trade, but also in drawing specific regulation to protect the national markets. Barcoding could potentially be useful in also identifying species in other groups with high potential of stalking IPR claims , such as medicinal leeches (for their coagulant property) or parasitoid wasps such as Trichogrammtidae (for their bio control uses) or orchids that might have immense commercial value. It is clear that DNA Barcoding can be in securing IPRs for important taxa, for conservation and commerce.
As the sequencing facilities improve, more and more sequence data for the accepted barcoding markers are becoming available in public data bases (GenBank, www.ncbi.nlm.nih.gov; EMBL, www.ebi.ac.uk/embl; DDBJ, www.ddbj.nig.ac.jp). However, the quality of the sequence data in GenBank, EMBL or DDBJ is not always perfect (Harris D.J. 2003), either as a result of sequencing errors, contaminations, sample misidentifications or taxonomic problems. The now well-established Consortium for the Barcode of Life (CBOL, barcoding.si.edu), build a new database specially dedicated to DNA barcoding will change this situation, and will provide an efficient and accurate tool for species identification (Barcode of Life Data Systems, BOLD, www.barcodinglife.org). BOLD has been designed to record not only DNA sequences from several individuals per species (including primer sets, electropherogram trace files and translations) but also complete taxonomic information, place and date of collection, and specimen images (Ratnasingham S. & Hebert P.D.N). All these improvements will probably boost the use of DNA barcoding by ecologists.
Barcoding is becoming an important and commonly-used research tool in taxonomy and systematic biology. These disciplines are essential foundations of species conservation. Several other research fields are also relevant to the protection of endangered species. Ecological studies and conservation genetics are fundamental to species conservation, and barcoding has the potential to be a standard tool in these fields.
Whenever possible, animal specimens should be killed and preserved in a DNA-friendly fashion (freezing, cyanide and ethanol). Even brief exposure to agents that damage DNA, such as ethyl acetate or formaldehyde, should be avoided (Prendini et al., 2002). Methods for DNA isolation fall into two broad categories: DNA release and DNA extraction. DNA release protocols aim to rapidly release DNA into solution, making it accessible for downstream applications such as PCR. Release-based methods also enable DNA isolation from samples without their physical disruption. In this case, the entire specimen can be removed after DNA isolation, allowing the retention of a voucher in cases where this would not otherwise be possible. Release methods are, however, not very sensitive and do not produce high purity DNA suitable for long-term storage (e.g. more than 1 year).By contrast, DNA extraction methods aim to purify DNA, often by binding it to a membrane (e.g. silica) or by chemical fractionation. Some classical methods, such as phenol/chloroform extractions (Sambrook et al., 2001), are not attractive because they are time consuming and involve toxic materials. The type and condition of specimens is a key factor in selecting a DNA isolation method. For fresh or recently collected tissue, a release-based DNA extraction usually provides sufficient DNA for barcoding.
Before starting a barcode project on any new taxonomic group, it is essential to test the performance of existing primers on fresh specimens from a range of species in the target group. If one or two current primer sets do not deliver more than 95% amplification success for the test assemblage, primer redesign should be undertaken. Past studies on varied taxonomic assemblages have shown that minor adjustments in primer sequences can have a large impact on barcode recovery. Primer reconfiguration begins by aligning all available sequences for the target taxonomic group. Subsequent adjustments in sequence to maximize matches have enabled the development of effective primer sets (more than 95% amplification across species) for large taxonomic assemblages, such as Lepidoptera (Janzen et al., 2005), birds (Hebert et al., 2004b) and fish (Ward et al., 2005). In most cases, effectively complete barcode recovery for all species in a group can be achieved with two sets of non-degenerate primers. Using primers with degenerate positions may also reduce the chance of preferential amplification of nuclear pseudogenes (Sorenson et al., 1999). Many software packages are available to aid primer design, but PRIMER3 (Rozen & Skaletsky, 2000) for designing non-degenerate primers.
An optimized PCR for the barcode region of cox1 should yield a single sharp amplicon, with no more than minor sub-banding when examined on an agarose gel. This can often be achieved by optimizing cycling conditions, especially the annealing temperature, and by altering the concentration of PCR reagents such as magnesium, dNTPs and primers through pilot studies
on a few taxonomically divergent members of the target assemblage. Optimization often also dramatically increases amplification success and can eliminate the need for PCR cleanup prior to the sequencing reaction.
The Barcode of Life Data System (BOLD) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a traditional bioinformatics chasm. BOLD is freely available to any researcher with interests in DNA barcoding. By providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances. This paper provides a brief introduction to the key elements of BOLD, discusses their functional capabilities, and concludes by examining computational resources and future prospects.
The role of DNA barcoding as a tool to accelerate the inventory and analysis of diversity for hyperdiverse arthropods is tested using ants in Madagascar. We demonstrate how DNA barcoding helps address the failure of current inventory methods to rapidly respond to pressing biodiversity needs, specifically in the assessment of richness and turnover across landscapes with hyperdiverse taxa. In a comparison of inventories at four localities in northern Madagascar, patterns of richness were not significantly different when richness was determined using morphological taxonomy (morphospecies) or sequence divergence thresholds (Molecular Operational Taxonomic Unit(s); MOTU). However, sequence-based methods tended to yield greater richness and significantly lower indices of similarity than morphological taxonomy. MOTU determined using our molecular technique were a remarkably local phenomenon—indicative of highly restricted dispersal and/or long-term isolation. In cases where molecular and morphological methods differed in their assignment of individuals to categories, the morphological estimate was always more conservative than the molecular estimate. In those cases where morphospecies descriptions collapsed distinct molecular groups, sequence divergences of 16% (on average) were contained within the same morphospecies. Such high divergences highlight taxa for further detailed genetic, morphological, life history, and behavioral studies.