MAJOR PLATFORMS OF NGS
1) ILLUMINA SEQUENCING :
The major platform used in NGS is ILLUMINIA sequencing in which reads of 100-150 base pairs are considered and by use of generic adapters comparatively longer fragments are considered from the template library for ligation ; and after this process this fragments are attached to glass slide and next process of PCR-polymerase chain reaction is conducted so that amplified several copies are acquired of a read ; then this amplified copies of read are separated into single strands which are to be sequenced and this slide is given flood of fluorescently labelled nucleotides i.e. ‘A,G,C,T’ & DNA polymerase, slide is flooded because each nucleotide has its own colour to identify itself & at a time only one base should be added is ensured by terminator. Identification of base added is is recorded in form of image due to fluorescent signal and at this stage of experiment slide gets ready for next process cycle ; and terminators are removed which ultimately allows the next base to get added in process of NGS ; contamination of signals is prohibited by removal of fluorescent signals and process is repeated by addition of a nucleotide at a time and imaging process using computers detection of bases at each site is possible which results in construction of sequence.
2) 454 SEQUENCING TECHNOLOGY:
Compared to illuminia, 454 SEQUNCING TECHNOLOGY has more ability to sequence longer reads. On addition of bases optical signal’s are read which is basic principal of 454seq. Even 454 seq technology fragmentation of DNA or RNA is involved which results in 1kb long reads and in this process of library amplification the fragments are attached to the microbeads and using the PCR technique the fragments are amplified further. In latter stage the slide is flooded with any one nucleotide’s i.e. A,G,C,T due to addition of each nucleotide optic signal is released and locations of signals are marked to identify the bead were the nucleotides get added ; after washing mix the process gets recycled. In 454 seq technology graphs are developed for each sequence read reflecting the signal density of every nucleotide washes. Therefore using computers the density of signals in each wash promotes the result which is a ‘sequence’.
3) ION TORRENT PGM:
ION TORRENT PGM seq technology is quite different type of platform than illuminia & 454 seq tech. Optic signals are not used in this platform but it use the concept of addition of a dNTP or DNA polymerase in wells which releases H+ ion instead of optic signal. The tDNA or RNA get fragmented into size of approx. 200bp. Amplification has to be done by technique of emulsion PCR. Only one step in common remains is of flooding slide with single type’s dNTP along with buffers and polymerase. Further H+ ion gets released by the addition of dNTP to a DNA polymerase which decreases pH too. The pH changes are detected and recorded from each well due to which analysis of bases type and its concentration in that well gets concluded.
4) HELISCOPE GENETIC ANALYSIS SYSTEM / HELICOS :
HELISCOPE GENETIC ANALYSIS SYSTEM / HELICOS is the NGS
platform which used the concept of single molecule fluorescent sequencing ; this process of using single molecule sequencing idea simplifies the work of DNA sample preparation process and avoids errors caused by PCR. DNA gets sliced into smaller fragments and it gets tailed with poly A and hybridised to a flow cell surface containing oigo-dT for sequencing-by-synthesis of billions of molecules simultaneously. (Thompson et al 2010) this technique requires less material additions compared to other methods.
CHALLENGES IN DATA INTERPRETATION-
Bioinformatics research, life sciences ; social networking has hand in hand problem of ‘big data’. Due to lack of accurate computational tools and exact infrastructure, biomedical research has real difficulty of carrying out analysis and interpretations of newly developed data sets. Nekrutenko ; Taylor in 2012 discussed and pointed out some major issues with analysis, interpretation, reproducibility ; accessibility of NGS data. Their exact points are as follows =
• ADOPTION OF EXISTING ANALYSIS PRACTICES-
According to Nekrutenko and Taylor there must be some well-known and well accepted data analysis practise so that it should be adopted as common analysis practise. For example 1000 genome project-GPC 2010 ; Hap Map are highly coordinated developed projects that defines series of accepted practises for peculiar discoveries, but both of them found that despite of well documented analytical procedures developed by 1000 genome project the community of researchers are still using a mix of heterogeneous approaches for different types of studies.
• DIFFICULTY OF REPRODUCIBLITY-
For a researcher while experimenting or studying any research he should get a complete holistic reported analysis by means of means of research publications or research literatures so that his process remains focused and intact. But Neckrutenko ; Taylor found that latest publications reporting typical computational analysis fails to provide exact information i.e. input data sets or programme source code or scripts ; all parameter settings ; due to which NGS data analysis results could not be exactly verified ,reproduced and adopted or used which ultimately results in crises like errors in reproducibility.
• POTENTIAL OF INTEGRATIVE RESOURCES-
Due to non-documentation in precise manner biological researchers face problems in computational complex analysis for large datasets. Keeping this aspects in mind a number of integrative frameworks have been developed which brings large numbers of tools under one dedicated and unified interface such as Bio extract Galaxy ,Gene pattern, Gene prof Mobyleand and etc. these highly useful integrative resources are keystones and potential requirements for bioinfomatician which empower them to make use of highly advanced computing infrastructure which is required in NGS data analysis.
• MAKING HPC INFRASTRUTURE USEFUL TO BIOLOGIST- Computation for analysis is heart of NGS data and is intensively used in analysis process. There are many high performance computing (HPC) resources available for computing of NGS. There are clusters of HPC all over the globe, popularly known as ‘cloud computing’. There has been growth in no. of vendors who are providing software and services for sequence data processes which include DNAnexus, Genomequest ; there are free and open source software available on internet FOSS. These FOSS are for single purpose. Well known FOSS are Crossbow for varied discoveries ; Myrna for RNAseq analysis ; CloVr for studying metagenomics annotations.
• IMPROVING LONG TERM ARCHIVING-
One of the most problem faced by researcher is that an online analysis tool is not guaranteed to be online forever therefore to archive the snapshots of a particular analysis is the promising solution.