### Some Data Analysis Schemes for Accurate Quantitative Results from Real Time PCR

On the main Real Time-PCR page I explained briefly that a PCR reaction could be interrogated during each cycle using a fluorescence measurement and the so-called log linear phase could be identified by plotting these measurements.  Theoretically, the slope of a plot of the log of fluorescence intensity vs. cycle number should be equal to the log of the efficiency.  If the efficiency were 100%, then the amount of product exactly doubles with each cycle, and it should be a simple matter to calculate back to the number of template molecules present at the start of the reaction.  In reality the efficiency isn't necessarily 100%, where the slope on the log graph would be 0.302 (10 0.302 = 2, indicating an exact doubling per cycle).  Still, if you had a good way to calculate the actual efficiency for the reaction, it should still be possible to calculate the number of template molecules you started with.  That is, if the assumption that the efficiency was constant over the course of the reaction up through the log linear phase were true.   A few authors have published data intended to address that question and found that it is not necessarily true, but that for many purposes it is a reasonable assumption.

Actually making an accurate calculation of the reaction efficiency and the number of starting template molecules depends on being able to ascertain which part of the raw fluorescence intensity data really falls in the log-linear phase.  A small variation in the slope will cause a significant change in the calculated value, so picking which points to use is critical.   Several methods to do this have been published, including picking points that appear to fall on the best straight line by eye, calculating the second derivative of the data, and fairly sophisticated statistical analyses of the regression line through the linear phase of the reaction.  Of course, it would be nice if an analytical method you liked could be handled by an Excel application.   Several researchers have posted Excel worksheets on the internet that you can download which implement one of these data analysis methodologies.

If you are satisified that the method you are currently using gives good results, that your standards are running with the same efficiency as your samples, that moving the threshold line around hardly makes any difference to your results, you don't really need to read the rest of this.
However
If you are curious about possibly being able to get around issues having to do with how comparable your standards really are to your samples, would like a more automated data analysis procedure, or wonder why all this averaging and comparison should really be necessary, you will probably find the papers below interesting.

Nice methodology review
BioTechniques, Vol. 39, No. 1, July 2005, pp. 75–85
Real-time PCR for mRNA Quantitation
Marisa L. Wong and Juan F. Medrano

Abstract
Real-time PCR has become one of the most widely used methods of gene quantitation because it has a large dynamic range, boasts  tremendous sensitivity, can be highly sequence-specific, has little to no post-amplification processing, and is amenable to increasing  sample throughput.  However, optimal benefit from these advantages requires a clear understanding of the many options available  for running a real-time PCR experiment. Starting with the theory behind real-time PCR, this review discusses the key components  of a real-time PCR experiment, including one-step or two-step PCR, absolute versus relative quantitation, mathematical models available for relative quantitation and amplification efficiency calculations, types of normalization or data correction, and  detection chemistries. In addition, the many causes of variation as well as methods to calculate intra- and inter-assay variation are  addressed.

Methodology article
Nucl. Acids Res. 2003
Mathematics of Quantitative Kinetic PCR and the Application of Standard Curves
R.G. Rutledge and C. Cote

Nucleic Acids Research, 2003.  Vol. 31 no. 16  e93

Abstract
Fluorescent monitoring of DNA amplification is the basis of real-time PCR, from which target DNA concentration can be determined from the fractional cycle at which a threshold amount of amplicon DNA is produced.  Absolute quantification can be achieved using a standard curve constructed by amplifying known amounts of target DNA.  In this study, the mathematics of quantitative PCR are examined in detail, from which several fundamental aspects of the threshold method and the application of standard curves are illustrated.  The construction of five replicate standard curves for two pairs of nested primers was used to examine the reproducibility and degree of quantitative variation using SYBR Green I fluorescence.  Based upon this analysis the application of a single, well-constructed standard curve could provide an estimated precision of +/- 6-21%, depending on the number of cycles required to reach threshold.  A simplified method for absolute quantification is also proposed, in which quatitative scale is determined by DNA mass at threshold.

Methodology article
BMC Bioinformatics, March 2005
A Standard Curve Based Method for Relative Real Time PCR Data Processing
Alexey Larionov, Andreas Krause, William Miller

BMC Bioinformatics 2005, 6:62
Abstract
Background: Currently real time PCR is the most precise method by which to measure
gene expression. The method generates a large amount of raw numerical data and processing
may notably influence final results. The data processing is based either on standard curves
or on PCR efficiency assessment. At the moment, the PCR efficiency approach is preferred
in relative PCR whilst the standard curve is often used for absolute PCR. However, there
are no barriers to employ standard curves for relative PCR. This article provides an
implementation of the standard curve method and discusses its advantages and limitations
in relative real time PCR.

Results: We designed a procedure for data processing in relative real time PCR. The
procedure completely avoids PCR efficiency assessment, minimizes operator involvement
and provides a statistical assessment of intra-assay variation. The procedure includes the
following steps. (I) Noise is filtered from raw fluorescence readings by smoothing, baseline subtraction and amplitude normalization. (II) The optimal threshold is selected automatically
from regression parameters of the standard curve. (III) Crossing points (CPs) are derived directly from coordinates of points where the threshold line crosses fluorescence plots obtained after the noise filtering. (IV) The means and their variances are calculated for CPs in PCR replicas. (V) The final results are derived from the CPs' means. The CPs' variances are traced to results by the law of error propagation. A detailed description and analysis of this data processing is provided. The limitations associated with the use of parametric statistical methods and amplitude normalization are specifically analyzed and found fit to the routine laboratory practice. Different options are discussed for aggregation of data obtained from multiple reference genes.

Conclusion: A standard curve based procedure for PCR data processing has been compiled and validated. It illustrates that standard curve design remains a reliable and simple alternative to the PCR-efficiency based calculations in relative real time PCR.

Methodology article
Clinical Chemistry, 2004
Properties of the Reverse Transcription Reaction in mRNA Quantification
Anders Stahlberg, Joakim Hakansson, Xioajie Xian, Henrik Semb, Mikael Kubista

Clincal Chemistry 50:3, 509-515  (2004)
Abstract
Background: In most measurements of gene expression, mRNA is first reverse-transcribed into cDNA. We studied the reverse transcription reaction and its consequences for quantitative measurements of gene expression.

Methods: We used SYBR green I-based quantitative real-time PCR (QPCR) to measure the properties of reverse transcription reaction for the  -tubulin, glyceraldehyde-3-phosphate dehydrogenase, Glut2, CaV1D, and insulin II genes, using random hexamers, oligo(dT), and gene-specific reverse transcription primers. Results: Experimental variation in reverse transcriptionQPCR (RT-QPCR) was mainly attributable to the reverse transcription step. Reverse transcription efficiency depended on priming strategy, and the dependence was different for the five genes studied. Reverse transcription yields also depended on total RNA concentration.

Conclusions: RT-QPCR gene expression measurements are comparable only when the same priming strategy and reaction conditions are used in all experiments and the samples contain the same total amount of RNA. Experimental accuracy is improved by running samples in (at least) duplicate starting with the reverse transcription reaction. and altered living conditions.

Methodology article
BBRC 2002
Validation of a Quantitative Method for Real Time PCR Kinetics
Weihong Liu and David A. Saint

Biochemical and  Biophysical Research Communications 294 (2002) 347-353
Abstract
Real time RT-PCR is the most sensitive method for quantitation of gene expression levels. The accuracy can be dependent on the mathematical model on which the quantitative methods are based. The generally accepted mathematical model assumes that amplification efficiencies are equal at the exponential phase of the reactions for the same amplicon. However, no methods are available to test the assumptions regarding amplification efficiency before one starts the real time PCR quantitation. Here we further develop and test the validity of a new mathematical model which dynamically fits real time PCR data with good correlation (R 2 = 0:9995 0:002, n = 50). The method is capable of measuring cycle-by-cycle PCR amplification efficiencies and demonstrates that these change dynamically. Validation of the method revealed the intrinsic relationship between the initial amount of gene transcript and kinetic parameters. A new quantitative method is proposed which represents a simple but accurate quantitative method.

Methodology article
Neuroscience Ltrs. 2003
Assumption-free Analysis of Quantitative Real-Time Polymerase Chain Reaction (PCR) Data
Christian Ramakers, Jan M. Ruiter, Ronald H. Lekanne Deprez, Antoon F.M. Moorman

Neuroscience Letters 339 (2003) 62-66
Abstract
Quantification of mRNAs using real-time polymerase chain reaction (PCR) by monitoring the product formation with the fluorescent dye SYBR Green I is being extensively used in neurosciences, developmental biology, and medical diagnostics. Most PCR data analysis procedures assume that the PCR efficiency for the amplicon of interest is constant or even, in the case of the comparative Ct method, equal to 2. The latter method already leads to a 4-fold error when the PCR efficiencies vary over just a 0.04 range. PCR efficiencies of amplicons are usually calculated from standard curves based on either known RNA inputs or on dilution series of a reference cDNA sample. In this paper we show that the first approach can lead to PCR efficiencies that vary over a 0.2 range, whereas the second approach may be off by 0.26. Therefore, we propose linear regression on the Log(fluorescence) per cycle number data as an assumption-free method to calculate starting concentrations of mRNAs and PCR efficiencies for each sample. A computer program to perform this calculation is available on request (email: bioinfo@amc.uva.nl; subject: LinRegPCR).

Methodology article
Nucleic Acids Res. 2003
Standardized Determination of real-Time PCR Efficiency from a Single Reaction Set-Up
Ales Tichopad, Michael Dilger, Gerhard Schwartz, Michael Pfaffl

Nucleic Acids Research, 2003.  Vol. 31 no. 20  e122
Abstract
We propose a computing method for the estimation of real-time PCR amplification efficiency.  It is based on a statistic delimitation of the beginning of exponentially behaving observations in real-time PCR kinetics.  PCR ground fluorescence phase, non-exponential and plateau phase were excluded from the calculation process by separate mathematical algorithms.  We validated the method on experimental data on multiple targets obtained on the LightCycler platform.  The developed method yields results of higher accuracy than currently used method of serial dilutions for amplification efficiency estimation.  The single reaction set-up estimation is sensitive to differences in starting concentrations of target sequence in samples.  Furthermore, it resists the subjective influence of researchers, and the estimation can therefore be fully instrumentalized.

Methodology article
Nucleic Acids Res. 2004

(Concerns TaqMan probes)
Instant Evaluation of the Absolute Initial Number of cDNA Copies from a Single Real-Time PCR Curve
Stephanie Swillens, Jean-Christophe Goffard, Yoann Marechal, Alban de Kerchove d'Exaerde, Hakim El Housni

Nucleic Acids Research, 2004.  Vol. 32 no. 6  e53
Abstract
Amplification of a cDNA product by quantitative PCR (qPCR) is monitored by a fluorescent signal proportional to the amount of produced amplicon.  The qPCR amplification curve usually displays an exponential phase followed by a non-exponential phase, ending with a plateau.  Contrary to prevalent interpretation, we demonstrate that under standard qPCR conditions, the plateau can be explained by depletion of the probe through Taq polymerase catalysed hydrolysis.  Knowing the probe concentration and the fluorescence measured at the plateau, a specifc fluorescence can thus be calculated.  As far as probe hydrolysis quantitatively reflects amplicon synthesis, this, in turn, makes it possible to convert measured fluorescence levels in the exponential phase into concentrations of produced amplicon.  It
follows that the absolute target cDNA concentration initially engaged in the qPCR can be directly estimated from the fluorescence data, with no need to refer to any calibration with known concentrations of target DNA.

Methodology article
BioTechniques 2005
Improved Real-Time RT-PCR method for High-Throughput Measurements Using Second Derivative Calculation and Double Correction
Van Luu-The, Nathalie Paquet, Ezequiel Calvo, Jean Cumps

BioTechniques 38:287-293
(Feb. 2005)
Abstract
Quantification of mRNA expression levels using real-time reverse transcription PCR (RT-PCR) is increasingly used to validate results of DNA microarrays or GeneChips®. It requires an improved method that is more robust and more suitable for high-throughput  measurements. In this report, we compare a user non-influent, second derivative method with that of a user influent, fit point method  that is widely used in the literature. We also describe the advantage of using a double correction: one correction using the expression levels of a housekeeping gene of an experiment as an internal standard and a second using reference expression levels of the  same housekeeping gene in the tissue or cells. The first correction permits one to decrease errors due to sample preparation and  handling, while the second correction permits one to avoid the variation of the results with the variability of housekeeping in each  tissue, especially in experiments using various treatments. The data indicate that the real-time PCR method is highly efficient with  an efficiency coefficient close to the theoretical value of two. The results also show that the second derivative method is more accurate than the fit point method in quantifying low gene expression levels. Using triplicate experiments, we show that measurement  variations using our method are low with a mean of variation coefficients of <1%.

Methodology article
BioTechniques 2001
High-Resolution Semi-Quantitative Real-Time PCR Without the Use of a Standard Curve
Alex Gentle, Frank Anastasopoulos, Neville A. McBrien

BioTechniques 31:502-508
(Spet. 2001)
Abstract
The repeatability and sensitivity of a simple, adaptable, semi-quantitative, realtime RT-PCR assay was investigated. The assay can be easily and rapidly applied to quantitate relative levels of any gene product without using standards, provided that amplification conditions are specific for the PCR product of interest. Using the LightCycler™ real-time PCR machine, a serial 10-fold dilution series (spanning four orders of magnitude) of a 379-bp cDNA template was amplified, and the PCR product was detected using SYBR™ GreenI chemistry. The experiment was repeated on a subsequent day. The experimental design was such that the data lent itself to analysis using an appropriate method for testing repeatability. It was found that, within a single assay, for samples assayed in triplicate, a difference of 23% may be reliably detected. Furthermore,when all of the factors that contribute to variability in the assay are taken into account, such as day-to-day variation in pipetting and amplification efficiency, a 52% difference in target template can be detected using a sample size of 4. The assay was found to be linear over at least four orders of magnitude.