PCR Simulator

1. Summary

PCR is a central technique in molecular biology, with an extremely wide range of applications in genetic analysis and engineering. The PCR Simulator is a tool for teaching various aspects of this technology. It estimates the products of multiple primers acting on genome-scale templates under user-specified reaction conditions. Results are presented as images or animations of electrophoresis gels. A unique feature of this simulator is its support of individuals with different genotypes.

The PCR simulator takes user inputs for primer sequences and reaction conditions. It uses NIH BLAST to search the primers against genomic sequences to find the potential primer binding sites, then estimates the yield of each potential product using a quantitative model.

Example output from PCR simulator using CSF1PO primers on various individuals

2. Software Architecture

Individuals are built using the organism's reference genome as a framework, with particular alleles added in at specified loci. The list of options for each variable locus is an individual's genotype. If you amplify a DNA segment including part or all of a variable locus, the product sequence will reflect the alleles of that individual's genome. Since a 'genotype' just stores the options for the variable parts, and the majority of the genome is not variable, we are able to store genomic DNA representing many different individuals.

Data Flow in PCR Simulator
Open figure in new window

The user may employ various interfaces; the standard interface on cybertory.org is an HTML form, whereas the cybertory.com interface is the Flash Virtual Laboratory. Each interface constructs a PCR request in XML format, which is submitted to the PCR Simulation Server. Note that the user interfaces are not hosted on the simulation server, but are on other web sites.

The user specifies reaction conditions, including primer sequences, templates, DNA concentrations, denaturation temperature, annealing temperature, and number of cycles.

A Java Servlet fields the requests and spawns various processes to coordinate PCR simulations. Long-running processes are managed by a central Java 'appserver'.

Primer binding sites are located on template genomes using NCBI BLAST, and recorded in a PostgreSQL database. Potential PCR products are identified by searching for pairs of primer binding sites with appropriate orientation and spacing. The sequence of each potential product is retrieved from the BLAST database using modified version of the NCBI "fastacmd" tool.

Since BLAST searches against genomic templates are computationally intensive, we store the results in a data cache, in case the same search is requested again. This greatly improves response times for classes using standard sets of primers, such as those for the FBI CODIS Core loci.

A quantitative model (implemented in Perl) estimates the yield of each potential product. We use a reversible kinetic model and nearest-neighbor thermodynamic calculations to estimate the extent of primer binding under the simulation conditions.

This model estimates priming efficiencies as a function of both primer binding and the sequence of the 3' end of the primer relative to the template, where mismatches nearer the end are more disruptive to priming. The results of this quantitative model are a set of products and intensities for each reaction, with a set of reactions representing the various lanes of a gel. An XSLT (eXtensible Stylesheet Language for Transformations) stylesheet converts the product predictions to an image of an electrophoresis gel in Scalable Vector Graphic (SVG) format.

Depending on the XSLT stylesheet used, the system produces either a static or an animated image. Using open source tools (Apache Batik and Java Advanced Imaging), we can convert the static graphics into alternative formats. For simple representation on a web page, static SVG works in most modern browsers, and JPG format provides an alternative if SVG is not supported. If you need to perform densitometry or other types of scientific image analysis, you can request the results in 16 bit grayscale TIFF format. This is compatible with NIH ImageJ, for example.

Output from the sequencing simulator can also be sent on to other modules, such as DNA sequencing. This is done slightly differently on cybertory.org and cybertory.com. On cybertory.org server, PCR product sequences are saved on the server in a location where they can be used as templates by a specialized version of the sequencing simulator. Users must be sure to log in with unique names, since only the last set of PCR products is saved for any given user name. It is recommended to use your email address as a user name, since that should be unique. In the commercial interface, the PCR products are added into the solutions on the PCR machine, and they can be used like any other DNA molecule, for sequencing, restriciton digestion, etc.

3. Research Applications

Because this model of the process may be useful in scientific research, we have made it available under an open source license.

4. HTML Interface

Configuration options are sent to the CGI program as “GET method” parameters, so they are part of the URL itself. This means you can save bookmarks to particular configurations of the interface.

5. Flash Interface

The cartoon interface uses a virtual PCR Machine in the cartoon virtual lab.

The Cybertory Virtual Molecular Biology Lab includes a PCR Machine

Reactions are set up by mixing the necessary ingredients, including template, primers, dNTPs, thermostable DNA polymerase, and buffer, placing the reaction tubes into the PCR machine, and setting the desired cycling parameters. When the machine is started, it sends a PCR request to the simulation server. The PCR products are translated back into components of the virtual solutions in the PCR machine. They can then be used like any other DNA molecule in the system, by loading them on gels,  performing sequencing reactions, measuring them using spectrophotometry, etc.

Users must register in the course management system at cybertory.com to use the Virtual Molecular Biology Lab and other features of this site.

6. Acknowledgements

This work was funded by NIH grant #R44 RR13645 02A2 to Attotron Biosensor Corporation.

©2009, Robert M. Horton, PhD