We will do simulated experiments to test the DNA of several individuals. These DNA samples simulate materials recovered from crime scenes (blood, semen, skin, etc.); your job is to try to identify the individual matching the sample, based on PCR analysis of STR loci.
When you have determined the number of repeats for each allele in your sample, you can search a database to see if your sample matches any known criminal, or other samples from unsolved crimes.
DNA results are matched against an online database
Technologies: This exercise uses the Cybertory PCR simulator with individual human genomic templates, and named STR primers.
Time required: approximately 2 hours.
You should have already completed the paternity exercise, as it covers many of the genetic concepts we need for DNA-based identification.
You should have already done the STR-based paternity testing exercise, and be familiar with Short Tandem Repeat loci. Whereas paternity testing only required us to match the baby's bands against the mother and possible fathers, in this exercise we must actually determine the number of repeats in each allele. This will let us search a database of other DNA samples. The FBI's CODIS database (Combined DNA Indexing System) contains data from known criminals (mostly sex offenders), as well as from previously unsolved crimes. If your sample matches a known offender, it may provide evidence that that offender is involved in this crime. If it only matches a previously unsolved crime, linking the two may provide investigators with valuable leads. We use a toy database of offenders and unsolved crime samples called the Cybertory™ On-line DNA Indexing System.
As in the earlier exercise, you should use the amelogenin primers to determine the sex of the individual from which your sample came.
Determine the sex using the Amelogenin primers, as before. It is not really necessary to quantitate the band sizes exactly, but you should check to see if they are about the expected sizes given in the primer table.
We will make custom DNA size markers for each STR locus. This will make it easy to determine the number of repeats in each allele, by counting the marker bands. Start with the TPOX locus.
In the primer table from the Criminal Database documentation page, look up the "base length" for the TPOX locus. This is the length the PCR product would be expected to be if it had no inserts at all.
Note the "repeat length" for the TPOX locus. This is the number of bases in each repeat for this STR.
Note the "repeats range" for the locus. These are the smallest and largest number of repeats that have been seen in alleles at this locus.
Open the PCR simulator page in a new window.
Make a custom marker by entering in the information from the primer table into the Marker section of the PCR setup page. The smallest band should match the "base length" for the locus (82 for TPOX). The step size should match the repeat length (4 bp for TPOX, as for most of the FBI CODIS STRs). The largest band of your marker should be at least as large as the largest allele known. For example, at the TPOX locus, the largest known allele has 14 repeats, so it would have an expected PCR product size of 82 + 4 * 14 = 138 bp. It is OK to make your largest band a little extra big; in most cases making it 100 bp larger than the smallest band will work fine.
Do a PCR on your sample(s) with the TPOX primers and your custom marker.
Determine the number of repeats in each allele by counting the marker bands up to the size of your product(s). Remember the bottom marker band has zero repeats, the next has one, etc. Write down your results.
Repeat the process for several other loci. Make a custom marker for each locus. Write down the number of repeats for each allele. Note that some alleles have partial repeats. These are represented by what looks like a decimal point, but really just separates the number of full repeats from the remaining bases. For example, an allele with 13 full 4-base repeats plus two extra bases would be called "13.2". You can find some examples in the Short Tandem Repeat DNA Internet DataBase.
Go to the Cybertory™ On-line DNA Indexing System .
Enter the number of repeats you determined at each locus into the appropriate boxes. Most people will have two different alleles at most loci. The convention is to list the smallest allele size first, but the search should work in either order.
Does your sample match any entries in the database? If not, what might be some reasons?
If your sample matches someone in the database, does that mean for sure that person is guilty (or that your sample certainly comes from the perpetrator of another unsolved crime)?
If a small number of loci match between two samples, is that as convincing as a large number of matches?
If two samples share a rare allele, is that more significant than if they share a common allele?
If two samples have six alleles that match, and one that clearly does not, are they from the same person?
What kinds of mistakes could you make in trying to identify people this way?
What is the largest number of 4-base repeats you could measure with a marker where the largest band is 100 bases bigger than the base length?
What would happen if the largest band of your marker were way bigger than the expected sizes of your PCR products? Remember that the size resolution range of our simulated gels magically adjusts to match your marker.
See the expected product sizes in the STR primers table below. This information will let you design custom size markers so you can quantitate the number of repeats in each allele.
Example results from the primers in the table are shown below the table.
STR information is taken from the Short Tandem Repeat DNA Internet DataBase.
This link to the Cybertory PCR simulator is set up to use the primers from the table on simulated individual humans.
Search for matching suspects and samples from unsolved crimes using the Cybertory On-line DNA Index System.
Butler, John M. "Forensic DNA Typing: Biology, Technology and Genetics of STR Markers", 2nd. Edition (2005) Elsevier Academic Press
See this map of where the FBI CODIS core STR loci are located in the genome.
Paper on calculating probability of sibships.
This work was funded by NIH SBIR grant 2R44RR013645-02A2 to Attotron Corporation.