In this exercise, we use simulated experiments to test the DNA of six individuals: a mother, her three children, and two different men who may be the fathers of the children. DNA samples are used as templates in Polymerase Chain Reaction (PCR) experiments to determine which variants each child has at certain genetic positions. A child inherits one variant from each parent, and we can usually match one up to a variant in the mother. This means the other one had to come from the father. If neither one of a possible father's genetic variants match, then he is not the father. If we examine enough genetic positions, it is very likely that we will be able to rule out any man who is not actually the father of a particular child.
When you have finished the experiment, you can look at a family tree of the individuals to confirm their relationships, and check your answers.
Determine relationships among family members using DNA analysis
Technologies: This exercise uses the Cybertory PCR simulator with genomes of simulated individuals. Results are preseted as gel images.
Time required: approximately 2 hours.
After studying the background materials and completing the exercises, you should be able to:
Define the following terms:
Expand and define the following acronyms:
List the most common reasons DNA evidence may be rejected in court.
Explain how PCR-amplified markers are used in sex typing.
Explain how STRs are used in paternity testing.
Calculate random match probabilities for a set of observed STR alleles using tables of observed population allele frequencies.
The place on a chromosome where a gene or other sequence of interest is located is called a locus (plural loci - from the same root as the word "location"). A particular sequence variant ("flavor") at a particular locus is called an allele. Loci that have many alleles are said to be "polymorphic", which means these loci have "many forms".
PCR ( the "Polymerase Chain Reaction") is a technique to replicate or amplify targeted segments of DNA. We will use it to amplify specific loci, then examine the amplified product to determine which alleles are present.
Electrophoresis is a method of separating molecules by size (in this case DNA) in a gel matrix in an electric field. Smaller fragments migrate the fastest through the gel, while larger fragments migrate more slowly.
A marker lane containing bands of known sizes are used to measure the sizes of sample bands. Gels have a limited range of sizes they can resolve. Molecules that are too small for the gel to resolve run as a "small" group, and those that are too big run in the "large" group. In real gels, the resolution range can be controlled by the concentration of gel material (usually either agarose or polyacrylamide) used to make the gel. In our virtual experiments, the gel resolution range will be automatically adjusted to match the range of the marker bands (actually 10% smaller than the smallest marker band to 10% larger than the largest marker band).
STRs (Short Tandem Repeats) are sequences of DNA composed of multiple copies of a particular base sequence. "Short" repeats are one to six bases in length. "Tandem" means they are arranged head-to-tail (as opposed to "inverted repeats", or palindromes). For example, two tandem repeats of the short tetranucleotide sequence "CTAG" would be "CTAGCTAG". An STR locus is a place in the DNA where an STR occurs. For our purposes, we will focus on STR loci in which alleles vary in the number of repeats.
A big advantage of STR loci is that we can tell alleles apart simply by comparing their sizes on an electrophoresis gel. (In contrast, for alleles that are the same size but have different sequences, you might need to do a more complicated experiment, like hybridization or DNA sequencing, to tell them apart.)
There are thousands of STR loci in the human genome. The FBI has chosen a standard set of 13 STR loci to use for identifying DNA from crime scenes and from known criminals. Because they use the same set of loci for each DNA sample, they can store the results in a database called "CODIS" (Combined DNA Index System). Each locus has one or more standard sets of PCR primers used to amplify it. The alleles for each CODIS locus are well studied, and are documented at http://www.cstl.nist.gov/biotech/strbase/fbicore.htm
None of the CODIS loci are genetically linked to one another. They sort independently, which means that inheriting one allele at one locus does correlate with inheriting another allele at a different locus.
D1S80 is another repeat locus, but since its repeats are so long (16 bases), it is technically not an "STR", but a "VNTR" (Variable Number of Tandem Repeats) locus. This locus is sometimes used in teaching labs, because the alleles are spaced widely enough to be resolved on a relatively low-resolution gel. It is not commonly used in actual forensic work, though, because the bands are too big to be resolved conveniently on automated equipment set up for using high resolution gel materials to determine STR alleles.
Amelogenin is a locus on the X and Y chromosomes. The version on the Y chromosome is shorter than the one on the X chromosome. Unlike the polymorphic STR loci, we only look at two versions of amelogenin (the X and the Y alleles). Because the PCR product from the X chromosome is larger than the product from the Y chromosome, males (XY) will have two bands and females (XX) will have only one (the larger band).
Use the Cybertory PCR simulator on simulated individual humans. The page has three sections:
The simulated family (Pt04-Pt09) is described in this family tree. You can check your results against this tree to see if they agree.
Detailed information on the FBI CODIS STR primers is available at http://www.cstl.nist.gov/biotech/strbase/fbicore.htm
This textbook is devoted to the science of DNA typing usng STRs: Butler, John M. "Forensic DNA Typing: Biology, Technology and Genetics of STR Markers", 2nd. Edition (2005) Elsevier Academic Press
The six simulated people we will test in this exercise are listed in Table 1. Samples in the PCR simulator are labeled by "patient number", so you can use this table to tell who the people are.
Possible Father A
Possible Father B
We will first determine the sex of each of the samples, to double check that they actually match the table above. Then we will analyze several STR loci to find which possible father matches each of the children. Finally, we will compare our results against the family tree.
Our first PCR experiment will determine the sex of each individual using the Amelogenin primers. We need to test the six people from Table 1, but we have eight reactions available, so we might as well test some extra, unrelated people as controls. We will use the same primers and conditions in each reaction, but different templates.
Right-click to bring up the PCR Simulator webpage in a separate window.
In the "Default Reaction Parameters" section, be sure that "Amelogenin_F" is selected for primer A, and "Amelogenin_R" for primer B.
Set the concentration of each primer to 0.01. Leave the default values for annealing temperature, denaturation temperature, and number of cycles.
Go to the template list, and click on patient #04 to select this sample.
Scroll down so you can see patient #11.
Hold down the Shift key and click patient #11 (Shift-Click). This should select six patients, from patient #04 through patient #11.
Scroll down the page to the individual reaction setup section. You should see that all of the reactions are set to use the amelogenin primers you selected above, as well as the reaction conditions you specified above. The eight templates you selected should be put into the eight separate reactions.
Reactions 7 and 8 are 'extra' people not in Table 1: we do not expect them to be related to the people in the table. For now, you shouldn't need to change any conditions for individual reactions.
Scroll down to the bottom of the page. For the output format select "JPEG ", then click "run reactions and show gel". If you want to see an animation of the gel running, you must have a browser that can handle SVG data. Firefox and Safari have SVG support built in; other browsers may need an SVG plug-in. It is available from http://www.adobe.com . Then you can use the "Animated SVG" option.
The results should appear as a virtual PCR gel. You should see one big band in reaction 1, 4 and 5 and two bands in reactions 2, 3 and 6.
What are the sexes of the individuals based on your results? Are these results consistent with Table 1? How might you explain the results if they were not consistent with the table?
To get a definitive paternity identification, PCR results for the FBI CODIS STR loci need to be determined. The FBI CODIS PCR primers are all available on the PCR webpage as options in the Primer A and Primer B pulldown menus.
Scroll back up the PCR simulator webpage to the Default Reaction Parameters (yellow) section.
In the Marker section, set the "smallest band" to 100, the "largest band" to 400, and the "step size" to 50. This range should cover the PCR products from all of the additional FBI CODIS alleles we will test.
Select the forward PCR primer CSF1P0F as primer A, and the reverse PCR primer CSF1P0R as primer B.
Scroll down to the PCR setup section and make sure that all reactions now have the CSF1P0 primers selected.
Scroll down to the bottom of the page and hit "run reactions and show gel".
When the simulation finishes, examine the gel image. Can either of the possible fathers be eliminated as the father of Child A, based on the data from the CSF1PO locus? Child B? Child C? Can you tell who the father is? How sure are you?
Do another PCR, using the primers for the TP0X locus. Are these results consistent with your results for the CSF1P0 locus?
Do PCRs with the remaining STR locus primers. Are the results from all the loci consistent?
Can you find any loci consistent with one of the unrelated people (patient 10 or 11) being the father of any of the children? (Hint: Try TPOX)
Run a PCR using the primers from the D1S80 VNTR locus. Note that you will have to change the marker sizes, since the D1S80 products are larger than those from the STR loci. Can you think of any problems with using the same size resolution for all the STR loci and the D1S80 VNTR locus? (Hint: try using a marker from 100 to 900 bases for the CSF1P0 locus. Compare the results to using a marker with a largest band size of 400)
To confirm the identity of the father of the children, examine the family tree. The mother's name is "Sharron Goddard". You can click the links to her children, who will have links to their fathers. The "patient IDs" in the family tree correspond to the patient IDs for the PCR templates.
Did you pick the right father for each child?
How might you explain your results if they do not agree with the family tree?
Can results from a single locus ever prove someone IS NOT the father of a certain child?
Can results from a single locus ever prove someone IS the father of a certain child?
How many loci would you have to examine to prove absolutely beyond any doubt that someone IS the father?
Why might a very common allele shared between a man and child be less convincing evidence of paternity than a shared rare allele? (Hint: There is a common allele for TPOX among our simulated people.)