|
|
|
1. Corrections and comments (last updated
15/7/07)
I will continue to put here any errors or clarifications that have been
pointed out or suggested by readers. Thanks
to John Buckleton and Jenny Conran for contributing some of these: further
comments from other readers are welcome.
See below for glossary of acronyms.
Preface: Here and
throughout the book references to Buckleton et
al. (2004) should read Buckleton et
al. (2005).
Section 4.1.1, p 47, Null
alleles and allelic dropout. It would have been
clearer to use the term “silent” rather than “null” alleles, reserving the
latter term for the scenario in which the entire STR locus has been
deleted. The statement: “Null alleles
cause no problem for DNA profile interpretation” applies only to the
identification setting. Null/silent
alleles can be highly problematic for relatedness since e.g. parent and child
can appear to be homozygous for different alleles; see T. M. Clayton, S. M. Hill, L.
A. Denton, S. K. Watson and A. J. Urquhart, Primer binding site
mutations affecting the typing of STR loci contained within the AMPFlSTR®
SGM Plus™ kit, Forensic Science
International 139 (2-3): 255-259, 2004.
Section 5.1.1, p 72,
Pearson’s test. The final
statement about implementation in R needs clarification. Use of the chisq.test command to test for HWE
requires dividing the heterozygote count by two and thereby completing a 2 by 2
contingency table. For example if the
genotype counts are 10, 5, and 5, we divide the heterozygote count of 5 into
two lots of 2.5 and use:
> chisq.test(matrix(c(10,2.5,2.5,5),2,2),corr=F)
Pearson's
Chi-squared test
data:
matrix(c(10, 2.5, 2.5, 5), 2, 2)
X-squared = 4.3556, df = 1, p-value = 0.03689
Warning message:
Chi-squared approximation may be incorrect in:
chisq.test(matrix(c(10, 2.5, 2.5, 5), 2, 2), corr = F)
p 73, Fisher’s test. The final statement is literally true but not
helpful: the fisher.test function of R can readily be used to test for linkage
equilibrium from diallelic haplotype data, but not for HWE because of the
single heterozygote count (i.e. BG is not distinguished from GB). An approximate test can be constructed for
example via
hwe <- function(a,bc,d) fisher.test(matrix(c(a,round((bc+0.5)/2),round((bc-0.5)/2),d),2,2))
Section 5.8 Population
genetics exercises. In Q3 p=0.75
should read p=0.25 (the latter is consistent with the hint and the solutions on
p 165).
Section 6.2.1, NRC, p
86. The acronym NRC, used here
and again on p 155, has not been defined.
It stands for National Research Council (of the
Section 6.2.5, Confidence
Limits, p 93. The term
“spurious” near the end of the central paragraph is intended in the sense of
“appears to be useful but is not”.
Spurious can also mean “false” or “illegitimate”, but I hope it is clear
from context that this sense is not intended here. It would have been better to say
“unnecessary”.
Section 6.5.2, p 93, L
13. Typo: “i ≡ AC” should
read “i ≡ BC”.
Section 6.6 p 110 In Q4, the
reference to page 105 should read page 107.
Section 7.1.3 p 116 The reference to
Balding and Nichols (1995) should have mentioned that there is an error in
Table 2 of the cited article: in case Mother = AA, Alleged Father = AB, and
Child = AB the 2 in the numerator should be deleted, so that the correct LR is
2(F+(1-F)pB)/(1+3F).
Section 7.1.8, Mutation, p
122. I should have included a
separate discussion of PCR primer mutations, see comment on Section 4.1.1 above.
Section 7.2.3, LR, p 128. The acronym LR, used here and again on p 170,
has not been defined. It stands for
Likelihood Ratio.
Chapter 8, Other approaches
to weight of evidence, p 135. The introductory
discussion assumes the identification setting, which is perhaps a little
confusing as it immediately follows the chapter on relatedness. In my original book plan Chapter 8 was going
to come after Chapter 3.
Chapter 10 Solutions to
Exercises:
p 165 For Q2(b) the answer should be 0.0077 not 0.0083. Also for Q3(b), in the table, 146 should be
143 and 165 should be 150.
p 167 Q1(a) The final 0.05 in the computation of R1h and
R2h should both be 0.5 (but the final answers are
correct). However the product Rh
= R1h x R2h should be 0.0040 not
0.0036. This error is propagated to the
final answer, which should be 0.976 not 0.977 and also the final answer for (b)
which should be 0.973 not 0.975.
p 171 Q4(a) The formula for R should have 0.5 / Rpi
instead of 0.5 x Rpi Also the formulas taken from Table 7.2 should
each have (1+2θ) in the denominator, not (1+θ). Accordingly, the two instances of 1.02 in the
calculation should both be 1.04, and the final answer should be 6.16, not 5.86.
p 173 In Q3(a), the reference
to page 105 should read page 107.
Suggested
addition: allelic drop-out
The book is intended to be
introductory in nature, and it avoids complex topics such as the statistical
analysis of low copy number (LCN) profiles, which are difficult because of
problems such as allelic dropout.
However, this topic is important and possibly should have been given an
introductory treatment, as was done for mixtures. Here is a very brief introduction to the
issues, via a simple, single-locus example.
Suppose that the crime scene
profile is A and the suspect s is AB. This would constitute an
exclusion under normal circumstances, but if the crime stain had extremely
little DNA, or was degraded, then it is possible that s is the donor of the crime scene sample and that the B allele was not observed because of
“drop out”, or strong preferential amplification of one allele. Similarly, the true source of the crime stain
could have any genotype that includes an A
allele. Let Dx denote the
probability that allele x would be
subject to drop-out under the conditions to which the crime stain was
exposed. Ignoring coancestry, the
likelihood ratio (LR) comparing the
hypothesis that s is the source of
the crime scene DNA with the hypothesis that an unrelated individual i is the source takes the form:

in which I have assumed that drop-out occurs independently for each
allele, both for heterozygotes and homozygotes.
In the final expression, the summation is over all alleles, and a compensating
factor is subtracted from the first term to cancel the effect of including
allele A in the sum. The dropout probability may increase with
allele length, in which case the LR
decreases (stronger evidence against s)
as the length of allele B increases.
If the dropout probability is
assumed to be equal to D for all
alleles at the locus, then the LR
simplifies to
LR = p2A(1-D)/D + 2 pA
As expected, the LR approaches
∞ in the limit as D → 0,
corresponding to stronger evidence in favour of the innocence of s as dropout becomes more unlikely. The approximation
LR = 2pA
is valid only if both pA is small and D is not small. However, still assuming that D is constant across alleles, this
approximation is always non-conservative (unless D=1 which is implausible if
some alleles are observed). To avoid a
non-conservative approximation requires an assessment of the dropout
probability, D. This is inevitable since the less likely is
dropout, the weaker the case against s.
The above analysis
assumes a simple yes/no result as to whether the allele is observed. In practice there may be an observed signal
in the EPG at the B allele, but
which is weak and does not reach the usual criteria for an allele to be
confidently “called”. Ideally we would
wish to undertake an analysis that took the full EPG into account: we would
need to compare how likely is the observed EPG if (i) s is the source of the crime stain, and (ii) i is the source of the crime stain.
Thus, the LR would be much smaller if an observed signal just failed to
meet the established criteria, than if no signal was observed at all. However, the details of such an analysis are
beyond the scope of the book, and indeed I don’t think that anyone has yet
satisfactorily implemented this approach.
2. Glossary of acronyms
BN Bayesian
Network (page 129)
CODIS Combined DNA Index System (page 4)
DNA DeoxyriboNucleic
Acid
EPG ElectroPheroGram
(page 44)
FBI Federal Bureau
of Investigation (of the US)
HWE Hardy-Weinberg
Equilibrium (page 69)
ibd identical by
descent (page 91)
LCN Low Copy Number
(page 50)
LD Linkage
Disequilibrium (page 75)
LR Likelihood
ratio (page 24)
MSE Mean Square Error
(page 62)
mtDNA mitochondrial DNA
(page 50)
NRC National
Research Council (of the US) (page 154)
PCR Polymerase Chain
Reaction (page 44)
PPV Positive
Predictive Value (page 9)
R statistical
software package, see www.r-project.org
SMM Stepwise Mutation
Model (page 59)
SNP Single-Nucleotide
Polymorphism (page 53)
STR Short Tandem
Repeat (page 3)
UK United Kingdom
of Great Britain and Northern Ireland
US United States
of
3. Notes to self: minor typos
p 27. L-6 delete “,” after “Apparently”
p 28. “essentially just replicates” -> “one is a
replicate of the other”
p 43. “In this chapter” starts
sentences 2 and 3.
p 46. “partial repeats are often
rare”
p 50. ((LCN)
p 61.
“for example” twice under frequency dependent
p 83. reference to R & V 95 is
duplicated
p 85. define C=s=D notation
p 98. footnote:
evidence of evidence
p 133. There should be a line: “Solutions start on
page 170”.
p 136. Typo: the “?” on line 10 should be a “.”.