Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
News

British DNA Database Mismatch 194

nahal writes "DNA evidence is extremely compelling to a jury at trial when trying to convict a suspect. In this article at USA Today, the world's largest DNA crime-solving machine, located in Great Britain, mistakenly matched a suspect to a crime in a 1-in-37 million chance. American experts have called it 'mind blowing'."
This discussion has been archived. No new comments can be posted.

British DNA Database Mismatch

Comments Filter:
  • by Sanat ( 702 )
    Scotland Yards could round up any 56 individuals off of the streets who have not had a DNA sample taken and the odds are that one of those individuals would match one of the 660,000 DNA samples on file.

    If a person FIRST was a suspect and then his DNA was tested and found to match the crime scene evidence... then that would be 1 in 37 million odds.

    But to compare an individual's DNA with the databank would as others here have said be 1 out of 56 chances. NOT good odds.

    It reminds me of the old parlor trick of matching birthdays. I forget the exact number (and to lazy to calculate it) however if 15 (approx) persons are in a group and each of their birthdays are matched then it is favorable that two within the group will have a birthday on the same day.

    The same corollary exists for the DNA, except that the police are matching against a database of data.

    You would think that the police would be smarter than that. Well, on second though, no....

  • What we're talking about here is pattern matching on something that looks like blobs of crap anyway. 1 in 37 million chance pretty much means it can't happen, assuming that figure is accurate ;^) . But it did. 10 loci gives another, larger but equally astronomical figure. But I've got a hunch this will happen again because this has the look and feel of a fundamental problem. I'm not saying that in theory DNA testing won't work; I'm saying the way we're doing it is fundamentally flawed. The price of a mistake is too high. I'm even more adamant about drug testing. The tolerable accuracy and effacy of a test depends on the eventual use of the results. This is what happens when greedy, unscrupulous companies slap together some half-assed but profitable methodology and put it forth as "science". True science gets a black eye.
  • By that count, if you have 37 million samples, the propability equals 1 eventually, and with 74 million, it approaches 2!! The moral of this being that you can't just multiply.

    The rule you are mistakenly using is
    P(A or B) = P(A) + P(B)
    this only happens if A and B are mutually exclusive, which brings with it that 37m samples will guarantee a match. In general, the formula is
    P(A or B) = P(A) + P(B) - P(A and B)


    BTW, the 1/56 figure is the probability that you have the right person given only the match in the DNA.
    John
  • For a good, fast, easy-to-read book on probability, chance and randomness, I recommend "Randomness" by Deborah J. Bennett. I'm reading it now.

    At Amazon [amazon.com]

  • There must be a helluva queue for posting stories here at SlashDot - this story was out on CNN or the NY Times or something days ago.
  • I seem to recall that 90% or more of human DNA is identical to that of everything that came before us, say, dinosaurs.

    So, presumably any moment now someone will get prosecuted for some of Godzy's mayhem.
  • by rve ( 4436 )
    What a rediculous, and particularily nasty bit of flamebait...
    ...or is this a clever troll intended to demonstrate how OJ got off? "The policemen who arrested him were racists, so if you find him guilty, that makes you a racist too. It doesnt matter if he's guilty or not, we have to punish the racist police by finding OJ not guilty". All these arguments weighed heavily enough in the mind of the jury to find the small, but non-zero chance that the overwhealming forensic evidence was the result of an elaborate conspiracy to rob the black community of one of it's icons, ground for reasonable doubt.
  • by rve ( 4436 )
    Really? If the lawyer is convincing enough, the jury will find a 1 in 37 million chance 'reasonable doubt', especially after this incident. Forensic evidence doesn't mean a whole lot to a jury that cosists of people who only get to be involved in a trial once in their life (if they're lucky). Remember how easily the overwhealming forensic evidence was brushed aside in the O.J. Simpson case?
  • Its difficult to believe that bonafide scientist
    could be so stupid. If they have 660,000 records
    on file, and the chance of a random match is
    1 in 37e6, then the chance of matchng someone at
    random in the entire database is
    1 in 37e6/660,000 = 1 in 56. The bigger the database gets the worse the problem.
    If they have been using such "evidence" to put
    people away, they must have hundreds of innocents
    in the slammer by now.
  • Hmm I think your reasoning is a little sloppy.
    Firstly you assume that each on the loci is
    statistically independent, I'd guess not.
    Secondly you assume that all gel locations
    are equally probable, this has to be wrong, I'd
    guess the distribution is highly skewed towards a
    small subset of locations.
    The article itself states that adding four
    additional loci takes you from 1 in 37e6 to 1 in
    1e9 an inprovement of a factor of 27 or 2.3 per
    loci, a long way from 600!
    There must be a strong case of diminishing returns
    here, so even the 13 loci tests used in the US are
    probably not much better than 10 loci test
    mentioned in the article.
  • The way I see that trial is as follows:
    The LAPD got caught framing a guilty man. Since the police in this country have to play by the rules, there was no reasonable way he could have been convicted. The un-reasonable way would have been to ignore police tampering with evidence and violating the chain of custody, but that would have been worse than letting a guilty man walk free.

    I did run that LA Times search URL [latimes.com] posted elsewhere, and it's pretty clear that the LAPD will do whyatever they think they can get away with. For example,

    If Not Guilty verdicts and overturned convictions are the price we pay to send a message that police misconduct Will Not Be Tolerated, then we grit our teeth and bear it. Remember, America is supposed to be a free society, and police misconduct cannot be tolerated.

  • All your points are true, and that means that people who use DNA testing need to be concentrating on *clearing* people rather than implicating them. A positive DNA match should not be considered to be the final word in guilt or innocence.
  • Hehe stupid Aussie comedians hehe

    --

  • >>I'm amazed human rights organisations have not
    >>stood up for the right of the defendant to be
    >>tried by proper licensed professionals.

    >So why not do away with juries altogether? The
    >lawyers and judges are all certified, surely that
    >makes them elite enough to decide the fate of
    >the accused.

    Heck, why not just let the police do it. Just add
    a certification test to their training...

    That would save some of the money spent on the
    justice department. Just let the police drop their
    supects off at the local maximum security prison.
    No judges, no lawyers. Just good old fashioned
    martial law.
  • >The environmental factors are acting all through
    >out the identical twins life time to make their
    >DNA different. They're know as viruses. Other
    >mutagens will also cause even more differences
    >over time.

    I'm afraid not. Generally mutagenic changes to each cell will be idiosyncratic -- that is they don't change every cell and the cells that are changed are altered differently, so they get washed out as 'noise' in a PCR RFLP.

    You are very correct in citing viruses as one of the few agents (mutagens are not at all uncommon) that could cause a highly consistent change in the genome, but they are rarely (never, so far as I know) so consistent that they approach total alteration.

    On a PCR-PAGE gel (with an adequate sample, this could result in abberant or extra bands, it should not erase bands that already exist (beecause some unaltered cells remain. therefore twins would definitely register as positives for each other -- though perhaps with some abberrant bands

    >Then there is the testing method. The
    >electrophersis gel tests used have rather poor
    >repeatability.

    Maybe they aren't perfect, but they are pretty darn good. I certainly would never call their reproducibility "poor".

    The things that produce bad results are well known, and avoidable. But I don't doubt that technician error or carelessness could create real problems

    >Sure some things can be done to help make them
    >better. I wouldn't accept a match when the
    >samples are done on two different machines in
    >different labs. Having two different gel
    >suppliers also makes a huge difference. The test
    >is really only telling you the length of strands
    >between markers where the chemicals split the
    >strands into segments.

    I, too, would want both samples run on adjoining lanes on the same gel (though irreproducibility would be *far, far* more likely to produce false negatives than false positives)

    I don't think they still do RFLP on DNA IDs -- but I could be wrong. PCR RFLP would seem the way to go -- a stronger signal with a tiny sample, and many other advantages.

    MY NIGHTMARE:

    The tired/careless/whatever technician who double-dips (placing/contminating) my DNA in both the "evidence" and "suspect" lane, and creating a match (especially with PCR RFLP
  • First off... IANAMB (I am not a molecular Biologist) -- well, I suppose that technically I am... at least judging by degrees and coursework. However, I never took a job in that field, and I haven't run a gel since the early 90's so a lot f what I think I know may be outdated (this field moves even faster than computing -- largely because we knew so little to begin with)

    I love PCR and the million ways it can be used, and I am very happy that it's being increasingly used in criminal investigations. The former 'gold' standard (eye witnesses) have been demonstrated in study after study to be frequently unreliable.

    However, when I see a number like 1:3.7x10^7, I really fume. It's based on far too many assumptions that we simply do not have the knowledge to verify. The specifics vary with the loci and methods used, but I think I can illustrate a few major points with general principles.

    1) DNA matching is *NOT* done by sequencing the entire sample of DNA available. Instead, a few quick measurements are performed. The principle is that no one individual is likely to match all of them ["Gee, how many green convertibles with a Z on the license plate could have been driving in this part of town at three a.m. last night? One, buster -- you!"] DNA evidence assumes a reasonable degree of randomness and statistical independence, but those qualities are poorly charaterized in the real world.

    2) DNA is far from random. In fact (despite the inevitable mutations we all have) it's just a mix-n-match of the DNA of existing humans (who are similarly non-random, breed non-randomly, etc.).

    3) Even after we sequence the Human genome, we won't have the information about genomic variance to make such estimates accurate -- until we characterize hundreds of thousands of people in a deliberately random fashion to even come close.
    [It *must* be random -- not based on criminals or even volunteers. Many 'classic' post WWII medical studies were heavily biased towards the "70-kg white male medical student" (who will volunteer for almost any test).]

    Think about it: how can any statistical analysis claim an accurate probability of 1 in 37 million, from a database of 660,000 individuals? or even 6.6 million? The number was created by assuming the individual measurements were independent -- even 'partial independence' would require a quantification of the degree of dependence for any real calculation. That data does not exist -- and would require millions of test subjects.

    3) Variance information would be tricky to interpret, even if we had the data.
    "A rare mediterannean genetic trait" isn't quite as significant if the crime took place in Italy -- or 'Little Italy' of your favorite town.
    If a witness sends the police on a round up of "short blond female caucasians with freckles", then the probative value of the DNA analysis depends on the likelihood of a match for a random
    "short blond female caucasians with freckles", not "tall, dark hispanics" or "short-haired male tabbys with spots"

    [Want to start a fight? Ask ten forensic geneticists how the overall odds change if the suspect turns out to have a known identical twin. Even this seemingly simple question has never been completely resolved mathematically. Many investigators will mumble 'No change', but in fact, there clearly is a difference. we just can't quantify it. The same applies, crudely, to an only child vs a child in a large family]

    Sadly, characteristics cluster in precisely the way we wish they wouldn't. Relatives share genetic similarities, have a tendency to be in the same general area, and often enough situational factors to predispose to similar motives. The same applies (much less strongly) for ethnicity.

    It's important to note that deviations from perfectly independent assortment will ALWAYS reduce the 'odds' of an incorrect match, making any DNA match less conclusive

    4) Generally, these corrections tend to have larger effects when the base (uncorrected) likelihood is small (i.e. it's easy for a correction to reduce one in 37 million to one in 10 million, but very hard to reduce 1:4 to 1:1.1)

    The article says that the error reflects the rapid increase in the database size (470K to 660K in the past year). However, I think that it is more likely that the error reflects the flaws in the assumptions behind the estimates. As the database size groews (and DNA is nore widely used), we will see more errors -- not because of "all that nasty data" but because "all that data" will highlight (as data is supposed to do) the error of our assumptions.
  • The "1 in 37,000,000" figure is presented as a final probablility of a match. Where did you see *anything* about there only being 37,000,000 possible permutations (1 person in 37 million)?

    If there were only 37M permutations of 6 loci, that would imply roughly 20 discrete possible values at each loci. Is that how you envisioned the underlying data?

    I don't know what test they use in the UK, but I'm assuming that it's the RFLP [Restriction Fragment Length Polymorphism]-- basically they use a highly specific enzyme to chop up the DNA, and place it on a polyacrylamide gel under an electric field to measure the size of the fragments. (Actually, nowadays, they probably use pre-synthesized n-nucleoside primers and PCR [polymerase chain reaction] to chop and selectively amplify the fragments, but the principle is the same)

    A single gel can easily measure fragments ranging from a few hundred base pairs to 10-400+ kbp with good resolution. The exact range varies according to current/field, gel composition, and other factors, but the bottom line is: it's easy to see bands that are a millimeter apart, so if you use a foot long gel, the range of possible values is close to 300. that creates:

    300^6= 7.29 x 10^14 possible permutations

    Actually, 0.5mm is a more realistic resolution limit, so the actual number of resolvable values is at least 600.

    (600 values) ^ (6 loci) =4.6x10^16 permutations

    These are just crude estimates, for the benefit of those who've never read a electrophoresis gel. In actuality, the range of allowable values might be limited by other factors (values that are too extreme may be eliminated as artifacts) But it does give a sense of the TRUE numbers involved.

    (with modern gels and automated readers, the resolution may be even higher, but my experience was with UV lamps, eyeballs and Polaroid prints way back in the 1900's... 1991 or so)

    Please run your analysis again using this range of possible permutations, and you'll see that 1:37M could well be a FINAL probability of a false match.
  • >1: The chance of a DNA match (in this 6-loci
    >case) is 1 to 37000000.

    Where does it say that? If you'd ever run a gel, you'd know that this is ridiculously low. It implies that there are only about 20 possible 'positions for a band. the actual number of values on a full-size gel is in the 100's

    >2: That means that ONE DNA-sample compared to
    >ONE other DNA-sample has the chance to in 1 of >37000000.

    No, it means (as stated in the article) that the chances of a mismatch OVERALL (under the condition listed for the database) were estimated a 1:37M. I have serious issues with the underlying assumptions of the model used in DNA ID calculations, but they are based on the fact that we lack critical data for the assumption of "independent assortment (a basic concept in first year genetics) but I DON'T doubt the ability oif the statisticians to do basic math. I just think that they made assumptions (required by the limit of current data) that are not justified.

    That may be okay for a research paper, but not for a person's life -- no matter how much law enforcement may want answers! [note: the polygraph, inadmissible in most courts and widely discredited as a 'truth tool', is often used in investigations because police want answers, and are willing to accept the "risk" (minimal, to them) of a wrong answer]

    >4: Any other circumstances have no impact on this >if THEY HAVE NOTHING TO DO WITH THE DNA-CODE !

    Jesu christu, tu mater est stertocarari! I can name a few dozen things unrelated to the genatic code, from start (AUG) to finish (3 codons) that impact the calculation -- lab criteria for artifacts, choice of primer, ethnic dependencies, underlying population composition, inbreeding and genetic relationships with the suspect pool... and if I hadn't been up for days, I'd have a much longer and more varied list

    >5: In this case we have 660000 OTHER DNA-samples
    >to match against ! The rest is obvious ...

    Yes, obvious. So obvious that (as I have shown in another post) the number of independent permutations may well be over 10^17 -- and the 3.7x10^7 figure cited in the article obviously is reduced to take account of this crude pairwise database comparison (and other factors)
  • >The most important thing to understand is that
    >this anomalous case does not invalidate DNA
    >evidence... (assuming the methodology of the
    >tests is good) is exactly as useful now as it
    >was before

    I DON'T hate statistics. I hate what is often done with it.

    As a result I can see that the assumption of independent assortment is severly flawed. This makes DNA IDs very useful for its negative predictive value (if it says you're not guilty, you almost certainly are not) but it's positive predictive value is much weaker.

    That's why blood tests are far more accurate at disproving paternity than proving it: it may be impossible to prove a negative ["I have never fathered a child"] but it may be easy to disprove a positive assertion ["You fathered this child"]

    Orpheus, father of the finest children, bar none
  • >This is ignoring the probability of a false
    >negative; this is very low since only one person
    >can commit a crime!

    Your logic is extremely weak here.

    Are you saying that if I have lots of co-conspirators, I decrease my chances of getting caught?

    Gee-- no wonder white collar crime is so rarely prosecuted
  • It depends exactly what probability the 1 in 37 million refers to.

    I understand it to mean that a given DNA sample, using their testing techniques, will only match 1 in 37 million of the general population.

    However, given 660,000 of the general population, the probability of you finding that one has just increased.

    The probability you refer to "1 in 37 million that you will get a false match if it is in the database." is what the juries and others are led into believing, but it is not explicitly put that way, because (I believe) that is false.

    The article actually says, "British authorities estimated that the likelihood of that match occurring at random was one in 37 million.", which is a totally different thing.
  • Ooohhh looks like someone dosn't like the prospect of lose marketshare to a free product hmmm?

    Ok to get back to the subject....
    There isn't that much chance of two people comming up with the same DNA.. Thats the whole point of the DNA fingerprint.

    On the other hand it is posable to polute DNA samples... If you have sample a and sample b in the same lab at the same time it is posable to mistakenly mix a and b together...

    DNA evedence shouldn't be total proff simply backup evedence...
  • To drive a car you have to get a licence, but to be resposible for someones life in a trial you only need to be selected........
    I'm amazed human rights organisations have not stood up for the right of the defendant to be tried by proper licensed professionals.
  • This fiasco illustrates how DNA testing can be abused. It is NOT ACCEPTABLE to keep a massive file of DNA codes of citizens, then when evidence comes in to match it with the entire database. As others have pointed out, there may be a one-in-37 million chance the DNA matches any given person, but the chance it matches SOMEONE who did not commit the crime can be unacceptably high (1/56 in this case).

    The correct use of DNA testing is for verifying suspects. Ideally, evidence is collected at the scene of the crime and a list of suspects is generated. If DNA evidence is found, it is checked against the list of suspects only. Only then is a DNA match meaningful. I think that using the entire database violates the principle of probable cause. Citizens should be innocent until proven guilty. If your DNA happens to match the DNA found at a random crime scene, you should not have to prove your innocence.

    Explanation of the math:

    Chance of a DNA sample matching another random sample: 1/37e6

    Chance of 2 DNA samples matching a random sample: (1/37e6)^2

    Chance that neither of the 2 DNA samples match a random sample: (1 - 1/37e6)^2

    Chance that none of the 660,000 samples match a random sample: (1 - 1/37e6)^660000

    Chance that at least one of the 660,000 samples match any given random sample:
    1 - (1 - 1/37e6)^660000 = 1/56.6

    -Nathan Whitehead

  • by floopy ( 28552 )
    Well there is a 1/37 million chance they picked *this* guy, but that would make it seem that the probability of a random DNA sequence matching *anyone* in this DNA database seem to be rougly 600,000 times that, or just one in 60. (really it would be 1 - (1-1/37million)^600,000 )
  • Too many RFCs . . .

    Thanks for the laugh!

    himi
    --
  • Chances of the technician contaminating the samples || mistake in labelling results || other incompetence || malicious substitution of results || a fit up ? 1 in 1000? 1 in 10,000? 1 in 1,000,000? Still smaller than 1 in 37,000,000.
  • To drive a car you have to get a licence, but to be resposible for someones life in a trial you only need to be selected........
    I'm amazed human rights organisations have not stood up for the right of the defendant to be tried by proper licensed professionals.


    So why not do away with juries altogether? The lawyers and judges are all certified, surely that makes them elite enough to decide the fate of the accused.

  • Doesn't this show that we never can be sure quite how sure we are? However convinced we are that somebody is guilty, they may in fact be innocent. This is an irrefutable argument against the death penalty IMHO.
  • I am going to have to say.. I don't get this facination about first posts, but all you AC's are going to have to go. This is ridiculous and not what slashdot is about or for. Why don't you get out of your bedroom and get a life? You have submitted the "Yet another absolutely stupid Anonymous Coward Post". Beware my wrath, because it is very very undesirable....
  • All of this statistical bru-ha-ha assumes that the samples are properly collected and isolated in the first place.

    With a test as sensitive as DNA analysis, it doesn't take a lot of contamination to blow the test.

    Especially if, as here in the USA, the detectives like to take the samples back to the scene for no good reason (as in the O.J.Simpson case).

    As a juror, I would not be comfortable convicting on DNA evidence alone. I could not examine the evidence itself, only what prosecutors tell me about the evidence. In the USA, it is the jurors' specific role to examine the evidence and weigh its relevance to the case. Evidence that cannot be examined by the jurors is thin ice, AFAIC. (Of course, my attitude would get me kicked out in the voir dire.)
  • DNA testing results do not prove guilt or innocence. The tests *may* be accurate, however, "matching" DNA patterns only show that the DNA specimen is consistent with the DNA of a suspect. Even if current DNA tests were 100% accurate they still could only place a suspect at a crime scene. This, by itself, only suggests the possibility of guilt. I *hope* that nobody has ever been convicted of a crime soley on DNA evidence.


    --

  • <BLOCKQUOTE>
    2: That means that ONE DNA-sample compared to ONE other DNA-sample has the chance to in 1 of 37000000.

    3: If You have TWO other DNA-samples to match against you have a chance of match in 2 (TWO!) of 37000000 !
    </BLOCKQUOTE>
    For a bunch of supposed geeks, you lot are fucking useless at mathematics.
    Here we have a classic example of no logic skills at all.
    I hardly need to point out that taking "i"'s logic a bit further,
    If You have 37000000 other DNA-samples to match against you have a chance of match in 37000000 of 37000000 !
    ie. a definite match. However this is blatantly false.

    The guy who replied to this message has a bit of sense, although his calculation is irrelevant.

    Now, time for another piece of using your brain instead of being a dick. It is blatantly false that the DNA database has a failure rate of about 1/56. I think it is safe to conclude that the 1/37,000,000 chance has <I>already taken into account</I> the fact that there are 660,000 people in the database.

    I don't know how the locus check matches (perhaps we need to find a genetics expert here), but it is reasonable to suggest that if there were a 1/37*10^6 chance of a match on six loci, then the chance of a match on one locus is the sixth root of this, ie. about one in eighteen. It seems highly likely to me that each locus check would be more definite than this low chance.

    Here's another piece of reasoning: If the chance of two DNA sets matching is 1 in 37,000,000, then the chance is <I>almost certain</I> that two in a pool of 660,000 will match. (Recall the birthday example; if there are 23 people in a room, then it is more than even chances that two will have the same birthday).
    The formula here is:
    chance = 1 - (37000000! / 36340000! / (37000000^660000))
    which I can't really be bothered calculating, but would wager highly (based on my arithmetic intuition) that it's rather close to 1.

    Anyhow, my purpose here was to show that one should reach sensible conclusions by using your brain and looking at different angles on a problem; and CHECK THAT YOUR ANSWER IS SENSIBLE before blindly trusting it. Perhaps the court judges could follow that principle. I know that if school students did, then the average marks on exams would be a lot higher.

    &lt;/RANT&gt;
  • More interesting is the case of Siamese twins, joined at the hip or something. One of them commits a serious crime.. but according to US law he cannot be imprisoned because it would not be fair on the other twin!

    Although if one got the death penalty, perhaps they could cut one head off or something
  • What the fuck is IMNAL.
  • I wonder something else. The 1 in 37 million odds are based on a random distribution, right? Someone else already commented that since it's possible that people in the same country are distantly related, it increases the chances of a false match. Simililarly, if there are any genetic predispositions towards committing crimes, (or getting caught) it's conceivable that the DNA samples in the database are more similar than they would be if they were randomly distributed.
  • If you have two crimes with DNA evidence that is only this reliable, then more than likely some innocent person in the UK would test guilty.

    First, each 'test' is a unique event, no two tests have any impact on each other. That said, for any single given comparison between two sets of human DNA the entire genome is certainly not compared directly for two reasons:

    A) That is technically approaching 'rediculous' since the entire genome run out on an agarose gel looks like a big streak instead of the banding pattern we've all seen on Court TV.

    B) It's an excercise in futility since a good percentage of all human DNA is repetitious. Some of these repetitious areas define what we look like (since we all basically look the same) and some are just 'junk' DNA that doesn't do anything. Comparing either of these sub groups is fairly futile.

    To this end, DNA science instead turns to hypervariable regions of the human genome known as 'marker' regions. It's these marker regions that are actually compared in court cases. And the variability of these marker regions is what leads to the 1:37 million statistical figure.

    Now since each trial is independant in a test like this EACH test for DNA similarity has a 1:37 million chance of matching to the level of exactness set by law. It doesn't really matter how many of these trials are carried out, whether its 80 a year or 80 million, each one has the same (tiny) chance of mis-conviction. This would be why the experts are suffering from blown minds about now considering my chances of winning the lottery every weekend are 1:~14million and I consider _that_ rediculous.

    As a side note, as our ability to sequence genomic data increases in speed our ability to compare larger and larger regions of human DNA will improve. At the moment it's fairly archaic, we chop each DNA sample with enzymes that cut at particular loci and then see if the pieces come out the same size. It's a dirty method to be sure. Perhaps someday soon we'll be able to just sequence each persons genome and compare them directly ... though I don't see that being a possibility for at least ten years hence as it would require amazing computing power to compare 3x10^9 bases directly as well as some serious sequencing technology we just don't have yet. A typical state-of-the-art capillary DNA sequencer costs $250,000.00 (US) and can sequence approx. 100,000 bases a week. Do the math and you'd need a lot of machines to get 3x10^9 in any sort of court-friendly time frame ... for now!


    ------------------------------------------------ ---------------
    James C. Diggans
    jdiggans@excelsior-web.com

  • I have seen a TV special on DNA testing and in this program they brought up this same question. It's rather late and my brain is not top form (but then again when is it?) so I may not be remembering this right. Twins, while being in almost all aspects the same right down to their finger prints, their DNA does not match exactly. While developing from the same egg that has divided into two separate fertilised eggs with identical genetic material they still develope separatly from each other. This results in slightly different DNA patterns.

    If I didn't get that right, somebody please tell me (as I'm sure you will anyway) I may have misunderstood. Biology was NEVER my strong point.

  • Anyone prefaces their comment with "IANAMB" and then proceeds to make a highly intelligent argument based an actual knowledge deserves to be moderated WAY up.

    (Don't even get me started on the people who post random made-up facts like "We have 90% of our DNA in common with dinosaurs" and expect that to be relevant to the discussion)

    Thanks for posting something I actually enjoyed reading!

    smallstar
  • Um, if a match was guaranteed to be in there, then the probability would be one.
  • I think he's actually British. I went to a talk by him once in
    Canberra, Australia, and he didn't sound Australian to me.

    Alex.
  • This demonstrated fallibility wont make any difference to a jury. if you wave "science" at them and say "1 in 37,000,000" they will convict. Still at least OUR forensic people are honest and dont alter the evidence to get a conviction, unless the FBI.
  • yes really. Like most lawyers I've been accused of thinking Im God (except that I exist), but even Im not omnipotent I cant overturn the prejudices of juries. OJ Simpson of course was a good example he was acquitted because he was black and was accused by racist officers. The DNA evidence provided no more than the hook the jury needed to justify their decision to themselves, IMNSHO. There is rarely absolute evidence, and you are right to say that persuasive arguments can sway many a jury but it needs a solid evidential basis mere rhetoric, however impassioned, will rarely sway a jury, *unless* they want to be so persuaded. Nontheless, Originally DNA was said to be right on a 1 in *BILLIONS* basis, and now, even though it has had to be downgraded, I wouldnt fancy my chances arguing it couldnt possibly be my client - unless he was a twin or OJ. Mind you, I dont do criminal trials, no money but I think thats the view many my colleagues would take.
  • I'm no molecular biologist (or lawyer or mathematician, for that matter) but neither is the typical juror who's sole qualification (in the U.S., anyway) is being in the pool of those who have registered to vote and do not have some obvious prejudice against the matter at hand.

    The assumptions you detail above and the intricacy of statistics is probably beyond analysis by the typical juror. This means that the jury is weighing not the technical or factual merit of the "experts" but some unknown subjective reason: the persuasiveness of the lawyers, how scary the defendant looks, how many days the case has gone on, or something else. In short, it is only a matter of time before the collection of assumption, error (possibly in procedure or in recording), and juror ignorance produced the result in this British case.

    I hope that this acts as a wake-up call here in the U.S. but somehow I doubt it.
  • He matched 5 pairs. There's a chance of 1 in a hundred thousand or so, but it DOES happen (people win the lottery, eh?). After they matched that, they went and ran the check that takes a lot longer on 10 pairs. He didn't match that, so they let him go.

    This is like saying that we acidentally killed the wrong guy after releasing an innocent man from questioning. It didn't happen. It didn't even come close to happening, people are just ignoring the guy is sitting at home in his house and that it was the exact same DNA matching lab (the ones that CORRECTLY matched the first 5 pairs) that matched the 10 and said it wasn't the right guy.

    Esperandi
  • I realise that this could have been a dreadful miscarriage if justice, but let's get things in perspective here: The British Criminal Justice System is, in many ways, held to be one of the fairest in the world. In most respects this is probably correct due to the copious layers of burocracy making most (I said *most*, not all!) forms of corruption difficult to get away with. However there are some great, gaping holes through which either criminals get off their charges or innocent people are found guilty. I guess it's even worse in the US, where the 'celebrity lawyer' enjoys a much higher cult status than over here. A fashionable lawyer can work miracles, whether you're as guilty as sin or innocent as the day you're born. I guess what I'm rambling at is that there might be a few more false positives (gasp, shock, horror) found after this case, but it's a drop in the ocean compared to the number of people who are wrongly convicted or cleared *every day* due to the power of persuasion which some legal people have. Let's keep it all in perspective, folks.
  • I was taught:
    X is the unknown quantity,
    a spurt is a drip under pressure.

    So a 1 in 56 chance actually happening blows the minds of some 'American experts'. Scary is the word that comes to mind. So it's easy to mock journalists. Mock mock mock.

  • Right, but If there are 6 billion people on this planet, even given that one of them has commited a crime, what are the odds that you will be accused of a crime they commited? 37Million to one. That's assuming that they even commited one. And that they don't also catch and test him.

  • The guy was a suspect, and was released after a more accurate test showed him not to match the DNA from the
    criminal. These kinds of problems actually happen with
    fingerprints too, as for purposes of searching the databases
    they only use something like 29 features of the fingerprints
    for matching by computer, then use humans to make exact matches.

    That said, even if he had been convicted, this one case
    in 37 millions doesn't even begin to compare in magnitude
    to the number of people who have been wrongly convicted
    by eyewitnesses and the like.
  • I have been reading these posts for a while, and it makes you wonder if there are more innocents out there, and if those numbers are right...
  • The actual odds of a false positive are much, much lower than 37M:1. The DNA comparison is much like a fingerprint comparison. Only a limited number of points are compared, if they match then it's counted as a DNA match. What a match is is often down to human opinion and thus is quite falable.
  • That's the most frightening part of the article, to my mind. Not 660,000 people, or samples to compare, but "possible suspects". The implication, of course, is that every person who's ever given a DNA sample to the police is a suspect in every crime the police investigate through DNA evidence.

    Great. Guilty by association, and there's two-thirds of a million associated people. I wonder if the cops take a sample, and you prove not to be guilty, if you can insist that your DNA be removed from the database. I doubt it, though.

    --

  • We have no idea what the odds are because all of the calculations you guys are posting assume independence. The probability distributions for different peoples DNA are *not* independent because DNA is inherited along family lines. Shared inheritance means a much greater chance of shared DNA. Forensic scientists probably have no clue how to work with non-independent probabilities - they probably haven't even considered the possibility.
  • If the 1 in 37M figure was the final probability, they must have taken the size of the data set into account. But if that was taken into account, why did the article say:

    • "British authorities say the mismatch probably was caused by the rapidly increasing size of their database, which has grown from 470,000 potential suspects to 660,000 in the past year."

    Also, your numbers are too high to come out with a final probability of 1 in 37M.

  • IMNAL, although the likelyhood is he can not sue, and rightly so. You can sue for wrongful arrest, but from the article it seems as though he was only called in for questioning, plus the Police had a good reason to bring him in, so that probably also means he can not sue.

    All you Americans must realise that you sue far too much, and unfortunately the rest of the world is going that way as well :(
  • Where does it say he was prosectued? from the article it looks as tohugh he was just brought in for questioning, provided an alibi and they re-checked the DNA evidence. There is a big difference between being Questioned and being Prosectued.
  • I don't know how you get to 1 in 56, but from my Maths, and the knowledge that the match *might* not be in the database, the chance is 1 in 37 million that you will get a false match if it is in the database.
    The point is that the DNA might not have been in the database. If it was guarnteed to be in there then you would be right, perhaps.
  • Even so I don't think you can even sue if you are found innocent, although as I previously said IMNAL. If you could sue, the government would have a lot less money, or maybe you can and that is where our taxes go :)
  • ok, I was wrong :) indeed there is a possibility of 1 in 58 of a random DNA segment having a match in the database.

    Although if you are in the database you have previously commited a crime, so there is a good reason for suspicion if you do match.
  • Yes, and they were saying that this could lead to a bunch of re-trials in the United States. From my knowledge most of the DNA related stuff in the US is used to relate DNA from a scratch, hair, or other item left at the crime scene to a particular individual. In this case, the probability is far less... although, this could be used to *find* potential criminal that might be scary. Might lead to a national DNA database for finding potential murders... I guess you would be guilty untill proven innocent in this case. Thoughts?
  • Here in New Zealand we had some DNA problems last year. In one case a man was wrongly convicted of a crime (rape of a little girl) despite being in a different city when it happned.

    Essentially the jury belived that the DNA test was 100% accurate and that there fore the man, and all the witnesses (all of them) were lying.

    I can understand that, when you have a dodgy looking normal Joe charged by the Police force, repectable scientest types stating how accurate the tests are etc etc then poor old Joe is scrwed.

    The sad thing was the people doing the DNA tests F%&#D up. Further tests (after a couple of years in jail) proved that this man wasn't the Rapist!


  • Q: you know what the funniest thing about that is?

    A: ...me neither.

    hehehe. i think i'm soooo funny.

    but really, i did a 'furst' post, not a first post. and besides, I apologized.
    what about an amendment to andy warhol's "15 minutes" saying.. something like 'Every geek will, once in their life, be entitled to a first post and offtopic reply string,' or something along those lines? huh? what about it?

    hehehe. i think i'm soooo funny.
  • sorry couldn't resist.
  • this story sounds like the introduction of a new (actually wouldn't be surprized if this is the basis of an actual) john grisham novel.
    can you imagine the horror that the accused must have gone through? poor 'bloke' will probably be able to sue pretty nicely, though. what's the basis of law in britain, more like canada than california i hope..?
    for the sake of Charles P. Taxpayer Jr. that is.

  • In the article they specifically state that the probability of a wrong match are as follows:

    6 loci: 1 in 37 million
    10 loci: 1 in one billion

    The FBI tests 13 loci. There is the potential for one in billions chance. It just depends on the testing methodology.
  • That and the whole concept of men not being justified in taking the life of other men. But then, that's just my opinion.
  • presumably because they have committed some crime

    Strangely, not so. In Britian at least, the Police may ask you to provide a DNA sample, generally to exclude people from a crime (This is generally used in cases such as a sex attack, killing etc.) You are not obliged to give a sample, but many people do.

    There is nothing to stop the DNA sample being added to the National database even if you have not commited a crime. This was the situation in this case, the gentleman in question had no previous convictions, but had provided a DNA sample in the past.
  • The most amazing thing about this, is that the police now seem to rely on Forensic evidence so much that the prosecuted the man in question.

    The most astounding thing about this, is that the suspect in question was disabled, epileptic (IIRC), had never even been to the place where the crime in question was commited, and had a rock solid alliby for the time & date the crime was comitted. Scotland Yard ignored all of this, and prosecuted solely on the basis of the forensic "evidence".

    How many more people who are protesting their innocence, have been convicted on the basis of forensic evidence alone? How many of these convictions could now be wrong?
  • Not just brought in for questioning, but the Crown Prosecution Service did start procedings, which where dully droped. Although it did not go to court, the CPS were prosecuting.
  • I heard about this. I understood at the time that it was illegal to explain the mistake in UK courts, because it would just confuse the jurors. The only article I can find about it now, here [asu.edu] says its only discouraged.
  • The real burglar was this guys long lost twin brother who was switched at birth. :)


    Regards,
  • This system can only be relied upon to "prove" guilt where every loci is tested.

    Even if all match, there is a very tiny but non-zero probability that the match is a false positive. The question then is how much doubt constitutes reasonable doubt? (Or the equivilant phrase in non U.S. courts).

  • ... When prosecutors abuse scientific evidence with pseudoscience. DNA evidence is exclusionary in nature, not inclusionary. In other words (assuming no procedural errors etc) no match = didn't do it, match = COULD have done it. Of course, prosecutors would have the jury believe the opposite. If science is to be used to convict, then scientific thinking MUST be involved if there is to be fairness. No proper scientist would consider a DNA match on 6, 10, or 16 loci as conclusive (but would consider it a VERY strong reason to investigate further).

    Consider the 1 in 37 million. If the database were complete for the world population (about 6 billion), that means that on average, any given DNA sample would appear to match 162 people. The 16 locus test that the FBI uses is better, but still is not damning in and of itself.

    Now, add in procedural error and other bad thinking and you have (to me) reasonable doubt unless there is some other evidence.

    I am certainly not against convicting criminals, but I AM against decieving juries into believing that a DNA match is damning evidence. Matching DNA evidence should be regarded as the beginning of an investigation, not the end.

  • I seem to remember watching a programme on the TV about this (can't remember which one, it was along the lines of an investigative news magazine programme, probably after 9.30, probably on the beeb, I think), and it was at least 4 - 5 years ago. I know it was in that time frame because emmigrated 3.5 years ago.

    Anyway, they made a claim that the current DNA testing at that time was flawed and often made matches that were incorrect, flying in the face of the astronimcal odds. I think that there were two stages to the problem, one was cross-contamination, and two, the cloning process that makes the sample big enough for testing cloned the contaminating DNA too.

    Perhaps the labs were using the same containers for both the evidence DNA and the sample DNA without proper cleaning between tests? It only takes one fragment of DNA to screw the whole thing up. I think that there was serious concern about the use of cost-cutting independent labs who were bidding to do this work for the police at the lowest possible rate.
  • They used their tests improperly, and they call it mind-blowing?

    Look. It's a 1:37-million chance if you're comparing one person's DNA to one sample (probably found at the crime scene) That's why you only use DNA testing to weed people who couldn't possibly have been involved from a very narrow range of subjects. You can't pick out one suspect from a huge list.

    This is the problem with archiving everyone's DNA. You know it'll be used for stuff like this, because law enforcement will get lazy.

    DNA testing is a Good Thing. It's a very safe, reliable way to identify suspects. But only if you use it properly. This is hardly a "proper" use of the tests, and I'm not at all surprised that this happened. It's a case of lazy law enforcement more than faulty testing.
  • This is a very important point and should be moderated up.

    It makes the utmost difference whether the police have a suspect and then use DNA matching to see if he did the crime or if they use DNA matching to find a suspect. As this poster mentions it is then a much lower probability that you did in fact commit the crime.

    It is exactly the same as disease testing. If you have a large population which is uninfected (not guilty) a positive match even from a very reliable test is highly likely to in fact be an error.

    Of course if you up the test to some obscene number of points you can probably make the probality of error very small again. Of course this leaves the scary possibility that people are falsely convicted because they left a hair lying around...but their are always false convictions.
  • Maybe I did'nt express myself properly ... I'm talking about a random DNA sample matching a sample in the database (assuming that those are unique). In that case the likeklihood of a false positive reaches 1 when the database has 37 million entries.
  • Everyone and their dog has shown that /.ers actually understand basic statistics. With a 1/37 million chance of a match between two people, and 660,000 people in the database, the odds of eventually coming up with a false positive eventually become quite high.

    What not so many have pointed out is that the true odds are probably lower than 1/37 million. That figure is based on the contents of each loci being independently distributed. (With about 1/18 of a match at each loci.) Well we know that is strictly not true - after all a sibling of yours will have 1/64 of getting the same loci from the same source that you did. But are there any larger effects?

    The answer is that there is. Suppose that some of the loci have a different distributions in frequency between anglo-saxons, Celts, and East Indians. Then the chance of finding a match between 2 East Indians could be far higher than they estimate. For instance if that 1/18 figure was changed to around 1/9, the chance of matching 2 East Indians now becomes about 1/530,000. Even if your database has only 50,000 East Indians in it, if an East Indian committed the crime, the chance of a false positive is around 10%. Much higher than you would expect. (I am using East Indians because I understand that they are a disliked racial minority in England. Substitute your favorite group if you wish.)

    So the moral of the story? Not only is the technique going to inevitably produce false positives, but it is likely to do so in a racially biased manner!

    Regards,
    Ben
  • I don't know where you get YOUR math from but it's not relevant.

    1: The chance of a DNA match (in this 6-loci case) is 1 to 37000000.

    2: That means that ONE DNA-sample compared to ONE other DNA-sample has the chance to in 1 of 37000000.

    3: If You have TWO other DNA-samples to match against you have a chance of match in 2 (TWO!) of 37000000 !

    4: Any other circumstances have no impact on this if THEY HAVE NOTHING TO DO WITH THE DNA-CODE !

    5: In this case we have 660000 OTHER DNA-samples to match against ! The rest is obvious ...

    Thomas Berg

    Mundus Vult Decipi
  • Is this sort of false match inevitable when you are comparing large numbers of DNA fingerprints from unsolved crimes to a large database of DNA samples?
  • Your points are all valid. However, none of this means that DNA testing is not extremely valuable. In fact, no matter how you figure the odds (within reason). the odds of a person who is WRONGLY suspected being cleared quickly by DNA are likely higher than they are of a person being wrong convicted based on DNA. Knowing what we know through years of experience with DNA, we know that the odds of a false positive are still very slim, even if you factor in human error. If the odds of being wrongly conficted of a crime are a mere one in 36m (or whatever figure you might happen to quote), and DNA proves to be usefull in solving a great many crimes, the question you should also ask is can we afford not to use it? Think of how many people have been cleared by DNA. Think of how many murderers have been convincted and/or arrested before they could kill again. Do you honestly believe that the number of people wrongly convincted (based on DNA) exceeds (or even remotely approaches) the number of people who've been saved? How many people have been wrongly convicted based on DNA? The closest thing, to my knowledge, is this ONE (out of how many million?) guy in the article here, and he was not even convicted. I suspect any lawyer worth his weight could have refuted that, especially if the odds (based on the agreed premises) are as high as most slashdoter's have just purported (e.g., 1:56).

    Sure, all things being equal, I would prefer there be no chance of anyone being wrongly convicted; however, the fact of the matter is that we don't live in a perfect world. We were no better off before DNA testing. All we've ever been able to gaurantee in the courts is due process. There has always been (and likely always will be, to some degree) human error and prejudice involved in any trial. DNA, despite its flaws, brings us that much further away from those kinds of errors...
  • I'm sorry you felt a need to take such a strong tone in your title

    The "1 in 37,000,000" figure is presented as a final probablility of a match. Where did you see *anything* about there only being 37,000,000 possible permutations?

    If there were only 37M permutations of 6 loci, that would imply roughly 20 discrete possible values at each loci. Is that how you envisioned the underlying data?

    I don't know what test they use in the UK, but I'm assuming that it's the RFLP -- basically they use a highly specific enzyme to chop up the DNA, and place it on a polyacrylamide gel under an electric current/field to measure the size of the fragments. (Actually, nowadays, they probably use pre-synthesized n-nucleoside primers and PCR [polymerase chain reaction] to chop and selectively amplify the fragments, but the principle is the same)

    A single gel can easily measure fragments ranging from a few hundred base pairs to 10-400+ kbp with good resolution. The exact range varies according to current/field, gel composition, and other factors, but the bottom line is: it's easy to see bands that are a millimeter apart, so if you use a foot long gel, the range of possible values is close to 300. that creates:

    300^6= 7.29 x 10^14 possible permutations

    Actually, 0.5mm is a more realistic resolution limit, so the actual number of resolvable values is at least 600.

    (600 values) ^ (6 loci) =4.6x10^16 permutations

    These are just crude estimates, for the benefit of those who've never read a electrophoresis gel. In actuality, the range of allowable values might be limited by other factors (values that are too extreme may be eliminated as artifacts) But it does give a sense of the TRUE numbers involved.

    (with modern gels and automated readers, the resolution may be even higher, but my experience was with UV lamps, eyeballs and Polaroid prints way back in the 1900's... 1991 or so)

    Please run your analysis again using this range of possible permutations, and you'll see that 1:37M could well be a FINAL probability.

    Actual experience counts for something. (And as someone who still likes to consider himself a Young Turk, I hate myself for saying that!)
  • 1 in 37 million ?

    I don't think so. Maybe onle one person in 37 million would match that DNA, but they were searching from 660,000 people. That makes the probability 660,000 : 37,000,000 or more plainly,
    1:56.

    I bet that figure never came up at trial. This is blatantly a case of a mis-understanding of probability, from what I have read about the case. They have to use DNA to narrow the search from a few suspects, instead of using it to pick out a person from 660,000 previous convicts.

  • This problem is similar to the so called birthday problem i.e. given a number of people n, what is the possibility of two of them sharing the same birthday? If I remember my stats correctly, the result is surprisingly large...

    For n=2
    364 ways second person could have birthday without matching first
    For n=3
    363 ways third person can have birthday not matching other two
    p(match) = 1-365x364x363/(365^3)
    ....

    when this gets to about 20, p(match) is about 40%!!

    The chances of a DNA match amounts to a similar problem, so the stats rapidly build up to an high likelihood of a match after about 20-30 samples.
  • The environmental factors are acting all through out the identical twins life time to make their DNA different. They're know as viruses. Other mutagens will also cause even more differences over time.

    Then there is the testing method. The electrophersis gell tests used have rather poor repeatability. Sure some things can be done to help make them better. I wouldn't accept a match when the samples are done on two different machines in different labs. Having two different gel suppliers also makes a huge difference. The test is really only telling you the length of strands between markers where the chemicals split the strands into segments.

  • He's at it again, that damn trickster Loci. He makes trouble wherever he goes. And now the FBI has 13 of him... boy are they in trouble.
  • IANAS (in fact, I hate that branch of mathematics with a passion), but I do know enough to be able to say that this is inevitable .

    They say there was a one in 37 million chance of this false match occuring - so? There's a one in multi-millions chance of someone winning the lottery, and yet it generally happens (I realise they're not equivalent cases, but it does show my point) - whenever you talk about probabilities, you have to realise that they are only relevant over a statistically significant sample size. They say nothing about individual cases - anomalies happen, the one-in-a-million chance does happen, and almost certainly will happen if you take a large enough sample.

    The most important thing to understand is that this anomalous case does not invalidate DNA evidence - all it does is highlight the statistical nature of such evidence. DNA evidence (assuming the methodology of the tests is good) is exactly as useful now as it was before - that is, very useful - as long as it isn't abused. And generally speaking, the various police forces that use it are honest enough that they don't abuse it (witness the fact that they got a second opinion in this case).
    This is an interesting and eye-opening occurence, but it isn't the end of DNS evidence in forensics.

    himi

    --
  • Illinois suspended executions after they realized that while they had executed 12 people since the reinstation in 1977, 13 people had been freed from Death Row. Those are not good odds.
  • So why not just make all the tests 10 loci?

    This point is the only valid take-away from the whole article. The British database only captures a DNA fingerprint based on 6 loci and we've all seen the math on that. The vast majority of US states and all federal cases require DNA tests with more than 10 loci. The odds of this error cropping up in the states is significantly less.

    p.s. This is an *old* story. It was reported at the beginning of the month in several British papers and ran on CNN on Tuesday. Granted, Saturday night is a slow time for Slashdot, but it'd be nice to hear stuff we didn't already know. :)

  • > if you are in the database you have previously committed a crime

    Not quite. If you are in the database you have been *convicted* of a crime. You may not have actually been guilty. For this reason, your previous track record cannot legally be taken into account when deciding whether you are guilty.
  • Truly.

    To be specific, if their database has 700000 entries in it, it has 700000*699999/2 pairs in it. That's 245 billion pairs. If the odds against any pair matching at random is 37 million to one, that means there are a *LOT* of matches in that database, probably about 7000 of them.

    This simply seems to be a case of scaling the database without scaling the identification key --with predictable results, non-unique keys.

    Anyone know how the probability of a bad match decreases with number of loci tested?
  • This issue was mentioned in some circles (not the mainstream press, unfortunately) in the wake of the OJ Simpson trials. The problem is that the public, and most of the pundits, it appears, were not educated about the fact that the odds of a false negative and the odds of a false positive are related but not the same.

    This brings up too issues. The first is the tendency to believe that technology can put complex techniques within the capabilities of people without training in the field. The second, closely related, is the belief that the reliability of the technology is not effected by the possibility of human error. On anything where the odds are stated as being that long, the two things I always ask are:

    1. Do we understand the odds? Are we aware of all of the factors that might be significant when we are trying to get results to that many decimal places? There are any number of factors that can be ignored when you are looking for imprecise answers.
    2. What were the opportunities for human error or corruption? I would expect them to be fairly high relative to the long odds stated here.

  • Well, if they were conjoined twins then the left one is always the evil one. ...according to the simpsons anyway
  • This really shouldn't come as that big a surprise to people - no more so than someone winning a lottery.
    As the article mentions, there is a 1 in 37 million chance of this happening. Statistically this means that while it will not happen often, it will happen at some point.
    I think the problem arises from the wide spread belief that DNA testing is infallible and provides concrete proof of a persons guilt/innocence - it does not.
    DNA evidence is just that, evidence, and should be regarded as such in court. If DNA testing along with collaborating evidence indicates the person is guilty, then they probably are - or vice versa. If there is evidence that points against the DNA results, one should not automatically assume that the DNA results are correct.
  • As Terry Pratchett says:
    "Million to one chances happen nine times out of ten."
  • Did you read the article - they re-tested with 10 points of reference, which supposedly has a 1 in 1,000,000,000 chance of a mismatch, so it was more a case of not using the most reliable test they could. Also, apparently in the US they use 13 points of reference, which presumably has a stupidly large number for it's mismatch chance. I guess it'll just change the procedure so they use the 1 in 37,000,000 and re-test with a higher level if it matches to confirm it.

    Are there any figures for finger print testing? How truly unique is a single finger print, and whats the chance of mismatch with 2 finger prints? DNA testing is still pretty accurate!

  • by divec ( 48748 ) on Sunday February 13, 2000 @03:13AM (#1280118) Homepage
    Right, 37 million to one is not very big odds when you're doing 245 billion independent tests.
    If the probability of a false positive in any individual test is p, then the probability of conducting n tests without getting any false positives is (1-p)^n. As pointed out, this means that if enough tests are done you'll almost certainly convict an innocent person. If you have two crimes with DNA evidence that is only this reliable, then more than likely some innocent person in the UK would test guilty.
    Actually, it's worse than this because people don't have independent DNA - they're likely to be distantly related. This makes false positives even more likely.
    If there are n people and you want the probabilility that any of them test positive to be less than x then you need
    1 - (1-p)^n < x, which is nearly the same as 1 - p*n < x. So to be fairly sure that nobody in the world falsely tests positive you need p to be less than about 1 in 80 billion.
  • by divec ( 48748 ) on Sunday February 13, 2000 @03:18AM (#1280119) Homepage
    The probability of a false positive match approaches 1 as the number of samples approaches oo.

    P(false positive) = 1 - P(no false positives)
    = 1 - (P(correct answer))^n
    = 1 - (1-p)^n
    -> 1 as n -> oo.

    This is ignoring the probability of a false negative; this is very low since only one person can commit a crime!
  • by divec ( 48748 ) on Sunday February 13, 2000 @03:40AM (#1280120) Homepage
    No, the original poster was right. The chance of a false match on the file is

    1 - (1 - 1/37million) ^ 660,000

    which is nearly the same as

    660,000 / 37million = 1/56.
  • This is so basic, I can't even believe it! I can't believe peoples lives are decided on such a weak mathematical basis!

    If the chance of a match between two random DNA samples is 1/37.10^6, and they have 660000 samples in their database, then the likelihood -- assuming their system does'nt give false positives, which I doubt -- of a database match is ... 1.78% !!! We don't know how much DNA tests they make each year, but it's porbably well over a thousand, wich leads to over 10 false positives a year!

    Americans find that "mind blowing"? Minboggling stupidity, if you ask me

Most public domain software is free, at least at first glance.

Working...