Is It Ever Reasonable to Admit ‘Bad Character Evidence’ in a Criminal Trial?

In what follows I will identify the chief concerns with admitting bad character evidence.  I will argue that this evidence, when used as direct evidence that a defendant is guilty of the crime for which he is standing trial, should never be admissible in a criminal trial.

In the UK the law of criminal evidence operates a presumption against admitting evidence that shows a defendant to be of bad character.  The arguments surrounding whether or not to admit such evidence largely centre on the relevance debate.  Evidence is logically relevant when the probability of finding that evidence given the truth of some hypothesis at issue in the case differs from the probability of finding the same evidence given the falsity of that hypothesis.  So, in a criminal trial, evidence is logically relevant only if it is not as likely to be found given the defendant’s guilt as his innocence.  Statistics strongly suggest that bad character evidence, such as a prior convictions record, is logically relevant.  If the initial odds for guilt are 1:1 and the jury is told that the defendant has a criminal record, it is reasonable to assume that the jury will calculate the probability of the defendant being guilty of the offence he is now on trial for and having a criminal record is greater than the probability of him being innocent of the offence he is currently on trial for and having a criminal record.  That is, Pr(Record|Guilt) > Pr(Record|Innocent).  Hence, bad character evidence does raise the probability of guilt and so must be logically relevant.  Indeed, bad character evidence is frequently admitted to undermine the defendant’s credibility, to inform the jury of the essential background to the allegations against the defendant, and to rebut the defendant’s depiction of himself as a person of good character.  The issue is whether or not to allow bad character evidence as direct evidence that the defendant committed the crime for which he is standing trial.  The Courts term this ‘similar fact evidence’.  Some contend that similar fact evidence should be admitted because it is logically relevant.  They assert that excluding evidence in which the likelihood ratio deviates substantially from 1:1 deprives the jury of information that might aid them considerably in their rational resolution of disputed factual claims and may prevent a party from making what is, on a fair reading of the evidence, a powerful case.  The obvious solution, they argue, is to provide the jury with all the information they need to assess accurately the probative value of the offered evidence.

Still, the Courts exclude similar fact evidence maintaining that it does not meet the appropriate standard of proof.  I will present three arguments against admitting similar fact evidence, these are that the data used by the statisticians may be biased, the problems associated with reference classes and the incompleteness of psychological theory.  Firstly, the statistics that indicate the logical relevance of similar fact evidence (e.g. prior convictions) turn into an almost self-fulfilling prophecy.  Based on statistical evidence, police carefully monitor recently released convicts, but this takes resources away from ‘traditional’ policing activity.  Thus, when a recently released convict commits an offence they are less likely to get away with it than a first offender, or one without prior convictions.  Similarly when a crime is committed police may be more likely to look to ‘round up the usual suspects’, the regular offenders, for explanation than follow a trail of evidence.  In this way the statistics can become more and more skewed.  The Courts could argue this throws doubt on their reliability.  Secondly, statistical groups are not finely enough individualised and so cannot be used to predict the behaviour of a particular individual.  For instance, often the statistics are used without differentiating between serious and more common minor offences.  Serious offences (e.g. murder, rape, etc.) are rare, they say more about a character than minor offences (e.g. car theft).  Serious offences are harder to predict.  Minor offences should not be used to predict a serious one, and vice versa.  It is a long way from stealing a car to murder.  A fifty per cent false positive rate has been identified when statistics are used to foretell a serious offender re-offending.  As minor offences are more common, they incur fewer false positives and as such evidence of prior convictions is more probative.  Likewise, the significance of time-scale is habitually ignored, or under emphasised.  The probability of minor offenders re-offending diminishes with time, most re-offending occurs within two years of the previous offence.  Outside the two-year window convicts of minor offences are unlikely to re-offend.  Further, reference classes introduce the problem of classification.  Doubt on how to classify events leads to a subjectivity that will surely reduce the probative value of the evidence.  Thirdly, the Courts argue that there are still sizable doubts as to whether character evidence can be used at all to predict behaviour because psychological theories do not absorb the complex nature of a person.  Currently, psychologists seem to favour ‘interactionism’, which says our behaviour is a result of a combination of our character/personality and the situation.  That is, they claim characters react similarly to similar situations.  This does not seem very precise, certainly grounds for reasonable doubt.  What proportion is down to character?  Is it the same for all people?  Can there never be a spontaneous offence?  These questions all point to similar fact evidence being insufficiently informative.  In view of the above we have reached a stage where the statistics and consequently the logical relevance of similar fact evidence are less than secure.  This would indicate the Courts are right to be wary of bad character evidence.

Another concern with admitting bad character evidence is its prejudicial effect.  This is closely related to the problem of relevance.  Courts fear that the small probative value of hearing similar fact evidence will be masked by the prejudice, perhaps unconscious, it will spark in the jury.  Hearing this evidence could unfairly change the burden of proof for a defendant’s guilt.  The evidence may influence the jury’s verdict without relaying logically to the issue of guilt or innocence.  In particular, once the jury learn that the defendant has past convictions they do not care so much if they wrongly convict him of the offence for which he is currently on trial.  This is based on a model known as the regret matrix.  It suggests that it is likely that jurors regret the mistake of convicting a basically evil person less than convicting a basically good person.  If most jurors cannot avoid being influenced by such preferences in reaching their verdicts the burden of proof is effectively changed by any information (prior conviction record) that affects these preferences.  The law, however, regards a wrongful conviction as much more regrettable than a wrongful acquittal (innocent until proven guilty).  The law does not contemplate that the standard of proof will vary with the defendant’s personal character or criminal activity and as such this evidence is simply inadmissible.  Thus even if bad character evidence is logically relevant, it might still be deemed legally irrelevant.

A further argument against admitting similar fact evidence is that the jury may well have already assumed the defendant has a criminal record.  Thus in their verdict whatever probative weight this evidence may carry has already been counted, so, admitting the evidence of previous convictions could raise the probability of guilt further than it should – double counting the probative value of the evidence.  Moreover, the courts are concerned that jurors will misestimate the worth of similar fact evidence.  That they will assign to it greater probative weight than it is due.  This tendency to overestimate could be used to advantage the prosecution, enabling them to confuse or mislead the jury.  A lot of this problem relates to the imprecision of probabilistic estimates.  Evidence E should impact jurors’ beliefs according to Bayes’ Rule:  post odds = likelihood ratio multiplied by prior odds.  If the likelihood ratio is greater than one we have evidence for our hypothesis, e.g. x is guilty.  If the likelihood ratio is less than one it is evidence against the hypothesis.  If the likelihood ratio is equal to one then the evidence is logically irrelevant.  Supposedly this is the model jurors follow.  However, it relies on the default assumption of the juror being that the defendant has no past convictions.  As we have already discussed this is not the case.  One may well then question the entire process by which we establish guilt or innocence, that is, having a non-specialist jury working within a model of probabilities which are either raised or not by evidence presented to them at the hearing.  But, this process does seem to have more or less worked well for centuries; at least it appears to be the least worse process thus far developed.  I think that even with a jury whose default assumption is that the defendant has a criminal record, and whose preferences are such that they are less concerned about convicting a basically bad person (i.e. one with a criminal record) the system actually does not suffer because the court does not explicitly say defendant x has a criminal record.  As soon as the jury is made aware they extremely overestimate the value of this evidence, as above considered.  Every juror enters the court with background assumptions and beliefs, these may well conflict with the other jurors.  The point of having a jury of your peers is to take into account these background assumptions.  Most judges (at least in the UK) are middle aged white upper middle class men, perhaps a little out of touch with the plight of the average defendant over which he presides.  The jury acts as a safeguard, of which their default assumptions are a crucial part.

The final argument I will present concerns the moral principles involved in disclosing evidence of prior convictions.  It could be interpreted that by excluding evidence of bad character, courts express the moral principle that a defendant ought to be tried for the particular charge he is alleged to have done and not his past actions.  Indeed this seems to be a fundamental principle of criminal justice.  It has been argued that it is a moral requirement not to hold people to their past, or to bound them to statistical averages.  But I would go further than this.  In many ways the legal system of a society says what that society values, punishing people for murder, literally removing them from society, says that as a society we believe that killing another person is wrong.  By admitting bad character evidence we are in effect saying that as a society people cannot change, they cannot redeem themselves.  The very fact that we spend millions of pounds of rehabilitation facilities speaks to the contrary.  We have discussed the reference class problem.  This problem can be extrapolated to an argument against statistics full stop.  No matter how strong the statistical evidence it can never say that a person will do x, there will always be uncertainty.  For most evidential forms there will be uncertainty, but the uncertainty with reference class statistics is grounded in the fact they represent the behaviour of other people.  We cannot use the behaviour of others as a reason to predict the behaviour of one particular person.  Even if the reference class consists of just that person and their past actions, still the future cannot be predicted.  One cannot convict someone of a crime they have not yet committed.  In the same way a person cannot be convicted because their past actions strongly suggest they have committed the offence for which they are presently on trial.  Finally, it is worth noting that statistics are not causally relevant.  ‘Past convictions’ is not a cause of any present offence (assuming they are committed independently).  It cannot be argued that the defendant stole this car because he has stolen three cars previously, or that the stealing of three cars caused him to steal this one.  While this is not an argument for not admitting bad character evidence, it is an important point because, perhaps due to a fundamental lack of understanding of probability theory, juries can misinterpret bad character and statistical evidence as causally relevant.  Maybe this goes someway to explaining why this type of evidence is so often over weighted. Thus, admitting bad character evidence says our society holds that ‘once a criminal always a criminal’ and that the behaviour of others (no matter how similar to me) will accurately predict my behaviour.

In sum, it is argued that bad character evidence should be admissible in criminal trials because it is logically relevant, and depriving the jury of this information may unfairly prevent a party from making a powerful case or inhibit the jury in their resolving of the disputed factual claims.  The chief concerns with admitting bad character evidence, in particular similar fact evidence perhaps in the form of a criminal record, are that it is not actually relevant, that jurors may well subconsciously already assume the defendant has a bad character and so admitting such evidence will lead to double counting, that jurors misestimate the probative value of this evidence, and prejudice.  Further, there are moral principles at stake in admitting bad character evidence.  It is a fundamental principle of criminal justice that a person be tried for the particular charge he is alleged to have done and not his past actions.  As a society do we want to say that people cannot change and offer no chance of redemption?  Statistics used to suggest the admittance of bad character evidence are in fact based on the behaviour of others and we cannot use the behaviour of others to predict the behaviour of the defendant.  We cannot use the behaviour of the defendant to predict the behaviour of the defendant; the defendant’s values, circumstances and situations are not constant.  Statistically relevant does not mean causally relevant.  Bad character evidence for the purpose of directly indicating a defendant’s guilt has no place in a court of law.




