Up Down Method

The Up Down Method is probably the most common technique used to derive the mechanical nociceptive thresholds from Von Frey Filament data (the other being the % response method). As part of our own validation of MouseMet, we compared it with von Frey filaments on mice and, when we came to analyse the data, studied the method. Gradually we realised, from talking to other workers and from the literature, that the up down method is often misunderstood and misused and that we should be very careful when comparing it with a system such as MouseMet which gives a true analogue result. It was to address this problem that we developed MouseCal, an in-house validation system for filament and EvF measurements alike. For completeness, and in the interests of better science, this page presents, therefore, a summary of the problems that we found and a link to a recent Topcat poster presentation. It contains more maths (and less pictures) than the rest of the website. Sorry!

I (Michael Dixon) will also start by saying that I am an engineer and not a mathematician; I will welcome input from anyone who can help shed light on the issues that we raise. And, if you are unfamiliar with the up down method, may I suggest you start by reading the first part of this recent Topcat poster, which outlines the procedure.

Weber’s Law

The method was pioneered by WJ Dixon (no relation) in the 1960s, and developed for this application by SR Chaplan in the 90s (refs at the end). The starting point is Weber’s Law. This states that, for a given stimulus, the minimum increment that a person (or animal) can detect is a fixed proportion of that stimulus. This increment, not surprisingly, is called the Weber fraction. One can therefore produce a progression of stimuli of increasing intensity, so that a subject can distinguish each one to be just bigger than the previous, by making each a multiple of the previous by a constant factor.

Expressed mathematically, if one took logarithms of the stimulus values, the differences between successive stimuli would then be a constant amount. (Please don’t give up at the word “logarithm”, I’ll be very gentle.)

An ideal set of von Frey filaments would then be designed according to this rule, each one producing more force than the previous one by a constant factor, so that a subject can just distinguish between them.

The ideal spacing

Dixon also stated in the original article that the spacing between adjacent stimuli should be about equal to the standard deviation of the (logged) data (but that this could vary up or down by up to 50%). It’s a slightly circular argument because it suggests that you need to know your population before you can measure it although the window is quite wide. Another way of putting this is that you will generally use 2 or 3 stimuli up and down from the mean to cover most of your population (because a normal distribution is 2-3 standard deviations wide on each side).

Note also that the method requires a degree of random variation in order to work. If a mouse were to respond at exactly the same threshold force every time, then the up down sequence would merely seesaw between two adjacent levels. If, for instance, these two von Frey filaments were the 2 and 4g ones, then it would be impossible to distinguish between a mean of 2.5g and 3.5g, or even 2.1g and 3.9g. In practise, this hardly ever happens because the sources of random error, such as biological variation from the subject, inconsistency of the operator and random variation in the forces produced by the von Frey filaments combine to provide a (normal) distribution of responses, which is exactly what the method is designed to measure.

Available von Frey Filaments

Here are the forces produced by many of the commercial von Frey filament sets, in grams force:

0.008,  0.02,  0.04,  0.07,  0.16,  0.4,  0.6,  1.0,  1.4,  2.0,  4.0,  6.0,  8.0,  10.0. (the full set going up to 300g).

These values are curious: Taking the 1.0g as a starting point, the next provides a force of 1.4g, giving a Weber fraction of 1.4. But the next two in the series are 2.0 and 4.0, giving a Weber fraction of 2. So the spacings are inconsistent, which may degrade the method. This was an immediate worry to us because we were trying to compare this data with MouseMet, which gives an analogue reading, and we needed to quantify any sources of error.

A Simulator

We therefore wrote a mathematical simulator to test the up down method, and to see how much it is degraded if, for instance, the von Frey filaments are not uniformly spaced. For this we needed knowledge of the typical standard deviation of baseline threshold data from mice and we took this from three sites currently using MouseMet regularly (acknowledgements on the poster).

For the range commonly used to test mice, the von Frey filament spacing quoted above works quite well if you add in a 3.0g filament, so we tested this case first and the results were presented as a poster abstract at the spring meeting of the AVA, Nottingham 2014.

We also modelled other possible sources of error from the von Frey filaments and the up down method. These are detailed on the poster and summarised here:

Incorrect Log Values

The commercial von Frey filaments have a number printed on the handle, which many people put into the standard equation for the up down method. The numbers appear to be the base 10 logarithms of the gram force values for each filament, with 4 added to each value. I assume this is to make all the numbers positive and above 1 (the base 10 logarithm of 1 is 0 and the smallest filament is 0.008, three orders of magnitude lower). Strangely, most of the values are slightly incorrect: the 1.0g filament has a value of 4.08 on the handle and even allowing for the factor of 4, there is no system of logs to my knowledge in which the log of 1 is 0.08! We therefore ran the simulation with correct and incorrect log values to see the difference.

Starting away from the mean

Another important point is that the method stipulates that the first von Frey filament used should be as close to the LD50 value as possible (the mean of the distribution of the data). Put another way this means that you need to know the answer before you start to measure it and while, this is obviously not possible, you should start with a von Frey filament that is close to the expected mean. This creates problems with a blinded trial where you are looking for a treatment effect which may raise or lower the mechanical threshold, relative to the baseline values. Also, some workers will deliberately start with a very small von Frey filament, expecting to record three or four non-responses at the start and to then use the values in the Caplan look up table starting with 000 to generate their coefficient K. This, we believe, is an incorrect use of the method; these coefficients are intended to cope with the outliers in normally distributed data, not because you deliberately started a long way from the mean.


Finally, any test for normality should presumably be performed on logged data, not on the linear values of grams; if Weber’s law applies, then testing the linear data is not meaningful. Note also that the up down method yields a discrete range of values, not a continuum, so the data should be regarded as non-parametric for statistical purposes.

There. That didn’t hurt, did it? Click here to return to the MouseMet system (with pictures).


Dixon WJ (1965) The Up-and-Down Method for Small Samples. J Am Stat Assoc 60 967-978

Chaplan SR (1994) Quantitative assessment of tactile allodynia in the rat paw. J Neurosci Methods 53 55-63.