The Problem with Accuracy: why uncertainty is more useful

Apr 6, 2021 00:00 · 1223 words · 6 minute read

In the normal run of agronomic activity, we get soil samples to tell us about the condition of the soil, and in particular any deficiencies in either macro-nutrients or micro-nutrients. Especially for the latter, we are looking at small numbers in parts per million. Why would accuracy be a problem? Wouldn’t accuracy be exactly what we want.

This post is about why we recommend understanding clearly the limitations of accuracy, and why uncertainty is better. We unwrap a paradox.

I am not suggesting that your soil sample results are consistently wrong, but there are significant limitations to take into account.

The first limitation is that when lab procedures are carried out very professionally, the results are accurate about some soil attributes in the small sample sent. They are not intrinsically the same across the body of soil that you were sampling. The body of soil is the farm paddock (length and width) and to the depth of the root zone. Its a 3D entity. We know there is variation.

If your company has already adopted sampling at known GPS points, then the results are probably accurate for that point. Results from a sample taken as little as one metre away could be quite different. We see that when we sample for soil electrical conductivity in horticulture.

There is considerable evidence in the scientific literature and in advice from the various departments of agriculture that the results returned will vary between laboratories and over time within one. Among the many papers demonstrating this variability is Variation in soil fertility test results from selected Northern Great Plains laboratories. The authors conclude:

“Variability within labs and between labs was observed, and is at least partially linked to different laboratory methods or changes in laboratory methods during the study period. Several labs showed a trend of improving lab repeatability. The results suggest that agricultural professionals should choose their labs wisely, specify analytical methods, and review results critically. In addition, the analytical data reported, and ultimately, the fertilizer recommendations provided, should be considered guidelines to nutrient management to be tailored to site-specific conditions.”

If you take several samples in a paddock and combine them, you introduce a third type of limitation and potentially a very significant one. There is always some variation across a paddock even at the one hectare scale. It can be as a result of how the soil was formed or as a result of human management. Its not a wise practice now. It can obscure big variations in soil conditions. Averaging the results makes no sense. If you have an area where electrical conductivity is sliding up for some reason, or sand-dominated soil types don’t retain P very well, you will average their problems out against the rest of the body of the soil. The plants in the affected/deficient area know nothing of the better conditions 10 metres away, but your process has hidden the range of values.

A fourth form of limitation is the nature of some soil attributes. The value for some attributes will vary across the growing season. Soil organic matter is one of those soil attributes that varies, and it varies with changes in soil moisture and temperature. When soil samples are taken at different times of the year, there may be noticeable changes in organic matter levels. Its not the only soil attribute that changes with the seasons.

The fundamental problem with soil sample accuracy however is that the user is tempted to treat the latest result as an accurate snapshot of soil attributes that are critical to production. Calculations may be made with the latest data, assuming it reflects the results of the most recent fertiliser applications. New fertiliser plans are implemented.

Its not that the soil sample results are inaccurate, its that we place too much reliance on a single value or a single comparison. a single point sample or an aggregated sample are not sufficient to capture soil variation and provide the insight needed to manage different sites or zones differently.

It can have additional impacts. Farm managers will often conduct small trials. It’s a very good practice, but it can lead to disappointment or hasty conclusions when the results at the next sample don’t seem to validate the changed practice.

A single sample may be accurate for the location, time and seasonal conditions, but its not nearly so suitable for extrapolating to a body of soil. On the other hand, it’s impractical and indeed impossible to sample at a 50 cm grid.

At the larger farm scale there are statistical tools to identify soil variation. They take existing data such as some satellite images provide, that is correlated to the soil attributes that we want to use to create soil zones of similar values. We prefer to base soil zones on soil physical attributes like soil texture, because these are highly influential in the behaviour of fertilisers and soil biology. Soil texture is slower to change under human management, than soil chemistry like P levels or pH.

Our first soil sampling plan serves to check the validity of the zones, and detect more variation. The sampling regime is designed by statistical processes to be efficient to achieve that. Since soil does not vary on a regular pattern, grid sampling is almost always inefficient. Subsequent sampling programs build data history within the zones. The trend becomes clearer over time. It becomes normal to see a range of values across a zone and over time. It becomes easier to project the trend for that zone and plan soil management for it.

We could well say that because we have a range of values within the zone, we have less accuracy, or rather, less certainty about any specific spot or time. Yet that same uncertainty gives us greater confidence in our management.

With more data we can show the trend for a location or a zone. Instead of attempting to extract more information from a single data point, trying to understand why one point is lower or higher than the previous one, we can monitor the central tendency.

Better yet, with enough points we can derive maps of the distribution of all the soil attributes that you have data for. There is some geostatistical magic that runs behind the scenes to do it, but like the laboratory processes for soil sampling, what you work with is the result.

The final step is that the user can have a map showing the confidence levels/uncertainty about each part of the soil attribute map. It enables us to adjust or add to the sample location set so that we can get more data in areas of greater uncertainty.

That brings us back to the single sample point on which a soil and productivity program is based. The data seems accurate but we know there are many limitations on the accuracy. Both how accurate it is and how much it is a guide to the body of soil that it represents is unknown. On the other hand we now have the opportunity to work with more data, easily. We can have maps of the soil based on our data, and maps showing us how confident we can be in the values in different areas.

And that is why precision confuses us, but uncertainty guides us to a clearer understanding and better management.

accuracy measurements soil samples