**Home**» Blog

# Blog

### A Visualization of Soil Taxonomy Down to the Subgroup Level

#### Posted on September 29, 2010

It turns out that you can generate a quasi-numerical distance between soil profiles classified according to Soil Taxonomy (or any other hierarchical system) using Gower's generalized dissimilarity metric. For example, taxonomic distances computed from subgroup membership are based on the number of matches at the order, suborder, greatgroup, and subgroup level. This approach allows for the derivation of a quasi-numerical classification system from Soil Taxonomy, but it is severly limited by the fact that each split in the hierarchy is given equal weight. In other words, the quasi-numerical dissimilarity associated with divergence at the soil order level is identical to that associated with divergence at the subgroup level. Clearly this is not ideal.

Gower's generalized dissimilarity metric is conveniently implemented in the cluster package for **R**. I have posted some related material in the past, but left out some of the details regarding which clustering algorithms produce the most useful dendrograms. Divisive clustering best represents the step-wise splits within the hierarchy of Soil Taxonomy, as expressed in terms of pair-wise dissimilarities. Code examples are below, along with the data used to generate the figure of California subgroups. Discontinuities in figure below are caused by errors in the underlying data, e.g. mis-matches in soil order vs. suborder membership.

Figure:

### Soil Properties Visualized on a 1km Grid

#### Posted on August 31, 2010

Fresno Area Urban Areas

vs Irrigated LCC: grey

regions are current urban

areas

A couple of maps generated from a 1km gridded soil property database, derived from SSURGO data where available with holes filled with STATSGO data. Soil properties visualized at this scale illustrate several important soil-forming factors operating within California: sediment source in the Great Valley, the interplay between precipitation and ET, and removal of salts. This database and the details on its creation should be available within a couple of months. This builds on a related post highlighting some of these maps packaged in KMZ format. Check back in a couple of weeks of updates.

### GRASS Can Make Pretty Maps

#### Posted on August 23, 2010

I have posted a couple examples in the past on the topic of high quality map production from GRASS GIS-- usually via the Generic Mapping Tools. I am not sure why, but I have previously avoided using the traditional cartographic output module that is bundled with GRASS (`ps.map`). This is despite the fact that there is now an excellent collection of examples and a very detailed manual page... I now realize that I have been missing out.

I needed a map with several "zoomed" insets, on rather short notice, with all of the data derived from work that had previously done in GRASS. Not looking forward to exporting all of the data into a GMT-compatible format, I gave `ps.map` a try. The following image is a JPG (i.e. degraded) version of the Postscript file generated by `ps.map`, with no manual intervention (other than page layout). Obviously the labels aren't readable, but that can be fixed within a drawing program like Inkscape. Overall, the results were much better than I was expecting. I'll post some notes and the script used to generate the map next time.

### What would a 25th, 50th, and 75th percentile soil profile look like?

#### Posted on August 11, 2010

I have mentioned the AQP package in previous entries. One of the functions in this package generates aggregate soil profile data, from a collection of soil profiles that are related by some factor: common lithology, common landscape position, and so on. Typically the mean, or median (50th percentile) is used to generate a new aggregate profile, that is *representative* of the original collection. Extending this idea, I thought that it would be interesting to generate aggregate profiles that are representative of the 25th and 75th percentiles as well. For the sake of clarity, lets call these three new profiles (25th, 50th, and 75th percentiles) Q25, Q50, and Q75. A 10 cm slicing interval was used as the basis upon which soil properties were aggregated.

### Just for Fun: Using R to Create Targets

#### Posted on August 10, 2010

OK, not really science or soil-related, but a fun 5 minute use of **R** to make something you can use to improve your hand-eye coordination. Demonstrates several ways to use base graphics, user-defined functions, and calling functions from within other functions.

### Using R and r.mapcalc (GRASS) to Estimate Mean Topographic Curvature

#### Posted on August 3, 2010

Recently I was re-reading a paper on predictive soil mapping (Park et al, 2001), and considered testing one of their proposed terrain attributes in GRASS. The attribute, originally described by Blaszczynski (1997), is the distance-weighted mean difference in elevation applied to an n-by-n window of cells:

Equation 4 from (Park et al, 2001)

where n is the number of cells within an (odd-number dimension) square window excluding the central cell, z is the elevation at the central cell, z_{i} is the elevation at one of the surrounding cells i, d_{i} is the horizontal distance between the central cell and surrounding cell i. I wasn't able to get a quick answer using r.neighbors or r.mfilter, so I cooked up a simple R function to produce a solution using r.mapcalc. The results are compared with the source DEM below; concave regions are blue-ish, convex regions are red-ish. The magnitude and range are almost identical to *mean curvature* derived from v.surf.rst, with a Pearson's correlation coefficient of 0.99. I think that it would be of general interest to add functionality to r.neighbors so that it could perform distance-weighted versions of commonly used focal functions.

Figure:

### R's Normal Distribution Functions: rnorm and pals

#### Posted on July 14, 2010

The `rnorm()` function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands `dnorm()` (normal density function), `pnorm()` (cumulative distribution function), and `qnorm()` (quantile function) before-- so I made a simple demo. The `*norm` functions generate results based on a well-behaved normal distribution, while the corresponding functions `density()`, `ecdf()`, and `quantile()` compute empirical values. The following example could be extended to graphically describe departures from normality (or some other distribution-- see `rt(), runif(), rcauchy()` etc.) in a data set.

### PostGIS in Action Book Review

#### Posted on June 8, 2010

I was recently asked to review a soon to be published book on PostGIS, a spatial extension to the very popular Postgresql relational database. I was very excited about receiving an early copy of this book, as the authors have provided countless tips, fixes, and clever query examples on the PostGIS mailing list over the years. After spending a couple weeks looking through the book, I have to say that I am very impressed with the quality and completeness. Indeed, this is the book that I wish would have been available when I was starting out with PostGIS. The authors do an excellent job of promoting the idea that a relational database and SQL are well suited for spatial data modeling and analysis.

### An XML Representation of the Keys to Soil Taxonomy?

#### Posted on May 29, 2010

Maybe this is just craziness, but wouldn't be neat to have an XML formatted version of the Keys to Soil Taxonomy? The format might look something like the following code snippet, although there may be more efficient uses of XML... The only problem I can see is that it would take a hell of a long time to type in the entire 300+ page document. A complete document of this nature would support all kinds of new and creative uses for the 'keys-- electronic look-up, automated generation of a PDA-ready version, an awesome teaching tool, or just something that could be used to generate cool figures. Anyone know of a quick way to get this put together, or of any similar document that has already been published? Anyone want to help type-in the data?