dylan's blog

soilDB Demo: Processing SSURGO Attribute Data with SDA_query()

Submitted by dylan on Thu, 2012-04-26 23:18.

Mapping near Paloma, CAMapping near Paloma, CA This image has nothing to do with the following content.

A quick example of how to use the USDA-NRCS soil data access query facility (SDA), via the soilDB package for R. The following code describes how to get component-level soils data for Yolo County (survey area CA113) from SDA and compute representative sub-order level classification for each map unit. This example requires an understanding of SQL, US Soil Taxonomy and the SSURGO database.

( categories: )

R Quickie: Custom Panel Functions and Default Arguments

Submitted by dylan on Mon, 2012-04-16 16:39.

Sometimes the basic functionality in lattice graphics isn't enough. Custom "panel functions" are one approach to fully customizing the lattice graphics system. Two examples are given below illustrating how to define an (inline) custom panel function for adding a regression line to an entire data set in the presence of grouping variable. The "..." keyword instructs our custom panel function to accept all arguments typically passed to a panel function, and can be re-used in clever ways within our panel function.

Dissimilarity Between Soil Profiles: A Closer Look

Submitted by dylan on Fri, 2012-03-23 19:35.

Continuing the previous discussion of pair-wise dissimilarity between soil profiles, the following demonstration (code, comments, and figures) further elaborates on the method. A more in-depth discussion of this example will be included as a vignette within the 1.0 release of AQP.

Profile Dissimilarity Demo: MVO SoilsProfile Dissimilarity Demo: MVO Soils

A Graphical Explanation of how to Interpret a Dendrogram

Submitted by dylan on Thu, 2012-03-15 18:16.

Dendrograms are a convenient way of depicting pair-wise dissimilarity between objects, commonly associated with the topic of cluster analysis. This is a complex subject that is best left to experts and textbooks, so I won't even attempt to cover it here. I have been frequently using dendrograms as part of my investigations into dissimilarity computed between soil profiles. Unfortunately the interpretation of dendrograms is not very intuitive, especially when the source data are complex. In addition, pair-wise dissimimlarity computed between soil profiles and visualized via dendrogram should not be confused with the use of dendrograms in the field of cladistics-- where relation to a common ancestor is depicted.

An example is presented below that illustrates the relationship between dendrogram and dissimilarity as evaluated between objects with 2 variables. Essentially, the level at which branches merge (relative to the "root" of the tree) is related to their similarity. In the example below it is clear that (in terms of clay and rock fragment content) soils 4 and 5 are more similar to each other than to soil 2. In addition, soils 1 and 3 are more similar to each other than soils 4 and 5 are to soil 2. Recall that in this case pair-wise dissimilarity is based on the Euclidean distance between soils in terms of their clay content and rock fragment content. Therefore proximity in the scatter plot of frock frags vs. clay is directly related to our simple evaluation of "dissimilarity". Inline-comments in the code below elaborate further.

Data to DendrogramData to Dendrogram

( categories: )

AQP / soilDB Demo: Dueling Dendrograms

Submitted by dylan on Wed, 2012-03-14 21:55.

UPDATE 2013-04-08: This functionality is now available in the sharpshootR package.

Previously, soil profile comparison methods from the aqp package only took into account horizon-level attributes. As of last week the profile_compare() function can now accommodate horizon and site-level attributes. In other words, it is now possible to compute pair-wise dissimilarity between soil profiles using a combination of horizon-level properties (soil texture, pH, color, etc.) and site-level properties (surface slope, vegetation, soil taxonomy, etc.)-- continuous, categorical, or boolean.

An example is presented below which is based on the loafercreek sample data set included with the soilDB package. Be sure to use the latest version of soilDB, 0.5-5 or later. Dissimilarity matrices created from horizon and site+horizon data are compared by placing their respective dendrograms back-to-back. Code from the ape package is used to facilitate dendrogram plotting, manipulation, and indexing. Blue line segments connect matching nodes from each dendrogram. Soil profiles with paralithic contact are marked with orange squares for clarity.

Dueling DendrogramsDueling Dendrograms

A Quick Demo of SoilProfileCollection Methods and Plotting Functions

Submitted by dylan on Wed, 2012-01-04 20:02.

Here is a quick demo of some of the new functionality in AQP as of version 0.99-9.2. The demos below are based on soil profiles from an archive described in (Carre and Girard, 2002) available on the OSACA page. A condensed version of the collection is available as a SoilProfileCollection object in the AQP sample dataset "sp5". UPDATE 2010-01-12 Syntax has changed slightly, as profileApply() now iterates over a list of SoilProfileCollection objects, one for each profile from the original object.

AQP Sample Dataset 5: Profile SketchesAQP Sample Dataset 5: Profile Sketches

( categories: )

Logistic Power Peak (LPP) Simulated Soil Profiles

Submitted by dylan on Sat, 2011-11-12 21:01.

A friend of mine recently published a very interesting article on the pedologic interpretation of asymetric peak functions fit to soil profile data (Myers et al., 2011). I won't bother summarizing or paraphrasing the article here, as the original article is very accessible, rather I thought I would share some new functionality in AQP that was inspired by the article. While reading the article I thought that it would be interesting to use one of these peak functions, the logistic power peak (LPP) function, to simulate soil property depth-functions. Simulated values could be used to evaluate new algorithms with a set of tightly controlled properties that vary with depth. One of the nice aspects of these peak functions is that they can create a wide range of shapes that mimic common anisotropic depth-functions associated with pedogenic processes such as illuviation, ferrolysis, or seasonal fluctuation of groundwater levels. An example R session demonstrating the use of LPP-simulated soil property depth-functions is presented below.

LPP-simulated ProfilesLPP-simulated Profiles

Combining Base+Grid Graphics

Submitted by dylan on Tue, 2011-10-04 17:48.

R provides several frameworks for composing figures. Base graphics is the simplest, grid is more advanced, and the lattice/ggplot packages provide convenient abstractions of the grid graphics system. Multi-element figures can be readily created in base graphics using either par() or layout(), with analogous functions available in grid. Mixing the two systems is a little more complicated, somewhat fragile, but entirely possible.

I recently needed to combine base+grid graphics on a single page; two sets of base graphics on the left, and the output from xyplot() (grid) on the right. Following some tips from Paul Murrell posted on r-help, the solution was fairly simple. A truncated example of the processes is listed below, corresponding to the attached figure. While the result was "good enough" for a quick summary, there are clearly some improvements that would make figure more useful.

Mehrten Soil SummaryMehrten Soil Summary

Soil-Landscape Block Diagrams in SoilWeb

Submitted by dylan on Fri, 2011-09-16 16:43.

Users of our Google Earth interface to USDA-NCSS soils information will now see links to soil-landscape block diagrams listed within map unit descriptions.

Automated Linking to NCSS Block DiagramsAutomated Linking to NCSS Block Diagrams

Soil Series Query for SoilWeb

Submitted by dylan on Fri, 2011-09-16 16:12.

A map depicting the spatial distribution of a given soil series can be very useful when working on a new soil survey, updating an old one, or searching for specific soil characteristics. We have recently added a soil series query facility to SoilWeb, where results are returned in the form of a KML file. Two modes are currently supported:

  1. map unit based querying- only map units named for the given soil series are returned
  2. component based querying- map units containing components named for the given series are returned

For example, if someone was interested in the spatial distribution of the Amador soil series, they could use the Series Extent Mapping tool to get a quick description of which survey areas contain (and how many corresponding acres of) this series. For an even more detailed description of where the Amador series is mapped, one could use our new soil series query like this:


This is a preliminary version, and a subsequent post will contain links to a Google Earth file that can be used to simplify the query process. In most cases queries take about 1-5 seconds, which is quite fast considering: 1) either 730k component names or 275k map unit names are searched, 2) 35 million map unit polygons are filtered for the series in question, and, 3) bounding boxes for matching polygons are merged together-- all on-the-fly. Full text searches for map unit/component names are very fast thanks to advanced text indexing and searching algorithms implemented in PostgreSQL and spatial processing functions implemented in PostGIS. In the final version, the location of the official series description (OSD) will be included in query results.

Attached at the bottom of the page is a KMZ demo showing sample output from the two query modes. Screen shots from the demo are posted below.

Soil Series Query Results: Amador Series: blue regions: map units dominated by the Amador series; red regions: map units that contain at least one component of the Amador series.Soil Series Query Results: Amador Series: blue regions: map units dominated by the Amador series; red regions: map units that contain at least one component of the Amador series.