dylan's blog

AQP 0.9-9 is ready: 1.0 should be out by 2012-01-01 !

Submitted by dylan on Thu, 2011-12-22 19:33.

Version 1.0 of AQP is nearly ready-- after nearly 3 years of development, 2 years on R-Forge, and 1+ year on CRAN. Just pushed version 0.9-9 to R-Forge, and it should be on CRAN within a couple days. Recent changes are listed below, clipped from the NEWS file.

AQP Demo: Soils Summarized by Bedrock KindAQP Demo: Soils Summarized by Bedrock Kind

Logistic Power Peak (LPP) Simulated Soil Profiles

Submitted by dylan on Sat, 2011-11-12 21:01.

A friend of mine recently published a very interesting article on the pedologic interpretation of asymetric peak functions fit to soil profile data (Myers et al., 2011). I won't bother summarizing or paraphrasing the article here, as the original article is very accessible, rather I thought I would share some new functionality in AQP that was inspired by the article. While reading the article I thought that it would be interesting to use one of these peak functions, the logistic power peak (LPP) function, to simulate soil property depth-functions. Simulated values could be used to evaluate new algorithms with a set of tightly controlled properties that vary with depth. One of the nice aspects of these peak functions is that they can create a wide range of shapes that mimic common anisotropic depth-functions associated with pedogenic processes such as illuviation, ferrolysis, or seasonal fluctuation of groundwater levels. An example R session demonstrating the use of LPP-simulated soil property depth-functions is presented below.

LPP-simulated ProfilesLPP-simulated Profiles

PostGIS Quickie

Submitted by dylan on Fri, 2011-11-11 00:30.

Today I needed to cut out a rectangle of geologic data from a state-wide map in an AEA coordinate system, using a bounding box from a UTM zone 10 region, with the output saved in UTM zone 10 coordinates. PostGIS makes this type of operation very simple via spatial SQL. Note the special syntax that PostgreSQL uses for referencing variables.

 
Spatial SQL Example

-- set our query geometry as a variable for use later
\SET query_geom ST_Transform(SetSRID(ST_MakeBox2d(ST_Point(675461, 4168366), ST_Point(749990, 4260786)), 26910), 9001)

-- save the result of our query to a temp table for export
CREATE TEMP TABLE tt AS
SELECT ST_Transform(ST_Intersection(wkb_geometry, :query_geom), 26910) AS wkb_geometry, ptype, ptype_full, gen_type, note
FROM cageo
JOIN cageo_legend USING (ptype)
WHERE ST_Intersects(wkb_geometry, :query_geom);

New S4 Classes and Methods added to AQP (Algorithms for Quantitative Pedology) Package

Submitted by dylan on Thu, 2011-10-20 21:52.

Thanks to some major help from Pierre Roudier, the aqp package now has some new S4-style classes and methods-- custom tailored to the complexities of soil profile data. These new classes/methods are available in the aqp development package, found on our R-Forge repository. An update to CRAN is pending. A couple of examples are presented below. Note that methods may change at any time, so consider these examples preliminary.

Soil Property Data ModelSoil Property Data Model

Combining Base+Grid Graphics

Submitted by dylan on Tue, 2011-10-04 17:48.

R provides several frameworks for composing figures. Base graphics is the simplest, grid is more advanced, and the lattice/ggplot packages provide convenient abstractions of the grid graphics system. Multi-element figures can be readily created in base graphics using either par() or layout(), with analogous functions available in grid. Mixing the two systems is a little more complicated, somewhat fragile, but entirely possible.

I recently needed to combine base+grid graphics on a single page; two sets of base graphics on the left, and the output from xyplot() (grid) on the right. Following some tips from Paul Murrell posted on r-help, the solution was fairly simple. A truncated example of the processes is listed below, corresponding to the attached figure. While the result was "good enough" for a quick summary, there are clearly some improvements that would make figure more useful.

Mehrten Soil SummaryMehrten Soil Summary

Soil-Landscape Block Diagrams in SoilWeb

Submitted by dylan on Fri, 2011-09-16 16:43.

Users of our Google Earth interface to USDA-NCSS soils information will now see links to soil-landscape block diagrams listed within map unit descriptions.

Automated Linking to NCSS Block DiagramsAutomated Linking to NCSS Block Diagrams

Soil Series Query for SoilWeb

Submitted by dylan on Fri, 2011-09-16 16:12.

A map depicting the spatial distribution of a given soil series can be very useful when working on a new soil survey, updating an old one, or searching for specific soil characteristics. We have recently added a soil series query facility to SoilWeb, where results are returned in the form of a KML file. Two modes are currently supported:

  1. map unit based querying- only map units named for the given soil series are returned
  2. component based querying- map units containing components named for the given series are returned

For example, if someone was interested in the spatial distribution of the Amador soil series, they could use the Series Extent Mapping tool to get a quick description of which survey areas contain (and how many corresponding acres of) this series. For an even more detailed description of where the Amador series is mapped, one could use our new soil series query like this:

http://casoilresource.lawr.ucdavis.edu/soil_web/reflector_api/soils.php?what=soil_series_extent&q_string=amador

 
This is a preliminary version, and a subsequent post will contain links to a Google Earth file that can be used to simplify the query process. In most cases queries take about 1-5 seconds, which is quite fast considering: 1) either 730k component names or 275k map unit names are searched, 2) 35 million map unit polygons are filtered for the series in question, and, 3) bounding boxes for matching polygons are merged together-- all on-the-fly. Full text searches for map unit/component names are very fast thanks to advanced text indexing and searching algorithms implemented in PostgreSQL and spatial processing functions implemented in PostGIS. In the final version, the location of the official series description (OSD) will be included in query results.

Attached at the bottom of the page is a KMZ demo showing sample output from the two query modes. Screen shots from the demo are posted below.

Soil Series Query Results: Amador Series: blue regions: map units dominated by the Amador series; red regions: map units that contain at least one component of the Amador series.Soil Series Query Results: Amador Series: blue regions: map units dominated by the Amador series; red regions: map units that contain at least one component of the Amador series.

Recent Updates in the aqp (Algorithms for Quantitative Pedology) Package for R

Submitted by dylan on Thu, 2011-09-15 04:44.

New version of our 'aqp' package for quantitative soils investigations, available on CRAN (version 0.99-5) and R-Forge (0.99-8). Some of the major changes are listed below:

-------------------------- aqp 0.99-8 (2011-09-14) --------------------------
* soil.slot() will now accept boundaries defining a 'slab' over which aggregates are computed
* soil.slot.multiple() now cleanly wraps soil.slot(), accepting all arguments
  - these two changes make it possible to ask: 
    "what is the wt. mean value of some property within this slab, and among these groups?"
  - soil.slot.multiple() now uses a formula interface: NOTE that this will break existing scripts (sorry)
    
-------------------------- aqp 0.99-7 (2011-09-01) --------------------------
* new functions for getting data out of PedonPC (MS Access) databases [windows only for now]
  - get_site_data_from_pedon_db() : site and pedon aggregate data
  - get_hz_data_from_pedon_db() : horizon level data
  - get_colors_from_pedon_db() : formats and mixes multiple colors / horizon
  + implemented in mix_and_clean_colors()

* test_hz_logic() : basic function for testing horizon logic, returns TRUE/FALSE by ID
* parallel operations now NON-functional, while we wait for plyr to support doSMP...
* new ID plotting style for profile_plot() : handy when plotting large collections and/or long IDs

-------------------------- aqp 0.99-4 (2011-08-15) --------------------------
* code and documentation clean-up
* Soil Sata Access (SDA) query functions have been added
  - mapunit_geom_by_ll_bbox() : get map unit geometry by bounding box
  - MUKEYS_by_ll_bbox() : get map unit keys by bounding box
  - SDA_query() : retrieve soil tabular data via query written in SQL
* additional customizations added to profile_plot
* two new sample data sets + examples
( categories: )

Saving Chunks of SSURGO Data in SoilWeb for Google Earth

Submitted by dylan on Wed, 2011-06-29 00:07.

GE KML Thumbnail

SoilWeb is an interactive, multifaceted interface to USDA-NCSS soil survey information. Our SoilWeb application for Google Earth streams soil map units and point data as you navigate across the lower '48 states. Currently, our system imposes a 30,000 ac. limit (defined by the Google Earth viewport) for streaming detailed soil survey (SSURGO) map unit boundaries. This limit, combined with a 2 second delay before streaming is initiated, helps to reduce CPU load on our server. When viewing landscapes from directly overhead the 30,000 ac. limit is usually sufficient for most soils investigations. However, tilting the camera for oblique views of the landscape (an excellent way of visualizing soil-landscape relationships) causes the viewport to encompass much larger areas, usually exceeding SoilWeb's limit for detailed linework. The solution to this problem is summarized below, and can also be used to save "chunks" of detailed soil survey information for later use or offline browsing.

Terrain Classification Experiment 2: GRASS, R, and the raster package

Submitted by dylan on Tue, 2011-05-24 16:37.

Quick post on terrain classification, based on some trouble folks were having with a previous example on Windows. With the spgrass6 package, raster stacks are created by loading several GRASS files at once: x <- readRAST6(vname=c('beam_sum_mj','ned10m_ccurv','ned10m_pcurv','ned10m_slope')). This works well on UNIX-like operating systems and in cases where the entire collection of raster maps can fit within the system memory. A different strategy is needed when working with massive raster stacks or on other (less useful) operating systems. This post outlines one possible strategy that should work on massive data sets, and across operating systems.

 
Outline

  1. export terrain surfaces from GRASS to intermediate files
  2. import into R with raster package
  3. perform unsupervised classification on a sample of the cells using PAM
  4. apply clustering to unsampled cells with randomForest
  5. save results to intermediate file
  6. import results into GRASS for post-processing

Terrain Classification ExampleTerrain Classification Example