Dylan

Soil Properties Visualized on a 1km Grid

Submitted by dylan on Tue, 2010-08-31 18:29.

Fresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areasFresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areas

A couple of maps generated from a 1km gridded soil property database, derived from SSURGO data where available with holes filled with STATSGO data. Soil properties visualized at this scale illustrate several important soil-forming factors operating within California: sediment source in the Great Valley, the interplay between precipitation and ET, and removal of salts. This database and the details on its creation should be available within a couple of months. This builds on a related post highlighting some of these maps packaged in KMZ format. Check back in a couple of weeks of updates.

New R Package 'aqp': Algorithms for Quantitative Pedology [updates]

Submitted by dylan on Mon, 2010-08-09 16:37.

 
Soils are routinely sampled and characterized according to genetic horizons (layers), resulting in data that are associated with principal dimensions: location (x,y), depth (z), and property space (p). The high dimensionality and grouped nature of this type of data can complicate standard analysis, summarization, and visualization. The aqp package was developed to address some of these issues, as well as provide a useful framework for the advancement of quantitative studies in soil genesis, geography, and classification.

Using R and r.mapcalc (GRASS) to Estimate Mean Topographic Curvature

Submitted by dylan on Tue, 2010-08-03 20:51.

Recently I was re-reading a paper on predictive soil mapping (Park et al, 2001), and considered testing one of their proposed terrain attributes in GRASS. The attribute, originally described by Blaszczynski (1997), is the distance-weighted mean difference in elevation applied to an n-by-n window of cells:



Equation 4 from (Park et al, 2001)

 
where n is the number of cells within an (odd-number dimension) square window excluding the central cell, z is the elevation at the central cell, z_{i} is the elevation at one of the surrounding cells i, d_{i} is the horizontal distance between the central cell and surrounding cell i. I wasn't able to get a quick answer using r.neighbors or r.mfilter, so I cooked up a simple R function to produce a solution using r.mapcalc. The results are compared with the source DEM below; concave regions are blue-ish, convex regions are red-ish. The magnitude and range are almost identical to mean curvature derived from v.surf.rst, with a Pearson's correlation coefficient of 0.99. I think that it would be of general interest to add functionality to r.neighbors so that it could perform distance-weighted versions of commonly used focal functions.

Elevation surface (left) and resulting mean curvature estimate (right)Elevation surface (left) and resulting mean curvature estimate (right)

R's Normal Distribution Functions: rnorm and pals

Submitted by dylan on Wed, 2010-07-14 17:10.

The rnorm() function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands dnorm() (normal density function), pnorm() (cumulative distribution function), and qnorm() (quantile function) before-- so I made a simple demo. The *norm functions generate results based on a well-behaved normal distribution, while the corresponding functions density(), ecdf(), and quantile() compute empirical values. The following example could be extended to graphically describe departures from normality (or some other distribution-- see rt(), runif(), rcauchy() etc.) in a data set.

( categories: )

PostGIS in Action Book Review

Submitted by dylan on Tue, 2010-06-08 17:32.

I was recently asked to review a soon to be published book on PostGIS, a spatial extension to the very popular Postgresql relational database. I was very excited about receiving an early copy of this book, as the authors have provided countless tips, fixes, and clever query examples on the PostGIS mailing list over the years. After spending a couple weeks looking through the book, I have to say that I am very impressed with the quality and completeness. Indeed, this is the book that I wish would have been available when I was starting out with PostGIS. The authors do an excellent job of promoting the idea that a relational database and SQL are well suited for spatial data modeling and analysis.

( categories: )

An XML Representation of the Keys to Soil Taxonomy?

Submitted by dylan on Sat, 2010-05-29 04:45.

Western Fresno Soil Hierarchy: partial view of the hierarchy within the US Soil Taxonomic systemWestern Fresno Soil Hierarchy: partial view of the hierarchy within the US Soil Taxonomic system

Maybe this is just craziness, but wouldn't be neat to have an XML formatted version of the Keys to Soil Taxonomy? The format might look something like the following code snippet, although there may be more efficient uses of XML... The only problem I can see is that it would take a hell of a long time to type in the entire 300+ page document. A complete document of this nature would support all kinds of new and creative uses for the 'keys-- electronic look-up, automated generation of a PDA-ready version, an awesome teaching tool, or just something that could be used to generate cool figures. Anyone know of a quick way to get this put together, or of any similar document that has already been published? Anyone want to help type-in the data?









( categories: )

Getting Parent Material Data out of SSURGO

Submitted by dylan on Fri, 2010-05-28 01:21.

 
Parent material data is stored within the copm and copmgrp tables. The copm table can be linked to the copmgrp table via the 'copmgrpkey' field, and the copmgrp table can be linked to the component table via the 'cokey' field. The following queries illustrate these table relationships, and show one possible strategy for extracting the parent material information associated with the largest component of each map unit.

 
Several of the example queries are based on this map unit:

Estimating Missing Data with aregImpute() {R}

Submitted by dylan on Mon, 2010-04-19 18:02.

 
Missing Data
Soil scientists routinely sample, characterize, and summarize patterns in soil properties in space, with depth, and through time. Invariably, some samples will be lost or sufficient funds required for complete characterization can run out. In these cases the scientist is left with a data table that contains holes (so to speak) in the rows/columns that are missing data. If the data are used within a regression, missing values in any of the predictor or the response variable result in row-wise deletion-- even if 9/10 variables are present. Furthermore, common multivariate methods (PCA, RDA, dissimilarity metrics, etc.) cannot effectively deal with missing data. The scientist is left with a couple options: 1) row-wise deletion of cases missing any variable, 2) re-sampling or re-characterizing the missing samples, or 3) estimating the missing values from other variables in the dataset. This last option is called missing data imputation. This is a broad topic with countless books and scientific papers written about it. Here is a fairly simple introduction to the topic of imputation. Fortunately for us non-experts, there is an excellent function (aregImpute()) in the Hmisc package for R.

( categories: )

Accessing Climate Change Data and a Custom Panel Function for Filled Polygons

Submitted by dylan on Fri, 2010-03-05 02:21.

GCS Model Grids

Recently finished some collaborative work with Vishal, related to visualizing climate change data for the SEI. This project was funded in part by the California Energy Commission, with additional technical support from the Google Earth Team. One of the final products was an interactive, multi-scale Google Earth application, based on PostGIS, PHP, and R. Interaction with the KMZ application results in several presentations of climate projections, fire risk projections, urban population growth projections, and other related information. Charts are dynamically generated from the PostGIS database, and returned to the web browser. In addition, an HTTP-based interface makes it simple to download CSV-formatted data directly from the CEC server. Some of our R code seemed like a good candidate for sharing, so I have posted a complete example below-- illustrating how to access climate projection data from the CEC server, a couple custom functions for fancy lattice graphics, and more.