dylan's blog

Customizing Maps in R: spplot() and latticeExtra functions

Submitted by dylan on Wed, 2010-12-15 17:00.

I recently noticed the new latticeExtra page on R-forge, which contains many very interesting demos of new lattice-related functionality. There are strong opinions about the "best" graphics system in R (base graphics, grid graphics, lattice, ggplot, etc.)-- I tend to use base graphics for simple figures and lattice for depicting multivariate or structured data. The sp package defines classes for storing spatial data in R, and contains several useful plotting methods such as the lattice-based spplot(). This function, and back-end helper functions, provide a generalized framework for plotting many kinds of spatial data. However, sometimes with great abstraction comes great ambiguity-- many of the arguments that would otherwise allow fine tuning of the figure are buried in documentation for lattice functions. Examples are more fun than links to documentation, so I put together a couple of them below. They describe several strategies for placing and adjusting map legends-- either automatically, or manually added with the update() function. The last example demonstrates an approach for over-plotting 2 rasters. All of the examples are based on the meuse data set, from the gstat package.

Extended spplot() examplesExtended spplot() examples

Three New Soils-Related KMZ Demos

Submitted by dylan on Tue, 2010-12-07 18:25.

LCC KMZLCC KMZ
Soil Texture KMZSoil Texture KMZ

 
Updated versions of three soils-related KMZ files: 1-km scale, aggregate LCC, CA Storie Index, and soil texture data, derived from SSURGO. These are part of a series of KMZ / raster datasets that will be published soon. See attached files at the bottom of the page. Enjoy!

A Visualization of Soil Taxonomy Down to the Subgroup Level

Submitted by dylan on Wed, 2010-09-29 18:44.

It turns out that you can generate a quasi-numerical distance between soil profiles classified according to Soil Taxonomy (or any other hierarchical system) using Gower's generalized dissimilarity metric. For example, taxonomic distances computed from subgroup membership are based on the number of matches at the order, suborder, greatgroup, and subgroup level. This approach allows for the derivation of a quasi-numerical classification system from Soil Taxonomy, but it is severly limited by the fact that each split in the hierarchy is given equal weight. In other words, the quasi-numerical dissimilarity associated with divergence at the soil order level is identical to that associated with divergence at the subgroup level. Clearly this is not ideal.

Gower's generalized dissimilarity metric is conveniently implemented in the cluster package for R. I have posted some related material in the past, but left out some of the details regarding which clustering algorithms produce the most useful dendrograms. Divisive clustering best represents the step-wise splits within the hierarchy of Soil Taxonomy, as expressed in terms of pair-wise dissimilarities. Code examples are below, along with the data used to generate the figure of California subgroups. Discontinuities in figure below are caused by errors in the underlying data, e.g. mis-matches in soil order vs. suborder membership.

Subgroups from CaliforniaSubgroups from California

Soil Properties Visualized on a 1km Grid

Submitted by dylan on Tue, 2010-08-31 18:29.

Fresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areasFresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areas

A couple of maps generated from a 1km gridded soil property database, derived from SSURGO data where available with holes filled with STATSGO data. Soil properties visualized at this scale illustrate several important soil-forming factors operating within California: sediment source in the Great Valley, the interplay between precipitation and ET, and removal of salts. This database and the details on its creation should be available within a couple of months. This builds on a related post highlighting some of these maps packaged in KMZ format. Check back in a couple of weeks of updates.

GRASS Can Make Pretty Maps

Submitted by dylan on Mon, 2010-08-23 23:22.

I have posted a couple examples in the past on the topic of high quality map production from GRASS GIS-- usually via the Generic Mapping Tools. I am not sure why, but I have previously avoided using the traditional cartographic output module that is bundled with GRASS (ps.map). This is despite the fact that there is now an excellent collection of examples and a very detailed manual page... I now realize that I have been missing out.

I needed a map with several "zoomed" insets, on rather short notice, with all of the data derived from work that had previously done in GRASS. Not looking forward to exporting all of the data into a GMT-compatible format, I gave ps.map a try. The following image is a JPG (i.e. degraded) version of the Postscript file generated by ps.map, with no manual intervention (other than page layout). Obviously the labels aren't readable, but that can be fixed within a drawing program like Inkscape. Overall, the results were much better than I was expecting. I'll post some notes and the script used to generate the map next time.

Example map produced with ps.mapExample map produced with ps.map

What would a 25th, 50th, and 75th percentile soil profile look like?

Submitted by dylan on Wed, 2010-08-11 20:38.

I have mentioned the AQP package in previous entries. One of the functions in this package generates aggregate soil profile data, from a collection of soil profiles that are related by some factor: common lithology, common landscape position, and so on. Typically the mean, or median (50th percentile) is used to generate a new aggregate profile, that is representative of the original collection. Extending this idea, I thought that it would be interesting to generate aggregate profiles that are representative of the 25th and 75th percentiles as well. For the sake of clarity, lets call these three new profiles (25th, 50th, and 75th percentiles) Q25, Q50, and Q75. A 10 cm slicing interval was used as the basis upon which soil properties were aggregated.

Aggregate Profiles

( categories: )

Just for Fun: Using R to Create Targets

Submitted by dylan on Tue, 2010-08-10 17:54.

OK, not really science or soil-related, but a fun 5 minute use of R to make something you can use to improve your hand-eye coordination.

Using R and r.mapcalc (GRASS) to Estimate Mean Topographic Curvature

Submitted by dylan on Tue, 2010-08-03 20:51.

Recently I was re-reading a paper on predictive soil mapping (Park et al, 2001), and considered testing one of their proposed terrain attributes in GRASS. The attribute, originally described by Blaszczynski (1997), is the distance-weighted mean difference in elevation applied to an n-by-n window of cells:



Equation 4 from (Park et al, 2001)

 
where n is the number of cells within an (odd-number dimension) square window excluding the central cell, z is the elevation at the central cell, z_{i} is the elevation at one of the surrounding cells i, d_{i} is the horizontal distance between the central cell and surrounding cell i. I wasn't able to get a quick answer using r.neighbors or r.mfilter, so I cooked up a simple R function to produce a solution using r.mapcalc. The results are compared with the source DEM below; concave regions are blue-ish, convex regions are red-ish. The magnitude and range are almost identical to mean curvature derived from v.surf.rst, with a Pearson's correlation coefficient of 0.99. I think that it would be of general interest to add functionality to r.neighbors so that it could perform distance-weighted versions of commonly used focal functions.

Elevation surface (left) and resulting mean curvature estimate (right)Elevation surface (left) and resulting mean curvature estimate (right)

R's Normal Distribution Functions: rnorm and pals

Submitted by dylan on Wed, 2010-07-14 17:10.

The rnorm() function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands dnorm() (normal density function), pnorm() (cumulative distribution function), and qnorm() (quantile function) before-- so I made a simple demo. The *norm functions generate results based on a well-behaved normal distribution, while the corresponding functions density(), ecdf(), and quantile() compute empirical values. The following example could be extended to graphically describe departures from normality (or some other distribution-- see rt(), runif(), rcauchy() etc.) in a data set.

( categories: )

PostGIS in Action Book Review

Submitted by dylan on Tue, 2010-06-08 17:32.

I was recently asked to review a soon to be published book on PostGIS, a spatial extension to the very popular Postgresql relational database. I was very excited about receiving an early copy of this book, as the authors have provided countless tips, fixes, and clever query examples on the PostGIS mailing list over the years. After spending a couple weeks looking through the book, I have to say that I am very impressed with the quality and completeness. Indeed, this is the book that I wish would have been available when I was starting out with PostGIS. The authors do an excellent job of promoting the idea that a relational database and SQL are well suited for spatial data modeling and analysis.

( categories: )