A Lattice Panel Function for Filled Polygons that Accounts for Missing Data

Submitted by dylan on Mon, 2011-03-28 15:23.

This is a quick update to some code posted while back, related to plotting filled polygons within a lattice panel function. After attempting to use the originally described function to plot data with NA, I quickly realized that a more robust approach was required-- panel.polygon() does not deal well with missing data. A new function of the same name is attached at the bottom of the page, and some code demonstrating its use is presented below. The rle() function (run-length encoding) does most of the difficult work of identifying contiguous chunks of non-missing data. Each chunk is plotted as a separate polygon, resulting in a (mostly) generalized panel function for plotting filled polygons within the lattice framework. Comments welcome.

panel.tsplot() examplepanel.tsplot() example

 
Demo

# Code originally by D.E. Beaudette and V. Mehta
library(lattice)
library(RColorBrewer)

# see attached file at bottom of page
source('panel.tsplot.R')

# setup query parameters
row <- 8 ; col <- 121 ; scale <- 1

# build URL: this is the new approach to loading data
u <- url(sprintf("http://www.climatechange.ca.gov/visualization/sei_temp/weadapt_www_site/data_export.php?format=csv&row=%i&col=%i&scale=%i&query_type=climatets", row, col, scale))

# read file from URL
x <- read.csv(u)

# fix the date
x$date <- as.Date(x$date)

##
## lets make things interesting and remove a couple slices of data
##
x$mean_temp[x$date > as.Date('2020-01-01') & x$date < as.Date('2025-01-01')] <- NA
x$mean_temp[x$date > as.Date('2075-01-01') & x$date < as.Date('2090-01-01')] <- NA

# get date range
d.range <- range(x$date)
d.list <- seq(d.range[1], d.range[2], by='5 years')

# setup colors and line types
area.cols <- brewer.pal('Set1', n=9)
line.cols <- rep(area.cols[1:2], each=3)
line.lty <- rep(c(1,2,3), times=2)

# customize plotting device
trellis.par.set('superpose.polygon'=list(col=area.cols))
trellis.par.set('superpose.line'=list(col=line.cols, lty=line.lty))
trellis.par.set('layout.widths'=list(ylab.axis.padding=3))

# plot
xyplot(mean_temp ~ date, groups=emission, data=x, scales=list(y=list(alternating=3), x=list(format="%Y", cex=1, at=d.list)), auto.key=list(columns=2, rectangles=TRUE, lines=FALSE, points=FALSE, cex=0.75), xlab='Date', ylab="Avg Annual Temperature", panel=panel.tsplot)

AttachmentSize
panel.tsplot.R1.72 KB
( categories: )

A quick n' dirty ggplot2 version

Hi Dylan,

Here's a quick n' dirty ggplot2 implementation. Some polishing is needed on the legend/color scale/x-axis:


library(ggplot2)

df <- ddply(x, .(date, emission), summarise, min_temp = min(mean_temp), max_temp = max(mean_temp))

my.plot <- ggplot(df) + geom_ribbon(aes(x=date, ymin=min_temp, ymax=max_temp, group=emission, fill=emission), alpha=I(1/3))

print(my.plot)

Neat

I figured that there must be implementations that know how to deal with these kind of data properly. Thanks for the example Pierre-- I'll have to spend more time looking at ggplot... although I have to admit that I like the look of lattice a little more :).