First-Cut Approach to Synchronizing Field Notes with GPS Data

Submitted by dylan on Mon, 2011-05-09 22:41.

After a week's worth of work in the field, I typically have several pages of hand-written field notes that are associated with GPS waypoints-- badly in need of some kind of transcription/organization. I have yet to find a simple approach for bringing together these non-spatial (field note transcriptions) and spatial (waypoints) data, linked only by a simple ID system. Also, I am somewhat limited by the software available on USDA computer systems. Here is a first-cut approach, implemented in R, that relies on CSV files. In a perfect world, I would implement something using python and PostGIS...

Code Listing

# Date: 20 June, 2011
# Author: D.E. Beaudette
# Purpose: Automate the process of concatenating GPS data + transcribed field notes into a single CSV + shapefile + KML for later use.
# Notes:
# folder structure must exist:
# data/
#       gps_data/                               [put TXT export from DNR Garmin here, should be no duplicate records]
#       shp_output/                             [output shapefile ends up here]
#       notes.csv                               [manually edited table with GPS ident + transcribed field notes]
#       combined_gps_data.csv   [created by script, same format as shapefile]
#   [xxxx].kml         [created by script, user-selectable filename'

# need these

# local functions

# this is where I keep my files

# set file paths, relative to working dir
gps_file_path <- 'data/gps_data/'
shp_file_path <- 'data/shp_output'
notes_file_path <- 'data/notes.csv'
combined_file_path <- 'data/combined_gps_data.csv'
pedonPC_import_file_path <- 'data/pedonPC_import.txt'
kml_file_path <- 'data/dylan_gps.kml'

# set user's initials
user_initials <- 'DEB'

# list of GPS files in our dir
files <- list.files(path=gps_file_path, pattern='*\\.txt')

# empty list to store intermediate results
l <- list()

# columns of interest
vars <- c('long', 'lat', 'x_proj','y_proj','altitude','ident','comment','filename')

# read each file into a sequential list element
for(i in seq_along(files))
  f <- paste(gps_file_path, files[i], sep='')
  l[[i]]  <- read.csv(f,
# re-combined into single DF
# note that ldply() will pad missing columns
d <- ldply(l)

# keep only named columns
d <- d[, vars]

# fix date/time
# keep the 'comment' column for pedonPC
d$datetime <- as.POSIXlt(strptime(d$comment, format='%d-%b-%y %H:%M'))
d$date <- format(d$datetime, '%Y-%m-%d')
d$time <- format(d$datetime, '%H:%M')
d$datetime <- NULL

# add year + initials to ident, making the ID
d$full_id <- paste(format(as.Date(d$date), '%y'), user_initials, d$ident, sep='')

# make ident for PedonPC
d$pedonPC_id <- paste(user_initials, d$ident, sep='')

# round coordinates
d$x_proj <- round(d$x_proj)
d$y_proj <- round(d$y_proj)

# add flag for full descriptions
d$fulldesc <- 1
d$fulldesc[grep('N', d$ident)] <- 0

# join with extended notes, padding missing note text
notes <- read.csv(notes_file_path,
g <- join(d, notes, by='ident')

# add 'pedon' flag to full notes,
# this will be used to filter out records to send to pedonPC
g$pedon[$pedon)] <- 1

# save to CSV, overwriting existing files
write.csv(g, file=combined_file_path, row.names=FALSE)

# re-format for PedonPC import
# keeping only full / partial soil descriptions
g.pedonPC <- data.frame(type='WAYPOINT', g[g$pedon == 1, c('pedonPC_id', 'lat', 'long', 'y_proj', 'x_proj', 'comment', 'altitude')])
names(g.pedonPC) <- c('type', 'ident', 'lat', 'long', 'y_proj', 'x_proj', 'comment', 'altitude')

# save file for PedonPC import
write.csv(g.pedonPC, file=pedonPC_import_file_path, row.names=FALSE, quote=FALSE)

# upgrade to SPDF
coordinates(g) <- ~ x_proj + y_proj
proj4string(g) <- CRS('+proj=utm +zone=10 +datum=NAD83')

# save to shapefile, overwriting existing files
writeOGR(g, dsn=shp_file_path, layer=paste(user_initials, 'combined', sep='_'), driver='ESRI Shapefile', overwrite_layer=TRUE)

# Make a KML version, with bubbles containing the notes
for(i in 1:nrow(g)) {add_placemark(kml_file_path, g[i, ])}

# done

Internally-Used Functions (will be greatly improved by the upcoming release of the plotKML package)

add_placemark <- function(filename, p)
  # inv. project to LL
  p.coords <- project(coordinates(p), proj=proj4string(p), inv=TRUE)
  # get the date/time <- p$date
  p.time <- p$time
  # get the full Id and description
  # if the description is missing, then it means we collected a full pit <- p$ident
  p.desc <- p$note
  # set style
  if(p$fulldesc == 1)
    style <- 'waypoint'
    p.desc <- 'Full description, see 232 form.'
    style <- 'waypoint_note'
  # compose placemark packet
  pm <- paste('
  <div style="width: 500px;">'
, p.desc, '</div>
, paste(, p.time), '</center>
, style, '</styleUrl>
, paste(p.coords, collapse=','), '</coordinates>
, sep='')
  write(pm, file=filename, append=TRUE)

start_kml <- function(filename)
kml_head <- '<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns=""
    <name>Mapping Notes</name>
    <Style id="waypoint">
    <Style id="waypoint_note">

write(kml_head, file=filename, append=FALSE)

stop_kml <- function(filename)
kml_tail <- '</Folder></Document></kml>'
write(kml_tail, file=filename, append=TRUE)

final csv output

Thanks for the post - I like where you're headed with this. What does your final csv output look like from this code? Please post. Thanks!

Sample Output

Hi Jay. Thanks for the feedback. It would be good to chat about what PedonPC expects, so that this script can re-format records from any GPS such that they are in a new, standardized format. The final CSV looks something like this:

692068,4227402,218,"002N",NA,"2011-03-29","10:23","11DEB002N",0,"UM","","getting close to geologic boundary"
691643,4227488,216,"003N",NA,"2011-03-29","11:05","11DEB003N",0,"slate","","looks like mariposa slate, blue oaks + chamise"
691574,4227521,227,"004N",NA,"2011-03-29","11:21","11DEB004N",0,"slate","whiterock","quick pedon desc."
691715,4227547,229,"005N",NA,"2011-03-29","11:57","11DEB005N",0,"","","geologic boundary; blue oaks + taller grasses, red hues; rounded, indurated gravels"

A note on dates

Hey Dylan,

Great post, as per usual ;)

Just wanna make a slight suggestion ; it appears that a significant amount of code here is dedicated to handling time/date values. Which is normal as handling these are always a bit of a pain!

But the freshly released package lubridate should come to an help here and lighten your code [].

For example:
x <- "12-02-2011 12:03"


Thanks Pierre, I really need to check out that new package. I spent about 30 minutes fighting with this code, as I had accidentally used the related POSIXlt class instead of POSIXct class for the date-time formatting.