# Spatial Clustering of Point Data: Spearfish Example

This example uses the 'Partitioning Around Medoids (PAM)' algorithm (Kaufman and Rousseeuw, 2005) to divide a number of point observation into k clusters, based on their spatial attributes only. An extension of this concept can be applied to any type of geographic data, such as terrain attributes.

For a simple comparison of some of the partitioning-style clustering algorithms in R, see this page of demos. For a more complete listing of clustering approaches in R, see the Cluster Task View.

References

• Kaufman, L. & Rousseeuw, P.J. Finding Groups in Data An Introduction to Cluster Analysis Wiley-Interscience, 2005

Export xy coordinates for the bugsites from GRASS See attached file at bottom of page.

```# export bugsites
v.out.ascii in=bugsites out=bugsites.xy```

Load this text file into an R session A simple map can be made by plotting the xy coordinates.

```# read in ascii file, and assign column names
x <- read.table('bugsites.xy', sep="|")
names(x) <- c('easting', 'northing', 'cat')

# subset original object, return only x,y cols
y <- data.frame(x[,1:2])
row.names(y) <- x\$cat

# simple plot of x,y data
plot(y, pch=3)```

Use the stepFlexclust function to determine an optimal number of hard classes 5 clusters looks like a good start.

```# load cluster package
library(cluster)
library(flexclust)

# figure out a good number of clusters use a range of 2 to 10 clusters, with 20 reps each
s <- stepFlexclust(y, k=2:10, nrep=20)
plot(s)```

Perform hard classification (clustering) with the PAM algorithm, and plot the results

```# 5 clusters in a good number
y.pam <- pam(y, 5, stand=TRUE)

# add the clustering vector back to the original dataframe
y\$cluster <- y.pam\$clustering

# plot the clusters by color
plot(y\$easting, y\$northing, col=y\$cluster, main="Bugsites Spatial Clustering, 5 classes", cex=0.5, pch=16, xlab="Easting", ylab="Northing")

# add the medoids, they are in the same order as the clustering vector
points(y.pam\$medoids, pch=15, col=1:5, cex=1.25)

# connect the original points to the centroids with line segments:
for(i in 1:5)
{
segments(x0=y.pam\$medoids[i,], y0=y.pam\$medoids[i,], x1=y\$easting[y\$cluster == i], y1=y\$northing[y\$cluster ==i], col=i, lty=3)
}```

Prepare the data for export to text, and save the clustered data

```# add the cluster number to the original dataframe
y\$cluster <- y.pam\$clustering
y\$orig_cat <- as.numeric(row.names(y))

# save as a text file and quit
write.table(y, file='bugsites.clust', row.names=FALSE)```

Load clustered data into GRASS as a new set of points called 'bclust' For each cluster, extract those points, and compute a convex hull.

```# load clustered points into GRASS
v.in.ascii in=bugsites.clust out=bclust fs=" " columns='x double, y double, cluster integer, orig_cat integer' skip=1

# zoom to the full extent of the Spearfish dataset
g.region rast=elevation.dem

# there are 5 clusters: show them all, and compute convex hulls
for x in `seq 1 5`
do v.extract --o in=bclust where="cluster=\$x" out=bclust_\$x
v.hull --o in=bclust_\$x out=bclust_hull_\$x
d.vect bclust_hull_\$x type=boundary fcol=none width=2 col=white
d.vect bclust icon=basic/box fcol=black col=black size=6
done```

bugsites.xy