Spatial Data Management and Mapping with R

Spatial Data Management and Mapping with R

The R programming language has several tools for spatial interpolation and managing spatial data. While there are plenty of tools out there for working with spatial data, one the advantages of R is that scripts can be written to automate frequently repeated tasks, or to automatically generate templated figures.

The other advantage is that spatial data can be combined with R’s extensive statistical and machine learning libraries for advanced data analysis.

In this post, I will be providing an example of how to map soil data with kriging in R using a free soil metal concentration dataset from the Netherlands. I will also show how to create a figure using the kriging results and satellite imagery. If you wish to have the entire code please contact Maapera and we'll send it at no charge.

1) Load Libraries                                                                     

The first step is to load all of the R libraries we need. We will be using seven different libraries throughout this example:

library(gstat)

library(geoR)

library(sp)

library(rgeos)

library(maptools)

library(raster)

library(SpatialTools)

Kriging can be done with both the gstat and geoR packages. Both are excellent packages. In this example we will be using the gstat package only to fetch our practice data, and we will be using geoR for the kriging.

2) Create a geodata file and fit a variogram

The next step is to organize our data into a geoR geodata file, and then fit a variogram to the geodata file. 

#Kriging of the Soil Cadmium Concentrations with the GeoR Package

#create geodata file for variogram fitting

#Coordinates are columns 2 and 3 in the meuse.all data frame, and the cadmium concentrations are in column 4

g=as.geodata(meuse.all, coords.col=2:3, data.col=4)

#create a variogram from the geodata file

#we will use the modulus estimator and specify constant trend

#other trends include 1st or 2nd order polynomials

v=variog(g, estimator.type='modulus', trend='cte')

 

#plot and inspect the variogram

plot(v)


Once we have created our variogram, we will then fit four different variogram models, and then plot the results.

#Fit a variogram model to the variogram

#Instead of just one lets plot a few and pick the one with the lowest SSE value

#create variogram models

variog.wls.exp=variofit(v,cov.model="exponential",weights="cressie")

variog.wls.spherical=variofit(v,cov.model="spherical",weights="cressie")

variog.wls.cubic=variofit(v,cov.model="cubic",weights="cressie")

variog.wls.circular=variofit(v,cov.model="circular",weights="cressie")

 

#plot the variogram models on the variogram

plot(v)

lines(variog.wls.exp)

In this example we are plotting the exponential model. 

While there are different rationales for selecting a variogram model, in this case we are going to get R to select the variogram with the lowest sum of squares error (SSE) value.

#Automatically determine which model has lowest SSE

model_list=list(variog.wls.spherical$cov.model, variog.wls.exp$cov.model,variog.wls.cubic$cov.model,variog.wls.circular$cov.model)

sse_list=list(variog.wls.exp$value,variog.wls.spherical$value, variog.wls.cubic$value,variog.wls.circular$value)

model_errors=do.call(rbind,Map(data.frame,Mod=model_list,SSE=sse_list))

#select row with min sse

min_sse=min(model_errors$SSE)

model=model_errors[which(model_errors$SSE==min_sse),1]

#fit variogram using the best variogram model from earlier

v.fit=variofit(v,cov.model=toString(model),weights="cressie")

 

#plot results

plot(v)

lines(v.fit)

In this case, the circular model has the best fit of the models we ran. We have fit the circular model to a new object called v.fit. This way we can call v.fit later for the kriging code. Our code will automatically run the four models, select the one with the lowest SSE and then call it later in the kriging process.

3) Kriging

Our next step to krig our data to create a grid of interpolated points. To start we will create a grid of points to predict to. In this case we will create a grid with 10 m spacing.

#create points for kriging results

#create a grid of points with 10 m spacing from the min and max x and y values

x=seq(min(meuse.all$x), max(meuse.all$x), by=10)

y=seq(min(meuse.all$y), max(meuse.all$y), by=10)

prd.loc=expand.grid(x=x,y=y)

prd.loc=as.matrix(prd.loc)

Next, we will want to subset our prediction grid based on the distance to the nearest measured data point. This way we can restrict R from predicting to points too far away from a measured data point.

#next we want to subset the prediction grid so there are no points more than 500 m from a measured point

#create a file with measured coordinates

#calculate distance from prediction point grid to the measured points

coords=cbind(meuse.all$x, meuse.all$y)

grid.temp=dist2(coords, prd.loc)

grid.temp=t(grid.temp)

 

#subset data based on distances

min.dist=apply(grid.temp, 1, function(x) min(x))

prd.loc=cbind(min.dist, data.frame(prd.loc), grid.temp)

prd.loc=subset(prd.loc, prd.loc[,1]<500)

grid.temp=prd.loc[,-c(1:3)]

prd.loc=prd.loc[,2:3]

Now that we have our prediction grid, we can go ahead and do the kriging. This code is set up so the parameters from the variogram model with the lowest SSE will automatically be used for the kriging.

#kriging

#specify variogram specifics for kriging

kcontrol=krige.control(cov.model=toString(model),cov.pars=c(v.fit$cov.pars[1], v.fit$cov.pars[2]))

 

#krig the results

cadmium_plot=krige.conv(g, loc=prd.loc, krige=kcontrol)

Now we want to save the kriging results into a data frame, which we can then convert to a spatial data frame and then a raster.

#save kriging results in a data frame

#specify column names

cadmium_plot=data.frame(cbind(cadmium_plot$predict, prd.loc[,1], prd.loc[,2]))

colnames(cadmium_plot)=c("cadmium", "x", "y")

 

#convert to spatial data frame

coordinates(cadmium_plot)=~x+y

 

#convert to a raster

cadmium_plot=rasterFromXYZ(cadmium_plot)

Finally, we can set the projection from the raster. In this example the coordinates were collected in the National Projection for the Netherlands (epsg:28992).

#set projection

crs(cadmium_plot)= "+init=epsg:28992"

Now that we have finished kriging the data we can plot the results.

#plot raster

image(cadmium_plot, asp=1, xlab="Northing", ylab="Easting", main="Meuse Data Set Cadmium Concentrations")


We can also add contours on the heat map if we want.

#add contours

contour(cadmium_plot, add=T)

4) Create Figure with Satellite Data

Now that we have our soil cadmium concentration map, we can overlay the results on a satellite image. We will be loading in 10 m spatial resolution Sentinel-2 Data from the European Space Agency.

Our first step will be to set our working drive to where the data is stored. As R was written for Linux systems your drive location needs double back slashes or a forward slash if you are using windows.

Then load in the red, green and blue bands and store then in a raster stack. Create the stack in the order of red, green and blue as the R Raster Package RGB plotting command expects the bands in this order.

setwd("C:\\Data\\Sentinel”)

blue=raster("L2A_T31UFS_20180222T104029_B02_10m.jp2")

green=raster("L2A_T31UFS_20180222T104029_B03_10m.jp2")

red=raster("L2A_T31UFS_20180222T104029_B04_10m.jp2")

rgb_image=stack(red, green, blue)
 
  

Our next step is to crop the sentinel data to the extent of the soil cadmium plot. Before we do that we need to make sure the rasters have the same projection. We will reproject the cadmium raster to the sentinel data as since it is smaller that will be computationally more efficient.

#crop to extent of site

#reproject cadmium plot and shapefile to same CRS as sentinel image

cadmium_plot=projectRaster(cadmium_plot, crs="+init=epsg:32631")
 
  

We can now crop the sentinel data to the extent of the cadmium concentration map. 

#crop sentinel image to Cadmium Plot Extent

rgb_image=crop(rgb_image, cadmium_plot)

Next, we can also create a shapefile from the cadmium concentration raster. In this case we are going to create a copy of the raster, set every value below 10 ppm to 1 and every value above 10 ppm to 2.

We can then subset our shapefile to only include the layers with values equal to 2, which corresponds to areas with soil cadmium concentrations greater than 10 ppm. 

#create a shapefile

#raster to vector

cadmium_shape=cadmium_plot

#set values in raster to classes

#in this case we are splitting the raster into values greater or less than 10

cadmium_shape@data@values[cadmium_shape@data@values <10]=1

cadmium_shape@data@values[cadmium_shape@data@values >=10]=2

#convert the raster to polygons

cadmium_shape=rasterToPolygons(cadmium_shape, dissolve=TRUE)

#remove polygons with values less than our specified value of 10 ppm

cadmium_shape=subset(cadmium_shape, cadmium_shape@data$layer==2)

 
  

We can plot the results to view our shapefile. 

#plot the results

spplot(cadmium_shape)
 
  


Now that we have all the pieces we will create our figure. R will put any new layers on top, so we need to add our data with that sequence in mind. We call plot the cadmium plot twice, so we can have that data on top while keeping the axes and legend created by the plot function.

By default, plotRGB will not create axes and legend plot elements. You can manually add them, but it is easier to plot the cadmium plot before and after calling plotRGB.

We have the transparency of the cadmium plot set to 50%, which can be increased or decreased with the alpha command. 

#Create Plot

par(xpd=FALSE, bty="l")

plot(cadmium_plot, main="Meuse Cadmium Map",xlab="Easting (m)", ylab="Northing (m)", cex=1.5)

plotRGB(rgb_image, add=T, stretch="lin", scale=65535, main="Meuse Cadmium Map",xlab="Easting (m)", ylab="Northing (m)")

plot(cadmium_plot, add=T, main="Meuse Cadmium Map",xlab="Easting (m)", ylab="Northing (m)", legend=FALSE, cex=1.5, alpha=0.5)

grid(lty=1)

par(xpd=TRUE)

If we want, we can also overlay the shapefile we created over this figure with the following command. 

Or we can add the contours to the plot with the following:

#add contours to the map
contour(cadmium_plot, add=T, col="white")

We can close the plot with dev.off()

#close the plot

dev.off()

Finally, we can export the cadmium concentration raster and shapefile with the following commands: 

#export

writeRaster(cadmium_plot, "Cadmium_Plot.tiff", format="GTiff")

writeSpatialShape(cadmium_shape, "cadmium")

5) Summary

That concludes this example showing how to krig data in R and plot the results along with satellite imagery. R has plenty of graphing customization functions, so this plot can be changed substantially as needed. The ggplot2 library has great plot customization functions.

If you have any questions feel free to contact Maapera. If you wish to have the entire code to contact Maapera and we'll send it at no charge







What can be the reason for this error                                                                                                              Error in match.arg(cov.model, choices = .geoR.cov.models) : 'arg' should be one of “matern”, “exponential”, “gaussian”, “spherical”, “circular”, “cubic”, “wave”, “linear”, “power”, “powered.exponential”, “stable”, “cauchy”, “gencauchy”, “gneiting”, “gneiting.matern”, “pure.nugget

Like
Reply

I'm going to try this when I have a chance! Thanks 

Like
Reply

Really great innovative stuff!

Like
Reply

To view or add a comment, sign in

More articles by Preston Sorenson

Others also viewed

Explore content categories