An afternoon with Geostatistics

Problem: Given a spatial field in 2 dimensions and time in one dimension, measurements are made at a few points in the three dimensions, and values at certain other points in the spatio-temporal field are required. 

Spatial Kriging. The 2-dimensional spatial version of the problem arises in several contexts. An oil company drills a few holes in the Earth’s crust for oil exploration and acquires a random variable that indicates the drilling potential at that spot. Each hole drilled costs a large sum of money. How best can the company estimate the drilling potential at other nearby spots without any further drilling? One way is to employ Gaussian Process Regression, where one places as Gaussian Process with a selected covariance kernel as a prior over the 2D space, adds the information gained by drilling at the few spots, and then computes a posterior distribution over the entire 2D field. This idea is termed kriging in geostatistics, after Daniel G. Krige, a South African mining engineer, who used similar techniques (weighted averaging, not GP regression) for measuring Gold grades in mines in South Africa, and whose work was taken up and formalized later by the French Mathematician Georges Matheron. 

The squared exponential Kernel is a popular choice for modeling covariance that depends only upon distance between two points and decays quickly. $$k(x_1, x_2) = \exp{\frac{(x_1-x_2)^2}{h^2}}$$

Spatio-temporal Kriging. The original problem above applies to a time varying spatial field. The idea of spatial kriging is naturally extensible to include the time dimension, if care is taken to ensure that the covariance kernel of the Gaussian Process degrades at an appropriate rate in the time dimension, which is chosen carefully, and separately from the the spatial dimension. One convenient alteration is to simply alter the bandwidth parameter of the temporal dimension. 

$$\exp{\frac{(x_1 - x_2)^2 + \alpha(t_1 - t_2)^2}{h^2}}$$ which of course turns out to be equivalent to simply using a product of two independent covariance kernels, one for space and one for time: $$\exp{\frac{(x_1 - x_2)^2}{h_x^2}} * \exp{\frac{(t_1 - t_2)^2}{h_t^2}}$$There is the question of whether such a kernel that is a product of two kernels, is guaranteed to product a positive-semidefinite matrix as the covariance metrix when arbitray values of \(x_1, x_2, t_1, t_2\) are plugged in. It is easy to show that the product of two valid kernel functions \(k = k_xk_t\) is also a valid positive-semidefinite kernel. This follows from observing that the covariance matrix for the kernel \(k\) is the Hadamard product (element-by-element product) of the covariance matrices resulting from the two component kernels. If \(K_x\) is the covariance matrix of a random vector \((x_1, x_2, \ldots, x_n)\) and \(K_t\) is the covariance matrix of a random vector \((t_1, t_2, \ldots, t_n)\), then the Hadamard product of \(K_x\) and \(K_t\) must be the covariance matrix of the random vector \((x_1t_1, \ldots, x_nt_n)\), and must therefore be positive semidefinite and hence \(k = k_x k_t\) must be a valid kernel.

In fact the product form of the kernel need not be limited to both kernels being of the squared exponential type. The temporal kernel can for instance be taken to be one where longer term correlations at the annual, monthly or weekly level can be exploited by the GP regression. This can be achieved via a sinusoidal temporal kernel or a product of a sinusoid with a slowly decaying exponential. Here is a cookbook of kernel functions from David Duvenaud, and there are several others in chapter 4 of the GPML book of Rasmussen & Williams.