NOTE: I’ve built an R package based on the code in this post. If you’re interested in solving systematic reserve design problems with Gurobi in R, take a look: https://github.com/mstrimas/protectr

Systematic conservation planning takes a rigorous, repeatable, and systematic approach to designing new protected areas that efficiently meet conservation objectives, while minimizing socioeconomic cost. This approach can be used to determine the optimal locations to invest limited conservation funds in the creation of new reserves. In a previous post, I provided an introduction to systematic conservation planning and discussed how Marxan solves the reserve design problem using a stochastic optimization heuristic known as simulated annealing.

Simulated annealing is widely used for solving optimization problems in conservation planning; however, a recent paper (Beyer et al. 2016) by the team behind Marxan describes an alternative approach that draws on the techniques of integer programming to solve reserve design problems. In particular, their method uses Gurobi, a powerful commercial optimization software, to determine the optimal location for new reserves. The authors kindly included their code in the supplementary materials of their paper and I’ve worked through this code as well as the Gurobi documentation. In this post I summarize what I’ve learned and describe how to solve Marxan-like reserve design problems with Gurobi in R.

Motivation

Simulated annealing, as implemented in Marxan, is currently the method of choice for solving systematic reserve design problems. So, why bother with integer programming at all? Simply put integer programming has the potential to produce higher quality solutions to reserve design problems more efficiently. I’ll provide a more in depth comparison of the benefits and drawbacks of the two methods at the end of this post; feel free to skip ahead.

Required packages

library(dplyr)
library(tidyr)
library(assertthat)
library(sp)
library(raster)
library(rgeos)
library(rasterVis)
#> Warning: package 'rasterVis' was built under R version 3.2.5
library(viridis)
library(gstat)
library(marxan) # devtools:::install_github('paleo13/marxan')
library(slam)
library(knitr)
set.seed(1)

Background

If you’re unfamiliar with systematic conservation planning in general, or Marxan in particular, it would be worth consulting my previous post on these topics before proceeding. I’ll briefly review the pertinent pieces of that post here.

Reserve design

A Marxan reserve design exercise starts by dividing the study region into planning units (typically square or hexagonal cells) and, for each planning unit, assigning values that quantify socioeconomic cost and conservation benefit for a set of conservation features. The cost can be the acquisition cost of the land, the cost of management, the opportunity cost of foregone commercial activities (e.g. from logging or agriculture), or simply the area. The conservation features are typically species (e.g. Clouded Leopard) or habitats (e.g. mangroves or cloud forest). The benefit that each feature derives from a planning unit can take a variety of forms, but is typically either occupancy (i.e. presence or absence) or area of occurrence within each planning unit. Finally, for each conservation feature, representation targets must be set, such as 20% of the current extent of cloud forest or 10,000km2 of Clouded Leopard habitat.

The goal of the reserve design exercise is then to find the set of planning units that meets the representation targets while minimizing cost. In addition, reserve planning tools may attempt to minimize fragmentation, or maximize connectivity, to yield spatial configurations of reserves that are conducive to the long-term persistence of conservation features. Marxan accomplishes this by favouring reserves with shorter perimeter length.

Mathematical formulation

Formulating the conservation objective mathematically as an optimization problem allows us to tap into the rich body of knowledge that exists in the field of mathematical optimization. In general, the goal of an optimization problem is to minimize an objective function over a set of decision variables, subject to a series of constraints. The decision variables are what we control, while the constraints can be thought of as rules that need to be followed. In the particular case of Marxan, the reserve design problem is formulated, for \( n \) planning units and \( m \) conservation features, as:

where \( x_i \) is a binary decision variable specifying whether planning unit \( i \) has been selected (1) or not (0), \( c_i \) is the cost of planning unit \( i \), \( r_{ij} \) is the representation level of feature \( j \) in planning unit \( i \), and \( T_i \) is the target for feature \( i \). \( \nu_{ij} \) is a matrix where off diagonal components are the lengths of the shared boundaries between planning units and diagonal components are external boundaries of planning units (i.e. those that are not shared with another planning unit). Finally, \( b \) is known as the Boundary Length Modifier (BLM) and determines how much emphasis should be placed on producing compact solutions relative to meeting targets and minimizing cost.

In the above equation, the first three terms comprise the objective function. The first term is just the total cost of a given candidate reserve. The second and third terms together give the total length of the perimeter of the reserve network. Note that in all the Marxan documentation I’ve seen the third term is absent; however, in the actual objective function that Marxan uses this term is usually included. The goal is to minimize the objective function, and the final piece of the equation states that we want to do this subject to the constraint that targets are met for all conservation features.

Mathematical optimization

Mathematical optimization is the field that deals with solving optimizing problems, including problems of the type posed above. Integer programming (IP) problems comprise the subset of optimization problems in which the decision variables are restricted to be integers. Reserve design problems fall into this category because the decision variables are binary, corresponding to whether or not each planning unit is selected. Two common sub-classes of IP problems are particularly relevant to reserve design:

  • Integer linear programming (ILP): problems in which the objective function and constraints are linear functions of the decision variables. The Marxan reserve design problem fits into this framework if the boundary length term is dropped.
  • Integer quadratic programming (IQP): problems in which the objective function is quadratic in the decision variables. These problems can be linearized through a change of variables and are therefore reducible to ILP problems. The full Marxan reserve design problem specified above fits into this category.

In general, the phase space (the set of all possible decision variable combinations) of these problems is so large that an exhaustive search of all possible sets of decision variables is impossible. For example, in a simple reserve design problem with 100 planning units, there are \( 2^{100} \sim 10^{30} \) possible configurations in the phase space. Even if evaluating the objective function for each set of decision variables only took a nanosecond, it would still take about 1,000 times the age of the universe to do so for all possibilities!

Fortunately a wide variety of approaches have been developed for solving optimization problems. Heuristic techniques can be used when finding the exact optimal solution is unnecessary or impossible. This is often the case for reserve design problems, which are frequently solved using simulated annealing, a stochastic heuristic for approximating global optima of complex functions. This method is conceptually simple and can be applied to a wide variety of optimization problems; however, it won’t, in general, find the true optimal solution. More importantly, it is impossible to quantify how close the resulting solution is to the optimal solution.

ILP and IQP problems can also be solved using algorithms that are guaranteed to find optimal solutions. If finding the optimal solution is too computationally costly, these algorithms can also find solutions that are within a specified distance from the optimum (e.g. a 0.1% gap to the optimum). This ability to quantify the quality of solutions (i.e. gap to optimality) is a major advantage compared to heuristic techniques. Despite this, simulated annealing, particularly as implemented by Marxan, continues to dominate systematic conservation planning exercises. However, a recent paper by Beyer et al. (2016) suggests that a change may be coming. When applied to reserve design problems, they found that ILP algorithms were vastly superior to simulated annealing in terms of both processing time and solution quality.

In the remainder of this post, I will set up an example reserve design problem and demonstrate how to solve it using integer programming techniques. This method is inspired by the R code in the supplementary material to Beyer et al. (2016); however, an important difference is that in that paper they linearized the Marxan objective function and used ILP, while I show how to solve the quadratic objective function directly with IQP.

Example data

First I create some simple example data: 9 species distributions and a cost layer, defined on a 44 x 44 grid of 1km2 planning units. I’ve chosen a 44 x 44 cell grid because it’s the largest problem size that can be solved using the trial version of Gurobi; more on this later.

Conservation features

I create 9 raster layers that I’ll use to represent species distributions. To work with data that are at least somewhat realistic I introduce some auto-correlation into the layers by generating spatially autocorrelated Gaussian random fields. I also assign three different levels of rarity and three different characteristic scales to the distribution.

# raster template
utm10 <- crs('+proj=utm +zone=10 +ellps=GRS80 +datum=NAD83 +units=km +no_defs')
r <- extent(c(0, 44, 0, 44)) %>% 
    raster(nrows = 44, ncols = 44, crs = utm10, vals = 1)

gaussian_field <- function(r, range, n = 1,
                           mean = 0, variance = 1, nugget = 0, 
                           coef = c(0, 0), pct = NULL) {
  # assertions
  assert_that(inherits(r, "RasterLayer"),
              is.count(n),
              is.number(mean),
              is.number(variance), variance > 0, variance > nugget,
              is.number(nugget), nugget >= 0,
              is.numeric(coef), length(coef) == 2,
              is.null(pct) || (pct >= 0 & pct <= 1))
  
  beta <- c(mean, coef)
  psill <- variance - nugget
  # define spatial variogram model 
  gsim <- gstat(formula = (z ~ x + y), dummy = TRUE, beta = beta, nmax = 20,
                model = vgm(psill = psill, range = range, nugget = nugget,
                            model = 'Exp'))
  vals <- rasterToPoints(r, spatial = TRUE) %>% 
    geometry %>% 
    predict(gsim, newdata = ., nsim = n) %>%
    {.@data}
  # reclassify to binary, pct defines proportion of 1s
  if (!is.null(pct)) {
    vals <- mutate_each(vals, funs(as.integer(. < quantile(., pct))))
  }
  if (n == 1) {
    r[] <- vals[, 1]
    return(r)
  } else {
    s <- list()
    for (i in 1:n) {
      r_tmp <- r
      r_tmp[] <- vals[, i]
      s[i] <- r_tmp
    }
    return(stack(s))
  }
}
# conservation features
species <- mapply(function(x, y, r) gaussian_field(r = r, range = x, pct = y),
                  rep(c(5, 15, 25), each = 3),
                  rep(c(0.1, 0.25, 0.5), times = 3),
                  MoreArgs = list(r = r)) %>% 
  stack %>% 
  setNames(letters[1:nlayers(.)])
levelplot(species, main = 'Feature Distribution', layout = c(3, 3),
          scales = list(draw = FALSE),
          col.regions = c("grey20", "#fd9900"), colorkey = FALSE)

plot of chunk features

Cost

I use a similar approach to generate a spatially auto-correlated cost layer.

cost <- gaussian_field(r, 20, mean = 1000, variance = 500) %>% 
  setNames("cost")
levelplot(cost, main = "Cost", margin = FALSE, col.regions = viridis)

plot of chunk cost

Boundaries

Next, I define a function to calculate the matrix of boundary lengths, \( \nu_{ij} \). This will be a large matrix with many zeros since most planning units do not share boundaries. So, to save memory, I store the boundary matrix as a sparse matrix using the slam package. Normal matrix objects store all components explicitly, even zeros, while slam sparse matrices save memory by only storing non-zero components explicitly.

calculate_boundary <- function(r, sep_diagonal = FALSE) {
  # shared boundaries
  shared <- adjacent(r, 1:ncell(r)) %>% 
    data.frame(boundary = mean(res(r))) %>% 
    mutate(id1 = as.integer(from), id2 = as.integer(to)) %>% 
    dplyr::select(id1, id2, boundary)
  # external boundary for each cell is total boundary length for cell
  # minus sum of shared boundaries
  external <- data.frame(id1 = 1:ncell(r), id2 = 1:ncell(r), 
                         external_boundary = 2 * sum(res(r)))
  external <- group_by(shared, id1) %>% 
    summarize(shared = sum(boundary)) %>% 
    right_join(external, by = "id1") %>% 
    mutate(boundary = (external_boundary - shared)) %>% 
    dplyr::select(id1, id2, boundary)
  # combine shared and external boundaries
  boundaries <- rbind(shared, external) %>% 
    arrange(id1, id2)
  # return diagonal and off diagonal components separately
  if (sep_diagonal) {
    off_diagonal <- shared %$% 
      simple_triplet_matrix(id1, id2, boundary)
    diagonal <- arrange(external, id1) %>% 
      {.$boundary}
    return(list(off_diagonal = off_diagonal, diagonal = diagonal))
  }
  boundaries <- rbind(shared, external) %$% 
    # convert to sparse matrix
    simple_triplet_matrix(id1, id2, boundary)
  return(boundaries)
}
boundary_sparse <- calculate_boundary(r)
object.size(boundary_sparse) %>% print(units = "Mb")
#> 0.1 Mb
object.size(as.matrix(boundary_sparse)) %>% print(units = "Mb")
#> 28.6 Mb

Target setting

Prior to a Marxan reserve design exercise, representation targets for each conservation feature must be set. This may occur through expert elicitation, modeling exercise to determine the amount of habitat necessary to ensure persistence, or more informal methods. Regardless, it’s very common to set percent-based targets, such as protecting 20% of the range of a given species or habitat type. I define a function that converts percent targets to absolute representation targets.

set_targets <- function(x, features, type = c("percent", "absolute")) {
  type <- match.arg(type)
  assert_that(inherits(features, "RasterStack"),
              is.numeric(x),
              length(x) == 1 || length(x) == nlayers(features))
  
  # total representation level
  total <- unname(cellStats(features, "sum"))
  if (type == "percent") {
    assert_that(all(x >= 0), all(x <= 1))
    return(x * total)
  } else {
    assert_that(length(x) == nlayers(x), all(x >= 0), all(x <= target))
    return(x)
  }
}
set_targets(0.3, species)
#> [1]  58.2 145.2 290.4  58.2 145.2 290.4  58.2 145.2 290.4

Gurobi

Gurobi is a powerful commercial optimization software that implements a variety of state-of-the-art algorithms for solving optimization problems. Although the software requires a license, the folks at Gurobi provide free licenses to academic users. Gurobi has also created interfaces to their optimization engine for a variety of programming languages, including R. Beyer et al. (2016) used Gurobi in their paper and found it to be superior to open source optimization software in terms of both speed and user-friendliness.

If you’re not eligible for an academic license, don’t worry, Gurobi also offers a trial version that can solves problems of up to 2,000 decision variables. In this post I’ve intentionally set up a problem with 44 x 44 = 1,936 decision variables (i.e. planning units) so that’s the code can be run with the trial version.

Installation

Before we get started you’ll need to install Gurobi. First, download the Gurobi Optimizer. Then request an academic license or a trial license. At this point you’ll get an email with instructions for downloading and activating the license.

Next install the gurobi R package. The package isn’t available on CRAN, rather it’s included in the program files you downloaded for the Gurobi Optimizer. So, you’ll need to follow the instruction on the Gurobi website for installing it.

Now you can load the package with

library(gurobi)

Gurobi interface

The Gurobi Optimizer can solve optimization problems of the following general form

where \( \boldsymbol{x} \) is a vector of decision variables, \( \boldsymbol{c} \) and \( \boldsymbol{b} \) are vectors of known coefficients, and \( A \) and \( Q \) are matrices of known coefficients. The final term specifies a series of structural constaints and the \( \Box \) symbol is used to indicate that the relational operators for the constraint can be either \( \ge \), \( = \), or \( \le \). In addition to these structural constraints, Gurobi accepts constraints on the type of decision variable. In particular, decision variables can be constrained to be continuous, integer, or binary.

If all decision variables are set to integer or binary, we have an IQP problem. And, if in addition, the term with the \( Q \) matrix is dropped, this reduces to an ILP problem.

The gurobi(model, params) function in the gurobi R package interfaces with the Gurobi Optimizer. The first argument to this function is an optimization model object, which is a list with named components that specify the optimization problem to be solved. The model components corresponds to different elements of the problem statement above, and the ones that are of relevance to the reserve design problem at hand are:

  • model$obj: the objective function vector \( \boldsymbol{c} \)
  • model$Q: the quadratic objective function matrix \( Q \)
  • model$rhs: the vector on the right hand side of the constraint statement \( \boldsymbol{b} \)
  • model$A: the constraint matrix \( A \)
  • model$sense: the type of constraint, i.e. a vector of the same length as model$rhs where each element is ">=", "=", or "<=".
  • model$vtype: a vector of variable types for decision variables, where "B", "I", and "C" correspond to binary, integer, and continuous respectively

The optional second argument, params, is a named list of components specifying Gurobi parameters. The full list of possible parameters is available in the Gurobi documentation; however, are few of potential relevance are:

  • params$MIPGap: the relative optimality gap at which to terminate Gurobi. For example, if a value of 0.01 is set then Gurobi will terminate when it finds a solution that is within 1% of optimality. This parameter can be used when finding the exact optimal solution would require a prohibitive amount of time.
  • params$TimeLimit: the maximum time in seconds to run Gurobi. Again, this parameter can be used if finding the exact optimal solution would require a prohibitive amount of time.
  • params$Presolve: an integer from -1 to 2 that controls the “presolve level”. I’m unclear what “presolve” means in the context of Gurobi, but Beyer et al. (2016) use a presolve level of 2, corresponding to “aggressive”. I’ll follow their lead here.

Reformulating the reserve design problem

The final step before we solve the reserve design problem using Gurobi is to manipulate the problem to conform to the standard format that Gurobi accepts. Recall that the reserve design problem is often formulated mathematically as

Beyer et al. (2016) perform a change of variables to linearize the objective function, then specify the problem as an ILP problem. However, since Gurobi can also solve quadratic optimization problems directly, I skip the linearization step and specify the problem as an IQP problem. To accomplish this, the boundary length term with the double summation can be expanded and all the linear terms in the objective function can be grouped to get

It this form, the various components of model are now clear:

Implementation

With all the background and theory out of the way, it’s time to create a function that solves reserve design problems with Gurobi. This function takes as input a RasterStack of feature distributions, a cost RasterLayer, relative or absolute representation targets, and the BLM. It constructs the model object for an IQP (or ILP if BLM = 0) problem and solves it using gurobi().

As touched on earlier, the boundary length matrix \( \nu_{ij} \) contains two types of boundaries. The off diagonal components correspond to shared boundaries between planning units and are consistently including in the Marxan objective function. The diagonal components correspond to boundaries on the edge of the study area (i.e. not shared with another planning unit), and they are almost never accounted for in the objective function that appears in Marxan documentation. However, in real Marxan exercises they may be included, excluded, or included but with a reduced magnitude. This last option, scaling edge boundaries by a factor, is sometimes used because complicated coastlines can create extremely long boundaries, which will heavily bias against selecting planning units on the edge of the study area. To account for all these possibilities I include an edge_factor argument that can scale or remove these edge boundaries.

For this example, I arbitrarily use 30% representation targets for all conservation features and a BLM of 200.

gurobi_solve <- function(features, cost, targets, blm, edge_factor = 1,
                         target_type = c("percent", "absolute"),
                         gap = 0.005, time_limit = Inf) {
  # check inputs
  assert_that(inherits(features, "RasterStack"),
              inherits(cost, "RasterLayer"),
              compareRaster(cost, features),
              is.number(blm), blm >= 0,
              is.number(edge_factor),
              is.number(gap), gap >= 0)
  
  # set proportional targets or check absolute targets
  target_type <- match.arg(target_type)
  targets <- set_targets(targets, features, type = target_type)
  # linear component of objective function
  obj <- cost[]
  if (blm > 0) {
    # calculate boundaries
    bound <- calculate_boundary(cost, sep_diagonal = TRUE)
    obj <- obj + blm * row_sums(bound$off_diagonal)
    # edge external, edge boundaries if edge_factor = 0
    if (edge_factor != 0) {
      obj <- obj + blm * edge_factor * bound$diagonal
    } 
  }
  
  # construct model
  model <- list()
  # goal is to minimize objective function
  model$modelsense <- "min"
  # binary decision variables
  model$vtype <- "B"
  # objective function
  model$obj <- obj
  if (blm > 0) {
    model$Q <- -blm * bound$off_diagonal
    rm(bound)
  }
  # structural constraints
  model$A <- as.simple_triplet_matrix(t(features[]))
  model$rhs <- targets
  model$sense <- rep('>=', length(targets))

  # set the parameters that control the algorithm
  # MIPGap controls the how close the returned solution is to optimality
  params <- list(Presolve = 2)
  if (is.finite(time_limit)) {
    params$TimeLimit = time_limit
  } else {
    params$MIPGap = gap
  }
  
  # gurobi is very RAM hungry, remove any un-needed objects
  rm(features, cost, targets, obj)
  
  # solve
  gurobi(model, params)
}
blm = 150
target = 0.3
results <- gurobi_solve(species, cost, targets = target, blm = blm)

Results object

The gurobi() function returns its results as a named list with several components, the most important of which are:

  • results$x: a vector of decision variables for the best solution found. This will either be the true optimal solution or a solution within a given gap of the true optimum.
  • results$objval: the value of the objective function for the returned solution.
  • results$objbound: a lower bound for the true optimum of the objective function.

In this case, the objective function value for the solution is:

results$objval
#> [1] 418593.8

I used a gap to optimality of 0.5%, which can be confirmed with:

100 * (results$objval / results$objbound - 1)
#> [1] 0.1492837

So this solution is actually even closer to optimality than specified. We can also visualize the resulting reserve network:

r_solution <- r
r_solution[] <- results$x
# make this a categorical raster
r_solution <- ratify(r_solution)
rat <- levels(r_solution)[[1]]
rat$status <- c("Not Selected", "Selected")
levels(r_solution) <- rat
levelplot(r_solution, main = "Gurobi Solution: 5% gap to optimality",
          scales = list(draw = FALSE),
          col.regions = c("grey40", "#4DAF4A"),
          colorkey = list(space = "bottom", height = 1))

plot of chunk plot-results

Checking the results

One quick test to make sure everything is working is to manually evaluate the objective function for the returned solution and ensure it agrees with the values stored in results$objval. The objective function is just the cost plus the BLM times the total perimeter length of the reserve network.

The perimeter can be found by converting the selected planning units from raster to polygon format, dissolving the internal boundaries between adjacent planning units, then calculating the boundary length with gLength from the rgeos package.

solution_cost <- cellStats(cost * r_solution, "sum")
perimeter <- rasterToPolygons(r_solution, function(x) {x == 1},
                              dissolve = TRUE) %>% 
  gLength
solution_cost + blm * perimeter
#> [1] 418593.8
results$objval
#> [1] 418593.8

Spot on! Next, I’ll also check that all the representation targets were met.

representation <- cellStats(species * r_solution, "sum") %>% 
  setNames(names(species))
targets <- target * cellStats(species, "sum")
(representation - targets)
#>   a   b   c   d   e   f   g   h   i 
#> 0.8 0.8 0.6 7.8 0.8 0.6 0.8 0.8 0.6

Again, everything looks good. For all 9 species, the level of representation is just above the target level.

Performance: Marxan vs. Gurobi

How does this compare to solving the same problem using simulated annealing with Marxan? To investigate the relative performance of the two methods, I’ll set up a slightly larger reserve design problem with 9 conservation features on a 100x100 grid (10,000 planning units). I solve the problem with Marxan and the marxan R package, using 10 simulated annealing runs of 10,000,000 iterations each. Then, I solve it with Gurobi using a 0.5% gap to optimality. Due to the larger number of planning units, the following code can’t be run using the trial version of Gurobi.

# features
r <- extent(c(0, 100, 0, 100)) %>% 
  raster(nrows = 100, ncols = 100, crs = utm10, vals = 1)
species <- mapply(function(x, y, r) gaussian_field(r = r, range = x, pct = y),
                  rep(c(5, 15, 25), each = 3),
                  rep(c(0.1, 0.25, 0.5), times = 3),
                  MoreArgs = list(r = r)) %>% 
  stack %>% 
  setNames(letters[1:nlayers(.)])
#cost
cost <- gaussian_field(r, 20, mean = 1000, variance = 500) %>% 
  setNames("cost")

# marxan
pu_spdf <- rasterToPolygons(cost)
  pu_spdf$id <- seq_len(length(pu_spdf))
  pu_spdf$status <- 0L
md <- format.MarxanData(pu_spdf, species, targets = "30%", spf = 1)
mo <- MarxanOpts(BLM = 150, NUMREPS = 10L, NUMITNS = 10000000L)
mu <- MarxanUnsolved(mo, md)
rm(md, mo, pu_spdf)
marxan_time <- system.time({resm <- solve(mu)})

# gurobi
gurobi_time <- system.time({
  resg <- gurobi_solve(species, cost, targets = target, 
                       blm = blm)
})

Now I summarize the results:

gaps <- c(min(resm@results@summary$Score) / resg$objbound - 1,
          resg$objval / resg$objbound - 1)
data.frame(method = c("Marxan", "Gurobi"),
           gap = 100 * gaps,
           time = c(marxan_time["elapsed"], gurobi_time["elapsed"])) %>% 
  kable(digits = 1,
        col.names = c("Method", "Gap to Optimality (%)", "Time (s)"))
Method Gap to Optimality (%) Time (s)
Marxan 12.0 42.4
Gurobi 0.1 25.5

So, Gurobi produces a higher quality solution (i.e. closer to the optimum) and does so more efficiently. Finally, I’ll map the reserve networks resulting from these two methods. Since the Marxan run involved 10 simulated annealing replicates, each giving a different solution, I map the selection frequency, i.e. the proportion of times each planning unit was selected.

r_solutions <- stack(r, r) %>% 
  setNames(c("Marxan", "Gurobi"))
r_solutions[["Marxan"]][] <- colMeans(resm@results@selections)
r_solutions[["Gurobi"]][] <- resg$x
levelplot(r_solutions, main = "Marxan vs. Gurobi: Selection Frequency",
          scales = list(draw = FALSE), 
          col.regions = viridis(100),
          at = seq(0, 1, length.out = 100))

plot of chunk marx-gur-maps

So, the two methods produce qualitatively similar solutions. The comparison I’ve given is quite simplistic, however, Beyer and colleagues did a much more thorough comparison between the two methods and found similar results: over a wide range of problem sizes, IP produced solutions higher quality solutions with less processing time.

Conclusions

After experimenting with Gurobi and integer programming recently, I see several reasons why it’s an exciting tool for conservation planners:

  • Exact optimization: First and foremost, IP algorithms are exact, i.e. they will find the true optimum given enough time. Furthermore, if finding the optimal solution is too computationally costly, these algorithms can also find solutions that are within a specified distance from the optimum (e.g. a 0.1% gap to the optimum). In contrast, simulated annealing is a heuristic method that in general won’t find the exact solution to an optimization problem and provides no measure of how close the returned solution is to the optimum.
  • Efficiency: As demonstrated by my simple performance comparison and the more thorough comparison in Beyer et al. (2016), Gurobi is able to find solutions closer to optimality more efficiently.
  • Fewer parameters: Marxan requires additional parameters that aren’t needed by Gurobi; some to define the optimization problem and others to specify the simulated annealing run. More free parameters means more moving parts that need to be understood and calibrated.
    • Species Penalty Factor: For a variety of reasons, simulated annealing works best when constraints aren’t explicitly enforced, but instead built directly into the objective function. So, Marxan minimizes a modified objective function that includes an additional term that applies a shortfall penalty for not meeting targets. This requires that the user set a Species Penalty Factor (SPF) for each feature, which defines the relative importance of meeting the target for each feature.
    • Base Penalty: For a conservation feature that doesn’t have its target met, the shortfall penalty measures of the cost associated with raising the representation up to the target level. To convert the proportional target attainment to the shortfall penalty in units of cost, Marxan calculates a sensible base penalty corresponding to the cost of taking a feature from zero representation to the target level. I go go into detail about how this is calculated in my previous post on Marxan. Fortunately, Marxan calculates the base penalties internally, so the user never has to worry about them, but they are another set of parameters that are required.
    • Number of Iterations: Simulated annealing is an iterative process, and the larger the number of iterations the closer the resulting solutions will be to the true optimum. The number of annealing iterations is set by the user based on the desired trade-off between solution quality and computation time. In this sense it has a similar function to the optimality gap in Gurobi, however, the optimality gap is a much more direct measure: it specifies exactly how close to the optimum the resulting solution will be, while number of annealing iterations provide no absolute bound on solution quality.
    • Annealing Schedule: Finally, simulated annealing is based on a temperature parameter, which decreases as the heuristic progresses. The rate at which the temperature decreases, called the annealing schedule, has a significant impact on the likelihood of getting caught in a local optima. In the Marxan implementation of simulated annealing, the annealing schedule is defined by two parameters: the initial temperature and a cooling factor. These parameters can be set by the user or automatically by Marxan using an adaptive method that chooses optimal parameters based on the specific objective function.

Of course, the catch is that Gurobi is commericial software and, if you’re not eligable for an academic license, the cost of a full license is several thousand dollars, likely prohibitive for many conservation projects. Gurobi does offer a cloud-based solution, at more reasonable rates. Alternatively, open source IP solvers exist, though my understanding is that most are significantly slower and not as user friendly as Gurobi.

I’m far from an expert on either of these approaches, so there may be further benefits and drawbacks. However, based on these points, it seems clear to me that integer programming, and Gurobi in particular, have the potential to be extremely valuable tools for systematic conservation planning.

Footnotes

1 Beyer HL, Dujardin Y, Watts ME, Possingham HP. 2016. Solving conservation planning problems with integer linear programming. Ecological Modelling 328: 14–22. []