A method for finding a maximum value region with a minimum width in raster space

Abstract Given a grid of cells, each of which is assigned a numerical value quantifying its suitability for a certain use, one problem in geographic information science concerns the selection of a region, i.e. a connected set of cells, with a specified size that maximizes the sum of all their values. This task can be cast as a combinatorial optimization problem called the maximum value region problem, and exact and heuristic methods exist for its solution. While those solutions are guaranteed to be feasible (if not optimal), they may not be desirable for practical use if they contain too narrow segments (down to the width of a single cell). In this paper, we present a new variation of the maximum value region problem—the maximum value wide region problem—that requires a region to be at least as wide as a specified width. We offer a heuristic method for its solution which models a region as a set of neighborhoods and test its performance through computational experiments. Results demonstrate that the method generates good feasible solutions in terms of connectedness, size, width, and value, but requires more computing time than methods for maximum value regions without minimum width requirements.


Introduction
One of the tasks frequently done in geographic information science is to evaluate land and select a site for a certain use.It relates to a variety of areas of planning including land use planning (Liu et al. 2012, Gilbert et al. 1985, Yao et al. 2018, Tomlin and Johnston 1990, Aerts et al. 2003, Ligmann-Zielinska et al. 2008), conservation planning (Billionnet 2013, Fuller et al. 2006, € Onal et al. 2016, Wang and € Onal 2016), and regional planning (Minor andJacobs 1994, Baerwald 1981).
Land evaluation often involves more than one criterion (e.g.slope, land cover type, and proximity to targets) and can be considered as a multi-criteria decision making (MCDM) problem (Carver 1991, Chuvieco 1993, Jankowski 1995).Its solution may require aggregation of multiple criteria through numerical scoring of each piece of land in terms of each criterion, weighting of each criterion according to its relative importance, and summation of weighted scores (Pereira and Duckstein 1993, Eastman et al. 1998, Jankowski 1995).A typical outcome of this process is a 'suitability' map, which estimates how suitable each piece of land is for the intended use.In a raster space, the map takes the form of a grid of cells, each having a single value representing a suitability score.
A suitability map is useful when one attempts to identify one or more candidate sites and compare them (see, e.g.Jankowski and Richard (1994), Malczewski (2004), and Church and Murray (2009) for reviews).In raster space, a site takes the form of a connected set of cells, commonly referred to as a region (Cova and Church 2000, Brookes 1997, Shirabe 2011).If a region is sought that maximizes the sum of the values of its cells without exceeding a certain number of cells, we have the 'maximum value region problem' (Shirabe 2011).If a region is sought that minimizes the sum of the values of its cells while being at least as large as a certain number of cells, it is fact the inverse of the maximum value region problem and can be solved in the same manner.Williams (2002) and Shirabe (2005) formulated these problems as integer programming models.However, these models did not offer the possibility of finding solutions in reasonable time.
A number of heuristic methods have instead been proposed to find a good feasible solution to the maximum value region problem with a reasonable computation time.They include 'region growing' (Brookes 1997), 'patch growing' (Church et al. 2003), 'simulated annealing' (Aerts and Heuvelink 2002), and 'dynamic programming' (Shirabe 2011) methods.The output of these methods might look like the one in Figure 1(a), which is indeed a connected set of cells, but barely so, through strings of cells.Imagine that the resolution of the raster suitability map is high, say, less than 1 meter.Then, the output region may not be useful in some contexts (e.g.wildlife conservation) because it is too narrow to accommodate actual users of it (e.g.migrating animals).For example, Beier (2019) suggests connecting corridors for conversation patches be no smaller than 2 km wide to provide proper protection for animals crossing between patches.It is reasonable to believe, then, that maintaining this minimum width within regions is also necessary.In other cases, a protective or transitional buffer is considered in site selection for wildlife and wilderness sanctuaries (Eigenbrod et al. 2009 andHull et al. 2011), implying that any location in the region should not measure less than two times the width of the buffer imposed from edge to edge.
Alternatively, one might consider resampling a suitability map down to a resolution equal to a width required by users of the region and applying to it any of the existing region selection techniques mentioned above.This approach is straightforward, as illustrated in Figure 1(b), but the resulting region not only fails to satisfy the minimum width requirement where two resampled cells meet at a cell vertex but also forces the region to be a very specific shape, in the case Figure 1(b) a distinctively angular form, which may overlook potentially good solutions of other forms.While the former issue can be avoided by employing the 4-adjacency assumption (under which two cells are considered adjacent if they share a cell edge, not just a cell vertex), the latter always remains.
A naïve yet effective approach to enforcing the minimum width requirement (and avoiding an angular form) is to limit feasible regions to ones having a preset form.For example, Figure 1(c) shows a maximum value region of a circular form.It certainly meets the minimum width requirement, but a question may arise why it needs to be circular (or any particular form).
If the form of a region cannot be fixed that rigidly, another approach is to limit the search scope to those that take the form of a 'wide path' (Gonc¸alves 2010, Shirabe 2016), that is, a swath of cells with a fixed width extending between two locations.There exist methods for finding a least-cost wide path on a raster cost map.So, if the same suitability map is inverted to a cost map by regarding lower suitability as higher cost, a least cost wide path can be found on the cost map and seen as a maximum Figure 1.Alternative heuristic solutions to the maximum value region problem: (a) one found on the original suitability map, (b) one found on a resampled suitability map, (c) one of a preset (circular) form found on the original suitability map, and (d) one that takes the form of a least-cost wide path found on an inverted suitability map.Note that darker shades represent higher suitability (or lower cost) scores, and that the darkest shades represent regions.
value region that satisfies the minimum width requirement on the suitability map.As seen in Figure 1(d), however, the region generated in this way inevitably has a linear form oriented along a one-dimensional path between two locations which are designated as its source and destination.
The objective of this paper is to introduce a new variant of the maximum value region problem that explicitly includes a minimum width requirement and to offer a solution method for it.The solution should avoid narrow stretches like those in Figure 1(a) or (b) and should not have a preset form or orientation like those in Figure 1(c) and (d).Such a region might rather look like the one presented in Figure 2.
The rest of the paper is structured as follows.Section 2 reviews two concepts from the literature that contribute to the design of our model and solution method, as presented in Section 3. Section 4 describes computational experiments and reports their results.Section 5 discusses the experimental findings and their implications, and Section 6 concludes the paper.

Preliminaries
This section briefly reviews a model (of a wide path) and a solution method (for the maximum value region problem) on which our approach to the maximum value region problem with a minimum width requirement is based.

Model of a wide path
In his attempt to solve the least-cost wide path problem, Shirabe (2016) introduced the concept of an 'octagonal neighborhood,' which is a non-empty set of cells arranged in a form called '(w,d)-form.'It is a w-by-w block of cells, where w is a desired width measured in number of cells from which d rows are diagonally removed from each corner (see Figure 3 for an example).Given the nature of rectangular cells, the lateral widths of a (w,d)-form are consistently equal to w, whereas the diagonal widths alternate between two values around w. To minimize the differences in the lateral and diagonal widths of the form and make the form as regular as possible, d is set to the largest integer that is not greater than 2−�2 2 w.A wide path of a width of w is then modeled as a sequence of adjacent neighborhoods of a (w,d)-form, and two neighborhoods are said to be adjacent if one can be translated by the distance of one cell edge or one cell diagonal to coincide with the other (see Figure 4 for examples).The wide path shown in Figure 1(d) was actually such a sequence of neighborhoods.Shirabe (2011) proposed a dynamic programming-based method for the maximum value region problem.It is designed to solve a subproblem of it that is to find a 'focal maximum value region,' s(l,i), which consists of a specified number, l, of cells, contains a specified cell, i, and has the greatest total value.The method relies on a recursive relationship such that s(l,i) is the union of cell i and s(l-1,j) where cell j is adjacent to cell i.This is a heuristic because although many s(l,i)'s satisfy this relationship, not all do (see Figure 5 for an example of each case).Starting with s(1,i) set to cell i for each cell i in N, where N is the set of all cells, this heuristic enables an algorithm to recursively create or update s(l,i) for each l (� 2) and each cell i in N by combining cell i and one of its adjacent s(l-1,j)'s that has the highest value.

Heuristics for maximum value regions
Because the algorithm as it is described above assumes that a new region consists of a cell and only one of its adjacent l-1 regions, some more desirable regions may be missed that are found by combining two or more of a cells adjacent regions.To 'scoop' some solutions overlooked by the algorithm described above, Shirabe (2011) designed a complementary heuristic such that s(l,i) replaces s(l,j) if the former has a greater value and contains cell j.With the addition of this heuristic, the algorithm is expected to find a good feasible solution to the maximum value region problem.

method
We first introduce a model of the maximum value region with a minimum width requirement problem and its sub-problem, and then we present a solution method for each.

Model
We interpret a region with a minimum width of w such that any segment of it is as wide as or wider than w so that an object with that width can reach anywhere inside it.Since this condition is satisfied by a wide path-i.e. a connected sequence of adjacent neighborhoods of a (w-d) form (see Section 2.1 and Figure 1(d) for a graphic example), we take a similar approach to its representation.That is, a region with a minimum width of w is a connected set of adjacent neighborhoods of a (w-d)-form.We call such a region a 'wide region,' and in particular, one that contains k or fewer cells a 'k-cell wide region.' Figure 6 illustrates an example of such a region.
In addition to width, three more properties are defined for wide regions: size, value, and adjacency.The size of a wide region, R, is the number of cells contained in R and thus is denoted by jRj, and its value is the sum of the values of all cells in R and is denoted by f(R).Note that depending on the form of the neighborhood used, the present model may not allow a wide region of some sizes.As the most obvious example, no k-cell wide region (unless k ¼ 0) can be smaller than a single neighborhood.This is a reason that a k-cell wide region is defined to contain not exactly k cells but k or fewer cells.The definition of adjacency between two neighborhoods reviewed in Section 2 is extended such that a neighborhood or a wide region is said to be adjacent to another neighborhood or another wide region if a neighborhood in the former is adjacent to a neighborhood in the latter.
With the wide region defined above, the maximum value wide region problem can be stated as follows: Problem 1: The maximum value wide region problem Given a grid of cells in which each cell is assigned a numerical value, select a connected set of adjacent neighborhoods of a (w,d)-form containing no more than a specified number, k, of cells-i.e. a k-cell wide region with a minimum width of w-such that no other such set has a greater total value.

Solution
We present a solution method for the maximum value wide region problem in terms of its general approach, specific algorithm, and computational complexity.

Approach
The method is generally designed to first find, for each neighborhood in the grid, a maximum value wide region with a condition that it must contain that neighborhood, and then select from among all such regions-referred to as 'focal' maximum value wide regions according to the terminology of Tomlin (1990)-one that has the greatest value.
To find focal maximum value wide regions, we take a dynamic-programming approach similar to the one employed by Shirabe (2011).It is presented below as Algorithm 1 with the following notation.
� N: set of all neighborhoods in a given grid � s(l,i): focal maximum value l-cell wide region associated with a neighborhood, i

end for 8 end for
After initializing the focal maximum value 0-cell wide region for every neighborhood to the empty set of cells (Lines 1-3), the algorithm recursively attempts to derive a focal maximum value wide region of each size for each neighborhood from combinations of smaller focal maximum value wide regions created earlier (Lines 4-8).
The accuracy of the algorithm relies on the presence of recursive relationships between larger focal maximum value wide regions and smaller focal maximum value wide regions.There may be such relationships but not likely to be computationally efficient.This is because there are an exceedingly large number of combinations of regions in general, which potentially causes a combinatorial explosion (Shirabe 2011), and this is also true for wide regions.Hence, we will pursue a heuristic method.

Heuristics
We adapt the heuristics employed by Shirabe (2011) to solve the focal maximum value region problem (see Section 2.2) to the focal maximum value wide region problem as follows.
That is: This recursive relationship simplifies the task at Line 6 to an iteration over a manageable number of smaller wide regions (see Figure 7(a) for an example).Note that h (< l) can be further limited to be larger than or equal to l − i j j j because j is, by definition, contained by s h, j ð Þ and the difference between s l, i ð Þ and s h, j ð Þ is thus at most i j: A drawback of this heuristic is that some feasible wide regions (which may include optimal regions) will be overlooked (see Figure 7(b) for an example).This makes the algorithm a heuristic.This heuristic takes advantage of the fact that a feasible l-cell wide region associated with one neighborhood is also feasible for every other neighborhood in the region.The heuristic, therefore, captures regions that exist for a given cell but were overlooked by the process of combining a neighborhood with a smaller focal maximum wide region adjacent to that neighborhood.The heuristic does not guarantee the algorithm to reach global optima but helps it escape from some local optima and thus find a good feasible solution to the maximum value wide region problem.Heuristic 1 Assume that a nonempty focal maximum value wide region associated with a neighborhood, i, is the union of neighborhood i and a smaller focal maximum wide region adjacent to neighborhood i.

Heuristic 2
Replace s(l,j) with s(l,i) if the latter has a greater value and contains neighborhood j.

Algorithm
We embody Algorithm 1 with the two heuristics described above.To facilitate its presentation, we add the following notation.end for 23end for Lines 1-5 initialize focal maximum value wide regions for all sizes and all neighborhoods to the empty set of cells (which is the 0-cell wide region and is considered a valid solution).The remainder of the algorithm updates each wide region that can be greater than or equal to the size, m, of one neighborhood until a focal maximum value k-cell wide region is generated for every neighborhood.Lines 8-15 construct a focal wide region for each neighborhood according to Heuristic 1 and substitute it for the current focal maximum value wide region associated with that neighborhood if it has a greater value and does not contain more than l cells.Lines 16-21 visit every neighborhood in the updated focal maximum value wide region and perform Heuristic 2.

� f(l,i):
Once a focal maximum value k-cell wide region is found for every neighborhood, a solution to the maximum value wide region problem can be found by selecting one with the greatest value.Algorithm 2 summarizes this procedure, where s(k) and f(k) represent a solution to the problem, that is, a maximum value k-cell wide region and its value, respectively.

Complexity
The most time-consuming task of Algorithm 1 is the calculation of s h, j ð Þ [ i, the union of a neighborhood, i, and each, s h, j ð Þ, of its adjacent wide regions at Line 10.This is equivalent to the union of s h, j ð Þ (whose size is not larger than k cells) and i j (whose size is proportional to neighborhood width, w) because j is, by definition, contained by s h, j ð Þ, and thus requires O kw ð Þ time.This task is repeated for each region size not larger than k (Line 6), each of all (up to n) neighborhoods in the grid (Line 7), each of its (up to 8) adjacent neighborhoods (Line 8), and each region size from l − i j j j to l − 1 (where i j j j is proportional to neighborhood width, w) (Line 9).Thus, the complexity of Algorithm 1 (and Algorithm 2

experiments
We conducted two computational experiments to evaluate the proposed method.The first experiment examined its qualities by comparing its solutions with those without the minimum width requirements.The second experiment focused on its efficiency in terms of running time.

Data
Both experiments used hypothetical raster suitability maps-which are referred to here as suitability grids-converted from 'neutral landscape models (NLMs)' (Gardner et al. 1987) that were randomly generated with the 'mid-point displacement method' (Fournier et al. 1982) implemented by the 'NLMpy' PYTHON software package (Etherington et al. 2015).

Experiment 1
For the first experiment, we first generated 10 NLMs consisting of 180 rows and 180 columns, whose values ranged from 0 to 1 with varying degrees of spatial autocorrelation.We then converted each NLM to a suitability grid by classifying its values into 100 intervals of equal length and assigning those intervals integers ranging from 1 to 100 (see Figure 8(a) for an example).In addition, we resampled each suitability grid to seven coarser resolutions including, 2, 3, 4, 5, 6, 9, and 10 (relative to the resolution of the original suitability grid), so that a region consisting of resulting cells would be at least as wide as their cell size.We excluded the resolutions 7 and 8 because a 180-by-180 grid cannot be divided into integer numbers of cells.As a result, we had 10 sets of eight 180-by-180 suitability grids-totaling 80 suitability grids-each set including one original and seven resampled suitability grids (see Figure 8 for an example of such a set).

Experiment 2
In the second experiment, we used 10 60-by-60, 10 120-by-120, and 10 180-by-180 suitability grids generated in the same manner as in the first experiment.

Procedure
In both experiments, we created a large number of problem instances by systematically varying the maximum region size and the minimum region width.For their solutions, we coded Algorithms 1 and 2 in Java and executed them on a 3.40 GHz Intel Core i7-6700 CPU processor with 32.8 GB of RAM.Details of this procedure are described below.

Experiment 1 4.2.1.1 Comparison with conventional regions.
We first solved the conventional maximum value region problems using Algorithm 1 for each of 10 region sizes-approximately 1% (324 cells) to 10% (3240 cells) of the size of the original 180-by-180 suitability grid.We then combined each of the 10 maximum region sizes and each of 10 minimum region widths-1 to 10 cells-into 100 problem types.For each problem type, we created 10 problem instances associated with the 10 original suitability grids.We solved each of the resulting 1000 problem instances using Algorithm 1 (revised) and Algorithm 2 and recorded the size and value of the resulting region.
For each of the 1000 wide regions generated, we calculated the relative difference between its value and that of its corresponding conventional region (which is expected to be the upper bound of the former)-i.e. the subtraction of the latter from the former divided by the latter.

Comparison with conventional regions generated on resampled suitability grids.
We first reused the resulting regions found using Algorithm 1 (revised) and Algorithm 2 on the original suitability grids for 80 of the 100 defined problem types above, excepting those associated with minimum region widths of seven and eight cells.Also, we used each of the same 10 maximum region sizes defined above to define 10 new problem types on the original grid and each 10 of the resampled grids corresponding to widths of 2, 3, 4, 5, 6, 9, and 10 cells, resulting in 800 problem instances.We then solved each problem instance as a conventional region problem on the resampled suitability grid with a resolution equal to the required minimum width using Algorithm 1.For each two solutions corresponding to the same maximum region size and corresponding widths, we calculated their relative difference-i.e. the subtraction of the latter from the former divided by the latter.

Experiment 2
We again defined 100 problem types by combining each of 10 region sizes (1% to 10% of the size of the suitability grid) and each of 10 region widths (1 to 10 cells).For each of the 100 problem types and each of the three grid sizes, (60-by-60, 120-by-120, and 180-by-180), we created 10 problem instances associated with the 10 suitability grids of that grid size.This resulted in 3000 problem instances.We solved each problem instance in the same manner as in Experiment 1 and recorded the time required for its solution.

Experiment 1 4.3.1.1 Comparison with conventional regions. 1000 maximum value wide regions
were generated on the original suitability grids.For example, Figure 9 (a) shows maximum value wide regions with varying minimum widths having the same maximum size generated on the same suitability grid.Two more examples are shown in Figures 9(b,c) .Note that a maximum value wide region with a width of 1 is identical to a conventional maximum value region.Each maximum value wide region was first visually compared with its corresponding conventional maximum value region.
Table 1 presents, for each problem type (specified by the region width row and the region size column), the average of the relative differences of the values of the 10 maximum value wide regions and those of their corresponding conventional maximum value regions.They were expectedly all negative, as the value of a wide region is, in theory, bounded from above by that of the corresponding conventional region (which is by definition without a minimum width requirement).Note that a wide region with a minimum width equal to 1 is identical to a conventional region and thus that all the relative differences in the first row are zeros.

Comparison with conventional regions generated on resampled suitability
grids.800 conventional maximum value regions were additionally generated on the resampled suitability grids.Figure 10(a) gives an example, which shows conventional maximum value regions with the same maximum size generated on suitability grids with different resolutions resampled from the same original suitability grid.Two more examples are given in Figures 10(b) and (c).A visual comparison was made between wide regions with a minimum width of w and conventional regions generated on resampled suitability grids with a resolution of w (e.g. Figure 9(a)-(c) vs. Figure 10(a)-(c), respectively).
Table 2 presents, for each problem type, the average of the relative differences of the values of the 10 maximum value wide regions and those of their corresponding conventional maximum value regions generated on resampled suitability grids.Note that a wide region with a minimum width equal to 1 is identical to a conventional region generated on the corresponding resampled suitability grid with a resolution of 1 and thus that all the relative differences in the first row are zeros.

Experiment 2
Table 3 reports the average running time of the proposed method to solve the 10 instances of each problem type on 60-by-60, 120-by-120, and 180-by-180 suitability grids.Note that a region with a maximum size of 1% (36 cells) of a 60-by-60 grid and a minimum width greater than or equal to 7 must be the 0-cell region, because a neighborhood of the (7,2)-form, (8,2)-form, (9,2)-form, and (10,2)-form contain 37, 52, 69, and 88 cells, respectively.For the same reason, a region with a maximum size of 2% (72 cells) of a 60-by-60 grid and a minimum width of 10, too, must be the 0-cell region.The proposed method takes negligible amounts of running time for such solutions.
Table 4 reports the average running time of the proposed method to solve the 10 instances of each problem type on 60-by-60 suitability grids starting at 10% (360 cells) up to 100% (3600 cells) of the grid.

discussion
We analyze the results of the two experiments to evaluate the qualities of solutions generated by the proposed method and its running time efficiency.

Qualities
The qualities of the solutions from Experiment 1 were evaluated in terms of feasibility and optimality.
Another finding is that maximum value wide regions may not be as large as the required maximum sizes.Certainly, they are still feasible in terms of size, but they could be improved even manually by adding some more cells.This feature is a direct consequence of the design of our model such that a region is a set of neighborhoods rather than of individual cells.The same feature is possible in conventional regions generated on resample suitability grids but they suffer to a greater extent.This is because they are sets of resampled cells, and thus their sizes are limited to be multiples of the number of cells (in terms of the original grid) aggregated by a resampled cell.
It was also found that like conventional solutions, some wide solutions might contain holes.While perforated wide regions are still feasible in our model, they might not be acceptable in some practical applications (e.g.acquisition of private land).As Figure 9 suggests, small holes tend to disappear as minimum region width increases.Depending on the underlying distribution of suitability scores, however, large wide regions may be punctuated by unsuitable cells like the one illustrated in Figure 2.

Optimality
The values of solutions to the maximum value wide region problem should be, in theory, bounded from above by the values of solutions to the maximum value region problem, because the former are subsets of the latter.Table 1 reflects this, with the largest average relative difference being about 2.3%.Because a conventional maximum value region does not satisfy a minimum width requirement, this difference indicates the cost of enforcing the minimum width requirement.While a couple of percents of value loss may seem reasonable, whether this cost is acceptable should, of course, depend on the specific application.We must acknowledge here that both the proposed and conventional solutions were heuristic solutions and thus could be far from optimal.Still, as the latter were experimentally found good in an earlier study (Shirabe 2011), we expect that their comparison is meaningful.
An interesting tendency was found in the relative difference such that it tends to increase with minimum region width but decrease with maximum region size (scan each column and row, respectively, of Table 1).We speculate that the two effects may neutralize each other as the underlying grid becomes finer, and therefore the relative difference may not increase arbitrarily, considering that both width and size increase with decreasing cell size.The small values presented in Table 2 may suggest that the resampling of a suitability grid down to a resolution equal to the required minimum width is an effective approach to the maximum value wide region problem.This seems true and even more so as maximum region size grows.Recall, however, that regions generated by this approach are not guaranteed to satisfy minimum width requirements and also their sizes are limited to multiples of the number of cells (of the original grid) aggregated by a single resampled cell.
In terms of region shape, compactness is another common requirement in region selection in a number of applications, such as political districting (Fryer andHolden 2011, Shirabe 2005), biodiversity conservation (Fischer and Church 2003), and landfill site selection (El Baba et al. 2015, Nazari et al. 2012).Our proposed method is designed to control only a region's minimum width and maximum size.For applications in which compactness is a necessary constraint, methods exist to control for compactness (Li et al. 2013, Hess and Samuels 1971, Wright et al. 1983), but they do not consider minimum width a requirement.Quantitative tests for compactness measurements in relation to required minimum width could be considered in further research.

Efficiency
Results of Experiment 2 show that as its complexity, O nk 2 w 2 ð Þ , suggests, the proposed method takes more computing time as the grid size, n, the maximum region size, k, or the minimum region width, w, increases.In fact, the problem instances on 120-by-120 suitability grids required just about four times longer time for their solution than those on 60-by-60 suitability grids.This can be seen by comparing the first column of Table 3(b) with the fourth column of Table 3 (a).Similarly, a comparison of the first column of Table 3(c) with the ninth column of Table 3(a) confirms that the problem instances on 180-by-180 suitability grids required approximately nine times longer time than those on 60-by-60 suitability grids.Note here that 1% of a 120-by-120 grid is equal to 4% of a 60-by-60 grid in number of cells (144 cells) and that 1% of a 180-by-180 grid is equal to 9% of a 60-by-60 grid in number of cells (324 cells).
We see the effect of maximum region size on computing time by scanning each row of Tables 3(a) to (c) and Table 4.That is, computing time generally increases with maximum region size, but it increases a little slower than predicted, given the com- where k is the region size in number of cells.This may be because we have implemented a set union (at Line 10 of Algorithm 1) in a way that is expected to run faster than O kw ð Þ (see Section 3.2.4) in practice.One might even consider employing a data structure that indexes all the elements of a set and performs set union in O w ð Þ instead (but at the cost of more memory usage).As for the effect of minimum region width on computing time, scanning of each column of Tables 3 (a) to (c) finds that computing time generally increases with minimum region width as predicted.An exception was found, however, when the minimum region width was relatively large compared to the maximum region size (see the lower-left triangular section of Table 3(a)).There may be two reasons for this.First, as minimum region width increases, the number of neighborhoods of the corresponding (w,d)-form (see Section 2.1) in a given suitability grid decreases and eventually becomes significantly less than the number of cells in that grid.Second, a region is represented as a set of neighborhoods rather than a set of cells, so with a large (w,d)form a region contains a small number of neighborhoods.
Another exception was found as well when region size was 1% and minimum region width was 10 cells on 120-by-120 grid.Furthermore, when region size was 3% or greater on a 180-by-180 grid and required minimum width was four cells the average running time was on average smaller than when the required minimum width was three cells.The same tendency was seen on 60-by-60 grids when region size was 40% or greater and required minimum width was four cells.These exceptions may suggest that the running time of the proposed method does not always increase with width, as had been predicted and was otherwise observed.
Overall, while the proposed algorithm runs in polynomial time, its practical use is still limited.The largest problem instance in Experiment 2 was defined by a maximum region size of 3240 cells, a minimum region width of 10 cells, and a 180-by-180 suitability grid, and its solution required approximately 48 hours.This is a relatively small problem to present-day geographic information systems (GIS) users who routinely process high-resolution raster data.
The limitations due to running time are even more evident when comparing the proposed method and the resampled method, where the former always had a longer running time than the latter.The average running time was 140.91 seconds and 0.25 seconds for the proposed method and the traditional method on resampled grids, respectively, when maximum region size equaled 1% of a 180-by-180 grid and the required minimum width equaled two cells.The difference tended to be greater as both maximum region size and minimum required width increased.The average running time was 155722.23 seconds and 0.05 seconds for the proposed method and the traditional method on resampled grids, respectively, when maximum region size equaled 10% of a 180-by-180 grid and the required minimum width equaled 10 cells.The large difference in running time between the two methods is likely because the proposed method requires extensive computational resources and should be considered along with the feasibility of the two methods for the defined problem.
It is important to note that the size of a grid is not the spatial extent of the study area it represents but the number of cells it contains.Therefore, even if the study area remains the same but a finer grid is used, n (number of cells) increases, which, in turn, increases k (maximum region size) proportionally to n, and w (minimum region width) proportionally to �n.This implies that the complexity of Algorithm 1 in terms of n is O n 4 ð Þ :

conclusions
We have revisited the conventional maximum value region problem and solutions that do not require a minimum width and may have narrow patches.With the increasing availability of high-resolution data on the Earth's surface, explicit consideration of width is critical in the region selection task.We have introduced a requirement of a region being not narrower than a specified width and presented a heuristic method for its solution.A key aspect of the method is the use of the 'neighborhood' concept, which allows raster-based spatial algorithms to process specific forms of cells together.The presented method represents a region as a set of adjacent neighborhoods, each being a set of cells arranged in a form whose width is equal to that width.This representation is simple but effectively guarantees any segment of a region to be at least as wide as a single neighborhood.Such regions are expected to have more practical uses than conventional regions (sets of adjacent cells), as they are found most suitable for a specific use based on the maximization of the suitability score and satisfaction of the width requirements.However, the regions still have chances to contain holes, which may not be ideal in some situations and should be addressed in future research.The method is immediately usable in any raster-based GIS, as it takes a raster suitability map as input without any preprocessing.However, one additional input parameter it takes to indicate a minimum region width, w, causes the method to take more-in theory, w 2 times more-computing time than its predecessor for the conventional region problem.

Figure 2 .
Figure 2. Another heuristic solution that satisfies the minimum width requirement.

Figure 3 .
Figure 3.The (8,2)-form.Measurements of its lateral and diagonal widths are displayed at the double arrows.

Figure 5 .
Figure 5. Two 21-cell regions: (a) one consisting of a single cell (shaded) and a single region (outlined by solid lines) and (b) one consisting of a single cell (shaded) and two regions (outlined by solid and dashed lines).

Figure 6 .
Figure 6.A 71-cell wide region consists of three adjacent neighborhoods (outlined by solid lines, outlined by dashed lines, and shaded) of the (8,2)-form.

Figure 7 .
Figure 7. Two 94-cell wide regions: (a) one consisting of a neighborhood (shaded) and a 91-cell wide region (outlined by solid lines) that is adjacent to it, (b) one consisting of a neighborhood (shaded) and a 66-cell wide region (outlined by solid lines) and a 60-cell wide region (outlined by dashed lines) that are adjacent to it.

Figure 9 .
Figure 9. Sets of wide regions of the same size with different minimum widths.Each set (a)-(c) was generated on a unique suitability grid.

Figure 10 .
Figure 10.Sets of conventional regions of the same size generated on resampled grids of different resolutions.The suitability grids on which each set (a)-(c) was generated were resampled from the suitability grid on which each (a)-(c), respectively, in Figure 9 was generated.
Value of a focal maximum value l-cell wide region associated with a neighborhood, i � f(S): Sum of the values of all cells contained in a set, S, of neighborhoods � m: Number of cells contained in one neighborhood 9 for each h ¼ 1 − i j j j to l − 1 10 if f ðsðh,jÞ [ iÞ > sðl, iÞ and sðh, jÞ [ i j j � l 11 sðl, iÞ : ¼ sðh, jÞ [ i 12 f ðl, iÞ : ¼ f ðsðh, jÞ [ iÞ

Table 1 .
Average of the relative differences of the values of the 10 wide regions from those of their corresponding conventional regions for each problem type.

Table 4 .
Average running time (in seconds) on a 60-by-60 suitability grid for regions from 10%-100% of the total grid size.