Maximum-subarray and related problems

In this article we will discuss another classic algorithmic problem. We are given an array of numbers and we are to find a contiguous part of this array such that sum of the numbers in the part is as large as possible. This problem is most commonly called “maximum-subarray problem”, but in the article we will use term “slice” to denote a subarray.

The problem is really nice from didactic point of view, as it admits a variety of solutions (including optimal linear-time ones), which allow to utilise methods like dynamic programming, “divide and conquer”, and segment trees data structures. Some of them can be extended to solve dynamic version (in which elements of the array can be updated), or infix version (in which we can ask about any infix of the array). Arguably less know variants include finding a set of k pairwise disjoint slices of maximal total sum.

This article is expanded version of my article printed in a popular science magazine Delta.

Standard problem

It is known that this problem has a very short linear solution, which is also quite simple, once you have seen it. But usually with this kinds of solutions it is not clear, how to derive them from scratch, so we will try to use two different approaches to this problem, and see where it will leads us. First we start with some trivial solutions, and then we try to come up with faster algorithms using two popular approaches: “divide and conquer” and dynamic programming.

Naive solutions

The array will have $n$ numbers $a_0, a_1, \ldots, a_{n-1}$. An obvious solution runs in $O(n^3)$ time and it just iterates over all $O(n^2)$ slices and for each of them it calculates their sum. Of course it can be speeded-up to $O(n^2)$ by reusing the calculations, as in the following C++ code:

int a[N], n; int max_slice_slow() { int slice = 0; REP(i, n) { int sum = 0; for (int j = i; j < n; ++j) { sum += a[j]; maximize(slice, sum); } } return slice; }

The code is rather self-explanatory: we are iterating over all possible slices $[i, j]$ and on the variable slice we are keeping the sum of the best slice found so far (i.e. slice of the largest sum, or “max-slice” for short). Since we will be dealing with maximizing stuff, it's quite handy to have a helper function which allows us to change the variable only to a bigger value:

template <class T> inline void maximize(T& a, const T& b) { if (b > a) a = b; }

Applying “divide and conquer” approach

One way to attack problems is to use a “divide and conquer” approach. We partition the array into two parts (left and right) of roughly the same size. Then we recursively solve the problem for both parts and finally merge these solutions in a solution of the whole array.

How can max-slice be positioned in respect to these two parts? It can either be completely inside the left part, completely inside the right part, or it can intersect them both. So we have to consider all these three cases, and take the sum which is maximal.

The first two cases are easy: since we recursively solve the problem for both parts, these solutions are precisely what we need to solve the two cases in which max-slice is completely covered in one part.

The last case is more interesting: what to do if the max-slice intersects both parts. We see that it consists of a suffix of the left part and a prefix of the right part. It's obvious that the sums in this suffix and prefix don't depend on each other (i.e. changing the suffix does not change the sum of the prefix and vice versa), so they can be chosen independently. Thus we want to independently maximize the sum in the suffix and in the prefix. We can do this by calculating all suffixes and prefixes, and choosing the largest sums:

int max_slice_recursive(int l, int r) { if (l+1 == r) { return max(0, a[l]); } else { const int s = (l + r) / 2, max_l = max_slice_recursive(l, s), max_r = max_slice_recursive(s, r); int max_suf = 0, max_pref = 0, sum = 0; for (int i = s-1; i >= l; --i) { sum += a[i]; maximize(max_suf, sum); } sum = 0; for (int i = s; i < r; ++i) { sum += a[i]; maximize(max_pref, sum); } return max(max(max_l, max_r), max_suf + max_pref); } } int max_slice() { return max_slice_recursive(0, n); }

The recursive function first checks for boundary case (when the array has one element). If not, it partitions the slice $[l, r)$ into two parts $[l, s)$ and $[s, r)$, and runs recursively in both of them. Then it calculates max-suffix in $[l, s)$ and max-prefix in $[s, r)$.

The time complexity of the merge part on an array of size $m$ is dominated by calculating suffixes and prefixes in time $O(m)$, thus the time complexity can be calculated using a recurrence formula $T(m) = 2T(m/2) + O(m)$, and the solution is $O(n \log n)$.

Faster approach

We see that the bottleneck of the above algorithm is to calculate suffixes and prefixes from scratch. But it turns out that it is not hard to extend the “divide and conquer” approach to calculate them. In other words: we not only want to calculate max-slice, we also want to calculate max-prefix and max-suffix, and we will be doing it recursively for each part.

We saw before that the sum of max-slice is maximum of three things, depending on where in fact is max-slice located: the sum of max-slice in the left part, the sum of max-slice in the right part and the sum of max-suffix from the left part plus the sum of max-prefix from the right part.

So we need to ask where can be max-prefix located. We have two cases: it is either completely inside the left part (as a prefix of this part), or it intersects both parts, but then it completely covers the left part and the rest is a prefix in the right part. Of course, the sums of prefixes in the left and right part should be as large as possible.

Thus we also need to calculate the sum of each part, but we can do it using prefix sums, or we can update it as we go. The similar thing we do for suffixes.

Thus for each part we will carry four values: max-slice, max-prefix, max-suffix, and the total sum. It will be handy to keep them in a structure, and create a function which given such structures for the left and the right parts calculates such a structure for the whole array, based on the above observations:

struct info_t { int slice; int prefix; int suffix; int sum; }; info_t join(const info_t& L, const info_t& R) { info_t I; I.slice = max(max(L.slice, R.slice), L.suffix + R.prefix); I.prefix = max(L.prefix, L.sum + R.prefix); I.suffix = max(L.suffix + R.sum, R.suffix); I.sum = L.sum + R.sum; return I; }

The rest of the code in then easy. First we need a function to calculate four values for a one-element array. The recursive function just takes care of partitioning and delegates rest of the work to join function:

info_t single(int val) { info_t I; I.slice = I.prefix = I.suffix = max(0, val); I.sum = val; return I; } info_t max_slice_recursive(int l, int r) { if (l+1 == r) { return single(a[l]); } else { const int s = (l + r) / 2; const info_t info_l = max_slice_recursive(l, s), info_r = max_slice_recursive(s, r); return join(info_l, info_r); } } int max_slice() { return max_slice_recursive(0, n).slice; }

Since now the merge procedure is done in constant time, the recurrence is $T(m) = 2T(m/2) + O(1)$, and the solution is $O(n)$. Thus we have obtained an optimal linear-time solution. But let's also try another approach.

Applying dynamic programming approach

In order to apply dynamic programming we need to specify subproblems. Usually a good subproblems for an array are prefixes of the array. So suppose we have a prefix $a[0..i-1]$ and we would like to extend it to prefix $a[0..i]$. So in other words we add to prefix $a[0..i-1]$ element $a[i]$ and we want to ask what is the max-slice then. We consider two cases: either this max-slice contains element $a[i]$ or not. If not, then it must be completely contained in $a[0..i-1]$, thus it is solved by this subproblem. Otherwise, it contains $a[i]$ and probably some part from $a[0..i-1]$, but since the slice must be contiguous, that part is a suffix of $a[0..i-1]$.

Thus for a subproblem we will calculate two things: max-slice and max-suffix. The code is as follows:

int slice[N], suffix[N]; int max_slice() { slice[0] = suffix[0] = 0; REP(i, n) { suffix[i+1] = max(0, suffix[i]) + a[i]; slice[i+1] = max(slice[i+1], suffix[i+1]); } return slice[n]; }

The time complexity is obviously $O(n)$. But we can simplify the code even more, by observing that it can be implemented in constant memory:

int max_slice() { int slice = 0, suffix = 0; REP(i, n) { maximize(suffix, 0); suffix += a[i]; maximize(slice, suffix); } return slice; }

This nice algorithm is attributed to Joseph B. Kadane. Observe that we maintain a current sum suffix and every time it plummets below zero, we reset it to zero. That is because it means that we have found a prefix of the whole array which has negative total sum, so we can discard it completely, since every suffix of this prefix will also have a negative sum.

Using “divide and conquer” on a tree

Both approaches give us linear-time algorithms, and they are doing that by calculating (among other things) max-slices for certain subproblems. For dynamic programming these subproblems are prefixes of the whole array (thus we can also obtain max-slice for every prefix). For “divide and conquer” algorithm these subproblems are parts into which we partition our array.

But if you are familiar with segment tree, you know that these parts corresponds to base ranges of this data structure. And that any subarray can be partitioned into $O(\log n)$ of such base ranges. Thus if we store the info_t values of every range, we could use join function to calculate max-slice inside any subarray of the array in time complexity of $O(\log n)$.

The below code is a quite standard implementation of the segment tree. Function tree_init initializes $2^{base}$ leaves of the tree and $2^{base}-1$ inner nodes. The initialization of the segment tree is done in $O(n)$ time.

Function max_slice_for_subarray calculates partition of subarray $[l,r)$ into base ranges and calculates info_t values on the fly. Each such query is done in $O(\log n)$ time:

info_t tree[2*N]; int base; void tree_init() { base = 1; while (base < n) { base *= 2; } REP(i, n) { tree[base + i] = single(a[i]); } for (int i = base-1; i > 0; --i) { tree[i] = join(tree[2*i], tree[2*i+1]); } } info_t max_slice_for_subarray(int l, int r) { l += base; r += base; info_t info_l = tree[l]; if (l+1 == r) { return info_l; } else { info_t info_r = tree[--r]; while (l/2 != r/2) { if (~l & 1) { info_l = join(info_l, tree[l+1]); } if (r & 1) { info_r = join(tree[r-1], info_r); } l /= 2; r /= 2; } return join(info_l, info_r); } }

This segment tree also allows us to change the contents of a cell in array a in $O(\log n)$ time:

void update(int i, int val) { a[i] = val; i += base; tree[i] = single(val); while (i > 1) { i /= 2; tree[i] = join(tree[2*i], tree[2*i+1]); } }

Retrieving the best slice

In the above algorithms we showed how to calculate the total sum of the max-slice, but not how to retrieve the location of the slice. In order to do that we also need to track this location, for instance by keeping left and right indices along with every sum. This would result in at least four additional values in info_t structure (the end of the max-prefix, the beginning of the max-suffix, and both endpoints of the max-slice) and a pretty messy join function. But we can do better, by hiding these in a new structure representing the candidate slice:

struct slice_t { int sum, l, r; bool operator<(const slice_t& other) const { return sum < other.sum; } }; slice_t add(const slice_t& L, const slice_t& R) { slice_t I; I.sum = L.sum + R.sum; I.l = L.l; I.r = R.r; return I; }

We can use this slice_t structure for all four members in info_t. This is a little wasteful memory-wise (since we use it even for obvious values, like the beginning of the max-prefix, which results in 12 integers), but this is a price we pay for a simple implementation: the only thing which needs to be changed in join function is replacing plus signs with calls to above add function.

To be continued…