Survey of knapsack problems

The knapsack problem is one of the classic examples of a problem in combinatorial optimization. Since it is discussed in many textbooks about algorithms, one can say that each competitive programmer should known it pretty well. However, most of these books deals only with one or two most common variants of the knapsack problem, and it turns out that there are plenty of such variants. In this article we will discuss a number of them. Most of the variants will have dynamic programming solutions, and one or two clever observations. However, certain variants will use other algorithms.

This survey collects various tricks and ideas which I encountered when reading solutions to various problems. My original input here is to present them all in hopefully logical way, and sometimes propose my own solutions to some variants, in order this survey to be as complete as possible. This article is significantly extended version of a lecture about knapsack problem I gave at Brazilian ICPC Summer School in 2018.

Standard knapsacks

We begin with probably the simplest formulation. We have a set of \(n\) items, each of them is described by its weight (sometimes called size or something similar); the \(i\)-th item has weight \(w_i\). We also have a knapsack of weight limit \(M\) and we are asked to find a subset of items of total weight as large as possible, but less than or equal to the limit (so that the selected items can fit into the knapsack). This is called 0-1 knapsack problem.

The dynamic programming is rather straightforward here. We denote by \(d[i,j]\) a Boolean value which says whether exists a subset of total weight exactly \(j\) when we are allowed to use items number \(\{ 1, 2, \ldots, i\}\). Thus variables satisfy $0 \leq i \leq n$, $0 \leq j \leq M$. Of course \(d[0,j] = [j = 0]\), since without any items, we can only obtain an empty subset.

For \(i\geq 1\) we consider two cases: either we do not take the \(i\)-th item to the subset (so the weight \(j\) can be realized only if \(d[i-1,j]\) is true), or we take it, but only if its weight is no bigger than \(j\) and \(d[i-1,j-w[i]]\) is true:

\(d[i,j] = d[i-1,j] \text{ or } d[i-1,j-w[i]] \cdot [j \geq w[i]]\)

The answer is the biggest value \(j \leq M\) for which \(d[n,j]\) is true. The code is quite simple:

#include <cstdio> using namespace std; #define REP(i,n) for(int i=0;i<(n);++i) const int N = 1000, MAXM = 1000; int n, M, w[N]; bool d[N+1][MAXM+1]; int knapsack() { REP(j, M+1) d[0][j] = j==0; REP(i, n) { REP(j, M+1) { d[i+1][j] = d[i][j]; /* rewrite */ if (j >= w[i]) { d[i+1][j] |= d[i][j - w[i]]; } } } return last_max_element(d[n], d[n]+M+1) - d[n]; }

The time complexity is \(O(nM)\). The space complexity is the same, but can be easily reduced to \(O(M)\), using the standard DP trick of storing only two last rows of array $d$ (since the calculations in the row only refers to values in the previous row). But here we can do even better. Observe that line with a comment just rewrites row \(i\) to row \(i+1\), and then in next two lines we improve some cells. Moreover during improvement we use only cells with lower \(j\), so we can store everything in one row, as long as we iterate over it backwards:

bool d[MAXM+1]; int knapsack() { REP(j, M+1) d[j] = j==0; REP(i, n) { for (int j = M; j >= w[i]; --j) { d[j] |= d[j - w[i]]; } } return last_max_element(d, d+M+1) - d; }

If we want, we can utilize the above optimization to write the recurrence formula in a slightly more readable way, by removing row indices:

$d_{\mathrm{new}}[j] = d[j] \text{ or } d[j-w[i]] \cdot [j \geq w[i]]$

Note also that we solve more general problem: our array \(d[j]\) is true if we can choose a subset of items such that its total weight is exactly \(j\). We will call it a knapsack structure. It will allow us to answer in constant time queries of form: is there a subset of total weight exactly \(j\), where \(0 \leq j \leq M\). The space complexity of the structure is \(O(M)\) and we can build it in \(O(nM)\).

Unlimited amount of items of each kind

We can also think about items not as single entities, but as various kinds, from which we can take multiple items. The simplest variant is when we can take unlimited amount of items of each kind. This is called unbounded knapsack problem. It is easy to adapt the previous algorithm. Just observe that if we consider taking an item of the $i$-th type, the remaining place \(j-w[i]\) can still contain this kind of item, therefore the main recurrence is replaced by

\(d[i,j] = d[i-1,j] \text{ or } d[i,j-w[i]] \cdot [j \geq w[i]]\)

It is also easy to change code which uses one row. Now we can immediately use the values of improved cells, so we only need to change the direction of the inner loop:

REP(i, n) { for (int j = w[i]; j <= M; ++j) { d[j] |= d[j - w[i]]; } }

This gives us $O(nM)$ algorithm for the unbounded knapsack problem.

Limited amount of items of each kind

A little bit harder variant is when we limit number of items we can take from each kind. Denote by \(b_i\) number of items of the \(i\)-th kind. This is called bounded knapsack problem. Of course the naive approach would be to copy each item \(b_i\) times and use the first algorithm for 0-1 knapsack. If $b_i \leq B$, then this would use time $O(nMB)$, which is not very fast.

But copying each item $b_i$ times is quite wasteful. We do not need to examine each item separately, since they are indistinguishable anyway, but we need some method to allow us to take any number of such items between $0$ and $b_i$, or in other words a bunch of items of total weight from the set $\{0, w_i, 2w_i, \ldots, b_i w_i\}$.

In order to do it more efficiently, we will group the items in “packets”. It is clearer to see how to do it if we assume that $b_i = 2^k-1$ for some $k$. Then we can form $k$ packets of sizes $w_i, 2w_i, 4w_i, \ldots, 2^{k-1} w_i$. Now if we want to take some packets of total weight $jw_i$, we just take packets based on set bits in binary representation of $j$. In general case when $2^k \leq b_i < 2^{k+1}$ we add an additional packet of size $r = (b_i - 2^{k-1} + 1)w_i$. If we take this packet, the remaining weight $b_i w_i - r$ is smaller than $2^k w_i$, so it can be covered by choosing from the first $k$ packets.

Therefore for the elements of the $i$-th type we will have $O(\log b_i)$ packets, so the algorithm (0-1 knapsack on these packets) will run in time $O(n M \log B)$.

But we can do faster. Let's consider adding the $i$-th type. Observe that the recurrence formula is as follows:

$d_{\mathrm{new}}[j] = \mathrm{or}\big(d[j], d[j-w_i], d[j-2w_i], \ldots, d[j-b_i w_i]\big),$

where for notational simplicity we assume that $d[\cdot]$ is false for negative indices. That means that the new value is true if there is at least one true value among $b_i$ values which we get starting from index $j$ an jumping backwards every other $w_i$-th index. Note that for $w_i=1$ we could just say that “there is at least one true value among the last $b_i+1$ values”. Such condition would be very easy to check, by maintaining the index $jr$ of the most recent true value; then the condition is true if $j - jr \leq b_i$.

But we can extend this idea for bigger values of $w_i$, by maintaining an array $jr$ of size $w_i$, where different indices keep track of most recent true values for cells which indices have the same remainder when divided by $w_i$. Maybe it is easier to explain it by providing the code:

int jr[MAXM+1]; REP(i, n) { const int w = w[i], b = b[i]; REP(j, w) jr[j] = -infty; REP(j, M+1) { d[j] |= j - jr[j % w] <= b*w; if (d[j]) { jr[j % w] = j; } } }

The time complexity of this algorithm is $O(nM)$. The algorithm highlights quite important idea we will use when creating algorithms for more complicated variants of bounded knapsacks. The idea helps us working with recursive formulas which, when calculating $d[j]$, depend on values $d[j-kw]$ for $0\leq k\leq b$. The other way of thinking about this is as follows: we partition array $d$ into $w$ arrays $d^0, d^1, \ldots, d^{w-1}$ by taking to array $d^J$ every $w$-th element from array $d$ starting from index $J$, i.e. $d^J[j'] = d[J + j'w]$. Then the recursive formula for $j = J + j'w$ can be written as follows:

$d^J_{\mathrm{new}}[j'] = \mathrm{or}\big(d^J[j'], d^J[j'-1], d^J[j'-2], \ldots, d^J[j'-b]\big),$

thus it depends on a consecutive range of indices $d^J[j'-k]$ for $0 \leq k\leq b$. And often we can propose a faster algorithm which uses the fact that the range of indices is consecutive.

But for the basic bounded knapsack problem we can show a different algorithm of $O(nM)$ complexity, which does not need additional memory. Once again let's consider adding the $i$-th type. If \(d[j]\) is true, we can generate subsets \(j + w_i, j + 2w_i, \ldots, j + b_i \cdot w_i\), thus we should set all these indices of array $d$ to true. Doing it naively would need time $O(nMB)$.

But let's iterate over $j$ decreasingly and take a closer look what happens. When examining value $j$, we can assume that for all values $j' > j$ if $d[j']$ is true, then $d[j' + k'\cdot w_i]$ is also true for all $k' \leq b_i$. So if $d[j]$ is true, we iterate over $k$ and set $d[j + k\cdot w_i]$ to true, until we hit the cell which was already set. Of course it couldn't be set in this iteration (since then $d[j + (k-1)\cdot w_i]$ must also be set), thus the size $j+k\cdot w_i$ can be obtained using elements of type smaller that $i$. But this means that when previously algorithm examined $j+k\cdot w_i$, the value $d[j+k\cdot w_i]$ was true, so it also set to true all values $d[j+(k+k')\cdot w_i]$ for all $k' \leq b_i$. Therefore we can stop our iteration over $k$. The code is as follows:

REP(i, n) { for (int j = M - w[i]; j >= 0; --j) { if (d[j]) { int k = b[i], x = j + w[i]; while (k > 0 && x <= M && !d[x]) { d[x] = true; --k; x += w[i]; } } } }

Observe that the inner while loop invokes body if some cell \(d[x]\) changes its value to true (the loop fills cells and stops if it finds a cell filled before). It can happen only \(M\) times, so total time complexity is \(O(nM)\).

Sum of weights of items is limited

This algorithm can also help us in a slight variant of 0-1 knapsack problem in which we have a limit on the sum of weights of all items, i.e. \(\sum w_i \leq S\). The idea is that if \(S\) is small, then there cannot be many items of different weights (weights of items have to repeat). More precisely, if we have \(k\) items of different weights, their total weight must be equal to at least \(1 + 2 + \ldots + k = O(k^2)\), so the number of different weights is \(O(\sqrt S)\). Thus we can utilize this fact and group items of the same weight and use algorithm for bounded knapsack problem. Thanks to that we add a bunch of items of the same weight in linear time, which results in time complexity \(O(\sqrt S M)\).

Of course this is after a preprocessing phase in which we form groups. It will run in time \(O(n \log n)\) or \(O(n + W)\) where $W$ is the upper bound on the size of the item weight, depending how we sort the weights. Important distinction between this and all previous algorithms is that int the previous ones we added items one by one, each in time $O(M)$, and now we must enforce some order by cleverly grouping them.

Note that exactly the same algorithm works for bounded knapsack with limit on the sum of weights.

Big knapsack

Most of the variants we discuss here has linear dependency on knapsack size \(M\). That is no surprise, since the knapsack problem is NP-complete and for big limits we know only exponential algorithms. For instance meet-in-the-middle one which after building phase of time \(O(n2^{n/2})\) and memory $O(2^{n/2})$ can answer queries in time \(O(2^{n/2})\).

But there is one more interesting variant for big values of \(M\) in which weights \(w_i \leq W\) are small. Let's take any item \(w\) and build array \(d[j]\) for \(0\leq j < w\) in which \(d[j]\) is the smallest weight among all obtainable subsets of weights giving remainder \(j\) when divided by \(w\). Having this array answering queries in constant time is easy, since we can obtain weights $d[j] + kw$ for any $k\geq 0$:

bool query(int j) { return d[j % w] <= j; }

The array can be calculated using shortest paths algorithm for directed graphs. We consider a graph with $w$ vertices $v_0, \ldots, v_{w-1}$. For every vertex $v_j$ and every item of weight $w_i$ we add a directed edge from $v_j$ to $v_{(j + w_i) \bmod w}$ with length $w_i$. Then $d[j]$ is the length of the shortest path from $v_0$ to $v_j$.

The overall time complexity depends on the size of this graph. Of course, it's the best to use the smallest weight as value $w$. Moreover, the number of edges from each vertex is bounded by $\min(n,w)$, since we need to store only the cheapest edge between each pair of vertices. So depending on whether the graph is dense or sparse, it make sense to use different implementation of Dijkstra algorithm: bucket $O(n + w^{3/2} + w\min(n,w))$ for dense graph or binary heap $O(wn\log w)$ for sparse graph.

To be continued…