Increasing subsequences and related problems

One of the classic algorithmic problems is a problem of finding the longest increasing subsequence in a sequence of integers. It has nice solutions (either using binary search or segment trees) and connections with order theory (Dilworth theorem). Moreover, it can be generalized in a various ways, some of them resulting in quite non-trivial algorithms.

We have a sequence $a[0..n-1]$ of integers (or more general: any objects with a linear order defined on them). We will be considering increasing subsequences, that is subsequences in which each element is strictly greater than the previous one. Similarly we can talk about non-decreasing subsequences when each element in the subsequence is greater or equal that the previous one.

The length of the longest increasing subsequence

Usually we are interested in a longest increasing subsequence, that is an increasing subsequence that has the most number of elements. We will call it LIS for short. For finding the LIS we can do simple dynamic programming. By $lev[i]$ we denote the length of LIS which uses only elements from the prefix $a[0..i]$ and ends in element $a[i]$ minus 1 (we call it a level of element $i$). Then we have simple recurrence:

$lev[i] = \max\{ lev[j] + 1 \mid j < i \text{ and } a[j] < a[i] \}.$

So it is easy to present a $O(n^2)$ code:

int lev[N]; int lis_slow() { REP(i, n) { lev[i] = 0; REP(j, i) { if (a[j] < a[i]) { maximize(lev[i], lev[j] + 1); } } } return *max_element(lev, lev+n) + 1; }

We will get a faster algorithm using different approach. Once again we investigate prefix $a[0..i]$, and suppose that we would like to create a subsequence ending in $a[i]$ of length $k$. Then this subsequence should contain a subsequence of length $k-1$ which ends in some $a[j]$ for $j<i$ (thus is in prefix $a[0..i-1]$). But we do not have to consider all such shorter subsequences – the smaller value of $a[j]$ the better, so we only need to know what is the smallest last element of some subsequence of length $k-1$.

Therefore we will maintain an array where $d[j]$ is the value of the smallest element which is the last element of some subsequence of length $j+1$ (or $\infty$ if such subsequence does not exist). Observe that from the definition we have that this array is increasing.

Suppose that we have this array for the prefix $a[0..i-1]$ and we want to update it for the prefix $a[0..i]$. Then we can find such element $j$ that $d[j-1] < a[i] \leq d[j]$. We can update $d[j] := a[i]$, since there is subsequence of length $j-1$ ending in $d[j-1] < a[i]$, and $a[i]$ as the last element is at least as good as $d[j]$. But this is all: for all values $d[j..]$ we cannot extend them by $a[i]$ and for all values $d[0..j-1]$ they already have smaller last elements.

int d[N],k; int lis() { REP(i, n) { const int j = lower_bound(d, d+k, a[i]) - d; lev[i] = j; d[j] = a[i]; if (j == k) { ++k; } } return k; }

The above code keeps track of number $k$ of non-trivial (smaller than $\infty$) elements in the array $d$. It also uses lower_bound algorithm to find index $j$ using binary search in $O(\log n)$. Thus the time complexity of this solution is $O(n\log n)$. As we can see it also calculates level of each element.

One observation: if we consider only the elements of a fixed level, they form a non-increasing subsequence. We will use that observation later.

Two small tricks here: to change the first algorithm in order to find the length of the longest non-decreasing subsequence we simply change “$a[j] < a[i]$” to “$a[j] \leq a[i]$”. In the second algorithm this leads to finding $j$ such that $d[j-1] \leq a[i] < d[j]$, thus we can just use upper_bound algorithm.

Second trick: if we want to calculate values for reversed order, just change sign of each value of $a[i]$.

For now we only calculated the length $k$ of LIS, but having the levels of elements it is quite easy to retrieve some LIS. We just find any element $i$ with level $k-1$ and put it as last element of LIS and recursively find rest of the LIS in $a[0..i-1]$. Sample code:

int ans[N]; void get_lis() { const int k = lis(); REPDN(i, n) { if (lev[i] == k-1) { ans[k--] = i; } } }

Note that in the above code we always select the furthest possible element of the given level, therefore it is the smallest possible element and we do not have to check whether it is less that $ans[k+1]$.

Number of solutions

In order to get the number of all LIS we will calculate array in which $cnt[i]$ is the number of increasing subsequences of maximal length which ends in $i$ (thus of length $lev[i]+1$). The recurrence is very simple:

$cnt[i] = \sum_{j < i} cnt[j] \cdot [lev[j]+1 = lev[i] \text{ and } a[j] < a[i]]$

with initial value of $cnt[0] = 1$.

int cnt[N]; int cnt_list_slow() { const int k = lis_slow(); int C = 0; REP(i, n) { cnt[i] = i==0; REP(j, i) { if (lev[j]+1 == lev[i] && a[j] < a[i]) { cnt[i] += cnt[j]; } } if (lev[i] == k-1) { C += cnt[i]; } } return C; }

Since the number of increasing subsequences can be exponential (e.g. for sequence $a[i]= i \text{ xor } 1$ we have $2^{n/2}$ LIS of length $n/2$), then we probably are interested in numbers modulo $M$, but we won't add it to the code for readability.

But we would like something better than the above $O(n^2)$ algorithm. Observe that when calculating $cnt[i]$ we only consider elements from one level $lev[i]-1$. Moreover from these elements we take only these which are before $i$ (which is a contiguous segment) and only these which have less that $a[i]$ (but again from monotonicity of elements in one level, it is a contiguous segment). Therefore, for each level we will store a vector of elements in this level, together with partial sums of counts:

struct elem_t { int val; int cnt; int pref_cnt; }; vector<elem_t> levs[N]; int cnt_list() { const int k = lis(); int C = 0; REP(i, n) { const int j = lev[i]; int pos = lower_bound( levs[j-1].begin(), levs[j-1].end(), a[i], greater) - levs[j-1].begin(); int cnt = levs[j-1].back().pref_cnt - levs[j-1][pos-1].pref_cnt; levs[j].push_back(elem_t(a[i], cnt, levs[j].back().pref_cnt + cnt)); if (j == k-1) { C += cnt; } } return C; }

That gives $O(n\log n)$ algorithm for calculating the number of LIS.

To be continued…