Solutions to problems from IOI 2018

Here I present a detailed description how to solve all six problems from the International Olympiad in Informatics 2018 which were held in Tsukuba, Japan. This work was partially funded by the International Committee of IOI.

So far four problems are described (Combo, Werewolf and Seats from the first day of the competition, as well as Mechanical Doll from the second day). The final two will be added mid-February.

Combo

In this problem there is a secret string $S$ of length $N$ which consists of four characters (namely $A$, $B$, $X$ and $Y$). We also know that the first character of the string never reappears in it. Our task is to determine the string by using at most $N+2$ guesses. In each guess we specify a string $G$ of length at most $4N$ consisting of above four characters, and we get the maximum length of a prefix of $S$ which is a substring of $G$.

Let's try some sample. Suppose that $N=3$, and we make our first guess of $G = ABX$. Let's see what answers we could get. If the answer is $3$, then it means we are lucky – there is only one substring in $G$ of length $3$ (the whole string) and it is a prefix of $S$, so $S=ABX$ and we are done. But probably we won't be so lucky in our first guess. If the answer is $2$, then it means that either $AB$ or $BX$ is a prefix of $S$, thus $S = AB?$ or $S = BX?$. If the answer is $1$, then it means that either $A$, $B$ or $X$ is a prefix of $S$, so we know that the first letter of $S$ is either $A$, $B$ or $X$. Finally, if the answer is $0$, that means that the first letter of $S$ must be $Y$.

First attempt – guessing exactly one prefix

It's hard to guess the whole string, but we can start small, and try to guess some shorter prefix of $S$ – maybe just one letter? In order to do that, we can ask $4$ guesses: $G=A$, $G=B$, $G=X$ and $G=Y$. Exactly one of them will give us answer $1$, and this will be the desired letter; suppose that it was $B$. Then we can guess the second letter, by asking similar $4$ guesses: $G=BA$, $G=BB$, $G=BX$ and $G=BY$.

This idea leads to a solution in which we guess letters one by one from the beginning. Suppose that we already guessed the first $i$ letters, which form a prefix $P$ of length $i$. To guess the $(i+1)$-th letter we use $4$ guesses, in each guess we append another letter to this prefix, thus we have guesses $G=PA$, $G=PB$, $G=PX$ and $G=PY$. Exactly one of such guess will return $i+1$ (the others will return $i$). Therefore we will use exactly $4N$ guesses, which guarantees us a positive score for the problem. The code is quite simple:

#include <algorithm> #include <string> #include "combo.h" using namespace std; #define REP(i, n) for (int i = 0; i < (n); i++) string guess_sequence(int N) { const int M = 4; const char LET[] = "ABXY"; int len[M]; string s; REP(i, N) { REP(j, M) { len[j] = press(s + LET[j]); } REP(j, M) { if (len[j] == i+1) { s += LET[j]; break; } } } return s; }

This code can be easily optimized. If any call to function press returns with $i+1$, we can immediately break, since we know that we have found the next letter. This could help us in some cases, but we wouldn't gain anything in the worst case when $S=YYY\ldots Y$. But if the first $3$ guesses $G=PA$, $G=PB$ and $G=PX$ returned $i$, we are sure that the next letter is not $A$, nor $B$, nor $X$, so it definitely must be $Y$ and we don't need the fourth guess to confirm this. So it is enough to make $3N$ guesses:

string s; REP(i, N) { REP(j, M-1) { int len = press(s + LET[j]); if (len == i+1) { s += LET[j]; break; } } if (s.size() != i+1) { s += LET[M-1]; } }

Interlude – worst-case and average-case

This reasoning is about the worst case – we will never make more than $3N$ guesses. But the exact number of guesses depends on the test case – for some test cases we could do fewer number of guesses. If we test letters in order $A$, $B$, $X$, $Y$, then for a test case $S=AAA\ldots A$ we would use $N$ guesses, for test case $S=BBB\ldots B$ we would use $2N$ guesses, but for cases $S=XXX\ldots X$ and $S=YYY\ldots Y$ we would use $3N$ guesses. And no matter what order of letters we use, the judges will test our program on all such cases, and for at least one of them we will use $3N$ guesses.

But suppose that the test case was random, i.e. each letter would be randomly generated with probability $1/4$. What would be the average number of guesses then?

We have $1/4$ chances that the first letter is $A$, so we would guess it on the first try. We have also $1/4$ chances that it is letter $B$, so we would guess it on the second try. Otherwise we need to use third guess. So on average number of needed guesses is $1/4 \cdot 1 + 1/4 \cdot 2 + 1/2 \cdot 3 = 9/4 = 2.25$.

But test cases don't have to be random, and for sure there will be some non-random test cases prepared by the judges. But there are some techniques, gathered in a general name of randomization, which when correctly applied, allows us to treat input data as random.

And we can apply such technique in our case. Rather than choosing fixed order of testing letters, we will choose this order independently for each letter, and every time we will do it randomly. In this case the reasoning is as follows: the letter to guess is fixed, but we have $1/4$ chances that it will be the same as the first letter in our random order, we have $1/4$ chances that it will be the same as the second letter in our random order, and so on. With this approach we would get $2.25 N$ guesses on average (but still $3N$ guesses tops if we are extremely unlucky):

int idx[M]; REP(i, M) { idx[i] = i; } string s; REP(i, N) { random_shuffle(idx, idx+M); REP(j, M-1) { int len = press(s + LET[idx[j]]); if (len == i+1) { s += LET[idx[j]]; break; } } if (s.size() != i+1) { s += LET[idx[M-1]]; } }

In the above code we don't test letters in fixed order, but we use the permutation stored in array idx. For every letter this permutation is randomly shuffled using function random_shuffle from C++ standard library.

Dealing with the first letter separately

So far we have not used the fact which was written in bold in task statement, namely that the first character never reappears in the string. That means that only for the first letter we have $4$ possibilities, and if we determine it, for each remaining letter we have only $3$ possibilities.

That means that we can test for the first character using only $3$ guesses, and for the rest of characters we need only $2$ guesses. Therefore we need in total $3 + 2(N-1) = 2N+1$ guesses. This will give us 30 points.

Improving our randomization approach we would have on average $1/3 \cdot 1 + 2/3 \cdot 2 = 5/3$ guesses, so in total $3 + 5/3(N-1) = 5/3N + 4/3$ guesses.

Guessing more than one prefix at once

So far we have only asked for one prefix, so the strings we were guessing were limited to $N$ in length. But we are allowed to asks for strings of length up to $4N$. Can we use this to reduce the number of guesses?

We somehow need to ask for more than one prefix in one guess. Let's try to do this for the first letter. Suppose that we guess $G = AB$. If we get the answer of $1$ or $2$, we know that either $A$ or $B$ is the first letter of the string. Otherwise we get $0$, and we know that either $X$ or $Y$ is the first letter of the string. In both cases after one guess we limited the number of possibilities for the first letter to two characters. It is easy to see that by asking $G=A$ or $G=X$ in the second guess allows us to determine the first letter in exactly two guesses. (That allows us to improve the previous algorithm to $2N$ guesses.)

We can try to utilize this trick for subsequent letters. So suppose we have already guessed the prefix $P$ of length $i$ and we are trying to determine letter on position $i+1$. We could try a guess $G = PAPB$ of length $2(i+1)$ in hope that getting answer of $i+1$ means that either $A$ or $B$ is at position $i+1$.

But we must be careful here. Suppose that $P = XAX$ and we asks for $G = XAXAXAXB$. Then prefix $P$ appears in $G$ not two times (as we intended), but three. Luckily, this additional occurrence is also followed by letter $A$, so no harm would be done here, but should it was followed by letter $X$, we could get answer $i+1$ even if $S$ starts with $PX$.

Lucky enough, we can prove that this could not happen, that is all additional occurrences of $P$ must be followed by $A$. Thanks to this, after question $G=PAPB$ we can establish whether the next letter is in set $\{A,B\}$ or in set $\{X,Y\}$, and then we can ask another question $G=PA$ or $G=PX$. That gives us an alternative algorithm for $2N$ guesses, which can be written quite compactly:

string s; REP(i, N) { int len = press(s + LET[0] + s + LET[1]); int id = 2*(len != i+1); len = press(s + LET[id]); id += (len != i+1); s += LET[id]; }

What if we use once more the fact that the first letter cannot reappear in the string $S$? Then for every position except the first we have only three possibilities. Suppose that $Y$ is the first letter. Then the answer to guess $G = PAPB$ gives us exactly the same information as answer to guess $G = PX$. Thus we are in no better shape than the algorithm we have used previously that asks two questions per position.

Now for every question we get a binary answer: either $i$ or $i+1$. If we want to guess a letter using only one question, and we have three possibilities for a letter, we need another possibility for an answer. Let's try to make a guess in which we could get $i+2$ for the answer.

That means that $G$ must contain prefix $P$ followed by letters in positions $i+1$ and $i+2$ in string $S$. It looks like in order to guess letter in position $i+1$ we must know letter in position $i+2$. But not necessarily. Look at the following guess: $G = PAAPABPAX$. If the letter in position $i+1$ is $A$, the answer would be $i+2$, since all possible extensions of $PA$ (namely $PAA$, $PAB$ and $PAX$) are present in $G$. Note that since the letter $Y$ appears at the beginning of $P$ and only there, the prefix $P$ will appear exactly three times in $G$, as intended by us.

Therefore we will ask guess $G = PAAPABPAXPB$. If the answer is $i+2$, we know that the next letter is $A$. If the answer is $i+1$, we know that the next letter is $B$. Finally, for answer $i$, we know that it is $X$. Such question will give us the next letter in one guess.

We must be a little bit careful though. First, let's check whether the length of the guess stays within $4N$ limit. If $P$ is of length $i$, the guess is of length $4i + 7$, which is fine up to the penultimate position, for which we have $4(N-2) + 7 = 4N-1$. For the last position the length would be too big, but we could not use this method for the last position anyhow, since there is no additional letter after it. Thus we will use standard method of $2$ guesses for the last position.

Therefore we will use $2 + N-2 + 2 = N+2$ guesses, which gives us perfect points:

int idx[M]; REP(i, M) { idx[i] = i; } string s; // First letter { int len = press(s + LET[0] + s + LET[1]); int id = 2*(len != i+1); len = press(s + LET[id]); id += (len != i+1); s += LET[id]; swap(idx[id], idx[M-1]); } // Almost rest REP(i, N-2) { string z; REP(j, M-1) { z += s + LET[idx[0]] + LET[idx[j]]; } z += s + LET[idx[1]]; int len = press(z); if (len == i+3) { s += LET[idx[0]]; } else if (len == i+2) { s += LET[idx[1]]; } else { s += LET[idx[2]]; } } // Last letter { int len = press(s + LET[0] + s + LET[1]); int id = 2*(len != N); len = press(s + LET[id]); id += (len != N); s += LET[id]; }

At the end, we have a question for the interested readers. Modify the model solution a little bit to get an algorithm which will use on average strictly less than one guess per position.

Werewolf

In this problem we are given a connected undirected graph with $N$ vertices and $M$ edges. The vertices are numbered from $0$ to $N-1$. We must answer $Q$ queries. Each query is represented by four integers $(S,E,L,R)$. We must find a path $v_0, v_1, \ldots, v_k$ in a graph starting in vertex $S$ (thus $v_0 = S$), ending in vertex $E$ (thus $v_k = E$), with some switch vertex in the middle $v_s$ (for $0 \leq s \leq k$). All vertices up to the switch vertex (inclusive) must have big numbers ($L \leq v_0, v_1, \ldots, v_s$) and all vertices after switch vertex (inclusive) must have small numbers ($v_s, v_{s+1}, \ldots, v_k \leq R$).

Solving queries independently

In problems that involves queries, it is usually a good idea to develop a solution for a single query. This is often much easier, gives us some intuition about the problems, and also allows us to write a slow solution for partial points, that handles queries independently one by one.

The first idea is to simulate exactly what is given in the problem statement. In order to find a path from $S$ to $E$ with some switch vertex, we can iterate over all possible choices for this switch vertex $v$, and then find a path from $S$ to $v$ using only vertices in range $[L,N-1]$ and find a path from $v$ to $E$ using only vertices in range $[0,R]$. The C++ code for this idea can look like this:

vector<int> check_validity(int N, vector<int> X, vector<int> Y, vector<int> S, vector<int> E, vector<int> L, vector<int> R) { int Q = S.size(); vector<int> A(Q); // Construct the graph vector<vector<int> > adj(N); int M = X.size(); REP(i, M) { adj[X[i]].push_back(Y[i]); adj[Y[i]].push_back(X[i]); } REP(i, Q) { // Compute the i-th query REP(v, N) { // Try switch vertex bool reach_S = reachable(adj, N, S[i], v, L[i], N-1); bool reach_E = reachable(adj, N, v, E[i], 0, R[i]); A[i] |= reach_S && reach_E; } } return A; }

To implement function reachable we can use any graph-traversal algorithm, but we must only visit vertices which are in a certain range. Here we use BFS:

bool reachable(vector<vector<int> >& adj, int N, int start, int end, int lo, int hi) { // Can we go from start to end using vertices in range [lo, hi]? queue<int> que; vector<bool> vis(N); if (lo <= start && start <= hi) { que.push(start); vis[start] = true; } while (!que.empty()) { int v = que.front(); que.pop(); for (int u : adj[v]) { if (lo <= u && u <= hi && !vis[u]) { que.push(u); vis[u] = true; } } } return vis[end]; }

Time complexity of function reachable depends linearly on the size of the graph, thus it is $O(N+M)$. For each $Q$ queries and for each $N$ choices of switch vertex we run it twice, so the total time complexity of this solution is $O(QN(N+M))$ and it's enough to solve subtask 1.

Let's look at calls to reachable for a fixed query. Note that we make multiple calls to the same starting vertex $S_i$ and various end vertices. But in order to check reachability from $S_i$ to given vertex the function reachable calculates array vis which stores reachability from $S_i$ to any other vertex, and then examines one cell of this array.

It is not very wise to repeat calculations for array vis. It's better to calculate it once, and then reuse it for all choices of switch vertex. Note that we can do the same with calls to the same ending vertex $E_i$, since in undirected graph we can reach $E_i$ from $v$ if and only if we can reach $v$ from $E_i$. New code looks like this:

vector<bool> reachable(vector<vector<int> >& adj, int N, int start, int lo, int hi) { // Where can we go from start using vertices in range [lo, hi]? // ... return vis; } vector<int> check_validity(int N, vector<int> X, vector<int> Y, vector<int> S, vector<int> E, vector<int> L, vector<int> R) { // ... REP(i, Q) { // Compute the i-th query vector<bool> reach_S = reachable(adj, N, S[i], L[i], N-1); vector<bool> reach_E = reachable(adj, N, E[i], 0, R[i]); REP(v, N) { // Try switch vertex A[i] |= reach_S[v] && reach_E[v]; } } return A; }

We can think of reach_S as a set of vertices that are reachable from $S$ using vertices with big numbers and reach_E as a set of vertices reachable from $E$ using vertices with small numbers. Finding existence of a switch vertex is just checking whether these two sets intersect. For each query we run reachable twice and iterate over switch vertices in time $O(N+M)$, so the total complexity is $O(Q(N+M))$ and solves subtasks 1 and 2.

We can do an alternative solution which solves query independently. We can traverse the graph looking for a path from $S$ to $E$, but now instead of explicitly finding switch vertex, we can track in which part of the path we are. Therefore we will create a new graph with $2N$ vertices. For every vertex $v$ in the original graph we have two vertices $(v,0)$ and $(v,1)$ in the new graph: the second coordinate specifies whether we are before or after switch vertex. Thus for every edge $vu$ in the original graph we have edges $(v,0)(u,0)$ and $(v,1)(u,1)$ in the new graph which specify normal move along the path. But we also have edges $(v,0)(v,1)$ which specify situations in which we select switch vertex. Now all we need to do is find a path in this new graph from $(S,0)$ to $(E,1)$. This path will have form

$$(v_0,0), (v_1,0), \ldots, (v_s,0), (v_s,1), (v_{s+1},1), \ldots, (v_k,1),$$

with $v_0=S$ and $v_k=E$ and vertex $v_s$ will be the switch vertex. The code is as follows:

vector<int> check_validity(int N, vector<int> X, vector<int> Y, vector<int> S, vector<int> E, vector<int> L, vector<int> R) { // ... REP(i, Q) { // Compute the i-th query int start = S[i]*2; int end = E[i]*2 + 1; queue<int> que; vector<bool> vis(2*N); que.push(start); vis[start] = true; while (!que.empty()) { int v = que.front(); que.pop(); int vert = v/2, form = v%2; // Move for (int u : adj[vert]) { if ((form == 0 && u >= L[i]) || (form == 1 && u <= R[i])) { int uu = u*2 + form; if (!vis[uu]) { que.push(uu); vis[uu] = true; } } } // Change form if (form == 0 && vert <= R[i]) { int uu = vert*2 + 1; if (!vis[uu]) { que.push(uu); vis[uu] = true; } } } A[i] = vis[end]; } return A; }

We have used an implementation trick: we represent node $(v,i)$ as an integer $2v+i$. Now for every query we do a single BFS in a graph with $O(N)$ vertices and $O(M+N)$ edges, thus the total time complexity is $O(Q(N+M))$.

Solving problem on a line graph – the first approach

Another simplification we can do when solving a problem on a graph, is to make this graph simpler. One idea would be to try to solve the problem on a tree, or even on a line graph (in which all vertices are connected in a line). Since this is what subtask 3 is about, it definitely is worth trying.

Why the problem on the line is simpler? Recall that we can solve the problem by calculating a set $V_L$ of vertices that are reachable from $S$ using vertices with numbers at least $L$ and a set $V_R$ of vertices that are reachable from $E$ using vertices with numbers at most $R$. The answer to the query is positive, if sets $V_L$ and $V_R$ intersect. These sets looks kind of arbitrary, and to calculate them and their intersection we spend $O(N+M)$ time.

However, in a line graph, every such set forms a continuous range, thus it can be described by two numbers: the position of the first and the last vertex in the range. And we can calculate intersection of such ranges in constant time. Indeed, if we store a range as a pair of numbers, the intersection code is as follows:

bool ranges_intersect(const pair<int,int>& a, const pair<int,int>& b) { return ! (a.second < b.first || b.second < a.first); }

The first thing we need to do is to calculate for each vertex its position on the line, i.e. its distance from the first vertex on the line. We can do it by selecting one of the two vertices with degree $1$, and traversing the line, every time going to a neighbour not yet visited:

vector<int> position(N); int v = 0, prev = -1; REP(i, N) { if (adj[i].size() == 1) { v = i; } } REP(i, N) { position[v] = i; int idx = (adj[v][0] == prev); prev = v; v = adj[v][idx]; }

Now we need to think how to calculate the sets $V_L$ and $V_R$. The set $V_L$ is the maximal range on the line that contains vertex $S$ and constitutes of vertices with numbers from range $[L,N-1]$. That boils down to slightly more general problem: having a line with numbers, and a certain position $x$, calculate the maximal range that contains $x$ and constitutes of positions with numbers from certain range $[lo,hi]$.

We can independently find the left end and the right end of this range. So let's focus on the left end. We start from a range containing only position $x$, and increase it to the left one element at a time, until we hit an element whose value is outside the range $[lo,hi]$. That suggests that we can use binary search to find the left endpoint, if we can quickly calculate minimum and maximum values in ranges on a line:

int lb = 0, ub = x; while (lb != ub) { int s = (lb+ub) / 2; int mi = minimum_on_range(s, x); int ma = maximum_on_range(s, x); if (lo <= mi && ma <= hi) { ub = s; } else { lb = s+1; } } return lb;

Calculating minimums (or maximums) in ranges is a classic problem called RMQ (Range Minimum Query). One solution to this problem is to use segment tree. Then each query can be calculated in $O(\log N)$ time. For completeness we include here one implementation of such a tree:

struct tree_t { vector<int> mins,maxs; int base; int n; tree_t(const vector<int>& vals) { n = vals.size(); base = 1; while (base < n) { base *= 2; } mins.resize(2*base); maxs.resize(2*base); REP(i, n) { mins[base + i] = maxs[base + i] = vals[i]; } for (int i = base-1; i >= 1; i--) { mins[i] = min(mins[2*i], mins[2*i+1]); maxs[i] = max(maxs[2*i], maxs[2*i+1]); } } pair<int,int> query(int xl, int xr) { // Returns (min, max) on range [xl, xr] xl += base; xr += base; int mi = min(mins[xl], mins[xr]); int ma = max(maxs[xl], maxs[xr]); while (xl/2 != xr/2) { if (~xl&1) { mi = min(mi, mins[xl+1]); ma = max(ma, maxs[xl+1]); } if (xr&1) { mi = min(mi, mins[xr-1]); ma = max(ma, maxs[xr-1]); } xl /= 2; xr /= 2; } return make_pair(mi, ma); }

Since range query works in $O(\log N)$ time, the binary search function for finding the left end of the set range works in $O(\log^2 N)$ time. Here we present the code to the function; it is a good exercise to complete it with code calculating the right end of the set range:

pair<int,int> calculate_range(tree_t& tree, int N, int x, int lo, int hi) { // Calculate maximal range of positions that contains position x // and elements are in range [lo, hi]. pair<int,int> ans; int lb = 0, ub = x; while (lb != ub) { int s = (lb+ub) / 2; pair<int,int> mima = tree.query(s, x); if (lo <= mima.first && mima.second >= hi) { ub = s; } else { lb = s+1; } } ans.first = lb; // ... (calculate ans.second) return ans; }

Finally we only need to construct the tree, based on numbers of vertices along the line, and use calculate_range function to proceed queries:

// Construct segment tree vector<int> numbers(N); REP(i, N) { numbers[position[i]] = i; } tree_t tree(numbers); REP(i, Q) { // Compute the i-th query pair<int,int> Lrange = calculate_range(tree, N, position[S[i]], L[i], N-1); pair<int,int> Rrange = calculate_range(tree, N, position[E[i]], 0, R[i]); A[i] = ranges_intersect(Lrange, Rrange); }

The time complexity of determining positions of vertices along the line and constructing the segment tree is $O(N)$, and time complexity of each query is $O(\log^2 N)$, so the total time complexity is $O(N + Q\log^2 N)$.

It turns out that we can improve the time of calculate_range function. Instead of making binary search, we can calculate the left end directly on segment tree. We start in a leaf of the tree corresponding to the position $x$, and we go upwards the tree. The first time we find a node which has a left sibling $y$ and the range of numbers in the segment of $y$ is not contained in $[lo,hi]$, we know that the left end must be inside this segment. So we can now go downwards the tree, finding the last position in this segment that violates $[lo,hi]$ constraint. The code is as follows:

int range_left(int x, int lo, int hi) { x += base; while (x > 1 && (~x&1 || x-- && lo <= mins[x] && maxs[x] <= hi)) { x /= 2; } if (x == 1) { return 0; } while (x < base) { if (lo <= mins[2*x+1] && maxs[2*x+1] <= hi) { x = 2*x; } else { x = 2*x+1; } } return x+1 - base; }

Thus the complexity of each query reduces to $O(\log N)$ and the complexity of the whole program to $O(N + Q\log N)$.

Solving problem on a line graph – the second approach

There is another approach for the line graph. But in this approach, before answering any queries, we will do some preprocessing.

Suppose we process query $(S,E,L,R)$ and we need want to find a set $V_L$, so the set all vertices reachable from $S$, but we can only go through vertices with numbers in range $[L,N-1]$. We can construct a new graph that contains the same set of vertices, but contains only edges between vertices in range $[L,N-1]$. In this new graph set $V_L$ is the connected component containing vertex $S$.

To calculate connected components we can simply use any graph-traversal algorithm. But there is another solution, using disjoint set union data structure, that allows us to solve dynamic version of this problem, in which we add edges to a graph.

Why is it interesting? Well, we can sort all our queries in non-increasing order based on values $L_i$. We start from a graph without any edges, and we iterate over queries in this order. For a given query we add to the graph all the edges that connects vertices with numbers in range $[L_i,N-1]$ and has not been added yet. We maintain DSU structure in which we maintain connected components, and for each component we remember the positions of left-most and right-most vertex from the component. Therefore we can quickly query the range for the vertex $S$ and store it.

The DSU structure will be located in array fu which stores parents of DSU trees. For roots we have fu[i] = -1 and in this case arrays pos_sml[i] and pos_big[i] will store left-most and right-most vertices of the component. Here are two functions of DSU structure to find the root of the component and join two components:

int fufind(vector<int>& fu, int x) { return fu[x] < 0 ? x : fu[x] = fufind(fu, fu[x]); } void fujoin(vector<int>& fu, vector<int>& pos_sml, vector<int>& pos_big, int x, int y) { x = fufind(fu, x); y = fufind(fu, y); if (x != y) { fu[y] = x; pos_sml[x] = min(pos_sml[x], pos_sml[y]); pos_big[x] = max(pos_big[x], pos_big[y]); } }

The following code process the queries:

vector<int> idx(Q); vector<int> fu(N), pos_sml(N), pos_big(N); vector<pair<int,int> > Lrange(Q); // Sort queries REP(i, Q) { idx[i] = i; } sort(idx.begin(), idx.end(), [&L](int i, int j) { return L[i] > L[j]; }); REP(i, N) { fu[i] = -1; pos_sml[i] = pos_big[i] = position[i]; } int next = N-1; REP(i, Q) { int limit = L[idx[i]]; for ( ; next >= limit; next--) { // Try to join vertex next for (int u : adj[next]) { if (u >= limit) { fujoin(fu, pos_sml, pos_big, next, u); } } } int root = fufind(fu, S[idx[i]]); Lrange[idx[i]] = make_pair(pos_sml[root], pos_big[root]); }

We leave calculating sets $V_R$ as an exercise – the code is symmetrical to the above. The sorting of vertices works in $O(Q\log Q)$ time (but since we are sorting numbers smaller than $N$ we can replace it with counting sort working in $O(Q+N)$ time). Invoking fujoin function $O(N)$ times runs in $O(N \log^* N)$ time. Therefore the whole solution runs in $O(N\log^* N + Q)$ time.

Seats

We are given a rectangular grid of size $H \times W$ in which all numbers from $0$ to $HW-1$ are written. We say that a subrectangle of area $k$ is beautiful, if it contains all numbers from $0$ to $k-1$ inside. We must calculate the number of beautiful subrectangles after each of $Q$ queries. Every query swaps two numbers in the grid.

Function give_initial_chart will be called exactly once at the beginning of our program, to give us a chance to inspect the initial grid. We just store all arguments in global variables. We can use the same names, by using C++ global namespace prefix ::.

int H, W; vector<int> R, C; void give_initial_chart(int H, int W, vector<int> R, vector<int> C) { ::H = H; ::W = W; ::R = R; ::C = C; }

Number $i$ in the grid is placed in row $R[i]$ and column $C[i]$.

The first approach – maintaining bounding boxes

As usual, a good idea is to simplify the problem and see how to solve it independently for each query. A query swaps numbers $a$ and $b$, so we can just swap their rows $R[a]$ and $R[b]$, as well as their columns $C[a]$ and $C[b]$.

We can try two different approaches. Since we are looking for rectangles containing consecutive numbers staring from 0, we can either iterate through all rectangles and see whether they contain correct numbers or iterate through all prefixes of numbers and see whether they form a rectangle.

For the latter approach let's say that we have selected numbers from $0$ to $k$. How to test if they form a rectangle? A rectangle can be described by four values $R_{min}$, $R_{max}$, $C_{min}$ and $C_{max}$ as the set of cells $(r,c)$ that satisfies $R_{min} \leq r \leq R_{max}$ and $C_{min} \leq c \leq C_{max}$. It is easy to calculate candidates for these four values: we can iterate over numbers, and for each number $i$ ensure that cell $R[i], C[i]$ is contained within bounds:

R_min = min(R_min, R[i]); R_max = max(R_max, R[i]); C_min = min(C_min, C[i]); C_max = max(C_max, C[i]);

That's how we find the tightest bounds that contain all cells. Now we must check if no number bigger than $k$ is inside these bounds. If not, we have found beautiful rectangle and we can increase the counter. We are ready for our first code:

int swap_seats(int a, int b) { swap(R[a], R[b]); swap(C[a], C[b]); int ans = 0; REP(k, H*W) { int R_min = R[0], R_max = R[0]; int C_min = C[0], C_max = C[0]; REP(i, k+1) { R_min = min(R_min, R[i]); R_max = max(R_max, R[i]); C_min = min(C_min, C[i]); C_max = max(C_max, C[i]); } bool ok = true; for (int i = k+1; i < H*W; i++) { if (R_min <= R[i] && R[i] <= R_max && C_min <= C[i] && C[i] <= C_max) { ok = false; break; } } if (ok) { ans++; } } return ans; }

For each of $Q$ queries we test each prefix (we have $HW$ of them) and for this prefix we go through all $HW$ numbers. Thus the time complexity is $O(QH^2W^2)$, which is enough to pass subtask 1.

Note, however, that we make a lot of unnecessary work. For each prefix we calculate bounds from scratch, and since we add numbers one by one, it's better to update the previous bounds by comparing them with current $R[i], C[i]$. The second observation is that we don't need to check the numbers bigger than $k$. We want to know if numbers from $0$ to $k$ form a rectangle of given bounds. From the bounds we can calculate the size of this rectangle, which is

$S = (R_{max}+1-R_{min})(C_{max}+1-C_{min}).$

Thus the rectangle contains exactly $S$ cells, so if they are filled only with numbers from $0$ to $k$ we must have $k+1 = S$.

int swap_seats(int a, int b) { swap(R[a], R[b]); swap(C[a], C[b]); int ans = 0; int R_min = R[0], R_max = R[0]; int C_min = C[0], C_max = C[0]; REP(i, H*W) { R_min = min(R_min, R[i]); R_max = max(R_max, R[i]); C_min = min(C_min, C[i]); C_max = max(C_max, C[i]); int Size = (R_max + 1 - R_min) * (C_max + 1 - C_min); if (i+1 == Size) { ans++; } } return ans; }

The time complexity is reduced to $O(HW)$ work per query, so $O(QHW)$ work in total. This passes subtasks 1 and 2.

This approach can also be extended to solve subtask 4, in which the distance between $a$ and $b$ in each query is smaller than some $D$. Assume that $a < b$. Note that if we swap positions of numbers $a$ and $b$, all numbers in the prefix of length $a$ stay the same, so the beautiful rectangles for numbers in this prefix will stay the same. Beginning with number $i=a$, the following rectangles may differ, but then again after number $i=b$ will again stay the same (since the set of numbers in range $[a,b]$ did not change, only their order). Thus we only need to recalculate beautiful rectangles in this $[a,b]$ range of length $O(D)$.

The easiest way to do this is to save all the values of $R_{min}$, $R_{max}$, $C_{min}$ and $C_{max}$ for all the indices, together with information whether the rectangle for such index was beautiful. We will also keep track of total number of beautiful rectangles, and every time we recalculate certain value, we update this number:

vector<int> R_mins, R_maxs; vector<int> C_mins, C_maxs; vector<bool> beaut; int total; void update_maxmin(int i) { R_mins[i+1] = min(R_mins[i], R[i]); R_maxs[i+1] = max(R_maxs[i], R[i]); C_mins[i+1] = min(C_mins[i], C[i]); C_maxs[i+1] = max(C_maxs[i], C[i]); int Size = (R_maxs[i+1] + 1 - R_mins[i+1]) * (C_maxs[i+1] + 1 - C_mins[i+1]); if (i+1 == Size && !beaut[i+1]) { beaut[i+1] = true; total++; } else if (i+1 != Size && beaut[i+1]) { beaut[i+1] = false; total--; } }

In function give_initial_chart we initialize all values:

R_mins.assign(H*W+1, H*W); R_maxs.assign(H*W+1, -1); C_mins.assign(H*W+1, H*W); C_maxs.assign(H*W+1, -1); beaut.assign(H*W+1, false); REP(i, H*W) { update_maxmin(i); }

Finally, the code for the query is very simple. We just need to update the values in range $[a,b]$:

int swap_seats(int a, int b) { swap(R[a], R[b]); swap(C[a], C[b]); for (int i = min(a,b); i <= max(a,b); i++) { update_maxmin(i); } int ans = total; }

This code has time complexity of $O(HW + QD)$ and it passes subtask 4 (and also subtasks 1 and 2, since obviously $D \leq HW$).

The second approach – limited sizes of a grid

How to speed up the above solution? For every query we iterate over all $HW$ numbers to find which prefixes form beautiful rectangles. But how many beautiful rectangles can we have? Well, every rectangle must contain the previous one, therefore it must be bigger. That means that the sum of its dimensions (width plus height) must be larger. And since the sum of dimensions of a subrectangle is limited by $H+W$, there are at most $H+W$ beautiful rectangles.

Let's take a look at constraints of subtask 3. Although the grid in this subtask can be quite big, both dimensions $H$ and $W$ are limited, thus $H+W$ also. But how to iterate over only beautiful rectangles?

Suppose that after investigating number $i$ we found that our candidate is given by boundaries $R_{min}$, $R_{max}$, $C_{min}$ and $C_{max}$. If it satisfy our condition for beautiful rectangle (i.e. its size calculated from boundaries is equal to $i+1$), then the next number $i+1$ will be outside, thus it will generate a new candidate. But if the condition is not satisfied, then it's possible that some subsequent numbers will be inside the rectangle still not satisfying the condition. Actually, if the size of candidate is $S$, then any number smaller than $S-1$ will certainly not satisfy the condition. Therefore we can immediately jump to number $S-1$.

Either the condition for this number is satisfied (that means we have a beautiful rectangle, and number $S$ will generate a bigger candidate), or the condition is not satisfied, which means that the candidate generated by number $S-1$ is already bigger. Either way, after a jump we generate a bigger candidate. Thus maybe not all candidates we iterate over are beautiful, but still we have only $O(H+W)$ candidates.

The question is how to quickly test if the boundaries for all numbers between $0$ and $k$ satisfy the condition. It's enough to calculate minimums and maximums for rows and columns for a prefix of length $k+1$. Once again we can use here RMQ data structure, e.g. segment tree (see problem Werewolf). Now we need to extend the tree_t structure with operation of updating the value in a cell at position x. This operation runs in $O(\log n)$ time, where $n$ is number of cell is the structure:

void update(int x, int val) { x += base; mins[x] = maxs[x] = val; while (x > 1) { x /= 2; mins[x] = min(mins[2*x], mins[2*x+1]); maxs[x] = max(maxs[2*x], maxs[2*x+1]); } }

We create two trees: one tree for querying minimums and maximums of values $R[i]$, and the other one for minimums and maximums of values $C[i]$. We declare two global variables:

tree_t* rows_tree; tree_t* cols_tree;

and initialize them in give_initial_chart function:

rows_tree = new tree_t(R); cols_tree = new tree_t(C);

Now at the beginning of each query we must update the cells $a$ and $b$ of both trees. Then we iterate over the candidates just the way we described before:

int swap_seats(int a, int b) { swap(R[a], R[b]); swap(C[a], C[b]); rows_tree->update(a, R[a]); cols_tree->update(a, C[a]); rows_tree->update(b, R[b]); cols_tree->update(b, C[b]); int R_min, R_max; int C_min, C_max; int ans = 0, i = 0; while (i < H*W) { tie(R_min, R_max) = rows_tree->query(0, i); tie(C_min, C_max) = cols_tree->query(0, i); int Size = (R_max + 1 - R_min) * (C_max + 1 - C_min); if (i+1 == Size) { ans++; i++; } else { i = Size-1; } } return ans; }

The initialization of trees runs in $O(HW)$ time. Then for each query we do two tree updates in time $O(\log HW )$ time and we iterate over at most $O(H+W)$ candidates, for each of them making two tree queries in time $O(\log HW)$. Thus total time complexity is $O(HW + Q(H+W)\log HW)$. This algorithm solves subtask 3.

The third approach – one-dimensional grid

Now, let's move to subtask 5. The grid can be big, but we have $H=1$, so it can only have one row. Therefore we can simplify our problem and make a one-dimensional version of it.

The idea is: how to characterize that a set of cells in one-dimensional array is a rectangle? In array rectangle becomes just a contiguous segment of cells. Therefore the condition that characterizes rectangles in arrays is as follows: the set must be connected.

So far we have managed to maintain connectivity information by storing two values $C_{min}$ and $C_{max}$. Every time the distance $C_{max} + 1 - C_{min}$ between them was equal number of cells in the set, the set was connected.

But we can use different condition that describes connectivity. Let's color the cells from set black, and the rest of them white (also add white cells outside the array). Now, the set of black cells is connected if and only if the number of pairs of adjacent cells with different color is exactly two.

That's true, since every connected component generates two such pairs. Thus we can start from a set containing only one black cell (with number $0$ in it). This cell already generates two pairs. Now, let's add new black cells and every time let's see how does it change the number of pairs. Let's call $line[c]$ the number in cell $c$ (thus $line[C[i]] = i$).

If we add number $i$ at position $c=C[i]$, then it can affect only two pairs $(c-1,c)$ and $(c,c+1)$. For the first pair if number $line[c-1]$ is greater than $i$, then it means that cell $c-1$ is white, and painting cell $c$ black creates new pair. On the other hand if $c-1$ is black, then painting cell $c$ black removes one pair. The similar case is with $(c,c+1)$ pair. The following code calculates how number changes after painting number $i$:

int calc_delta(int i) { int c = C[i]; int num = 0; num += (c == 0 || line[c-1] > i) ? 1 : -1; num += (c == W-1 || line[c+1] > i) ? 1 : -1; return num; }

If we make prefix sums of these deltas, then every time we hit value $2$, we have a beautiful rectangle:

int swap_seats(int a, int b) { swap(C[a], C[b]); swap(line[C[a]], line[C[b]]); int ans = 0, pref = 0; REP(i, W) { pref += calc_delta(i); if (pref == 2) { ans++; } } return ans; }

But still this is solution requires $O(W)$ work per query, so it is not very efficient. How to speed it up? We would need a data structure to maintain the sequence of “deltas&rdqou;, i.e. changes after painting subsequent numbers. The first operation needed from this data structure would be updating a single entry in the sequence of deltas. Indeed, invocation of calc_delta(i) depends only on elements at positions $C[i]$, $C[i]-1$ and $C[i]+1$, therefore if we change element at position $C[a]$, it will only require changing deltas at indices $C[a]$, $C[a]-1$ and $C[a]+1$. These are constant number of updates.

The second operation is to calculate how many values $2$ we have in a prefix sum of the delta sequence. In general, it is not easy to maintain it, but we have a nice property that the values never got below $2$ (we always have at least one connected component). Moreover, since calc_delta(0) is equal to $2$, we can simplify it a little bit, by replacing it by $0$ (we add a special-casing to calc_delta function).

Now our structure needs to support the following operations on a sequence delta: (1) update a single element of the sequence, (2) get the number of zeros in prefix sum of the sequence, provided that all numbers in prefix sum are non-negative.

It turns out that we can implement such a data structure using a segment tree, so that the former operation runs in $O(\log n)$ time (where $n$ is the length of the sequence), and the latter in constant time. We get to this in a minute, but for now we assume that we have such structure called prefixsum_zeros_tree_t and we implement the rest of the solution. We have some additional variables:

vector<int> line; vector<int> delta; prefixsum_zeros_tree_t* delta_tree;

In give_initial_chart function we do natural initialization of these variables:

line.assign(W, 0); delta.assign(W, 0); REP(i, W) { line[C[i]] = i; } REP(i, W) { delta[i] = calc_delta(i); } delta_tree = new prefixsum_zeros_tree_t(delta);

Now the code of the update function and swap_seats function is quite short:

void update_delta(int c) { if (c < 0 || c >= W) return; int i = line[c]; delta[i] = calc_delta(i); delta_tree->update(i, delta[i]); } int swap_seats(int a, int b) { if (H != 1) return -1; swap(C[a], C[b]); swap(line[C[a]], line[C[b]]); for (int i = -1; i <= 1; i++) { update_delta(C[a]+i); update_delta(C[b]+i); } int ans = delta_tree->count_prefixsum_zeros(); return ans; }

The time complexity of initializing the tree is $O(W)$ and the time of a single query is $O(\log W)$. Therefore the total complexity is $O(W + Q\log W)$.

Interlude – data structure for counting zeros in prefix sum

All we left with is implementation of prefixsum_zeros_tree_t data structure. We have said, that we can use a segment tree here, so we need to see what kind of information we need to maintain for each node of the segment tree. Each node corresponds to some fragment of a sequence. Unfortunately, we cannot directly count zeros for a prefix sums of this fragment, since we do not know cumulative sum for numbers before this fragment. But since we know that the prefix sums are always non-negative, we know that only minimal values obtained in prefix sums for the fragment can become zeros in prefix sums for the whole sequence.

Therefore for each node we maintain three values: cnt (number of minimums), hl (difference between $0$ and minimum), and hr (difference between cumulative sum on this fragment and minimum). We store them in info_t structure. It is easy to create such a structure for a fragment of length 1:

struct info_t { int hl; int hr; int cnt; info_t() : hl(0), hr(0), cnt(1) { } }; info_t single(int val) { info_t I; I.hl = max(-val, 0); I.hr = max(val, 0); I.cnt = 1 + (val == 0); return I; }

The more involved operation is joining information from two nodes which are left and right children of a common parent. We have to consider three cases, based on difference between minimal values of these two fragments. If they are equal, all of them becomes minimas of joint fragment. If either of them is smaller, the other ones could not be minimum any more:

info_t join(const info_t& L, const info_t& R) { info_t I; int diff = L.hr - R.hl; if (diff == 0) { I.hl = L.hl; I.hr = R.hr; I.cnt = L.cnt + R.cnt - (L.hr == 0); } else if (diff < 0) { I.hl = L.hl - diff; I.hr = R.hr; I.cnt = R.cnt; } else { I.hl = L.hl; I.hr = R.hr + diff; I.cnt = L.cnt; } return I; }

The rest of the code is a standard for segment trees:

struct prefixsum_zeros_tree_t { vector<info_t> tree; int base; int n; prefixsum_zeros_tree_t(const vector<int>& vals) { n = vals.size(); base = 1; while (base < n) { base *= 2; } tree.resize(2*base); REP(i, n) { tree[base + i] = single(vals[i]); } for (int i = base-1; i >= 1; i--) { tree[i] = join(tree[2*i], tree[2*i+1]); } } int count_prefixsum_zeros() { return tree[1].cnt - 1; } void update(int x, int val) { x += base; tree[x] = single(val); while (x > 1) { x /= 2; tree[x] = join(tree[2*x], tree[2*x+1]); } } };

Final touches

It turns out that having solved version with $H=1$ we have almost everything to solve the full problem. Remember, that we say that a set of cells in one-dimensional array is a rectangle if this set is connected. Can we give a similar simple characterization of a rectangle in a two-dimensional grid? Yes, this set must be connected and convex.

It means that it cannot have any “holes” or angles pointing “inwards”. In fact, it must have exactly four angles, each pointing “outwards”. But what is more interesting, this can also be characterized by observing small parts of the grid.

Now we are interested in squares of size $2\times 2$ on a grid. Note that in case of a rectangle, there will be exactly four such squares that contain exactly one black cell (these corresponds to four “outwards&rdqou; angles). The rest of the squares will contain zero, two, or four black cells; in particular there will be no squares that contain exactly three black cells.

It turns out that this is not only necessary, but also a sufficient condition for a set of cells to be a rectangle: the number of squares with one black cell is $4$, and the number of squares with three black cells is $0$. Since on each image containing at least one black cell the number of squares with one black cell is at least $4$, this condition can be simplified even more: the number of squares with one black cell plus the number of squares with three black cells is equal to $4$.

Thus we can use the very same data structure that counts zeros in prefix sums. Now we assume that cell $grid[r][c]$ stores the number in row $r$ and column $c$. The code is more involved, since now we need to update number of cells in each $2\times 2$ square that intersects with updated cell:

vector<vector<int> > grid; vector<int> delta_2; prefixsum_zeros_tree_t* delta_2_tree; int calc_delta_2(int i) { if (i == 0) return 0; int num_1 = 0, num_3 = 0; REP(ir, 2) REP(ic, 2) { int cnt = 0; REP(iir, 2) REP(iic, 2) { int nr = R[i] - 1 + ir + iir; int nc = C[i] - 1 + ic + iic; if (nr < 0 || nr >= H || nc < 0 || nc >= W || grid[nr][nc] > i) ++cnt; } if (cnt == 1) num_1++; else if (cnt == 2) num_1--; else if (cnt == 3) num_3++; else num_3--; } return num_1 + num_3; } void update_delta_2(int r, int c) { if (r < 0 || r >= H || c < 0 || c >= W) return; int i = grid[r][c]; delta_2[i] = calc_delta_2(i); delta_2_tree->update(i, delta_2[i]); } int swap_seats(int a, int b) { swap(R[a], R[b]); swap(C[a], C[b]); swap(grid[R[a]][C[a]], grid[R[b]][C[b]]); for (int ir = -1; ir <= 1; ir++) { for (int ic = -1; ic <= 1; ic++) { update_delta_2(R[a]+ir, C[a]+ic); update_delta_2(R[b]+ir, C[b]+ic); } } int ans = delta_2_tree->count_prefixsum_zeros(); }

The total code complexity of the final solution is $O(HW + Q\log HW)$.

Mechanical Doll

In this problem we need to create a circuit that controls motions of a mechanical doll. The circuit consists of exactly one origin (a device number $0$), exactly $M$ triggers that cause motions (devices numbered from $1$ to $M$), and some number of switches (devices numbered from $-1$ to $-S$).

Devices have exits, each of them must be connected to another (possibly the same) device. Origin and triggers have one exit. Switches have two exits X and Y; every time we go through a switch, we use the exit that was not used previously on this switch (the first time we use exit X).

Our aim is to create connections in such a way, that starting from the origin and following the connections, we will visit triggers $N$ times in order $A_0, A_1, \ldots, A_{N-1}$, we will visit each switch even number of times, and we will come back to the origin. For full points we cannot use more than $N + \log_2 N$ switches.

On the following picture there is a circuit with two triggers and one switch. Following connections of the circuit, we will visit triggers in order $1, 2, 1$.

Warmup

We start by solving subtask 1 in which every trigger appears in sequence $A$ at most once. Therefore we do not have to use any switches, we just connect triggers in the required order.

void create_circuit(int M, vector<int> A) { A.push_back(0); int N = A.size(); vector<int> C(M+1); vector<int> X, Y; REP(i, N) { C[A[i]] = A[(i+1) % N]; } answer(C, X, Y); }

First of all, we can treat the origin as a trigger, and put its device number at the beginning (or at the end, it does not matter) of sequence $A$. Then for every trigger $i$ that appears in the sequence, we connect its exit $C[i]$ to the next trigger in the sequence. We don't use any switches, so we left vectors $X$ and $Y$ empty.

In subtask 2 we can have triggers that appears twice in sequence $A$. Suppose that trigger $i$ appears at positions $j_0$ and $j_1$ in the sequence. Then we can connect the exit of this trigger to a switch, the exit X of this switch to trigger $A[j_0+1]$ and the exit Y of this switch to trigger $A[j_1+1]$.

vector<vector<int> > after(M+1); REP(i, N) { after[A[i]].push_back(A[(i+1) % N]); } int S = 0; REP(i, M+1) { const int size = after[i].size(); if (size == 0) { ; } else if (size == 1) { C[i] = after[i][0]; } else if (size == 2) { C[i] = --S; X.push_back(after[i][0]); Y.push_back(after[i][1]); } }

First of all, for each trigger $i$ we create list $after[i]$ containing numbers of triggers that appear after trigger $i$ in sequence $A$. Then for every trigger, if it appears exactly once in the sequence (thus list $after[i]$ has length $1$), then we just connect it to $after[i][0]$). Otherwise we create a new switch (by increasing number of used switches $S$) and connect its exits to $after[i][0]$ and $after[i][1]$.

More than two appearances

In subtask 3 each trigger can appear up to four times. Let's think what to do with a trigger than appears exactly four times. We can create the following gadget using three switches positioned in two layers:

Such a gadget has four exits numbered from $0$ to $3$, exit $j$ will be connected to trigger $after[i][j]$. Note the order of exits on the picture.

What about a trigger that appears exactly three times? One could think that we can create a gadget using two switches, like the one on the left of the following picture:

Unfortunately, such a gadget does not work, since it creates a sequence of triggers $0, 1, 2, 1$. A good idea is to use previous gadget with four exits, but to “kill” one of its exits, by connecting it to the bottom layer (see the middle picture). That would almost work, but there is a subtle catch here: if we kill the last exit, we would visit all triggers in correct order, but we won't visit every switch even number of times (and this was also a requirement in this problem). But killing any other exit is fine (for instance killing the first one, like in the right picture). In the code we need to add another if to handle sizes up to four:

} else if (size <= 4) { C[i] = --S; X.push_back(--S); Y.push_back(--S); if (size == 4) { X.push_back(after[i][0]); Y.push_back(after[i][2]); X.push_back(after[i][1]); Y.push_back(after[i][3]); } else { X.push_back(C[i]); Y.push_back(after[i][1]); X.push_back(after[i][0]); Y.push_back(after[i][2]); } }

The above idea can be generalized to solve the whole problem. If trigger $i$ appears $s$ times, we need to create a gadget $G_s$ that has exactly $s$ exits, connect trigger $i$ to this gadget, and connect the exit number $j$ of this gadget to $after[i][j]$. The function generate(after, S, X, Y) creates a desired gadget, updating values S, X and Y, and returns the number of device which is at the bottom of the gadget:

vector<vector<int> > after(M+1); REP(i, N) { after[A[i]].push_back(A[(i+1) % N]); } int S = 0; REP(i, M+1) { C[i] = generate(after[i], S, X, Y); }

Suppose we need to create a gadget $G_s$ for $s=2^k$. We can construct it recursively from one switch and two gadgets $G_{2^{k-1}}$. The exits of gadget $G_{2^k}$ will be just exits of these two gadgets shuffled:

For value $s$ that is not a power of two, we find number $k$ such that $s \leq 2^k < 2s$. Then we create gadget $G_{2^k}$ and kill $2^k-s$ of its exits. The code is as follows:

int generate(const vector<int>& after, int& S, vector<int>& X, vector<int>& Y) { const int size = after.size(); if (size == 0) { return 0; } else if (size == 1) { return after[0]; } else { int k = 1, K=0; while (k < size) { k *= 2; K++; } vector<int> revbits(k); REP(j, k) { revbits[j] = revbits[j/2]/2 | ((j&1) << K-1); } const int id = --S; REP(j, (k-1)/2) { X.push_back(--S); Y.push_back(--S); } REP(j, k) { vector<int>& where = j%2 ? Y : X; if (revbits[j] < (k-size)) { where.push_back(id); // kill exit } else { where.push_back(after[ revbits[j] - (k-size) ]); } } return id; } }

Observe that we use array $revbits[j]$ to store the number $j$ after reversing the order of its bits, when expressed in binary form. It is useful, since the $j$-th exit from the left in the gadget $G_{2^k}$ is exit number $revbits[j]$. When choosing exits to kill, we take these whose numbers $revbits[j]$ are smaller than $2^k-s$. Thanks to that the remaining exits have consecutive numbers, and we can connect exit of number $revbits[j]$ to trigger number $after[revbits[j] - (2^k - s)]$.

This solution uses at most $2s-1$ switches for a trigger that appears $s$ times in sequence $A$. Thus it uses at most $2N$ switches in total, which is enough for getting half score for subtasks 5 and 6.

The problem is that we use too many switches if $s$ is not a power of two. One idea how to handle it, is use a better way of killing unused exits. So far we killed exits with consecutive numbers, but it's better to kill exits that are located close to each other on the picture. Let's see an example with $s=5$ and $2^k=8$. If we kill the first three exits on the left, then one switch becomes unnecessary, since both of its exits will go to the bottom layer. Therefore, we can remove this switch:

This is easier said than done, since we need to know which switches to remove, and also we need to maintain correct order of connections on remaining exits. Perhaps recursive implementation would be more straightforward here, but nevertheless it can be implemented iteratively:

int generate(const vector<int>& after, int& S, vector<int>& X, vector<int>& Y) { const int size = after.size(); if (size == 0) { return 0; } else if (size == 1) { return after[0]; } else { int k = 1, K=0; while (k < size) { k *= 2; K++; } vector<int> revbits(k), go(k); REP(j, k) { revbits[j] = revbits[j/2]/2 | ((j&1) << K-1); } // Define switches for the first K-1 layers const int id = --S; REP(lev, K-1) { REP(j, 1<<lev) { if ((j << K-lev) + (1 << K-lev) <= (k-size)) { ; // removed switch } else if ((j << K-lev) + (1 << K-lev-1) <= (k-size)) { X.push_back(id); // kill exit Y.push_back(--S); } else { X.push_back(--S); Y.push_back(--S); } } } // Calculate the order of connections for the remaining exits int ptr = 0; REP(j, k) { if (revbits[j] < (k-size)) { ; } else { go[ revbits[j] ] = after[ptr++]; } } // Define switches on the topmost layer REP(j, k/2) { if (2*j + 2 <= (k-size)) { ; // removed switch } else if (2*j + 1 <= (k-size)) { X.push_back(id); // kill exit Y.push_back(go[2*j+1]); } else { X.push_back(go[2*j]); Y.push_back(go[2*j+1]); } } return id; } }

This way we use only $s + k-1$ switches. Therefore for subtask 5 in which we have only one trigger, we use less than $N + \log_2 N$ switches, as required for full points.

Unfortunately, this still does not give us full points for subtask 6, since for many triggers we get additional logarithmic addend for each of them. But it turns out that we do not need a separate gadget for every trigger. Indeed, we need only one gadget of size $N+1$. All triggers are connected to this gadget, and $N+1$ exits of the gadget are connected to triggers in the order they appear in sequence $A$:

int S = 0; int id = generate(A, S, X, Y); REP(i, M+1) { C[i] = id; }

This solution uses at most $N + \log_2 N$ switches and solves the whole problem.

To be continued…