Skip to content
Go back

Editorial: "Catch the Mole"

Edit page

Here’s a link to the problem.

With problems like these, it’s often useful to consider extreme tree configurations, such as:

1. Long Chain

This case is easily solved via binary search.

2. Star Graph

Without the condition that the mole moves upward when our query misses, star graphs (i.e. trees of depth 1) would be impossible to solve in faster than O(N)\mathcal{O}(N) queries. Luckily, the fact that the mole moves up means that we can actually solve any star graph in at most 2 queries!


So, it looks like if we want to figure out a worst-case scenario, we’ll need to balance out depth (long chains) and breadth (star graphs). One way we can achieve this is to construct a tree with O(N)\mathcal{O}(\sqrt{N}) chains of O(N)\mathcal{O}(\sqrt{N}) nodes connected to the root, like so: md Such a tree requires us to make O(N)\mathcal{O}(\sqrt{N}) queries in the worst case since we have to check all the branches, and in that time the mole may not reach the root.

So now the question becomes: can we always achieve this lower bound of O(N)\mathcal{O}(\sqrt{N}) queries?

As it turns out, yes! And an even more beautiful fact is that if, at every time-step, we simply make the query that eliminates as many nodes as possible, we can achieve this bound.

However, I’d first like to offer my slightly less elegant approach:

Let SS denote the set of leaves of the subgraph where the mole could currently be located---that is, the mole could be in any of these nodes, or their ancestors. The key idea is that we can always query a node xx such that we either

  1. Reduce our search space to a single chain, to which we can apply a simple binary search.
  2. Remove a single leaf from S|S|, then replace all elements of SS with their parents.

To select xx, we simply choose a leaf yy in SS, then set xx to be the highest ancestor of yy that contains only one leaf in SS. md The red nodes denote the elements of SS. One possible choice of xx and yy is shown.

If we query xx and get response 1, we know the mole must be somewhere along the chain between xx and yy.

Otherwise, we eliminate yy from SS, and since the mole must move up, all elements of SS are replaced with their parents. An illustration of the process described above. Red nodes denote elements of SS, and the blue node denotes xx, the node we query in each step.

Let SiS_i denote SS after the iith query. Assume all queries have returned 0, since as soon as a query returns 1, we can find the mole in O(logN)\mathcal{O}(\log N) queries. Then, we have that

  1. Si+1<Si|S_{i + 1}| < |S_i|
  2. All SiS_i are disjoint, and thus SiN\sum |S_i| \leq N

Combining these two facts, we can show that using our strategy, no more than O(N)\mathcal{O}(\sqrt{N}) queries can return 0. Thus, our total runtime is O(N+logN)=O(N)\mathcal{O}(\sqrt{N} + \log N) = \mathcal{O}(\sqrt{N}).


The one thing I dislike about this approach is that a separate strategy (binary search) is needed to handle the chain case. However, this is not an issue for the strategy I mentioned above, where we always greedily eliminate as many nodes as possible. Here’s a nice proof for why this works, which I got from here:

Let’s consider a slight rephrasing of the problem. In each step, we can either 1) query a subtree and know whether it contains the mole or not, or 2) delete all leaves from the tree.

If we do operation 1 on a subtree of size szsz, we will eliminate, in the worst case, min(sz,Nsz)\min(sz, N - sz) nodes. This means that, as long as there exists some subtree that is not “too small” nor “too large”, we can always eliminate a decent amount of nodes. However, let’s say this is not the case, and all subtrees are either too large or too small (for instance, a star graph). Then, there must exist a large subtree whose children are all small, and the root of this subtree must then have significant degree. This means the tree must have a significant number of leaves, which we can then eliminate via operation 2!

Let’s formalize this, and define a “small” subtree as one with size N\leq \sqrt{N}, and a “large” subtree as one with size NN\geq N - \sqrt{N}. Then, if a large subtree has only small subtrees as children, its root must have a degree of at least NNN=N1\frac{N - \sqrt{N}}{\sqrt{N}} = \sqrt{N} - 1, which means there must be at least N1\sqrt{N} - 1 leaves! Therefore, we can always remove at least O(N)\mathcal{O}(\sqrt{N}) nodes in each step, and so greedily maximizing the number of nodes deleted in each round suffices.

To summarize concisely why this approach works: unless some node has very high degree, a tree’s subtree sizes will be relatively continuous. IMO, a pretty elegant fact!


Edit page
Share this post on:

Next Post
The Sprague-Grundy theorem