Editorial for IOI '17 P6 - Ancient Books
Submitting an official solution before solving the problem yourself is a bannable offence.
Subtask 1
The first subtask, where there are at most books in the library, can easily be solved by hand. There are only
possible inputs and so they can all be precomputed by hand simulating the process with pen and paper.
Alternatively, we do a brute force state exploration. We can take the triple (state of the shelf, Aryan's position, Aryan's current book) as the state in a breadth-first search. This state space is roughly of size and so for
, we can still find the shortest path to the sorted state fast enough.
Medium subtask (start/end on the very left)
To compute the number of steps (= seconds) needed without actually simulating the entire sorting process, we need a couple of observations:
Balancing property Across every edge of the underlying path graph, Aryan will walk equally often from the left to the right as she walks from the right to the left. A similar balance also applies to the books. For every edge of the path, there are equally many books that need to be moved from the left to the right as there are from the right to the left.
Cycles of the permutation The array order
specifies a permutation. We will denote that permutation by from here on. We will see that the answer only depends on how
partitions the set of slots
into disjoint cycles. A book that is placed correctly from the beginning, we call trivial. Their corresponding trivial cycles (cycles of length one) can almost be ignored, as we will always just move past them.
Single cycle If consists of a single cycle then the answer is easy to find: Aryan can grab the book at
, bring it to
, take the book from there to
and so on until she returns to
. [Here is a video illustrating this.] (DMOJ curator's note: The link doesn't seem to be in the pdf) As she brings one book on slot closer to its target position in every step, her walk surely is optimal. We can compute this number of steps by:
which is exactly the sum of the distances between the initial and target position of every book.
Lower bound The sum of distances
is a lower bound on the answer even if
consists of multiple cycles. We distinguish two kinds of steps for Aryan: A step is called essential if Aryan brings one book one step closer to its target position than this book ever was before. Otherwise, the step is called non-essential. Every way of sorting the books consists of exactly
many essential steps. The number of non-essential steps needed depends on how the cycles of
overlap.
Two cycles Every cycle of covers some interval
of the bookshelf which extends from the leftmost book to the rightmost book that is part of this cycle. We have
, where we use
as a shorthand for
. Let
consist of exactly two non-trivial cycles
and
with their respective intervals
,
and let
with
. Then the answer only depends on whether the cycles overlap (
) or not. If they overlap, the answer is
, otherwise it is
.
Why? If they overlap, Aryan can sort along until she encounters the first book belonging to
. She then leaves the book she was carrying at that slot to fully sort
and return to the same slot. She can then pick up that book again and finish sorting
without ever spending a non-essential step.
If the two cycles do not overlap (so is entirely to the left of
), Aryan can do something similar. She starts sorting
until she encounters the rightmost slot belonging to
. She then takes the book from there and non-essentially walks with one step to the right to the leftmost slot of
. There, she sorts
and picks up the same book again to non-essentially walk back to
. Finally, she finishes sorting
and returns to
. This is optimal since the only two non-essential steps of Aryan's walk are spent across an edge that no book needs to cross but has to be crossed by Aryan eventually as there are non-trivial books on both sides. [Here is a video illustrating this.]
Multiple cycles The two cases from above generalize to the case where consists of many cycles. Any two overlapping cycles can be interleaved without non-essential steps, and Aryan has to spend two non-essential steps across every edge that no book needs to cross, but where there are some non-trivial books or
on both sides of the edge. [Here is a video illustrating this.] More formally, let
be the subset of the edges of the path with the following property:
If , these are all the non-essential steps needed, so the answer is
.
Implementation With these observations, it is easy to compute both and
. If it is done in quadratic time (e.g., by just checking the above conditions for
by looping over all indices for every edge), this will solve subtask 2. However, it is not hard to compute
in linear time (e.g., in a scanline fashion from left to right), which will then score for the first three subtasks.
Harder subtasks (with
)
If Aryan does start somewhere in the middle of the shelf, we might need some additional non-essential steps across edges not in . It is not obvious whether Aryan should first go to the left or to the right, as there might be non-trivial books on both sides.
We could try out both options and define the following subproblem:
How many non-essential steps do we need, if we already know how to connect all the cycles in the interval
with
?
This gives rise to a dynamic programming formulation with a state of quadratic size (all intervals containing ).
For any given interval, we define the function with
being the largest part of the shelf that we can sort without spending any additional non-essential steps. So
has to repeatedly add all cycles
whose interval
partially or fully overlap with
and then continue with
until no more cycles can extend the interval.
Once there is no other overlapping cycle (so ), we are either done or we know that we have to spend some non-essential steps. Let
be the function that computes the cost of connecting all cycles to the interval
. We can recursively compute it using
. We need to take care of the border cases (when
or
are outside the shelf) and initialize it with
for
being the smallest interval that contains all non-trivial books and
.
By implementing carefully, we can achieve an amortized constant time complexity across all calls, so that the dynamic program runs in quadratic time overall. Note that for
, the set of states is only of linear size, so this solution also passes subtask 3.
To solve the problem in linear time, we note that we can decide whether to go left or right somewhat locally without exploring quadratically many states. If is some extended interval (
), we look for two special cycles
and
.
is the first
-containing2 cycle that we encounter when walking from
to the left. Similarly,
is the first
-containing cycle when walking from
to the right.
2 A cycle with interval
is
-containing if and only if
.
Let be the cost of reaching
from
(and define
). Note that
is not just twice the distance between
and the closest book of
as there might be some small cycles along the way that help us save some non-essential steps. But we can compute
quickly by solving the (
)-problem between
and the first box of
.
Observe that if does not exist,
does also not exist (as
is maximally extended, the book of
to right of
also has to be to the right of
and vice versa). Also note that once we reach either
or
, we also reach
and therefore get
without any further cost. This means that we can greedily decide for the cheaper of the two sides (of cost
) and then continue with the interval
regardless of whether we decided to go left or right.
Finally, one has to take care of the border region, everything outside of the outermost -containing cycles (so once
and/or
no longer exist). But this is easy, as this is just another (
)-case on each side.
We can answer all the calls and compute all the
and
costs using only one overall sweep over the shelf (if we precompute the cycle of each shelf and the interval of each cycle, which we can also do in
). Therefore, we can determine the answer
in linear time.
Almost correct solutions
One thing a contestant might overlook is that the answer can be of the order and therefore does not necessarily fit into a 32-bit integer. This causes an overflow in subtasks 3 and 5.
Other submissions might consider only near-optimal sorting walks. Some might also visit all the trivial books at the ends of the paths, even though there is nothing to sort there. Others might just traverse the path once without carrying any book and then sort every cycle individually along the way without interleaving anything and hence spend many non-essential steps. Both of these would result in non-optimal answers.
Comments