Editorial for COCI '13 Contest 1 #6 Slastičar
Submitting an official solution before solving the problem yourself is a bannable offence.
Let us first solve a simpler version of the problem, where the robot doesn't stop upon finding a matching serial number, but tries out all segments instead.
Let be the number of suffixes of string
that have
as a prefix. Let
denote a substring of
from position
to position
.
The number of comparisons that the robot will make for a word with length
is then:
where is the total number of segments that the robot has started comparison with. With our simplification assumption,
always equals
, beaus the robot doesn't stop comparison upon finding a matching word.
A more efficient way to compute is needed. One possible method is building a suffix array of
. A suffix array is a set of all suffixes of a string
sorted lexicographically. Binary search can be applied to the suffix array in order to find the interval of suffixes beginning with the character
. Let us denote this interval by
. The value of
then equals
. Next, within this interval, we need to find the interval of suffixes with
as the second letter,
.
is then
. We repeat the procedure for all remaining characters in
.
The complexity of this search is , which is fast enough since the sum of all query lengths will not exceed
.
What about the full problem, where the robot stops upon finding a match?
For each word , let us denote the first position where it appears with
. Let
be the number of suffixes of string
with the prefix
starting at a position less than or equal to
. We can now derive a new formula for the total number of comparisons:
where now equals
since we stop comparing after that position.
In order to make computing the function easier, we will not respond to queries in the order that they appear in the input, but sort them in a suitable order instead and store the solutions in an array. After computing all the solutions, we can output them in the original order.
We will sort the queries by in ascending order. We will also utilize a structure supporting two operations: add
to some position, find the sum of an interval. This structure will enable us to find the number of important prefixes in an interval. It can be implemented using a Fenwick tree or an interval (tournament) tree.
We will process queries as follows:
- suppose we are currently responding to a query first matching at
- in the structure, we set
to corresponding positions for all suffixes up to
; notice that it is not necessary to do so for all suffixes for each query, but only for suffixes after the one we stopped at in the previous query, since the queries are sorted by
- as in the simpler version of the problem, we find the intervals
; however, we do not add
to the solution, but instead the sum of the interval
in the structure, which corresponds to the number of suffixes starting at position
or less in the interval
in the suffix array
- we store the result into the solution array at the appropriate position
Finally, we just need to output the solution array.
There is one more detail left unresolved: how do we compute the numbers ? Upon finding the last interval
for some word
, we know that all suffixes from that interval have
as a prefix. The first position where
appears is thus the starting position of the leftmost of those suffixes, that is, the suffix with the smallest index. We therefore need a structure for finding the minimum number in an interval. This can be done using an interval tree or a similar structure.
There are multiple algorithms for building a suffix array with varying complexities that can be found on the Internet. The official solution uses an algorithm with complexity . Also, both an interval tree and a Fenwick tree are implemented in the official solution. The complexity of a query in both structures is
.
The total complexity of the algorithm is .
Comments