Procedures
There are several data-sets included in the challenge. Some of them
are real transcribed medieval manuscripts, some artificial (but
hand-copied) ones. The data-sets are made available in (at least) two
phases. Correct answers to some of the first phase data-sets will be
announced before the final submission deadline in order to enable
self-assessment by the participants. These data-sets will not affect
the final results.
Submissions are made in groups (of size one or more persons). Each
group should submit at most two solutions. If more than two solutions
are submitted, the last two before the deadline are accepted. A person
can belong to at most two groups. (So a person can contribute to a
maximum of four solutions.)
Evaluation
For the artificial data-sets, for which there is a known
'ground-truth' solution, the difference between the proposed and
correct solutions is evaluation. The exact criterion is open for
discussion among participants (and other interested parties). The
current scoring function is based on distance orders among triplets
(for details, see Example) Go to Discussion.
For the real data-sets, there is no known correct solution. Therefore,
EBS¹ will be used: the 'owner' of the data-set will be asked to
estimate whether the proposed solution is plausible and useful from
his/her point of view.
¹ EBS: endorphin-based scoring
Ranking Schemes
There are two ranking schemes.
- Primary ranking is based on
performance on the primary data-set (see Data Sets).
- Secondary
ranking is based on all the other data sets except those for which
the correct solution is annouced during the challenge, including both
artificial and real data. For these other data sets, 'thumbs-up' marks
will be awarded to the best submissions, and the total number of
thumbs up determines the secondary ranking.
The organizers reserve the right to alter the rules of the challenge
at any point.
Restrictions
Anyone who has been in contact with some of the data-sets, or has
otherwise obtained inside information about the correct solution
should obviously not enter the challenge as regards the data-set(s) in
question. However, participation is allowed as regards the other
data-sets.
The data-sets are provided only for the use in the challenge.
Further use of the data requires an explicit permission by the
organizers and the original providers of the data.
|