Overview
The CVC-MUSCIMA database will be used for the
competition. This database consists of 1,000 handwritten music
score images, written by 50 different musicians. All the 50
writers are adult musicians in order to ensure that they have
their own characteristic handwriting music style. Each writer
has transcribed exactly the same 20 music pages, using the same
pen and the same kind of music paper. The set of the 20
selected music sheets contains monophonic and polyphonic music,
and it consists of music scores for solo instruments and music
scores for choir and orchestra.
For testing the robustness of the staff removal algorithms, we have generated a degraded version of the database. For each one of the 1,000 original images, we apply a 3D distortion and local noise (see section 3). Each degradation model can be set in order to generate different degradation levels. For each original gray level image and for a determined degradation level, we generate three degraded images: one with only the 3D degradation, one with only the local noise and one combining both sources of degradations. One example of distorted image (in gray level and binary) including 3D distortion and local noise can be seen below.

In total, there will be 6000 degraded images.
For each one, we provide the gray level image and the
corresponding binary image as input images to the participants.
We also generate the binary staff-less image (only music
symbols, no staff lines) that we use for performance evaluation.
The staff-less images of the test set will be made public after
the competition. One example of the desired output image file
can be seen next.

For the staff removal competition the entire dataset is equally
divided into two parts, of which the first 66% of the images
(4000 images) will be used as training (setting parameters) the
algorithms and the other 33% (2000 images) of the images will be
used for testing them.
Input files
The following training data (all the images are
in PNG format) is available for download:
Test Files
The following test data (all the images are in
PNG format) is available for download:
Participants are free to
decide if they would like to use the gray-level or the
binary images.
Participants will also receive an email with instructions for uploading these two files:
The staff removal problem is considered as a
two-class classification problem at the pixel level. For each of
the images we compute the number of true positive pixels (pixels
correctly classified as staff lines), false positive pixels
(pixels wrongly classified as staff lines) and false negative
pixels (pixels wrongly classified as non-staff lines) by
overlapping with the corresponding ground truth images. Then,
from these measures, the precision, recall and error rate
measures are computed.
Since there are different distortion levels, we will provide a
separate evaluation for each kind of degradation (3D, increasing
level of local noise) to get a comparison of the robustness of
each method towards different kinds of degradations