contenido

Overview

The CVC-MUSCIMA database will be used for the competition. This database consists of 1,000 handwritten music score images, written by 50 different musicians. All the 50 writers are adult musicians in order to ensure that they have their own characteristic handwriting music style. Each writer has transcribed exactly the same 20 music pages, using the same pen and the same kind of music paper. The set of the 20 selected music sheets contains monophonic and polyphonic music, and it consists of music scores for solo instruments and music scores for choir and orchestra.

For testing the robustness of the staff removal algorithms, we have generated a degraded version of the database. For each one of the 1,000 original images, we apply a 3D distortion and local noise (see section 3). Each degradation model can be set in order to generate different degradation levels. For each original gray level image and for a determined degradation level, we generate three degraded images: one with only the 3D degradation, one with only the local noise and one combining both sources of degradations. One example of distorted image (in gray level and binary) including 3D distortion and local noise can be seen below.

In total, there will be 6000 degraded images. For each one, we provide the gray level image and the corresponding binary image as input images to the participants. We also generate the binary staff-less image (only music symbols, no staff lines) that we use for performance evaluation. The staff-less images of the test set will be made public after the competition. One example of the desired output image file can be seen next.

staffl-ess

For the staff removal competition the entire dataset is equally divided into two parts, of which the first 66% of the images (4000 images) will be used as training (setting parameters) the algorithms and the other 33% (2000 images) of the images will be used for testing them.

Input files

The following training data (all the images are in PNG format) is available for download:

Gray images (part 1, part 2, part 3, part 4: each part contains 1000 images, maximum size 1'3Gb): Set of 4000 original gray-level images with staff lines in the folder "GRAY/". The name convention will be GR_XXXX.png, where XXXX will range from from 0001 to 4000.
Binary images (345Mb): Set of 4000 original binary images with staff lines in the folder "BW/". The name convention will be BW_XXXX.png, where XXXX will range from from 0001 to 4000.
Ground-truthed images (340Mb): Set of 4000 images without the staff lines. These are the ground-truthed images of the first set in the folder "GT/". The name convention will be GT_XXXX.png, where XXXX will range from from 0001 to 4000.

Test Files

The following test data (all the images are in PNG format) is available for download:

Grey-level images (2'48Gb). Set of 2000 original gray-level images with staff lines. The name convention will be gray-test-XXXX.png, where XXXX will range from from 0001 to 2000.
Binary images (183Mb). Set of 2000 original binary images with staff lines. The name convention will be test-XXXX.png, where XXXX will range from from 0001 to 2000.
Ground-truth (179Mb): Since the competition is over, the ground-truth for the test set is NOW available!!

Participants are free to decide if they would like to use the gray-level or the binary images.

Participants will also receive an email with instructions for uploading these two files:

The names/surnames of the participants, and a short description of their methods in order to be included in the competition report.

A ZIP file with the following format:

Filename: The filename will be ParticipantCode-MethodCode.zip, where Participant Code will be the code of the participant (assigned after registration), and Method Code will be the code of the method (in this way, the participants are free to sumbit more than one method to the contest). Ex. If the participant code is CVC01 and the code of the method is MyWIdent1, then the filename will be CVC01- MyWIdent1.zip.
Content: The ZIP file must contain the output images for each of the test images (in this case 2000 images). Each of the output images should be a binary image containing only the musical symbols, the images should be in PNG format. For each input image test-XXXX.PNG, the output filename of the input should be in the form out-test-XXXX.PNG.

Evaluation Metrics

The staff removal problem is considered as a two-class classification problem at the pixel level. For each of the images we compute the number of true positive pixels (pixels correctly classified as staff lines), false positive pixels (pixels wrongly classified as staff lines) and false negative pixels (pixels wrongly classified as non-staff lines) by overlapping with the corresponding ground truth images. Then, from these measures, the precision, recall and error rate measures are computed.
Since there are different distortion levels, we will provide a separate evaluation for each kind of degradation (3D, increasing level of local noise) to get a comparison of the robustness of each method towards different kinds of degradations