PIAST

A Piano Dataset with Audio, Symbolic, and Text Data


PIAST (PIano dataset with Audio, Symbolic, and Text) presents a multimodal dataset that pairs solo piano music with text.
In addition to the audio and text data, PIAST includes MIDI data, transcribed through our transcription pipeline. The dataset is
divided into two subsets: PIAST-YT and PIAST-AT, each with distinct text collection methods and research purposes.
The dataset and the training weights are available on this link.

PIAST-YT

PIAST-YT consists of 9,673 tracks (1006 hours) of solo piano audio, MIDI and text (title, description, and tags) collected from YouTube. The collected text data was processed with ChatGPT-4 Turbo to extract text represents each music content.

PIAST-AT

PIAST-AT consists of 2,023 expert-annotated tracks, providing more accurate and detailed text information. Each 30-second segment from PIAST-YT was annotated using our piano-specific taxonomy, which includes emotion, mood, genre, and style. The text data also includes agreement information from three annotators.

Taxonomy & Samples

We constructed a piano-specific taxonomy for solo piano music to encompass and define the range of expressions possible in solo piano music. 7 music experts participated in the construction of the taxonomy, by rating each tag by its suitability for solo piano music. The taxonomy includes three main categories: Emotion/Mood, Genre, and Style. The samples below each tag are the sample audio that were labeled with the corresponding tag during the annotation process.

Emotion/Mood

Happy

Annotated Tags:

  • Happy, Playful, Bright, Upbeat/Energetic
  • Jazz
  • Ragtime, Cute, Intense/Grand, Pop-Piano Cover, Emotional
Bright

Annotated Tags:

  • Bright, Pop-Piano Cover
  • Easy
  • Upbeat/Energetic, Happy, Relaxing/Calm, Cute, Emotional
Playful

Annotated Tags:

  • Playful, Happy, Bright
  • Cute, Upbeat/Energetic
  • Easy, Laid-back, Pop-Piano Cover, Emotional, Jazz
Cute

Annotated Tags:

  • Cute, Bright
  • Playful, Happy, Emotional, Easy, New-age
  • Classical, Relaxing/Calm, Pop-Piano Cover
Relaxing/Calm

Annotated Tags:

  • Relaxing/Calm
  • Emotional,Easy,New-age,Classical,Laid-back
  • Pop-Piano Cover,Happy
Emotional

Annotated Tags:

  • Emotional
  • Dreamy, Ballad, Laid-back, Relaxing/Calm
  • Epic, Mysterious, Sad, New-age, Dark, Bright, Jazz
Dreamy

Annotated Tags:

  • Dreamy, Emotional
  • Laid-back, Bright, Relaxing/Calm, Ballad, Jazz
  • Intense/Grand
Mysterious

Annotated Tags:

  • Mysterious, Classical
  • Dreamy, Easy, Sad
  • Dark, Emotional, Relaxing/Calm, Playful
Sad

Annotated Tags:

  • Sad, Dark
  • Emotional
  • Tense, Intense/Grand, Classical, New-age
Dark

Annotated Tags:

  • Dark, Mysterious
  • Dreamy, Tense, New-age
  • Classical, Relaxing/Calm, Laid-back
Tense

Annotated Tags:

  • Tense, Speedy
  • Intense/Grand, Mysterious, Dark, Playful
  • Difficult/Advanced, Passionate, New-age, Powerful, Pop-Piano Cover
Epic

Annotated Tags:

  • Epic
  • New-age, Powerful
  • Passionate, Emotional, Upbeat/Energetic, Speedy
Intense/Grand

Annotated Tags:

  • Intense/Grand
  • Pop-Piano Cover, Upbeat/Energetic, Powerful, Difficult/Advanced, Speedy
  • Ballad, Sad, New-age, Emotional, Passionate, Epic
Passionate

Annotated Tags:

  • Passionate, Difficult/Advanced, Speedy
  • Powerful, Intense/Grand, Epic, Jazz
  • Dark, Tense, Funk, Upbeat/Energetic
Powerful

Annotated Tags:

  • Powerful, Swing, Difficult/Advanced, Upbeat/Energetic, Passionate, Intense/Grand, Pop-Piano Cover, Epic, Jazz
  • Speedy
  • Tense

Genre

Jazz

Annotated Tags:

  • Jazz, Swing
  • Laid-back, Upbeat/Energetic, Difficult/Advanced
  • Happy, Mysterious, Playful
Blues

Annotated Tags:

  • Blues, Jazz
  • Playful, Bright
  • Laid-back, Upbeat/Energetic, Tense
Funk

Annotated Tags:

  • Funk, Swing, Difficult/Advanced, Happy, Bright, Pop-Piano Cover, Upbeat/Energetic, Jazz
  • Powerful, Passionate, Epic
Swing

Annotated Tags:

  • Swing, Difficult/Advanced, Speedy, Jazz
  • Playful, Bright, Passionate
  • Happy, Upbeat/Energetic, Intense/Grand, Blues
Latin

Annotated Tags:

  • Latin, Jazz, Playful
  • Speedy, Bright
  • Upbeat/Energetic, Happy, Dreamy
Bossan Nova

Annotated Tags:

  • Bossan Nova, Latin, Jazz
  • Emotional
  • Laid-back, Relaxing/Calm, Mysterious, Dreamy, Bright
Ragtime

Annotated Tags:

  • Ragtime, Cute, Playful, Jazz
  • Bright, Emotional
  • Passionate, Speedy, Upbeat/Energetic, Relaxing/Calm, Happy
Ballad

Annotated Tags:

  • Ballad, Jazz
  • Relaxing/Calm, Dreamy
  • Dark, Sad, Tense, Mysterious, Emotional, Difficult/Advanced, Passionate, Intense/Grand, Playful, Epic
New-age

Annotated Tags:

  • New-age, Emotional
  • Relaxing/Calm, Happy, Bright
  • Passionate, Dreamy, Easy, Laid-back
Pop-Piano Cover

Annotated Tags:

  • Pop-Piano Cover, Emotional
  • Happy, Bright, Ballad
  • Intense/Grand, Relaxing/Calm, Easy, Mysterious
Classical

Annotated Tags:

  • Classical, Passionate, Difficult/Advanced
  • Intense/Grand, Tense, Speedy
  • Powerful, Upbeat/Energetic, Dreamy, Mysterious, Dark, Epic

Style

Easy

Annotated Tags:

  • Easy, Cute, Bright
  • Relaxing/Calm, New-age
  • Emotional, Happy
Difficult/Advanced

Annotated Tags:

  • Difficult/Advanced, Jazz
  • Passionate, Swing, Bright, Upbeat/Energetic
  • Intense/Grand, Playful, Dreamy, Pop-Piano Cover, Mysterious, Speedy
Laid-back

Annotated Tags:

  • Laid-back, Jazz
  • Mysterious
  • Emotional, Blues, Happy, Relaxing/Calm, Swing, Dreamy
Speedy

Annotated Tags:

  • Speedy, Passionate, Difficult/Advanced
  • Powerful, Intense/Grand, Epic, Jazz
  • Dark, Tense, Funk, Upbeat/Energetic

Experiments & Results

We conducted Piano music classification tasks in both audio and MIDI domains.
The experiments were in two stages: 1) Pre-training with PIAST-YT and 2) Probing with PIAST-AT. The table below shows the results of annotation and retrieval tasks, comparing with the supervised models that were not pre-trained on PIAST-YT.

  • Pre-training with PIAST-YT improves classification accuracy in both audio and MIDI domains, compared to the supervised models.
  • Probing with PIAST-AT shows promising results in all tasks, outperforming the supervised models.
  • In most tasks, MIDI outperforms audio in performance.