- Research Opportunities
- MERGE: Music Emotion Recogni...
- Description
- Results and conclusions
- MOODetector: A System for Mo...
- Description
+ Results and conclusions
+ Mellodee – Melody Detection ...
+ Description
+ Results and conclusions
+ Details

<<
Music Information Retrieval CISUC

 
Research Opportunities

If you are interested in performing research (MSc or PhD level) in any of the topics below, send an e-mail to Prof. Rui Pedro Paiva.

 

  • Music Information Retrieval in general

     

  • Music Emotion Recognition: involving Deep Learning, Machine Learning + Audio Signal Processing (for feature extraction from audio) and/or MIDI Processing (for feature extraction from MIDI) and/or Natural Language Processing (for lyrics processing), among other sources

     

  • Automatic Music Transcription and Melody Transcription in particular

     

 
MERGE: Music Emotion Recognition - Next Generation

new

 
Description:

  • Context
  • Current digital music repositories are enormous and continue to grow. More advanced, flexible and user-friendly search mechanisms adapted to individual user requirements are urgent. This has led to increased awareness in the Music Information Retrieval (MIR) area. Within MIR, Music Emotion Recognition (MER) has emerged as a significant sub-field. In fact, “music’s preeminent functions are social and psychological”, and so “the most useful retrieval indexes are those that facilitate searching in conformity with such social and psychological functions. Typically, such indexes will focus on stylistic, mood, and similarity information” (David Huron).

    Google, Pandora, Spotify and Apple have MIR/MER research agendas, with existing commercial applications. However, services like the Spotify API which purport to categorise emotion have been shown not to be state of the art. To the best of our knowledge, there is no Portuguese industrial entity working on the field; however, international players may benefit from the outcomes of this research, e.g., Pandora, which endorses this proposal (attached letter). MER research promises significant social, economic, and cultural repercussions, as well as substantial scientific impact. Several open and complex research problems exist in this multidisciplinary field, touching upon audio signal processing, natural language processing, feature engineering and machine learning.

  • Problem
  • Current MER solutions, both in the audio and lyrics domains, are still unable to solve fundamental problems, such as the classification of static samples (i.e., with uniform emotion) with a single label into a few emotion classes (e.g., 4-5). This is supported by existing studies and the stagnation in the Music Information Retrieval Evaluation eXchange (MIREX) Audio Mood Classification (AMC) task (music-ir.org/mirex/), where accuracy stabilized at ~69%. Nevertheless, more complex problems have been addressed, e.g., music emotion variation detection (MEVD), with even lower reported results (~20%).

    We believe that, at the current stage of MER, rather than tackling complex problems, effort should be refocused on simpler static single-label classification, with a low number of classes, and exploiting the combination of audio and lyrics via a bi-modal approach.

  • Current Research Gaps
  • There exists a significant corpus of research on MER, ranging from fundamental problems of static single-label classification to higher-level work on multi-label classification or MEVD. However, current results are low and limited by a “glass ceiling”. In previous work, we demonstrated this to be partly due to two core problems: i) lack of emotionally-relevant features; ii) absence of public, sizeable and quality datasets. We created public robust (although somewhat small) MER datasets and emotionally-relevant audio and lyrics features (e.g., music expressivity and text stylistic features, such as vibrato or slang). This strategy was successful and led to ~10% increase in classification performance,

    and, thus, should be further pursued. The lack of robust and sizeable datasets impacts deep learning (DL) research on MER as well. While DL is widespread in MIR, MER datasets are not sufficiently large or well-annotated to train fully end-to-end models.

  • Objectives and Approaches
  • We seek to advance MER research by following an explicitly bi-modal approach which models lyrics and audio simultaneously, and define the following scientific objectives:

    1. Feature engineering/learning.To devise meaningful MER features (both handcrafted and via feature/deep learning) simultaneously in the audio and lyrics domains following a bi-modal approach.
    2. Robust public datasets. To collect and robustly annotate data for MER based on audio + lyrics and to release it to the MIR community.
    3. Static MER and MEVD. To combine these contributions to advance both static MER and MEVD, addressing bi-modal approaches and dimensional MER.

    As a technological objective, we will develop two MER software applications (a standalone and a web app) to demonstrate our scientific innovations.

Keywords:

Music Emotion Recognition, Music Information Retrieval, Feature Engineering, Audio Signal Processing, Natural Language Processing, Machine Learning, Deep Learning

Dates:

Start date: January 1, 2022

End Date: December 31, 2024

Team (alphabetical order):

Gabriel Saraiva (MSc Student)

Guilherme Branco (BSc Student)

Hugo Redinho (MSc Student)

Luís Carneiro (MSc Student)

Matthew Davies (Senior Researcher)

Pedro Louro (PhD Student)

Renato Panda (Senior Researcher)

Ricardo Correia (PhD Student)

Ricardo Malheiro (Professor)

Rui Pedro Paiva (Professor, Project Coordinator)

 

Past collaborators:

Diogo Rente (BSc student)

Pedro Sá (MSc Student)

Rafael Matos (MSc Student)

Funded by:

FCT - Fundação para a Ciência e Tecnologia, Portugal (PTDC/CCI-COM/3171/2021)

Budget:

€ 195 892

Resources:

 
Results and conclusions:

Main contributions:
  • Work in progress...;
  •  

     
    MOODetector: A System for Mood-based Classification and Retrieval of Audio Music

     
    Description:

    “Music’s preeminent functions are social and psychological”, and so “the most useful retrieval indexes are those that facilitate searching in conformity with such social and psychological functions. Typically, such indexes will focus on stylistic, mood, and similarity information” (David Huron, 2000). This is supported by studies on music information behaviour that have identified music mood as an important criterion for music retrieval and organization.

    Besides the music industry, the range of applications of emotion detection in music is wide and varied, e.g., game development, cinema, advertising or the clinical area (in the motivation to compliance to sport activities prescribed by physicians, as well as stress management).

    Compared to music emotion synthesis, few works have been devoted to emotion analysis. From these, most of them deal with MIDI or symbolic representations. Only a few works tackle the problem of emotion detection in audio music signals, the first one we are aware of published in 2003. Being a very recent research topic, many limitations can be found and several problems are still open. In fact, the present accuracy of those systems shows there is plenty of room for improvement. In a recent comparison, the best algorithm achieved 65% classification accuracy in a task comprising 5 categories (MIREX 2010). The effectiveness of such systems demands research on feature extraction, selection and evaluation, extraction of knowledge from computational models and the tracking of emotion variations throughout a song. These are the main goals of this project.

    Keywords:

    Music Emotion Recognition, Music Information Retrieval, Feature Engineering, Audio Signal Processing, Natural Language Processing, Machine Learning

    Dates:

    Start date: May 16, 2010

    End Date: November 14, 2013 (formal date, although research work has continued)

    Team (alphabetical order):

    Amílcar Cardoso (Professor)

    António Pedro Oliveira (PhD student)

    Renato Panda (PhD student)

    Ricardo Malheiro (PhD student)

    Rui Pedro Paiva (Professor, Project Coordinator)

     

    Past collaborators:

    Álvaro Mateus (BSc student)

    João Francisco Almeida (BSc student)

    João Miguel Paúl (MSc student)

    João Fernandes (MSc student)

    Luís Cardoso (MSc student)

     

     

    Funded by:

    FCT (Fundação para a Ciência e Tecnologia, Portugal)

    Budget:

    77304 €

    Resources:

     
    Results and conclusions:

     
    Mellodee – Melody Detection in Polyphonic Audio

     
    Description:

    Keywords:

    Dates:

    Team (alphabetical order):

    Funded by:

    Budget:

    Resources:

     
    Results and conclusions:

     
    Details:

    Pitch Detection:

    Determination of musical notes:

    Identification of melodic notes:

    Index:

    >>