ISMIR 2006 Tutorial Information - SUNDAY October 8

The ISMIR 2006 tutorials will take place at the University of Victoria campus at the new Engineering and Computer Science building. A map of the UVic campus showing the building (lower left corner) can be found at:
UVic 3D map
There is frequent public transportation from downtown and the Empress to UVic:
Route 4
Route 14
Important: Sunday Oct. 8th also happens to be the Royal Victoria Marathon so make sure you check the bus detours posted on the webpage above. These are the two easiest routes. The staff at the Empress will also be able to help you with arranging for a taxi which would be quite affordable when shared between 2-4 people.

09:00-12:00 Computational Rhythm Description
09:00-12:00 User Interfaces for Music Information Retrieval
14:00-17:00 MIR for audio signals using Marsyas-0.2
14:00-17:00 Bayesian Methods for Music Signal Analysis

Computational Rhythm Description
Fabien Gouyon, Simon Dixon

This tutorial will provide an overview of past and current approaches
to the automatic description of musical rhythm of audio and MIDI
files. After defining basic notions of interest, we will propose a
general functional framework for the analysis and qualitative
comparison of existing rhythm description systems and review the
architecture of many existing systems with respect to individual
blocks of this framework. We will then address the issue of system
evaluation and report on results from the last two MIREX efforts in
tempo estimation. We will finally illustrate the use of rhythmic
descriptors in music content processing applications (from retrieval
to transformations) and highlight current promising research trends.

In this tutorial, a special focus will be put on building bridges
between high-level music-theoretic rhythm concepts and acoustic
features of a lower level of abstraction.

Rhythm is ubiquitous in the literature on automatic music description,
notably in the proceedings of past ISMIR conferences. However, when
including rhythmic features in MIR systems, researchers usually use
rather low-level representations of rhythm. This tutorial will review
previous and current attempts at automatic rhythm description of audio
and MIDI files, illustrate current use of rhythmic features in MIR
systems and foster the use of higher level rhythmic features for tasks
such as genre classification and music similarity.


Fabien Gouyon works at the Austrian Research Institute for Artificial
Intelligence in Vienna. He received his PhD in Computer Science and
Digital Communication from University Pompeu Fabra in Barcelona. His
thesis titled "A Computational Approach to Rhythm Description - Audio
Features for the Computation of Rhythm Periodicity Functions and their
use in Tempo Induction and Music Content Processing" was supervised by
Xavier Serra (Pompeu Fabra University, Barcelona) and Gerhard Widmer
(Johannes Kepler University, Linz). He is member of the Audio
Engineering Society committee on Semantic Audio Analysis, has
published more than 30 scientific papers on the computational analysis
of musical rhythm, onset detection and percussion sound
classification, he serves as reviewer for several international
journals and organized the first tempo induction contest at the ISMIR
2004 conference.

Simon Dixon is a research scientist at the Austrian Research Institute
for Artificial Intelligence (OFAI) in Vienna. He studied computer
science at the University of Sydney, obtaining the BSc (1989) and PhD
(1994) degrees, with a dissertation on computational belief revision.
During his undergraduate studies, he also obtained the AMusA and LMusA
diplomas in classical guitar. After lecturing in computer science at
Flinders University of South Australia for 5 years, he moved to Vienna
in 1999 to join the Intelligent Music Processing and Machine Learning
Group at OFAI. His research interests focus on the extraction and
processing of musical content (particularly rhythmic content) in audio
signals, and he has published over 40 papers covering areas such as
tempo induction, beat tracking, onset detection, automated
transcription, genre classification and the measurement and
visualization of expression in music performance.

User Interfaces for Music Information Retrieval
David Gerhard

As in many other disciplines, there is somewhat of a gap in the Music
Information Retrieval community between those who develop tools and
those who use the tools.  Often, a developer will build a tool to a
level of detail sufficient to solve a particular problem and then make
the tool available to other users.  Unless these users are familiar
with the development process for the tool, or the underlying parameter
design, they may have difficulty using the tool.  Researchers
unfamiliar with programming languages may have difficulty using many
of the available MIR tools, and may become frustrated knowing that
extracting a particular piece of information is possible if only they
could figure out how to do it.

This tutorial will examine some of the principles of user interface
design, and how they apply to MIR situations.  Presentation of MIR
data in visualizations, augmented audio, and score notation will be
discussed, as well as the design of efficient, effective and
satisfying interactions. Physical interfaces and interface protocols
will also be discussed, considering their limitations, advantages, and
the possibility of using traditional interface modes for novel
interaction paradigms.


David Gerhard is an Assistant Professor of Computer Science at the
University of Regina in Saskatchewan, Canada.  He received his B.Sc.
in Computer Engineering from the University of Manitoba in 1996, and
in 2003 received his Ph.D. in Computer Science from Simon Fraser
University for his work detailing the computational differences
between talking and singing. Since then, he has been studying
interactive media, specifically computational and human aspects of
audio interaction, applying techniques from HCI, multimedia, and
pattern recognition.  He has published work related to multimedia
composition, sound spatialization, and usability of new music devices,
as well as continuing his work on the analysis of the sung voice. Dr.
Gerhard is a founding member and the current director of the aRMADILo
(Rough Music and Audio Digital Interaction Lab) and is an associate
member of the department of music.

MIR for audio signals using Marsyas-0.2
George Tzanetakis and Luis Gustavo Martins

Marsyas-0.2 is an open source audio processing framework with specific
emphasis on MIR applications. It is written in C++ and provides a large
number of building blocks for constructing MIR systems. It attempts to
provide high-level expressive abstractions without sacrificing efficiency.
The tutorial will cover the basics of using the latest rewrite of Marsyas
(0.2.x releases)  with specific examples from audio MIR research such
as feature extraction, similarity retrieval, classification, clustering,
segmentation. In addition it will cover more advanced topics such as the
creation of user interfaces and the interaction of analysis and synthesis.


George Tzanetakis is an assistant Professor of Computer Science
(cross-listed in Music) at the University of Victoria. He received his
PhD degree in Computer Science from Princeton Uiversity in May 2002
and was a PostDoctoral Fellow at Carnegie Mellon University working on
query-by-humming systems with Prof. Dannenberg and on video retrieval
with the Informedia group. In addition he has worked as a summer
intern at SRI on multimedia browsing user interfaces, was chief
designer of the audio fingerprinting technology of Moodlogic Inc., and
developed a real-time music speech classification system for All Music
Publishing, The Netherlands. His research deals with all stages of audio
content analysis such as feature extraction, segmentation, classification
with specific focuson Music Information Retrieval (MIR). His pioneering 
work on musical genre classification is frequently cited and received an
IEEE Signal Processing Society Young Author Award in 2004. He has
presented tutorials on MIR and audio feature extraction at several
international conferences. He is also an active musician and has studied
saxophone performance, music theory and composition. More
information can be found at:

Luis Gustavo Martins is a PhD student and researcher at the Audio Group
of the Telecommunications and Multimedia Unit of INESC Porto, Porto, Portugal.
His main work interests are in the areas of Digital Audio Processing
and Semantic Audio Analysis. He has been involved with the development
of Marsyas for several years and is currently working on implementing
his Masters thesis work on polyphonic music transcription in Marsyas-0.2.

Bayesian Methods for Music Signal Analysis
A. Taylan Cemgil

In the last years, there have been a significant growth of music
information processing applications that employ ideas from statistical
machine learning and probabilistic modeling.  In this paradigm, music
data is viewed as realizations from highly structured stochastic
processes.  Once a model is constructed, several interesting problems
such as pitch detection, transcription, tempo tracking, source
separation e.t.c. can be formulated as Bayesian inference problems. In
this context, graphical models provide a "language" to construct
models of music such as quantification of prior knowledge about
physical properties of sound, musical structure and its relation to
the performance. Unknown parameters in this specification are
estimated by probabilistic inference. Often, however, the problem size
poses an important challenge and in order to render the approach
feasible, specialized inference methods need to be tailored to improve
the computational speed and efficiency.

The scope of the proposed tutorial is as follows: First, we will
review the fundamentals of probabilistic models of music, both low
level signal models and high level structure models such as tempo,
rhythm and harmony. Then, we will discuss the numerical techniques for
inference in these models. In particular, we will review exact
inference, approximate stochastic inference techniques such as Markov
Chain Monte Carlo, Sequential Monte Carlo and deterministic inference
(variational) techniques. Our ultimate aim is to provide a basic
understanding of probabilistic modeling for music processing, and a
roadmap such that music information retrieval researchers new to the
Bayesian approach can orient themselves in the relevant literature,
understand the current state of the art and eventually incorporate
these powerful techniques into their own research.


A. Taylan Cemgil received his B.Sc. and M.Sc.  in Computer
Engineering, Bogazici University, Turkey and his Ph.D. (2004) from
Radboud University Nijmegen, the Netherlands with a thesis entitled
Bayesian music transcription. Between 2003-2005, he worked as a
postdoctoral researcher at the University of Amsterdam on vision based
multi object tracking.  He is currently a research associate at the
Signal Processing and Communications Lab., University of Cambridge,
UK, where he cultivates his interests in machine learning methods,
stochastic processes and statistical signal processing.  His research
is focused towards developing theoretically sound computational
techniques to equip computers with musical listening and interaction
capabilities, which he believes is essential in construction of
intelligent music systems and virtual music instruments that can
listen, imitate and autonomously interact with humans.