101

Virtual Instrument research

Abstract

  • Human Computer Interaction is becoming a major component in computer science related fields allowing humans to communicate with machines in very simpler ways exploring new dimensions of research.
  • Kinect, the 3D sensing device introduced by Microsoft mainly aiming computer games domain now is used in different scopes.
  • We use kinect for controlling sound signals producing aesthetic music.

Introduction

  • Kinect is a sensor which is capable of capturing depth and color information of the user in front of it using an array of RGB and infrared cameras. Further it is capable of capturing the sound input though an array of microphones.

Image

Image

  • Musical Instrument Digital Interface (MIDI)
  • Technical standard that describes a protocol, digital interface and connectors and allows a wide variety of electronic musical instruments, computers and other related devices to connect and communicate with one another.

Midi Architecture

  • MIDI controller is hardware or software which generates and transmits MIDI data to MIDI-enabled devices.
  • Our Virtual Instrument is a novel MIDI controller design based on Kinect!

Design

  • Visual C++: Programming Language
  • OpenNI: Kinect Driver and API functions
  • OpenCV: Visual Interaction display
  • MIDI: Music signal to control MIDI instrument
  • Cubase and VST instrument: MIDI instrument (Audio Library)

Image

  • Once the depth information is captured using Kinect, user skeleton can be obtained with the available functions of OpenNI with 24 joints.

Skeleton tracking

Midi Programming and Audio

  • For the MIDI signal, we use the RtMidi API from Gary P. Scavone from McGill .Sending MIDI signals to audio library for controlling the note key, duration and velocity.
  • VST instrument from Steinberg Company is the audio library which is capable of simulating the sound of real instruments.
  • Using MIDI as the input mechanism, VST instruments output the sounds vividly as instructed.

VST example

Virtual Drum

  • Three areas in front of the user is identified as the Kick, Snare, Hi-Hat and Cymbal.
  • Left Hand ,right hand and right knee are used as the triggers against the above specified regions.
  • When the coordinate of the triggering point is larger than a specified threshold with respect to the defined regions of the virtual drum sets, program triggers MIDI signals, and then MIDI signals trigger the sound in audio library.

trigger points and controlling points

Image

Virtual Guitar

  • Same like in Drum, using Kinect and OpenNI, user skeleton is captured first.
  • Then according to the user skeleton coordinates, we set a Chord selection position in front of the user’s left hand. In the Chord selection position, we have defined six different areas, each representing a chord.
  • To play the virtual guitar, we also set a virtual guitar string in front of the user’s right hand. When the right hand coordinate is in a special value interval, the program sends MIDI signal to the audio library and then triggers the relevant sound.

Chord Selection Box

Virtual Guitar Illustration

Image

Spider King

  • The program virtually draws a circle around the user. We divide the circumference into several intervals. Then users’ hands are used to control the key and volume.
  • When the hand is closer to the circumference, the volume becomes louder. The sound is also from MIDI signal as with drum and guitar, and we connect MIDI signals to different audio libraries.

Image

Image

Advanced Spider king

Image

Image

Performance

  • All the virtual instruments presented here, Guitar, Drum and Spider King and several other virtual instruments are presented in a live performance concert in NCU. --May,9th,2013

---103 CSIE Concert opening

---103 CSIE Concert "無底洞"

Conclusion

  • Human gesture based music composing is one of those emerging field of research. Here in this paper a research directive based on HCI to compose music based on human gestures is presented using the Kinect sensor.
  • Current works are going on to capture detailed human gestures, increase the fps for better performance, integrate robust methods to improve quality and specially detailed subjective Quality of Experience measurements are required to assess the users and listeners experience.
Undefined