Final Submission

Here is the final submission. From the learning agreement….

I propose to produce a sound track for a short film (2-3 mins duration). The film will be an observational study of an old wooden jetty near West Mersea, Mersea Island, Essex, which protrudes from the island bank over salt marsh into mud/water (depending on the state of tide). The location is well away from built up areas and rich in sounds which are produced by the large number of birds in the area, marine activity, local firing ranges, walkers, water and wind. The visual style of the film will be deliberately simple in order to create space for the sound track which will be produced by creating a prototype ‘positional audio diffusion system’ enabling the placement of sound sources in notional 3d space around the film POV. All sound sources will be captured in situ via field recording techniques and processed in real-time by the diffusion system in order to create the illusion of distance and position in relation to the viewer. The final submission will consist of the film complete with audio track achieved via the positional audio diffusion system.

Critical evaluation/reflection, system overview and more in the posts below…


Critical Evaluation of Project

In general, I personally feel the project has been a success, despite not being 100% haoppy with the final output (more on this later!). I have learned much on the way, have devised and tested a range of techniques, have come up against problems and found a way round them and have found a number of new contexts for my artistic endeavours. I feel that I have really identified with sonic art in the context of audio-visual (av) composition and started to move away from the idea of simple producing av output to ideas of installation/performance within which the av content is an important but not solitary element.

The Good Stuff

  • The positional techniques and trig maths worked, phew!
  • All software elements generally stood up to the task, in particular the Processing implementation of OSC (see below for more detail) although looking ugly, did the trick
  • I learned some Java syntax and Processing methods
  • I developed some useful audio editing techniques eg removing detail noise from an atmosphere track and using it as an audio event
  • I tested out use of volume, frequency filter and reverb to create the sensation of distance from the point of audition (POA)
  • Similarly, I tested out the use of pan and additional frequency cut (for sounds to the rear) to create the sensation of direction from teh POA

Bad Stuff

  • I underestimated the time it would take to capture some decent field recordings, a month rather than a week may have been better
  • I was not very happy with the final field recordings, especially car, aircraft and high gain noise
  • Car and aircraft noise I begrudgingly got used to and after all these sounds are part of the environment
  • The gain noise I found more annoying – quiet broad in frequency, it runs right across the sound of water trickling through the mud, which was a key sound I wanted to preserve, inevitably much of the noise remains despite my efforts to supress it – at least the water trickling noise is audible.
  • The main audio sources that was suitable for use as sound event were seagull cries as these were generally nearer to the mic and cut through background noise, fine except slightly limiting in terms of an audio event palette, I was tempted to rename the project ‘generative seagull’ ;-0
  • Although Resolume generally behaved as a master av performance platform, I had some issues exporting a decent render to be used as the actual submission

If I did it all again

  • I would have to consider which is more important, fidelity to the environment I am seeking to re-construct or fidelity of audio as there is an evitable compromise to be struck
  • If I went down the field recording as only source route again, I would need to up-spec recording equipment and spend longer capturing sources
  • I would instead be tempted to re-create/re-interpret the sounds using synthesis which would be great fun and would also mean clean sound sources 😉
  • The visuals played second fiddle in this project, naturally as it is a sound module, I would want to think about these in more detail if I did it again
  • I would try to improve the use of digital sound processing to achieve positional audio, in particular the convincing placement of sound  behind the POA is difficult to achieve with standard 2 speaker/headphone set up
  • Ideally, I would mitigate the above by piping the sound to multiple speakers in an installation environment 😉

A note on the images

In the end I decided to use a small number of detail shots of the jetty rather than the advancing point of view shot I’d previously used. The detail shots are in sequence running from shore to mud ends of the jetty and roughly match up in position to the placement of the triangular point of audition (POA) marker shown on the schematic.

I like the close-up textures and their sense of ‘attrition in the face of the elements’. Hopefully these oblique views of location help to draw focus and give more purpose to the audio.

System version 0.0.1 – Overview

The ‘Positional Audio System’ created for the Sound Creation and Perception module comprises of a number of elements, at the core of which is the Resolume Avenue VJ Platform.

The system is designed with future performance in mind and although clips are randomly triggered in the current version, a future version could easily incorporate a control interface better suited to performance, eg an iPad running a specifically designed app.

A Flash movie holds a series of still images taken from individual points along the jetty. These images were taken using a 4:3 aspect ratio. The Flash movie itself is 16:9 and in the remaining space is a simple schematic of the jetty from a birds-eye view and a marker (a triangle) denoting the position of the Point of Audition (POA). Another marker (a circle) is used when a sound is triggered to allow a visual check that the each triggered sound is being treated corresponding to its nominal position.

Using Resolume Avenue’s Flash SDK, a number of the Flash movie’s parameters are ‘exposed’ to Resolume Avenue so that these can be controlled by any other internal/external source capable of communicating with Avenue.

Processing is used to execute a mini application (in Processing speak, a ‘sketch’) capable of using Open Sound Control (OSC) to control the audio and Flash clips held in Resolume Avenue. The Processing sketch contains the system’s control logic – all other system elements are essentially ‘dumb’ and are only used to facilitate connection, playback and final output of media and effects.

The Processing sketch initiates and controls the ‘shore’ and ‘mud’ atmosphere tracks (ie the processed field recordings made at either end of the jetty). It does this by modulating amplitude and also a low pass filter to suppress volume and frequency range of an audio track the further away the POA is from the nominal source of that track – ie for the atmosphere tracks, the distance from the current POA to either end of the jetty. The duration of POA movement from shore to mud end of the jetty is set in Processing and because the atmosphere tracks loop and the triggered sounds are continuously triggered this can be set for an arbitrary time.

Individual detail sounds are stored in Resolume Avenue and for this initial version of the Processing sketch, are triggered using a combination of simple techniques common to motion graphics programming. For each frame of the journey from shore to mud ends of the jetty, an increasing probability threshold is set for a trigger to occur, ie the nearer the end of the journey the more likley a sound is to be triggered. Whether or not a positional audio event occurs is determined by firstly randomly generating x and y co-ordinates for the possible location of the event in notional space and secondly using the ‘noise()’ function of Processing which returns a value for a given co-ordinate pair based on an adaptation of Perlin noise. Perlin noise is commonly used to create semi-random patterns for use as textures, form or movement paths in motion graphics. It has the feature of looking organic when visualised as a bitmap (as below).

Perlin noise seems like an appropriate input pattern for the triggering of sounds as part of a soundscape composition. In fact there may be some mileage in exploring the natural patterns present in an environment and trying to devise mathematical articulations of them for use in composition. For example the almost-fractal type patterns created by the interaction of sea water and land as salt marsh is created, as seen in the aerial view of the Mersea Island jetty.

Back to the audio trigger logic: if the Perlin noise value for the x,y co-ordinate pair being tested passes the current trigger threshold, a sound is triggered. Additionally a trigger count limit is set on a semi-random basis which is used to prevent successive sounds being triggered too quickly, ie many events triggered a few frames apart would sound cluttered.

The choice of sound is arbitrary for the initial version. Each sound triggered is assigned the x and y co-ordinate pair to hand. The Processing sketch then uses trigonometric functions (detailed previously) to ascertain distance and angle of the sound from the POA. An OSC message is then constructed that triggers the relevant sound clip in Resolume (shown below).

The OSC message instructs Resolume to modulate volume, pan, low pass filter and reverb dry/wet balance based on the positional attributes of the constructed sound event, as evaluate by the Processing sketch. Additionally, pitch is micro-modulated arbitrarily for each sound event in order to introduce variation and nuance.

Contextualisation: Soundscape Composition

The term ‘Soundscape’ was created by Canadian composer R Murray Shafer (b1933) who used it to describe a genre of audio composition that creates the sensation of a specific acoustic environment through the use of sounds to be found there such as animal, environmental, people and mechanical sounds. Shafer originally suggested that there are three main elements to soundscape, he called these keynote sounds, sound signals and soundmarks. Keynote sound is a musical concept referring to the innate key of an environment. Sound signals are near field sounds listened to consciously. Soundmarks are analogous to landmarks and are completely unique to an aural environment.

Previous to and at the same time as Shafer was developing these ideas, what we now consider to be ‘sampled’ real world sounds had been used as part of electro-acoustic compositions by the likes of John Cage, Pauline Oliveros and John Oswald. In this sense the use of soundscape-type sound sources was also a facet of wider electro-acoustic composition, although usually these sources were used out of their original aural context.

Cue Barry Truax and Hlidegard Westerkamp, two Canadian-based composers at the forefront of branching the form of soundscape into ‘soundscape composition’. In soundscape composition, sounds from a given environment can be manipulated substantially, mimicked and sounds from elsewhere can also be used as long as the listener feels a connection to the subject environment and the compositional devices are used to enhance this connection.

Barry Truax outlined 4 principles of soundscape composition in his 2001 publication ‘Acoustic Communication’

1. listener recognisability of the source material is maintained, even if it subsequently undergoes transformation;

2. the listener’s knowledge of the environmental and psychological context of the soundscape material is invoked and encouraged to complete the network of meanings ascribed to the music;

3. the composer’s knowledge of the environment and psychological context of the soundscape material is allowed to influence the shape of the composition at every level, and ultimately the composition is inseparable from some or all aspects of that reality;

4. the work enhances our understanding of the world, and its influence carries over into everyday perceptual habits.

Initial audio tests with atmospheric recordings

I initially ran a few tests to figure out whether the recordings would actually be usable. I tried cross fading between A and B (ie ‘land‘ and ‘mud’) atmosphere tracks over a period of 2 minutes. This didn’t work as almost right away I could hear atmosphere B coming in over A and similarly could hear A until very nearly the end. I tried off-setting the cross fade so that B starts fading in after approximately 30 seconds and A completes fading out 30 second before the end. This created a slightly more convincing transition.

Next I tried placing a detail sound over the atmosphere layer. The detail sound itself was very noisy but placed over the noisy atmosphere, the noise was not so apparent. Hurray – it appeared that the field recordings I had created in the limited time available, further reduced by a run of bad weather and mistakes, would actually be usable.

After some trial and error I came up with the approach of running some basic audio pre-processing on all sound sources to be used in the piece, and then to apply additional post-processing on the master output of Resolume Avenue (the master composition platform) as required, ie further equalization or volume adjustment. The pre-composition technique was to cut all audio from 17kHz and above using a high shelf in an attempt to reduce the high gain noise without compromising the higher frequency environment sounds such as water trickling through mud. Similarly, a low shelf was employed from 100Hz and below to remove much of the low end noise generated by wind and general sonic reverberation. (see below diagram for pre-processing audio profile)

It seems that the 80kHz cut I employed on the recorder at the point of capture was not particularly effective.

I had already realised that it would be necessary to run low-detail atmosphere track(s) for the duration of the piece. Low detail because the programmatically placed individual sounds would need to provide the ‘interest sounds’. In actual fact some of the atmosphere recordings had far too much foreground detail and I had to go through them cutting out these fragments to create a more ambient atmosphere track suitable as ‘base layer’ to place detailed sounds on top of.

Diary of a Field Recordist

Thursday 19/04/12
Picked up directional mic, wind shield and recorder from NUCA
Friday 20/04/12
Went to site mid afternoon, a couple of hours after high tide so plenty of water still about, mainly gulls and possibly Avocet in the near field.
Weather state – high wind and imminent heavy showers.
Ran into a few immediate problems
People – few kids around making a fair bit of noise
Aircraft – the dreaded distant jet engine noise prevalent on most recordings
Wind noise
In general having to crank the gain right up and record from a substantial distance making for pretty noisy recordings. Not able to record any single sound in isolation.
Saturday 21/04/12
Early morning, weather state fair but cold. Tide out but perhaps too far as waders not in near field. More problems –
Car noise – even at 6:30am a fair few cars crossing from the island to the mainland via the causeway with resulting noise
Aircraft noise as with previous day
‘Channel hum’ a reverberant humming sound possibly caused by wind acting on masts of boast moored in the channels of Mersea Quarters, I’ve heard this before in the area when all else is quiet and it can be quite intrusive
In general still very noisy recordings with little prospect to separate the various sources.

So far the source types I’m after and their relative proximities are –

Land birds eg song thrush to the landward of the jetty (mid)
Gulls (near)
Waders (mid)
Geese (distant)
Water (near)
Mud/trickling sound (near/mid)

Monday 23/04/12
Realised I have been making the school boy error of combining the Left and Right tracks at the point of recording. Although I only have 1 microphone and intend to create mono recordings, the unused XLR socket creates a certain amount of noise which is audible at the high gain setting I have been using. I had initially thought that this noise was caused either by my mobile phone or the proximity of mains cabling. Doh! This renders the recordings made to date unusable as the tracks cannot be separated once unified at the point of capture.

Tuesday 24/04/12
Went back to the jetty at Mersea Island at low tide, just before sun set, hoping to make some decent mono recordings onto a single channel, ie rectifying the mistake identified the previous day. Conditions were very still after a day of rain. The water being very low and only forming a small slow moving stream in the centre of a large mud channel, a natural amphitheatre was formed with gull cries in particular echoing noticeably.

I made a number of 4 minute recordings at either end of the jetty in an attempt to capture a polarity of ambience. Also brought along a mic stand and set this up rather than holding the mic by hand. The XLR lead was long enough to trail back a few meters which reduced the likelihood of any sounds by clothing etc being captured. Not holding the mic was definitely a good move as it allowed me to focus more on the audio being captured via headphones.

Again, the session was dogged by aircraft noise which was a real shame as conditions were sonically perfect with the still evening air really carrying sounds far, sadly including unwanted jet engine noise. I know believe that the River Blackwater, at the mouth of which Mersea Island stands, is almost certainly a navigational corridor for aircraft departing and approaching Stansted Airport from the East.

Recording quality was still quite noisy due to the high gain necessary to capture atmospheric, middle and distant sound sources. The noise was more of a broad hiss than the digital artifacts the recorder was previously picking up.

Despite the pervasive aircraft noise and the recorder gain noise, I got some reasonable atmospheric sounds, 2 tracks in particular being of merit, each captured at either end of the jetty.

Simple Positional Audio Theory

Simple game audio spatialisation techniques use Euclidean geometry to modulate audio pan and volume based on the distance and angle between the listener and sound source. For most applications 2D spatialisation suffices as it can be used in conjunction with standard stereo diffusion systems eg headphones or speakers.

Using Euclidian geometry, the distance between 2 objects with known x and y co-ordinates is caculated by using the formula

distance = sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}

which is basically the same as Pythagorus’ Theorum – imagine a right angled triangle with the hypotenuse extending between 2 points representing the listener and sound source. If we know the lengths of the other 2 sides of the triangle we can solve for the hypotenuse.

The resulting distance between the 2 points is used to modulate volume, usually by multiplication with an attenuator factor, eg

attenuator = 0.002;
gain = distance * attenuator

The difference in position on the x plane between listener and sound source is used to modulate pan, calculated by the formula

x_difference = x_2-x_1

The result is modified by the width of the nominal environment divided by 2 (or the radius of the environment) to create a meaningful value between -1 and 1, eg

pan = x_difference/(environment_width/2)

Alternatively, the angle between the listener and sound source can be calculated using the mathematical function atan2 which is possibly preferable as it removes the need for a nominal environment size.

By using atan2 it is possible to convert the angle of the sound source into a useful number, within the range of -PI and PI, which can then be converted using PI to set a pan level of -1 to 1, eg

pan = atan2((y_2-y_1),(x_2-x_1))/PI

This technique supposes no difference between an object in front or behind of the listener.

Books, books, books

I thought it time to talk about some of the books I have been reading recently, firstly ‘The Fundamentals of Sonic Art & Sound Design’ by Tony Gibbs (published 2007), leader of BA Sonic Arts at Middlesex University.

This is a really concise and easy-to-read primer that’s actually quite inspirational in its gentle collation of sonic art history and curation of contemporary practitioners and students. I have been exposed to much of the contents through previous studies and even to some of the individuals mentioned through direct contact.
One inspirational points for me is thinking about the importance of performance in sonic art and how this relates to many of the projects I have been involved in to date. I can also relate it directly to the project I am currently working on in terms of ‘playing’ a field of sound through a specifically devised system. Gibbs writes about the role of the ‘diffusor’ as an artist in his/her own right who usually, but not always, is the composer too.

Another book that has been keeping me turning the pages of late is ‘ocean of sound, aether talk, ambient sounds and imaginary worlds’ by David Toop (originally published 1995).

Among other fascinating fly-on-the-wall observations of modern music culture, Toop writes about the birth of an electronic ambient music scene in UK and Dutch clubs in the early 90’s and the symbiosis that existed between this form and the high-energy electronic dance music that was often played under the same roof at the same time. He mentions a number of notable ambient-only events that were held at the Brixton Cooltan Arts Centre in 1993 under the name of ‘Telepathic Fish’. I was lucky enough to live locally at the time and when I dropped by to one of these nights, was mightily pleased to see and hear the huge bass bins normally reserved for big dance parties rattling away with the hum of electronic ambience. This experience actually had a profound effect upon me as it made me think about ambient music in a much more visceral and physical way. Listening to long, slowly modulating bass sounds played through large speakers with particular harmonic overtones causing rattle and vibration is a very inspirational experience if you like that sort of thing.

Dynamic Sound Processing – early beginnings

Following from my last post I have been beavering away at some early stage proof of concept (POC) code to ensure tha the positional audio systen has a hope of actually coming into being in the limited timescale available. The main criteria for the system are as follows

  • Must be able to trigger and manipulate audio samples dynamically
  • Must be able to run custom positional logic
  • Must be able to trigger video and/or still image
  • Must be able to render to an audio/video file format for the purposes of module delivery

The solution I have come up with is to use Processing (normaly used to program  graphics) running the oscP5 library to generate Open Sound Control (OSC) commands that address audio/video/still media and effects plug-ins assembled in Resolume Avenue. The latter being a pretty robust piece of commercial VJ software that excels as a real-time a/v composition platform, with the very handy ability to render out video files without too much trouble.

Here is a sample of the POC code which just tests the water in terms of triggering and manipulating an audio source programmatically.

Even if the code does not make sense, you might glean from the comments that the routine ‘connects’ a sound clip (ie triggers it), sets the volume, stereo placement and reverb. The values set are arbitrary at this stage, just to show that the technique can be used, the real work will come in developing the positional logic, relating this to the audio parameters and tweaking to hopefully make it all sound ok in relation to the accompanying video image 😉

Here is an example of an electronic chirp created for the test, played ‘as is’ without treatment.

Here is a clearer picture of the (mono) waveform (generated by a slightly modulated sine wave).

Here is the dyanamically treated version

And here is a picture of part of the wave form which now shows the sound in stereo with reduced volume, a bias to one side of the stereo axis and the beginning of a long reverb tail

So what? Well it’s all controlled by code so as long as the host machine processor or any of the constituent platforms don’t max out, the road is clear to create the afore-mentioned sound diffusion system.

More Development….

Having spoken to my tutor I realise that the project as I currently planned it does not go far enough – amounting to Foley in the outdoors.

I have been scratching around for means to develop the original idea into something more challenging and I realise that I need to call in the programming expertise that I have accrued over the years to add some weight. My idea now is to try to create a ‘sound diffusion system’ that I can use to place the individual field sounds I intend to record in an environment, drawing on (simplified versions of) game programming techniques such as 3d positional audio to locate sounds within the stereo field depending on the position and attitude of the listener. In the case of the jetty film, the position and attitude is known and follows the POV progression along a fixed axis between 2 points – the beginning and end of the jetty – with a persistent attitude (ie no change in angle of view). Simulation of stereo position could be achieved by modulation of left and right volume, distance by increase of reverberation and possibly dulling of higher frequencies.

The system I intend to create would locate a number of audio ‘events’ around the jetty axis and modulate the audio as described above to simulate movement in relation to the audio source. There may be scope for audio events that change position (as they would in nature – many of the sounds after all will be bird cries) or this may be too complicated for an initial proof of concept, which is the level I would like to aim at – already having some thought about the development and application of such a system for use in possible future audio-visual installations projects.

This image of binaural synthesis gives a good visualistion of the technique

THis video of an existing system controlled by an Android device also looks promising

The authoring group, Spatial Audio Research of Technische Universitat Berlin, certainly looks promising to this line of research!