Procedural Sound Generation in Unity

Sound design is a vital part of a game. Other than using audio files such as .mp3 and .wav files as audio sources, we can also generate audio at runtime and play it. Generating audio in runtime adds more dynamic sound to the game by playing different sounds based on the state of the game.

To keep this tutorial simple, we will generate white noise and pure sine tone at runtime using code.

Audio Fundamentals

To generate audio, first, we need to know how sound is produced physically. Sound is the result of the wave produced by the vibration of air. A speaker consists of a diaphragm that vibrates to produce sound.

Different types of sounds are produced by a vibrating diaphragm with varying frequencies with different strengths. So, if the vibration of the audio source can be controlled, the generation of sound can be controlled.

Sample Rates

In digital systems, everything is discrete. Although the vibration of a speaker is perceived as continuous movement, it is actually divided into much smaller discrete subdivisions.

These smaller subdivisions are known as a sample, and the total divisions in a second are known as the sample rate. Humans cannot hear sound frequencies greater than 20,000 Hz, so to sample audio, we need a minimum of 40,000 samples a second to play all possible ranges of sound.

Typical audio systems have a sample rate of 44.1 kHz or 48 kHz. The sample rate determines the data that should be passed per second. The strength of vibration is passed for each sample.

Channels

Also, we need to consider channels while generating audio. When a sound is played, more than one sample of data may be played simultaneously for each audio source.

For example, a stereo system has a dual-channel for left and right speakers. This means there are separate audio data for the left and right speakers. For a 5.1 surround sound system, there are 6 channels. So we must be careful about the number of channels we are dealing with before generating audio.

Controlling Audio in Unity

To control audio in Unity, there is an event function called OnAudioFilterRead. This is available for MonoBehaviour classes. It consists of a float array, and an integer passed to it as a parameter. The parameter float array consists of data for each sample for each channel.

Let's call this array data. The int parameter consists of the number of channels. Let's call this channels.

The data order in data is as follows,

Let,
C is Channel and is indexed as C₀,C₁,C₂,... for different channels,
S is Sample and is indexed as S₀,S₁,S₂,... for sample at different sample index. Then,
data is in the order of [S₀C₀,S₀C₁,S₀C₂,...S₀C_n, S₁C₀,S₁C₁,....S₁C_n, S_nC₀,....,S_nC_n, ...]

Generally, the size of data is 1024 audio samples. It means that for 2 channel audio data, it will contain 2048 floats inside it. To manipulate audio, we need to change the values of those samples.

One thing to note is that this method runs on a separate thread from the main thread, so calling Unity functions from inside of the function is not allowed.

Generating White Noise

White noise is static noise screen noise. This is similar to the static noise we hear on the radio or television. To generate white noise, we just need to set a random value to each sample in data array.

using Random = System.Random;

public class AudioGenerator: MonoBehaviour {
  private Random rand = new Random(); // using random from system namespace
  // because unity random is not allowed due to this being in separate thread

  // loop through sample data of each channel and fill random value
  private void OnAudioFilterRead(float[] data, int channels) {
    // looping through each sample group
    for (int i = 0; i < data.Length; i += channels) {
      // looping through sample of each channel in sample group
      for (int j = 0; j < channels; j++) {
        // keeping float value in range of (-1,1)
        data[i + j] = (float)(rand.NextDouble() * 2 - 1);
      }
    }
  }
}

Generating White Noise

Attaching this script to any GameObject in a scene produces white noise. Just be sure that scene contains an audioListener.

Generating Sine Tone

Sine tone is the purest form of sound consisting of only one frequency. It is generated by vibrating in simple harmonic motion with a certain frequency. Let's create a float variable frequency to control the frequency of sound, time to keep track of played samples, and sampleRate to store the sample rate of the playback system. We will use Mathf.Sin to generate a sin wave. Let's create a method for it.

// generating sin wave
// index is the index of sample for which value of wave is to be determined
float GetSinWave(int index, float frequency, int sampleRate) {
  return Mathf.Sin(2 * Mathf.PI * (index / (float) sampleRate) * frequency);
}

Generating sin Wave

The expression 2*Mathf.PI*(index /(float) sampleRate) * frequency can be broken down as

(index/(float) sampleRate) will give the total seconds passed for given index. index is the total number of samples. It means the value of this expression will increase by 1 every 1 second.

Now, multiplying it with frequency (index /(float) sampleRate) * frequency scales up time so that value of this expression will be increased by frequency every second.

Finally, multiplying by 2*Mathf.PI will convert the expression to radian so that the correct value can be obtained when passed to the Sin method.

This will return a value in the range of (-1,1).

Now, use OnAudioFilterRead method to fill data from the sine wave for each sample. We are not provided with an index of sample in time-space, so we need to manually track it using the time variable. I am setting sampleRate as 48000 in my project. It can differ from your project.

Also, since time is incremented around 40k times per second, its value can get too large and can result in float overflow. So, to control it, we can reset it every 5 seconds.

private int sampleRate = 48000; // this can be different. setting wrong value will play wrong pitch
private int time = 0;
public float frequency;

private void OnAudioFilterRead(float[] data, int channels) {
  // looping through each sample group
  for (int i = 0; i < data.Length; i += channels) {
    // looping through sample of each channel in sample group
    for (int j = 0; j < channels; j++) {
      data[i + j] = GetSinWave(time, frequency, sampleRate);
    }

    time++; // increase sample count in each call

    // resetting wave every 5 sec to avoid overflow in float
    if (time >= sampleRate * 5) {
      time = 0;
    }
  }
}

Generating Sine Tone

Attaching it to a GameObject will play the sine tone of a given frequency. Since audio is being generated in runtime, we can change the frequency from the editor and hear the change in pitch.

Conclusion

Here we have it, a white noise and sine tone. This concept can be further expanded to make a more dynamic audio system in your game.

Some examples are playing engine sound based on the vehicle's speed, adding dynamic ear candy in the progression game according to the player's stage, and adding subtle variation in sound to avoid it being monotonous and boring. So, explore more and create interesting sounds.

Thanks for reading, more blogs comming soon.

Procedural Sound Generation in Unity

Anuj Shrestha

Audio Fundamentals

Sample Rates

Channels

Controlling Audio in Unity

Generating White Noise

Generating Sine Tone

Conclusion

Device Farming in the QA Process

Art Fundamentals: Shape Language in Character Design

Listeners in JMeter