How to Design Sound for XR Applications
In augmented reality (AR) and virtual reality (VR), audio is 51% of the experience. When you get the sound right, your AR/VR applications will deliver a more realistic, immersive experience.
Immersive audio is 360-degree audio that mirrors real-world soundscapes within XR environments. Also known as spatial audio, this form of sound keeps users in an XR experience for 40% longer.
How Does Spatial Audio Work?
Sounds emanate from all directions and are non-linear. For example, if you see a dog barking to your right in an XR simulation, spatial audio will match the barking sound's location with the dog's visual location as you move around the virtual space.
For users, the ability to pinpoint where sounds are coming from helps people feel like they have control over their environment. From a design standpoint, you can make it possible for users to localize directional audio by adding object-based auditory cues, which enhance the immersive experience.
Psychoacoustics defines how humans perceive sound and audio acts as the doorway into interactivity. Regardless of how good motion controls or graphics within AR/VR technology gets, true immersion cannot be achieved without paying attention to psychoacoustics.
In this post, we’re going to learn:
- The Six Key Principles of XR Sound Design.
- Why HRTF is so Important when you Work with Spatial Audio.
- Unity’s Native Sound Capabilities
- How User Interfaces in Audio help XR Users
- Best Practices for Implementing Spatial Sound Design in Your XR environment
Let’s dive in.
1. The 6 XR Sound Design Principles
Here are six fundamental principles you should think about when designing sounds for XR experiences:
You must relay information about the scene to the user. When they put a headset on, you place them in the world that you created. To paint the picture, ask yourself the following:
→ What year is it?
→ What’s the location? Are they inside a building or outside in nature?
→ What time is it? Is it night or day?
You must immediately surround them with whatever you can to make them feel a part of the story.
Showing the users your world is a start, but you must quickly ground people to strengthen the relationship they have with the content. If things don't feel realistic, it can detract from the experience. During testing, ask:
→ Am I standing next to the hologram?
→ Have we achieved actual presence, or is there a screen-door effect?
→ What can we do to make the sounds more realistic, as if we’re really there?
This is a crucial aspect of spatial audio that amplifies the experience for users.
When you have the first two principles right, your XR experiences can impact users on an emotional level. Here are some things to think about:
→ Is the sound important to mimic or change what we are seeing? Can we change it? Should we?
→ How can we use sounds to convey rewards for proper use?
→ How can you use sounds to instill joy, fear, or sadness?
→ How can you use sounds to push or slow the pace of the experience?
Think about what you're doing to the player's emotion. With experimentation, you can use spatial audio in XR to exaggerate or reduce sounds, heightening, or altering player feelings and emotions.
Sometimes, even in simple games, spatial audio acts as a guide to redirect users whenever they need assistance. You can relay information to people through sounds.
→ Getting their attention because you need them to look "over here."
→ Telling them that their battery is low.
→ Manipulating user focus, or lack thereof.
When you want to tell the player something through sound, ask yourself if you are pointing them in the right direction. Think about when and why you are using sound to ensure you don’t misguide users.
Your XR feature should offer a universal experience that everybody can enjoy. When designing it, consider the following:
→ Can everyone share this experience?
→ How can we help people who can’t or those who interpret it differently than I do?
→ What adjustments can we make so we’re inclusive of anyone with limitations?
Ideally, your experience should have something for people from all walks of life.
While mixed reality is gradually becoming more mainstream, it’s still a leap from real life. First-time users can struggle to fully engage with XR experiences.
It’s your job to help them get there.
→ How does the user feel right now?
→ How can you bridge the information gap?
→ Can we make this feel less weird?
During design, remember that sound plays a crucial role in normalizing the experience to eliminate mental hurdles.
2. The Crucial Role of HRTF in Spatial Audio
A head-related transfer function (HRTF) is a response that distinguishes how each ear receives a sound from any point in space. HRTF is possible because our brain, inner ear, and external ears all work together to determine the location of a sound’s source.
Our brains are constantly processing information. We can parse these sounds down into two categories:
- Interaural time difference is when there is a time delay between ears. The brain determines which ear received the audio signal first and identifies the source location.
- Interaural level difference is when there is an audio volume difference between your ears. This difference is because of an acoustic shadow cast by our heads, which block high frequencies.
How to Calculate HRTF for XR Sound Design Projects
So, now that you know the science, how do you apply it in XR sound design?
Here's how you can calculate HRTF:
- Set up a sound source—a speaker—in a room.
- Play sound through the source, and make a recording.
- Move the sound source 30 degrees.
- Play the sound, and record again.
- Repeat this process in twelve different locations around the room.
Once you’re done, you’ll have a good representation of what the sound will be like when emitted from any given location around the user.
In a traditional example, let's say we want to demonstrate a bird sound in our XR environment. To do this, we take the player orientation into consideration and pan accordingly. Then, we need to determine the distance that the bird is from the player and calculate a volume based on that distance.
However, with spatial audio, we have an advantage. We can say it's 30 degrees to the right, 50 degrees above, and 10 meters away. Then, we go into our HRTF files and play that sound.
Benefits of HRTF in Spatial Audio
Thought you'd never ask.
- It helps user focus - HRTF helps our brain parse information to filter out what is unimportant or determine that something is just background noise. As such, these sounds will not pull focus from the main features of the mixed reality environment.
- It makes the experience more accessible - With better, more realistic audio, you can offer a high-quality user experience (UX) that is more engaging for anybody to use.
- It’s good for navigation - HRTF provides clues for the user to clearly tell them where things are happening.
- It helps with grounding - You can guide the user and bring them into the story. In effect, HRTF is a powerful tool in spatial audio because it is an immersion multiplier.
3. How to Use Spatial Audio in Unity
Unity3D is a free 3D development engine for building games, simulations, and experiences. It is the easiest way to make apps for XR. However, despite its prowess for XR design, Unity doesn’t have a lot of spatial audio features.
You can use third-party tools to add to the experience, like:
- Microsoft Acoustics
- Valve Steam Audio
- Resonance Audio
In each of these tools, key features include:
- Varying acoustic modeling tech
- Custom HRTFs
- Middleware support
Important Spatial Audio Considerations during XR Sound Design
Spatial audio is affordable on every mixed reality platform. Regardless of which device you’re creating for, it’s worth having spatial audio to build the best possible experience.
Here are a few things to keep in mind:
- CPU budget - Be sure you choose the right game engine and output platform to maximize your resources.
- Features - Think about the specific features you need before you commit to a specific device or engine.
- Spatialize Checkbox - Make sure this is checked, so your sounds spatialize.
- Attenuation - This is how sounds roll off and change over distance.
- Spread - How much of the signal is spread into two ears. Smart use of spread is a good way to adjust for perceptual size.
Once everything is set up, you can use the audience prefab to quickly pull in different settings for spatializer or attenuation when you need them.
How User Interfaces in Audio Help XR Users
In a world of constantly changing hardware and foreign input systems, UI audio plays a crucial role in helping users navigate and enjoy their experience.
Sound design improves the user interface—and by extension, the user experience—in several ways:
- Relays information to the user
- Input (confirm, cancel, back, next, previous, etc.)
- Gaze, touch, voice, buttons, sliders, etc.
- Provides sonic unification
- Outlines experiences for the user
- Provides accessibility to otherwise unreachable interactions
- Brings familiarity to elements
- Provides reassurance and confidence
For example, when navigating the UI of a Nintendo Switch, you'll hear simple, repetitive sounds. These innocuous beeps and chirps have a musical feel, and the home sound evokes a familiar sense of arrival. Together, these sounds are designed with the intent to create a bigger, more unified UX.
But what about mixed reality? How does audio help the user interface?
Here’s where it gets tricky.
AR/VR has unconventional input models
Everyone is familiar with the UI of a smartphone. Even if it isn't your phone, you'd have some sense of how to operate it. With an AR/VR device, logging in isn't as straightforward. If I told you to put on a headset and look at a particular object until it stopped spinning, you might think it's a little weird. For some people, this concept doesn't make any sense.
Combatting with clutter
In XR environments, there can be multiple animations firing, voiceovers, and graphics, all competing for the user's attention. Designers must cut through the noise to create the most cohesive experience and relay only what needs to be heard at that given moment.
More axis of control
UI info and control panels may be off the screen. Designers need to prioritize what info they are trying to communicate and present panels when they are needed.
How to Apply the XR Sound Design Principles to Asset Creation and Integration
Okay, so let’s take a brief run through the steps involved in creating
sounds for an XR experience.
First, we must define the goal behind a new sound. Why are you creating this particular sound?
→ The background or goal defines your why
→ The design is your what
→ The integration is your how
For example, we’ll imagine that we’re visiting the Seattle Aquarium. It's Shark Month, and we need to design an app to teach people about sharks.
Here are a few considerations for our Sharkland XR app:
→ This is a fun app suitable for all ages.
→ It should have a stylized design, not realistic sharks.
→ We will have a HoloLens 2 at the real shark tank. People can put it on and look around to see our cool XR sharks and engage with information about sharks.
→ We’ll assume these people have never used MR before
→ It's going to be a loud environment, like a busy Saturday at the Aquarium.
So, how do we do this?
Use the XR sound design principles to guide your project
That’s right! Let’s walk through each one.
- Envelop - It’s a loud environment, so we won’t be able to fully envelop them because of the background noise.
- Ground - As with the first principle, we’ll struggle here. So, we don’t need to worry about this aspect.
- Emote - This app should be fun. We can use fun sounds and aim to get smiles from first-time users.
- Navigate - Make it obvious we are hydrating or activating a button panel.
- Include - Make it very easy to use and differentiate between buttons or menus.
- Bridge - Assume it’s a weird, wild experience for first-time users. Don’t overwhelm people.
You can use Reaper to edit recorded sounds. If we consider our Shark app example, you can record a short audio clip of somebody blowing bubbles with a straw in a glass of water.
With Reaper, you can then manipulate this audio by dropping pitches, adding a delay, or reversing a snippet of the sound. In XR sound design, playing a sound one way, then reversing it and manipulating the pitch is a classic mechanic for the open and close menu functions in an application.
Certain sound may fit better when you perform specific menu actions with the controller, such as:
- On touch
- On release
- On scroll
As you experiment with the audio clip, think about where the sounds play from in relation to your app’s buttons or menus.
Quick Key Concepts in XR Sound Design
As you gain experience in the field, you’ll soon realize which aspects are integral to high-quality sound production. To get you off on the right foot, here are some important tenets of sound design in XR environments:
Your environment's acoustics is crucial for immersing players in your virtual world and dressing up the spatial audio. When you have the acoustic settings correct, you can ground the player in your location and genuinely make them feel like they are in a bathroom, concert hall, or ancient pyramid.
You need to design sound events that will play alongside the animations in your XR experience. For example, you may have animated events like footsteps, a car crash, or somebody firing a gun. To do this, you must create a script that will call specific functions on keyframes.
You can use voices to relay info to users. For instance, you might include a narrator, tutorial, or even the voice of God. Alternatively, you could attach voices directly to game objects to bring them to life.
Consider the impression that voice positioning will have on the user:
- Is the voice relaying crucial info? If so, how can we ensure the user hears everything?
- Can we see anything visually when the voice is speaking? Think carefully about attaching to an empty 3D object, and how that may negatively impact user engagement.
- Randomly spatializing a sound could lead users away from the vital info.
- It’s ok to use stereo for voices if it fits the experience.
Using music in a mixed reality environment can easily go wrong. Be careful not to distract the user from vital information—such as an important message being relayed by a speaking character.
If you use music, it should never be arbitrary or only for filler. Instead, think about the following:
→ What are we trying to say?
→ Are we driving the story?
→ For novice users, does our music ease their confusion or discomfort about using XR?
Here is a shortlist of some essential tools for XR sound design, as recommended by Microsoft Technical Sound Designer, Travis Fodor. In addition to some high-quality headphones, Travis recommends:
- Reaper for sound editing.
- Sony M10 as a portable recorder. Alternatives include LOM Mikro Usi, or Rode NTG3.
- Izotope RX to clean up recordings
- Fabfilter Suite and a reverb plugin like Altiverb or Blackhole.
- DearVR as an all-inclusive spatial audio suiteTo experiment more with XR sound design, he suggests you check out the SDK, Google Resonance.
To experiment more with XR sound design, he suggests you check out the SDK, Google Resonance.
When you're designing sounds for a mixed reality experience, the user interface can make-or-break your application's success. A good UI makes the app a more accessible, enjoyable experience that users will actively engage in and get excited about. For first-time users, that is one of the most important things, as you want to break down the walls to make your XR experience feel normal for them and to put a smile on their face.
Great sound design plays a vital role in immersing users, relaying information, and directing the player to where they need to be when they need to be there.
If you’re ready to go a little deeper, you can learn more in our workshop with Travis Fodor to see how he puts the principles of XR sound design into practice. Check out the on-demand XR Sound Design workshop now.
Download the XR Development with Unity Syllabus