Spatial Mapping Explained: Your Guide to How Augmented Reality Works
Last Updated: December 19 2018
When it comes to using augmented reality devices, the device must create an understanding of your physical space. But how does this understanding of your space translate into being able to interact with virtual objects? What is Spatial Mapping?
Because of the very technical procedures involved in spatial mapping, it’s difficult to get a straight answer when searching online. But we have you covered. This post explains what spatial mapping is, how integral it is to virtual and augmented reality worlds, and how Unity and different AR devices map your space. We focus on AR as these devices rely more heavily on meshing so that holograms can integrate with the real world. Let’s get started!
What is Spatial Mapping?
Spatial mapping describes the process of an AR device literally mapping your space. This is done through computational geometry and computer-aided engineering that create a mesh that lays over your environment. All devices generate this mesh, which looks like a series of triangles placed together like a fishing net.
As you maneuver through the space, or objects and people move around you, the mesh is updated to reflect the boundaries of your environment. The mesh is updated so frequently that any changes are noted and accounted for. This creates what is called spatial understanding: your devices are smart enough to recognize what is the floor, the ceiling, a table, the optimal places for a holographic image, etc.
Even AR mobile apps have processes for spatial understanding. In AR mode, Pokemon Go recognizes flat planes in front of your camera so it knows where to place the pokemon rather than have it floating in empty space.
Spatial Mapping in Unity
AR requires more complex understandings of space in order to blend real and virtual objects, and we’ll explain AR headsets in depth at the end of this post. Because each device uses spatial mapping, you’ll need to understand how your 3D engine generates and configures its spatial understanding. If your space is mapped through a geometric mesh, how is that mesh generated?
Anatomy of a Mesh
Mesh looks like a wireframe of triangles with different shapes and sizes denoting physical objects shapes and sizes. There is also a colour-coding system that signifies the distance the user is away from those objects.
When it comes to mesh on virtual objects, Unity requires information to determine how light reacts to the object (is it dull or shiny? Where are its shadows?) and the object’s texturing (is it rough or smooth? What colour is it?).
When it comes to lighting, Unity requires what’s called a “normal” value. When you look at your mesh, you’ll notice that each triangle connects to another at its point. These connected points are called vertices or vectors. A “normal” describes the vectors that are perpendicular to the mesh surface.
The idea of a “normal” is a little confusing, but think of it this way: When you make a pillow fort and drape a sheet over top of it, there are different points where the edges of the pillows create the fort’s shape even underneath the sheet. Those are normals, and the values assigned to them helps Unity determine what light effect to put in place; will there be a lot of shadow, or will the vector catch the light?
Don’t worry, you don’t have to manually set the normal values. You can use the function Mesh.RecalculateNormals on an object, and Unity will approximate values for you based on how smooth of a surface the vertices create.
Vertices of your mesh also determine the texturing of your object. In this case, we’re using texture to refer to the image on a 2D surface (rather than the object’s smoothness). The image of your object is the sheet on top of your pillow fort. Specific locations of the sheet are pinned to specific locations of your mesh so your image stays in place. These locations are called UV coordinates.
UV coordinates are understood in 2D and are scaled to the 0..1 range (0 being the bottom left of the image and 1 being the top right). So, if you think of your image as a 2D sheet, where on your sheet do you want to pin to your mesh? UV is used instead of XY to avoid confusing between the object’s texturing and the object’s placement in the 3D environment.
Through normal values and UV coordinates, Unity understands your virtual object’s texturing and reactions with light. But this is specific to the virtual models you are bringing into your environment; what about how Unity understands your physical space?
Unity's Spatial Mapping Components
The way Unity maps real-world surfaces and understands them is through three components: the Surface Observer, the Spatial Mapping Collider, and the Spatial Mapping Renderer.
The Surface-Observer checks with Unity’s mapping system about changes in your environment, and coordinates any changes with the Collider and Renderer. In a way, the Surface-Observer acts as Unity’s eyes to your physical space.
When a new surface or object is detected by the Surface-Observer, Unity’s mesh is updated to incorporate it through a process called “baking.” When an object is baked, the mesh reconforms around it. In essence, a virtual object is made to take the place of the physical object. Unity can recognize the virtual object internally, while to our eyes it appears Unity is recognizing the physical object.
To simulate the physicality of this object, any freshly baked object is made with a mesh filter and a mesh collider. The mesh filter determines what the object looks like, and the mesh collider helps define the object’s shape so raycasts are able to collide with it.
Put another way, when an object is freshly baked, Unity alters its mesh to include it, and the object is given a mesh filter (so for example, you could assign it to be blue) and a mesh collider (allowing other objects and raycasts to collide with it).
As an example, let’s presume Unity’s Surface-Observer has just recognized your dining room table. Unity alters its mesh to go around the table, and through the collider you are able to place a virtual cup upon the table.
This process is handled by Unity’s Spatial Mapping Collider. This system is responsible for updating the mesh, and tracking where these baked objects are located in your space. It can adjust the mesh to have high-resolution to acknowledge the very intricate shape of the your table. Or, it can adjust the mesh to a low-resolution so the general rectangle shape of a table is acknowledged.
While the Spatial Mapping Collider assigns a mesh filter to these virtual objects so you are able to assign appearances to them, the actual process of making that appearance visible is through the Spatial Mapping Renderer. The Spatial Mapping Renderer reads the shape of the mesh and can apply lighting or texturing effects to it. The renderer ultimately determines what the mesh looks like and how detailed it is.
The Spatial Mapping Collider allows you to attach “anchors” to virtual objects so that they remain fixed in a physical location. You can anchor a virtual vase to the top of your baked table, and every time you view your virtual table, your vase will remain in place. These anchors can be locked to the world (the table, in this example), or locked to the user’s body (if you wanted the vase to stay consistently to your left, regardless of placement on table).
Unity's Spatial Mapping Systems
Unity uses different systems to map the space. The two most common are stage-frame of reference, and a world-scale experience. The stage-frame of reference involves your objects having rigid relationships to each other. For example, the vase will always be on the table. Think of it like you are seeing an actor on stage, and the props on the stage are always in the same location regardless of where the actor moves.
Conversely, the world-scale experience allows flexible relationships between objects. Let’s say you put the vase on the right-hand side of the table directly in the middle of the table. (Keep in mind the vase is a virtual object placed on a physical table). As you move around the room, Unity gathers more information about the space — maybe the table is wider or longer than what Unity initially understood it as. So, the virtual vase’s location in the centre of the table updates to reflect this new understanding of the table’s dimensions. In other words, as you move through the space, the system updates and the object’s location adjusts to reflect its new position.
A world-scale experience creates a more immersive space as a result of updating the virtual environment to be as accurate a representation of the physical environment as possible.
To create a world-scale experience in Unity, add the WorldAnchor component to your GameObject, and then the object will be anchored in its current position. To remove the anchor, simply destroy object.You could also create a GameObject with a WorldAnchor and have children tied to it with a local position offset — this way you don’t have to apply multiple WorldAnchor components, and you ease up on your CPU.
You can program anchor persistence into your app as well. This means your anchored objects are stored internally and reloaded in the same positions each time the app is used. To do this, download the asset WorldAnchorStore and apply it to your GameObject.
Spatial mapping, regardless of the engine or device you are using, has its limits. If the environment is very busy with many people moving throughout it, your device will have to constantly be refreshing and rebaking objects. This lags the device, and causes imprecise spatial awareness. Similarly, if a new physical space is nearly identical to a previous physical space, your device’s spatial mapping may become confused.
Spatial Mapping in AR Devices
Each augmented reality device and engine uses spatial mapping and computational geometry to create meshes to understand your physical space, but the terms and processes used are device-specific. So, let’s breakdown how the HoloLens and Magic Leap One create, understand, and render spatial mapping.
The HoloLens understands the geometric mesh as a Spatial Coordinate System. Using a Spatial Surface Observer, the HoloLens creates “bounding volumes” to define regions of your physical space. These bounding volumes are then given a Spatial Surface so as to understand their shapes, directions, and more. The HoloLens can read and process an area of five metres around the device.
Bounding volumes help guide placement of holograms in your physical space. The way the HoloLens calculates placement is actually quite interesting and involves several processes.
Through visualization, the HoloLens begins to make assumptions about the Spatial Surfaces – a horizontal surface can be assumed as a table; a vertical surface is a wall; or, a bumpy surface is a cluttered desk and is not suitable for hologram placement. The device can work through different mesh processes to simplify the mesh, fill holes, and more to fully understand Spatial Surfaces.
Through raycasting, the HoloLens determines the distance from the device to the point in question. This helps determine where in the 3D space the hologram is being placed. By reading the Spatial Surface, the HoloLens determines whether the 2D plane is suitable for hologram placement.
Physics collisions help determine if there is room for the hologram (is the space too small for a holographic chair, for example) in the physical space. This prevents the holograms from bleeding into real world objects. This process is aided through occlusion so holograms can appear hidden behind real world objects (the real world table occludes the virtual chair in the corner).
Holograms can be programmed to be stationary (having a fixed world location) or be attached to the HoloLens location (moves along with the user’s device, but does not rotate). So, when the volume is attached to the device, the hologram can be programmed to remain to the left of the device as the user moves through the space. Alternatively, when the volume has a fixed world location, the hologram remains in its original location as the user moves around it. You could move around it to see all sides of the hologram, and it will remain fixed in location. Persistence will need to be programmed in to the specific app in order for holograms to remain in place consistently between uses.
Magic Leap One
The Magic Leap One’s (ML1) process of reading and understanding the computer-generated geometric mesh is called World Reconstruction. At any time, the ML1 reconstructs a physical space in a 10 metre radius around the device. This area is called the “Bounding Box,” literally a box around the user that sets boundaries for the area of play. There are three aspects to ML1’s understanding of the physical space: meshing, plane extraction, and raycasting.
Meshing describes a process including determining the mesh’s normals, confidence levels (how confident the device is about the mesh in the area), hole filling, and more. ML1’s mesh includes a “Block Skirt” which helps the mesh lay over cracks between planes in the mesh. It’s through a similar process called “Mesh Planarization” that ML1 can read a bumpy plane (such as a cluttered desk) and smooth it to be a relatively flat plane.
Within the bounding box, ML1 uses plane extraction to understand the space. For example, if a plane is perpendicular to gravity, it’s understood as a wall; if a plane is parallel to gravity, it’s a ceiling or a floor. Through this process, ML1 also understands flat areas as having individual planes. For example, an ‘outer’ plane is understood as the table surface in the space, and the ‘inner’ plane includes objects on the table’s surface. This allows a more complex understanding of your physical space, and of how holograms can interact within.
Through raycasting, ML1’s remote determines where there is a plane to collide with in order to place the hologram. The device is able to cast a ray before an updated mesh is available during times when the mesh is being refreshed, and raycasting can occur multiple times at once for accurate placement. Further, through raycasting, ML1 better understands the dimensions and objects in the physical space, allowing planes to be understood before even being observed (the other side of a table in your space, for example).
ML1 automatically sets spatial anchors to the world itself, providing a more immersive feeling in the play area. Further, persistence of these holograms is an automatic process, so objects are anchored and updated at the same time as the mesh itself. Between sessions, these objects are auto-saved and auto-restored, but can be deleted if the user desires.
How to Use Spatial Mapping in AR
When trying to figure out how to use spatial mapping in your AR app, you need to figure out how immersive you want to be. Is your app going to need a lot or a little amount of space to place holograms? Does placement on a mesh’s planes need to be specific, or will the hologram hover in place? How robust do you want your occlusion or physics colliders to be?
For example, this ML1 game of a toy car requires a fairly large space in order for the space to feel immersive. Seeing how the car interacts with objects in the space is part of the enjoyment of this app. Through ML1’s block skirt, the device removes the space between the floor and the couch so the car is able to drive up it. It’s plane extraction recognizes the table and accounts for room to drive underneath it.
Alternatively, this HoloLens app gives real-time instructions to repair crews, and highlights the part of the machine to be adjusted. This requires a small area to be mapped, but with relatively high amounts of detail depending on the complexity of the pipes. There’s not much need for interaction between the user and the holograms. The instructions are anchored to remain to the left of the user for easy reference, while the arrows are anchored to the pipes so the user knows what part to manipulate.
Once you answer those questions, and more, then you’ll know where to set your app’s spatial boundaries and how you’d like users to interact with the holograms. It will also help you determine which device to program for. For example, the HoloLens’ visualization process smooths out bumpy planes for easy hologram placement. But, a virtual car driving over this plane would not register the plane’s bumpiness. ML1’s block skirt accounts for these bumps and overlays the mesh on top so the car will interact with the bumps.
An understanding of spatial mapping helps you determine the scope of your app. From there, we can help you build your app through our augmented reality course. You’ll learn to program in Unity and C#, and we’ll take your idea to a prototype in just 10 weeks. Our one-on-one mentorship ensures you get the guidance to fully understand building your augmented reality app. Download our syllabus to learn more about how Circuit Stream’s augmented reality course can help make your idea a reality!