Catching up with AR

Heavyweights Apple and Google beef up AR content development tools.

There is nothing but good news on the AR front as the duel mobile platform giants Google and Apple improve their mobile AR SDKs. Apple went big with the introduction of ARKit 4 through its online WWDC 2020 conference this year. Of the two, Apple is also the most hardware dependent, which is as to be expected. This year WWDC was all about new processors. Arm-based CPUs for their PCs and new generations of the Bionic processor. Everything else flows from there.

The updates to ARKit showcased at this year’s WWDC were added to make the most of Apple’s A12 and A13 Bionic processors. The A12 powers Apple’s iPhone XR, XS, and XS Max,  and the A12 X Bionic processor is enabling the 11-inch iPad Pro and 12.9-inch iPad Pro (the A13 powers the iPhone 11). These new iPad Pro tablets are the show ponies for Apple’s new AR capabilities featuring an additional LiDAR sensor. They are also the advance guard of Apple’s new ARM-based PCs, which it is believed will have a new A14 processor.

Apple also debuted its new AR SDKs to support the hardware and enable more realistic AR. Apple has been adding additional capabilities to ARKit 3.5 and ARKit 4.

The new iPad Pro 12.9 represents Apple’s first vehicle for its hardware-based AR using a new LiDAR sensor in conjunction with its other cameras. (Source: Apple)

ARKit 4 includes a Depth API to make use of the depth information from the LiDAR scanner, Location Anchoring uses Apple Maps data to place AR content more accurately within the scene relative to the real world. The new API also includes face tracking.

One of the most obvious benefits of the Depth API is the enhanced ability to perform virtual object occlusion thanks to the LiDAR scanner when combined with the 3D mesh data generated by Apple’s Scene Geometry algorithm which was introduced earlier this year with ARKit 3.5 when Apple introduced its new iPad Pro tablets.

Scene Geometry builds a topological map of the area in front of the cameras and identifies floors, walls, ceilings, windows, doors, and seats. It tags those features which then enables realistic effects including occlusion and provides data for applying effects to virtual objects. Developers can use the tags for applications.

Thanks to the increased depth data from the sensors, virtual objects and be better placed relative to the 3D mesh and more realistically blended into the physical surroundings. Apple says users and developers will benefit from new capabilities such as the ability to take measurements and realistically apply effects to virtual objects. In addition to ARKit for AR development, Apple has also developed the RealityKit framework for high-performance 3D simulation and rendering. RealityKit uses information from the ARKit framework to convincingly integrate virtual objects into the real world. RealityKit also supports audio sources, enables animation including physics simulations, supports user input and changes in the environment and can be synchronized across devices to enable group AR experiences.

Real-world location

As you may have heard, Apple says they’re finally making good on improving Apple Maps, which, yeah, they owe us. Apple has acquired multiple map technology companies and has been working for 8 years to fulfill the promise made in 2012 when Apple Maps was met with derision and threatened lawsuits after the app fell far short of Apple’s big promises. You’ve got to give it to the company, they do not surrender to humiliation … at all. The company has rebuilt its mapping technology and is spending its enormous resources on gathering 3D data directly, driving its own data-gathering cars along the streets of the world, and sending out walkers to gather data with radar rigs on their backs. The new Apple Maps is much better. There is much more localized data on the maps, they have 3D representations, and most importantly, they’re more accurate.

ARKit 4 takes advantage of the improved Apple Maps through Location Anchoring. AR experiences can be placed at specific locations according to longitude, latitude, and altitude coordinates. Obviously, this capability will enrich travel applications, but we’ve been talking a lot about industrial applications and also construction. These new features mean that the virtual representation of a machine can be where it is on the shop floor, or construction work can be pinpointed on the site. It seems that accuracy is still fairly relative—but according to an article in Apple Insider, Apple has worked to improve accuracy for its Location Anchors by using landmarks within the camera view to improve its location finding. Apple has been using machine learning to identify landmarks as part of Apple’s data collection process.

Apple location anchors will let people walk around virtual objects and see different perspectives as if they’re being seen through a camera lens.

Apple’s ARKit 4 has expanded its Face Tracking to include the front-facing camera on any device with the A12 Bionic chip and later (that means the iPhone XS, XS Max, XR and 2019 versions of the iPad Air and iPad Mini and beyond including the iPhone SE.) Up to three faces can be tracked at once and the TrueDepth camera can come into play for Memoji (personalized faces that can be used as stickers and applied messages, mail, etc.)  and Snapchat.

Additional features

Apple says it has developed Motion Capture capabilities for its devices using a single camera. It has built the application through its work with the human body including the movement of joints and bones. As a result, says Apple, people in AR are understood to be people.

The company is enabling collaborative sessions so that it’s possible to build a collaborative world map and enable shared AR experiences such as multiplayer games.

Apple says it can detect up to 100 images at a time and get an automatic estimate of the size of an object.

It’s believed that Apple’s upcoming iPhone 12 will also have a model that includes the LiDAR sensor and the A14 processor.

Meanwhile, Google updates ARCore

The differences between Google and Apple are evident in the way both companies have introduced their new AR features. While Apple demonstrates capabilities made possible by its new sensors and processors, Google shows off its new APIs that enable AR applications on mobile devices with a single camera … imagine. The company is empowering AR for as many devices as possible by enabling single RGB cameras.

This represents Google’s revamped AR strategy.  As an article in Ars Technica points out, Google also went down the path towards multi-camera platform support with Tango in 2014. Lenovo’s Phab 2 Pro phone was among the first prototype phones for Google’s early AR experiments and it was a little tippy, hardware-wise—big, heavy, expensive, and it ran hot. In 2017, the company abandoned Tango in favor of its new ARCore APIs and turned its attention to the mainstream of mobile.

Google’s pitch is that AR is for everyone, but the capabilities only get better and more capable with the addition of LiDAR sensors like those on new Samsung phones and others.

Google says ARCore features three basic capabilities to integrate virtual content into the real world as seen in a camera: motion tracking, environmental understanding, and light estimation.

Google introduced a preview of the ARCore Depth API at the end of 2019 and invited developers to pitch in. Google’s approach relies on the company’s depth-from-motion algorithms developed to enable depth map creation from a single RGB camera. In June 2020, Google is making its Depth API widely available through ARCore 1.18 for Android and Unity, and including Unity’s cross-platform development API AR Foundation, which is a write-once deploy across API supporting Android devices and iOS.

Google’s work enables 3D depth mapping using its depth-from-motion algorithms to enable any device to capture depth. In addition, they’ve introduced Environmental HDR which captures real-world lighting and enables it to be applied to AR objects and scenes. Google has brought machine learning into play to better understand depth sensing so that it doesn’t have to rely on traditional methods that rely on triangulation. Instead, moving the phone’s camera around the scene collects enough information to build a depth map. Google’s AI algorithms have been built through hours of Machine Learning using YouTube videos.

The ARCore Depth API creates a depth map from multiple images taken from different angles and then compares them as the phone moves to estimate the distance to every pixel. ARCore can also identify planes such as tables, floors, ceilings, etc., making it easier to place objects and define interactions with them.

In a blog post, Rajat Pajaria, Google’s Product Lead for AR provides applications that demonstrate the ways in which developers are putting ARCore to work.

Google says ARCore is designed to work on phones running Android 7.0 (Nougat) and higher and offers a list on their website.

Generate a depth map without specialized hardware to unlock capabilities like occlusion.

As we said, Google is going for less expensive hardware options. For instance,  Samsung phones feature ToF sensors rather than the LiDAR components Apple has decided to go with. The phones with ToF features include Galaxy S10 5G Note 10+, Galaxy S20+, and Galaxy S20 Ultra. Samsung’s “scannerless” ToF relies on a pulse of IR light to resolve the distance between the camera and points identified in the field. It is enabling Live Focus video to blur out the background while capturing videos, and swap between foreground and background focus. Live Focus video is available on both the front and rear cameras.  Samsung uses ToF to enable its Quick Measure feature which relies on ToF gather enough data to determine the width, height, area, volume, and more. Quick Measure is a preloaded app on Galaxy S10 5G and Note10+.

Huawei’s sensor loaded P40 Pro. (Source: Huawei)

Samsung is not the only player on the field offering ToF sensors. A recent search also turned up LG G8 Thin Q, the OPPO RX17 Pro, Honor View 20, and Huawei P30/P40 Pro. Samsung phones are probably the most widely available worldwide.

What do we think?

Lidar is here. Or almost. It’s not too surprising that Apple thinks you should buy expensive new something. And, as many Apple users will tell you, you get what you pay for. That may be debatable. I am an Apple fan, but as much as I think I want some features, I’m often struck by how rarely I actually use the features I thought I wanted. And, I never wanted a Memoji.

What’s intriguing along the way, though, is how Apple is blending features for a PC and mobile device in its new iPad Pro and after WWDC 2020, with the promise of ARM-based PCs, it seems kind of obvious how that trend will take shape.

Google on the other hand has a strategy that’s going to appeal to professionals as well as artists and hobbyists. As we already pointed out developers like platforms with lots of potential customers and that’s what Google promises.

Professional AR, which includes construction, maintenance, retail development, security, and monitoring, etc. is going to happen across the board. Software developers including PTC’s Vuforia group, Vectorworks, Autodesk, Dassault’s Solidworks, and more are supporting both ARCore and ARKit.

AR has long seemed inevitable, but with the new features and capabilities of 2020, it’s going to become more realistic, accurate, and fun.