Featured Image for Watch the footy play out on your dining table thanks to AI and augmented reality

Watch the footy play out on your dining table thanks to AI and augmented reality

Remember the holographic chess in Star Wars? Imagine that on your dining table, except you’re watching the World Cup final.

Deep learning is a form of machine learning that involves learning from data representations such as images, video or text without the introduction of human knowledge, task-specific algorithms or hand-coded rules. Making a very broad generalisation, deep learning enables artificial intelligence to learn by itself.

Aided by a team of researchers from the University of Washington, developers from Facebook and Google have come up with the first deep learning-based system that can transform a standard video stream of a football game into a moving 3D hologram.

The system analyses the video stream and reconstructs the body and movements of each player to project them in 3D through augmented reality.

Fancy having Manchester United play on your dinner table? It seems there could soon be an app for that.

“There are numerous challenges in monocular reconstruction of a soccer game,” the researchers explain.

“We must estimate the camera pose relative to the field, detect and track each of the players, re-construct their body shapes and poses, and render the combined reconstruction.”

The team used powerful NVIDIA graphic cards to “train” the artificial intelligence to extract 3D player data from a FIFA video game from EA Sports. It’s the meshes from the game that the AI uses to render the players for the 3D viewer or AR device.

“It turns out that while playing Electronic Arts FIFA games and intercepting the calls between the game engine and the GPU, it is possible to extract depth maps from video game frames.

“In particular, we use RenderDoc to intercept the calls between the game engine and the GPU. FIFA, similar to most games, uses deferred shading during gameplay. Having access to the GPU calls enables capture of the depth and colour buffers per frame. Once depth and colour are captured for a given frame, we process it to extract the players.”

The technology is still a bit rough on the edges, but it doesn’t make the potential for this technology absolutely mind-blowing.

The research was featured at this year’s Computer Vision and Pattern Recognition (CVPR) conference in Salt Lake City, Utah, where we also saw MIT researchers demonstrate how to see through walls.

About the author

Filmmaker. 3D artist. Procrastination guru. I spend most of my time doing VFX work for my upcoming film Servicios Públicos, a sci-fi dystopia about robots, overpopulated cities and tyrant states. @iampineros

Leave a comment