Nvidia creates even smoother slow-motion video than a 300K fps camera
Looking to the future: Nvidia has developed a technique that uses neural networks to create smooth slow-motion video from standard footage. Multi-frame, variable-length interpolation uses machine learning to ‘hallucinate’ the transitions between frames in the movie, then inserts these artificially created frames together to seamlessly slow down the final footage.
I don’t know why, but people love to watch slow motion videos. In fact, it’s so popular that Gavin Free and Dan Gruchy have a fully dedicated YouTube channel called The Slow Mo Guys which has nearly 1.5 billion views and over 11 million subscribers. Free saw a niche to be filled as creating slow motion videos is inconvenient for most people. Besides the fact that the equipment is extremely expensive, storing images shot at over 300,000 fps quickly becomes a problem.
There are filters that convert ordinary video to slow motion, but the result is somewhat jerky as it only intersperses duplicate frames to lengthen the footage. However, researchers at Nvidia believe they’ve developed a way to create even smoother slow-motion videos than those taken with high-speed cameras like the ones Free and Gruchy use on their channel.
According to VentureBeat, “Scientists at Nvidia, the University of Massachusetts Amherst and the University of California, Merced have designed an unsupervised end-to-end neural network that can generate an arbitrary number of intermediate frames to create sequences. idling fluids. “
The technique has been dubbed “variable-length multi-frame interpolation” and it uses machine learning to bridge the gaps between frames in a video to create smooth, slow-motion versions.
“You can slow it down by a factor of eight or 15 – there’s no upper limit,” said Jan Kautz, senior director of visual computing and machine learning research at Nvidia.
The technique uses two convolutional neural networks (CNNs) in tandem. The former performs both forward and backward estimates of optical flow in the timeline between frames. It then generates what is called a “flow field”, which is a predicted 2D vector of motion to insert between images.
“A second CNN then interpolates the optical flow, refining the approximate flow field and predicting visibility maps to exclude pixels occluded by objects in the frame and then reduce artifacts in and around moving objects. Finally, the visibility map is applied to the two input images, and the intermediate optical flow field is used to warp (distort) them so that one image smoothly passes to the next.
The results are remarkable as you can see in the video above. Even the video taken at 300,000 fps by the Slow Mo Guys was even slower and seems even smoother than the original.
The technique uses Nvidia Tesla V100 GPUs and a cuDNN-accelerated PyTorch deep learning framework. As such, don’t expect to see a retail version coming out anytime soon.
According to Kautz, the system needs a lot of optimization before it can run in real time. He also says that even when it hits the market, most of the processing will have to be done in the cloud due to the hardware limitations of the devices the filter would likely be used on.
If you’re into technical details, the team has a document that describes it at the Cornell University Library.