NVIDIA Research unveils GauGAN2, a new AI demo that creates accurate images from text: Digital Imaging Review


Main image: Image taken from the following text, “A calm lake surrounded by tall trees on a foggy day.”

NVIDIA has announced the latest demo version of the AI ​​panel from NVIDIA Research, GauGAN2. The model is powered by deep learning and now features a text-to-image feature. Whereas the original version can only convert a rough sketch into a detailed image, GauGAN 2 can create images from phrases like “sunset on the beach,” which can then be modified with adjectives like “rocky shore,” or by changing the “sunset” to a different time of day or even weather conditions adjustment. GauGAN is powered by Generative Adversarial Networks (GANs), which you can learn more about in this NVIDIA article.

Going back to GauGAN2, NVIDIA writes, “With the push of a button, users can create a segmentation map, a high-level diagram that shows the location of objects in the scene. From there, they can switch to drawing, modifying the scene with rough sketches with stickers like sky, tree, rocks, and river, allowing for a brush Smart Drawing by merging these doodles into stunning images.

You can try GauGAN2 for yourself on the NVIDIA AI Demos. You can also see it in action in the video below.

By adding text-to-image capabilities, the new version of GauGAN is more customizable and can be set up faster. Even a quick diagram isn’t as quick and simple as writing a statement. The latest release is also one of the first AI models to include multiple methods, script, semantic segmentation, diagram, and pattern, all within a single GAN.

A text-based starting point, such as a “snow-capped mountain range”, can be further customized with the diagram. You can add trees, change the height and size of objects, add clouds to the sky, and much more. Then GauGAN2 creates a new, modified image.

Endless tall mountains on a sunny day

You don’t need to keep your ideas grounded either. GauGAN2 may be useful for concept artists, as you can create worlds with two suns, such as Tatooine in star Wars. NVIDIA writes, “It’s an iterative process, where each word the user types into the text box adds more to the AI-generated image.”

Click to view GIF in motion

GauGAN continues to improve its results. When we looked at it in early 2019, the results were great, but there were obvious limitations. NVIDIA released a tool earlier this year built on GauGAN, NVIDIA Canvas, which can be used on any NVIDIA RTX GPU. At this point, GauGAN2 has been trained on 10 million landscape images using the NVIDIA Selene supercomputer, which is among the 10 most powerful supercomputers in the world.

To learn more about NVIDIA’s research and projects, click here. There are a lot of exciting AI projects under way and it’s exciting to see how far GauGAN has come in just a few years.



Source link

Share:

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings