AI can draw hands now

Aidan Ragan produces photos that are artificially made, and he envisions the subjects to having clumsy, veiny hands with more or fewer than five fingers. Yet this month, he was astounded to see a well-known image maker produce lifelike hands while he sat in his University of Florida seminar about AI in the arts.

Ragan, 19, claimed in a Washington Post article that “it was great.” “It was the one thing keeping it from being flawless, but now it is. It was both exciting and a little frightening.

The performance and popularity of artificial intelligence image-generators, which produce images based on given instructions, has exploded. Users give prompts ranging from the commonplace (draw Santa Claus) to the absurd (draw a dachshund in space with stained glass windows), and the software produces an image that looks like a fine painting or realistic photograph.

A significant shortcoming of the technology is the creation of lifelike human hands. AI training data sets frequently only provide partial hand images. As a result, photographs of bulging hands with excessively long fingers or stretched-out wrists frequently appear, which is a clear indication that the AI-generated image is false.

Nevertheless, towards the middle of March, popular image creator Midjourney released a software update that appeared to fix the issue; as a result, artists reported that the tool produced photos with perfect hands. The company’s upgraded software was used this week to produce fake images of former president Donald Trump being arrested that looked real and went viral, demonstrating the disruptive power of this technology. However, this improvement comes with a big problem.

Graphic designers that depend on AI image producers for realistic art would benefit from the update, which at first glance seems harmless. But, it raises a wider discussion about the peril of created content that cannot be distinguished from real photos. Some claim that this hyper-realistic AI will eliminate jobs for artists. Others claim that since there won’t be any obvious signs that an image is fake, perfect images will make elaborate fake campaigns seem more credible.

Hany Farid, a professor of digital forensics at the University of California at Berkeley, stated that the average individual would “be like: Okay, there are seven fingers here or three fingers there — that’s obviously false” before nailing all of these specifics. Yet, once it begins to get all of these features correct, these visual cues start to lose their validity.

Text-to-image generators

With the larger rise in generative artificial intelligence, which supports software that generates texts, images, or sounds based on data it is fed, text-to-image generators have increased dramatically over the past year.

When it was released in July of last year, the well-liked Dall-E 2—named for both Disney Pixar’s WALL-E and painter Salvador Dali—shook the internet. Essentially an anti-DALL-E with less usage limitations, the start-up Stable Diffusion debuted its own version in August. A research lab called Midjourney introduced its own version over the summer, and when it won an art contest at the Colorado State Fair in August, that image is what sparked a dispute.

These picture creators gather billions of photographs from the internet and analyze them for patterns between the images and the text that appears beside them. For instance, when someone writes “bunny rabbit,” the software recognizes that it’s related with a picture of the fluffy animal and suggests that.

According to Amelia Winger-Bearskin, an associate professor of AI and the arts at the University of Florida, the software still had difficulty recreating hands.

Why AI image-generation algorithms struggle to create hands

She claimed that rendering the hand has proven challenging since AI-generated software has not been able to comprehend the meaning of the word “hand” in its entirety. Hands can take many different forms, sizes, and shapes, but training data sets frequently emphasise faces more than hands, according to the speaker. If hands are shown, they are frequently folded or making gestures, providing a distorted view of the body part.

During a Zoom video interview, she held her hands wide, saying, If every single image of a person was constantly like this, we’d definitely be able to duplicate hands very effectively.

Winger-Bearskin noted that the software update for Midjourney this month seems to have made a dent in the problem, despite the fact that it is not flawless. She remarked, We’ve still had some pretty bizarre ones.

Winger-Bearskin speculated that Midjourney may have improved the quality of its image data collection by designating photos where hands aren’t blocked as having a higher priority for the algorithm to learn from and reporting photos where hands are concealed as having a lower priority.

German graphic designer Julie Wieland, age 31, said she values Midjourney’s capacity to produce more lifelike hands. In order to generate mood boards and mock-ups for visual marketing campaigns, Wieland uses the program. She claimed that correcting human hands in post-production is frequently the most time-consuming aspect of her job.

Yet she said that the news was inconvenient. Since her preferred creative approach is highly influenced by the lighting, glare, and through-the-window shots made famous in Wong Kar-film wai’s “My Blueberry Nights,” Wieland frequently enjoyed touching up AI-generated images’ hands or altering the image to match it.

I do miss the imperfect looks, she admitted. As much as I enjoy getting stunning photographs right from Midjourney, editing them afterward is my favourite part of the process.

Ragan, who wants to work in artificial intelligence, also claimed that these flawless photographs take away from the enjoyment and originality of AI image creation. He remarked, He really enjoyed the interpretive art component. It just feels stiffer now. It has a more mechanical, tool like feel.

The capacity of Midjourney to produce superior photos, according to Farid of UC Berkeley, raises political danger since it might produce images that seem more believable and might incite public resentment. He cited recent photographs posted on Midjourney that appeared to credibly depict Trump being detained despite the fact that he hadn’t. Farid observed that the small elements, such as the length of Trump’s tie and his hands, were improving and adding credibility.

He stated, It’s simple to convince people to believe this things. And now it’s even easier when there are no visual [errors].

A few weeks ago, recognising poorly drawn hands was a reliable indicator that an image was deep-faked, according to Farid. With the increase in quality, he claimed, that is getting more difficult to accomplish. Yet, he added, there are still hints, frequently in the backdrop of a picture, like a deformed tree branch.

Farid advised AI businesses to consider the negative effects that could result from developing their technology more extensively. He said that they may include guardrails, such as picture watermarks, the ability to prohibit anonymous users from posting photos, the ability to prohibit the creation of certain words (Dall E-2, he claimed).

Farid asserted that it is doubtful that AI businesses will halt the development of its image producers.

In the area of generative AI, he claimed, there is an arms race. Everyone is moving quickly to figure out how to commercialize, but safety slows you down.

Source link