Artificial Intelligence and Copyright banners

Data poisoning tool eyed to prevent AI copyright infringement

Amid the heated row between artificial intelligence (AI) companies and creators over alleged copyright breaches, researchers are developing a new tool that will protect digital artists’ intellectual property (IP) rights.

The tool, dubbed Nightshade, is designed to poison the data sets used in training generative AI models, causing them to malfunction. Per an MIT Technology Review report, Nightshade works by tweaking the pixels in digital art in an invisible way to the naked eye but affects how trained generative AI models interpret the image.

Early test results by the researchers at the University of Chicago have shown significant promise in contaminating machine learning data sets. For example, inputting the word dogs in “poisoned” systems generates images of cats, increasing the inaccuracies of affected models.

“The poisoned data is very difficult to remove, as it requires tech companies to painstakingly find and delete each corrupted sample,” read the review.

Ben Zhao, a leading researcher for the project, hinted at plans to make Nightshade open-source to allow other teams to create their product versions. Zhao says the end goal is not to stunt the development of AI but to “tip the power balance” between AI developers and artists “by creating a powerful deterrent” against the violation of copyrights.

The researchers developed Glaze—a tool designed to label digital art genres in a manner different from the original to trick AI models. Zhao revealed that Nightshade will be integrated into Glaze to give artists stronger copyright control over their creations.

Despite the promise shown by the tools, attempts at poisoning large-scale generative AI models will require “thousands of poisoned samples,” with experts predicting AI developers will begin working on defenses.

“We don’t yet know of robust defenses against these attacks. We haven’t yet seen poisoning attacks on modern [machine learning] models in the wild, but it could be just a matter of time,’ said Vitaly Shmatikov, professor at Cornell University. “The time to work on defenses is now.”

There are fears that bad actors could leverage the tools to carry out malicious attacks against machine-learning models. Still, the researchers say the bad actors will require a boatload of poisoned samples to inflict real damage.

Trailing copyright concerns

AI developers have received flak from creators over the arbitrary use of copyrighted materials to train their AI models. Several creators have since filed class-action lawsuits against OpenAI and Meta (NASDAQ: META) over their copyright violations, seeking compensation and other declaratory reliefs against AI firms.

In their defense, AI developers argue fair usage in their reliance on copyrighted material, with some pointing to existing opt-out clauses. Outside of copyright concerns, the AI industry is criticized for impacting finance, Web3, education, elections, mass media, and security industries.

Watch: AI truly is not generative, it’s synthetic

YouTube video

New to blockchain? Check out CoinGeek’s Blockchain for Beginners section, the ultimate resource guide to learn more about blockchain technology.