Home Artificial Intelligence Artificial Intelligence News Microsoft thinks it’s perfectly OK to steal content if it’s on the...

Microsoft thinks it’s perfectly OK to steal content if it’s on the open web

July 2, 2024

It is a mistake held by Microsoft AI chief Mustafa Suleyman that anything published on the open web instantly qualifies as “freeware,” which anybody can copy and use without restriction.

He responded to a question on whether AI businesses had actually stolen intellectual property worldwide by saying:

Fair use has, in his opinion, been the social contract for content that has been available on the open web since the 1990s. It is freely replicable, reproducible, and replicable. The understanding has been that this has been “freeware,” if you like.

You may not be surprised to hear a Microsoft executive defend it as entirely lawful given that the company and OpenAI are allegedly the targets of numerous lawsuits alleging that they are collecting copyrighted internet material to train generative AI models.

In the US, copyright protection is automatically applied to works that you make. It’s not even required that you apply, and posting it online doesn’t take away your legal rights either. It’s so hard to give away your rights, in fact, that attorneys had to create unique web licenses to make it easier!

In contrast, fair use is determined by a court rather than by a “social contract.” This legal defense permits certain uses of copyrighted content as long as the court considers the following factors: what you’re copying, why, how much, and whether the copyright owner will be harmed.

Even still, most AI businesses haven’t been quite as outspoken as Suleyman in their claims that training on copyrighted content qualifies as “fair use.”

Speaking of brazen, right after his “fair use” comment, he says this insightful thing about the nature of humanity:

As a human species, what else do we constitute except a machine for producing knowledge and intellectual output?

Suleyman appears to believe that the robots.txt file, which lists which bots are prohibited from scraping a specific website, has some value in preventing content piracy. He declares:

A website, publisher, or news outlet that has made it clear that “do not scrape or crawl me for any other reason than indexing me so that other people can find this content” has its own category. It’s a gray issue, and the courts will have to deal with it.

However, robots.txt is not a legitimate file. The social contract that has existed since the 1990s is not about fair use; nonetheless, some AI companies also seem to be disregarding it. OpenAI, a Microsoft partner, is allegedly one of the companies disregarding it.

Source link