Various icons of the companies belonging to the Facebook metaverse on the black screen of the smartphone

Meta leverages public social media posts to train its new AI assistant

Meta (NASDAQ: META), the parent company of Facebook and Instagram, has unveiled a new artificial intelligence (AI) model, which it says was trained with data respecting the privacy rights of individuals.

Dubbed Meta AI, CEO Mark Zuckerberg has announced the product at Connect, Meta’s annual product conference. During the reveal, Meta confirmed that the new AI tool was trained largely with data gleaned from Facebook and Instagram public posts.

The data used in training the AI model include both text and images, hinting that the company will be proceeding with a text-to-image generator. However, private posts, especially those shared with family and friends, were not used to train Meta AI.

Nick Clegg, Meta President of Global Affairs, noted that the company did not rely on private messages in developing the new model, steering clear of data from sources like LinkedIn over its proliferation of personal details.

“We’ve tried to exclude datasets that have a heavy preponderance of personal information,” said Clegg.

In terms of its functionalities, Meta says its new AI will possess multiple functionalities with users able to interact, with the offering via text, voice, and gestures. On the technical side of things, Meta AI leveraged its Llama 2 large language model (LLM) and a new image generator model dubbed Emu.

While guardrails protecting user privacy have received significant attention from Meta, the new AI product does not live up to the same expectations in terms of copyright protection. Clegg revealed in an interview with Reuters that the company is bracing itself for a wave of litigation over copyright claims.

Clegg cites the lack of clarity over the issue of creative content covered by the existing fair use doctrine, stating that the matter will most likely play out in court.

While rivals like OpenAI have invested on a multi-year deal with Shutterstock to use its collection of images in training its AI models, Meta simply points to its terms of service preventing users from generating content that violates intellectual property (IP) rights.

Copyright issues threaten AI’s future

AI developers have received flak over their data scraping and handling policies, with creators alleging a brazen infringement of copyright rules. A number of aggrieved creators have dragged leading firms to court seeking compensation for failing to seek consent before using copyrighted materials to train their AI models.

In response to the rising cases, the U.S. Copyright Office waded in with a public consultation on the best route to regulate AI and IP matters. Top of the list for the Copyright Office is the need to reach a decision on the rights of AI-generated materials to be copyrighted and the need to establish a new licensing regime.

Apart from the copyright rules, U.S. authorities are keen on launching wholesome AI legislation for national security and the need to protect emerging sectors from bad actors.

In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.

Watch: Turning AI into ROI – Albert Cuadrante, Roger Collantes, Rafael Fernandez De Mesa

YouTube video

New to blockchain? Check out CoinGeek’s Blockchain for Beginners section, the ultimate resource guide to learn more about blockchain technology.