Мы в Telegram
Добавить новость
103news.com
BusinessInsider.com
Апрель
2024

Big Tech needs to get creative as it runs out of data to train its AI models. Here are some of its wildest solutions.

0

OpenAI, Meta, Google and other tech companies are considering some creative ideas to find more data to train their AI models.

Big Tech
Big Tech is scouring the internet for new data sources to train its AI models.
  • OpenAI, Meta, Google, and other Big Tech firms train their AI models using online data.
  • But AI models learn so fast that all that data could run out by 2026.
  • So how will AI systems keep learning? Big Tech has some interesting ideas.

More is more when it comes to AI. The more data AI systems are trained on, the more powerful they will be.

But as the AI arms race heats up, tech giants like Meta, Google, and OpenAI face a problem: They're running out of data to train their models.

Many leading AI systems have been trained on the vast supply of online data. But by 2026, all the high-quality data could be exhausted, according to Epoch, an AI research institute.

So major tech companies are searching for new data sources to keep their systems learning. Here's a look at some of the most creative options that tech companies are considering.

Google considered tapping consumer data available in Google Docs, Sheets, and Slides.
google docs
Google considered using data from Google Docs, Sheets, and Slides for training its AI systems.

Last summer, the legal department at Google began asking employees to broaden the language around using consumer data, the Times reported. Some employees were informed that the company wanted to use data from the free consumer versions of Google Docs, Google Sheets, Google Slides, and even the restaurant reviews on Google Maps.

While Google updated its privacy policy in July 2023, the company says it didn't expand the types of data it uses to train AI models.

Splurging on the publishing house, Simon & Schuster.
Simon & Schuster
Simon & Schuster's New York City headquarters in 2016.

At Meta, the waning supply of usable data concerned executives so much they met almost daily in March and April last year to brainstorm alternatives, the Times reported.

One idea floated at these meetings was to buy Simon & Schuster. The famed publishing house has worked with authors like Stephen King and Jennifer Weiner and was purchased by private equity firm KKR for $1.62 billion last year.

Other attendees suggested a more budget-friendly option of paying $10 a book to obtain the full licensing rights to new titles.

Generating synthetic data
Stock image from Getty
OpenAI is exploring synthetic data to train its systems.

Synthetic data is data generated by AI systems, and OpenAI has considered it an option for its models.

"As long as you can get over the synthetic data event horizon, where the model is smart enough to make good synthetic data, everything will be fine," OpenAI CEO Sam Altman said at a tech conference last May, according to the Times.

The issue with training AI systems on synthetic data is that it can reinforce some of the mistakes and limitations of AI, the Times reported. OpenAI is working on a process to address this in which one AI system produces data, and another AI system judges it.

Whisper, a speech recognition tool that translates YouTube videos
YouTube logo music
YouTube wants to create AI-generated music.

OpenAI has also built Whisper, a speech recognition tool that can translate YouTube videos and podcasts. Its latest large language model, GPT-4, has been trained on over one million hours of YouTube videos transcribed by Whisper.

OpenAI's president, Greg Brockman, was a key developer of Whisper and told the Times that OpenAI relies on "numerous sources" of data for its systems.

Photobucket: A treasure trove of photos from Myspace and Friendster
myspace 2009
Photobucket, which hosted photos on Myspace, might be licensing its data to tech companies.

Photobucket was once "the world's top image-hosting site" and accounted for nearly half of the US online photo market, according to Reuters. Part of that was because it hosted photos for early social media sites like Myspace and Friendster.

Its database of pictures might now soon be licensed to tech companies for training their AI systems, Reuters reported. Photobucket declined to identify prospective buyers to Reuters.

Read the original article on Business Insider




Губернаторы России
Москва

Собянин поддержал команду кандидатов от «Единой России» на выборы в Мосгордуму





Москва

Филиал № 4 ОСФР по Москве и Московской области информирует: Свыше 5,2 миллиона жителей Московского региона получают набор социальных услуг в натуральном виде


Губернаторы России

103news.net – это самые свежие новости из регионов и со всего мира в прямом эфире 24 часа в сутки 7 дней в неделю на всех языках мира без цензуры и предвзятости редактора. Не новости делают нас, а мы – делаем новости. Наши новости опубликованы живыми людьми в формате онлайн. Вы всегда можете добавить свои новости сиюминутно – здесь и прочитать их тут же и – сейчас в России, в Украине и в мире по темам в режиме 24/7 ежесекундно. А теперь ещё - регионы, Крым, Москва и Россия.

Moscow.media
Москва

Собянин поддержал команду "Единой России" на выборах в Мосгордуму



103news.comмеждународная интерактивная информационная сеть (ежеминутные новости с ежедневным интелектуальным архивом). Только у нас — все главные новости дня без политической цензуры. "103 Новости" — абсолютно все точки зрения, трезвая аналитика, цивилизованные споры и обсуждения без взаимных обвинений и оскорблений. Помните, что не у всех точка зрения совпадает с Вашей. Уважайте мнение других, даже если Вы отстаиваете свой взгляд и свою позицию. 103news.com — облегчённая версия старейшего обозревателя новостей 123ru.net.

Мы не навязываем Вам своё видение, мы даём Вам объективный срез событий дня без цензуры и без купюр. Новости, какие они есть — онлайн (с поминутным архивом по всем городам и регионам России, Украины, Белоруссии и Абхазии).

103news.com — живые новости в прямом эфире!

В любую минуту Вы можете добавить свою новость мгновенно — здесь.

Музыкальные новости

Баста

В Красноярске рэпер Баста поругался с инспекторами ГАИ и пошел пешком через мост




Спорт в России и мире

Алексей Смирнов – актер, которого, надеюсь, еще не забыли

В Москве тушат пожар в здании у метро «Динамо»

Новороссийские шахматисты приняли участие в полуфинале турнира "Шахматные звёзды 4.0"

Массовыми патриотическими акциями отметили День России на заводах АО "Желдорреммаш"


WTA

Самсонова вышла в четвертьфинал турнира WTA в Хертогенбосе



Новости Крыма на Sevpoisk.ru


Москва

Появление российского флота у берегов Кубы напомнило Западу о словах Путина



Частные объявления в Вашем городе, в Вашем регионе и в России