Kodlama 9 Nisan 2024

Google’s Gemini Pro 1.5 enters public preview on Vertex AI

Gemini 1.5 Pro, Google’s most capable generative AI model, is now available in public preview on Vertex AI, Google’s enterprise-focused AI development platform. The company announced the news during its annual Cloud Next conference, which is taking place in Las Vegas this week.

Gemini 1.5 Pro launched in February, joining Google’s Gemini family of generative AI models. Undoubtedly its headlining feature is the amount of context that it can process: between 128,000 tokens to up to 1 million tokens, where “tokens” refers to subdivided bits of raw data (like the syllables “fan,” “tas” and “tic” in the word “fantastic”).

One million tokens is equivalent to around 700,000 words or around 30,000 lines of code. It’s about four times the amount of data that Anthropic’s flagship model, Claude 3, can take as input and about eight times as high as OpenAI’s GPT-4 Turbo max context.

A model’s context, or context window, refers to the initial set of data (e.g. text) the model considers before generating output (e.g. additional text). A simple question — “Who won the 2020 U.S. presidential election?” — can serve as context, as can a movie script, email, essay or e-book.

Models with small context windows tend to “forget” the content of even very recent conversations, leading them to veer off topic. This isn’t necessarily so with models with large contexts. And, as an added upside, large-context models can better grasp the narrative flow of data they take in, generate contextually richer responses and reduce the need for fine-tuning and factual grounding — hypothetically, at least.

So what specifically can one do with a 1 million-token context window? Lots of things, Google promises, like analyzing a code library, “reasoning across” lengthy documents and holding long conversations with a chatbot.

Because Gemini 1.5 Pro is multilingual — and multimodal in the sense that it’s able to understand images and videos and, as of Tuesday, audio streams in addition to text — the model can also analyze and compare content in media like TV shows, movies, radio broadcasts, conference call recordings and more across different languages. One million tokens translates to about an hour of video or around 11 hours of audio.

Thanks to its audio-processing capabilities, Gemini 1.5 Pro can generate transcriptions for video clips, as well, although the jury’s out on the quality of those transcriptions.

In a pre-recorded demo earlier this year, Google showed Gemini 1.5 Pro searching the transcript of the Apollo 11 moon landing telecast (which comes to about 400 pages) for quotes containing jokes, and then finding a scene in movie footage that looked similar to a pencil sketch.

Google says that early users of Gemini 1.5 Pro — including United Wholesale Mortgage, TBS and Replit — are leveraging the large context window for tasks spanning mortgage underwriting; automating metadata tagging on media archives; and generating, explaining and transforming code.

Gemini 1.5 Pro doesn’t process a million tokens at the snap of a finger. In the aforementioned demos, each search took between 20 seconds and a minute to complete — far longer than the average ChatGPT query.

Google previously said that latency is an area of focus, though, and that it’s working to “optimize” Gemini 1.5 Pro as time goes on.

Of note, Gemini 1.5 Pro is slowly making its way to other parts of Google’s corporate product ecosystem, with the company announcing Tuesday that the model (in private preview) will power new features in Code Assist, Google’s generative AI coding assistance tool. Developers can now perform “large-scale” changes across codebases, Google says, for example updating cross-file dependencies and reviewing large chunks of code.


source

Spread the love <3

You may also like...

Nis
08
2024
0
Android kullanıcılarının ortak sorunu: Circle to Search

Android kullanıcılarının ortak sorunu: Circle to Search

Habere başlamadan önce okurlarımızın yorumunu merak ettiğimiz bir soru var: Sizce de yapay zeka bir anda hayatımıza dahil olmadı mı?...

Spread the love <3
Nis
26
2024
0
170: No More Easy Button: A Suggested Approach to Post-Pandemic Teaching

170: No More Easy Button: A Suggested Approach to Post-Pandemic Teaching

Now that we can see the light at the end of the tunnel of Covid-19, we have an opportunity for...

Spread the love <3
Nis
05
2024
0
Beklenen oldu! Apple, yüzlerce çalışanını işten çıkardı

Beklenen oldu! Apple, yüzlerce çalışanını işten çıkardı

Son yıllarda teknoloji şirketlerinin farklı alanlara giriş yapmaya çalıştığını görüyoruz. Örneğin Android cephesinden Xiaomi, otomobil pazarına adım attı. Öte yandan...

Spread the love <3
Nis
15
2024
0
Israelis are used to living under threat - but this is new

Israelis are used to living under threat – but this is new

It’s a new day in Israel but the unprecedented attack by Iran has left this country feeling shaken. Schools and...

Spread the love <3
Whatsapp İletişim
Merhaba,
Size nasıl yardımcı olabilirim ?