Kodlama 3 Nisan 2024

OctoAI wants to make private AI model deployments easier with OctoStack

OctoAI wants to make private AI model deployments easier with OctoStack

OctoAI (formerly known as OctoML), announced the launch of OctoStack, its new end-to-end solution for deploying generative AI models in a company’s private cloud, be that on-premises or in a virtual private cloud from one of the major vendors, including AWS, Google, Microsoft and Azure, as well as CoreWeave, Lambda Labs, Snowflake and others.

In its early days, OctoAI focused almost exclusively on optimizing models to run more effectively. Based on the Apache TVM machine learning compiler framework, the company then launched its TVM-as-a-Service platform and, over time, expanded that into a fully fledged model-serving offering that combined its optimization chops with a DevOps platform. With the rise of generative AI, the team then launched the fully managed OctoAI platform to help its users serve and fine-tune existing models. OctoStack, at its core, is that OctoAI platform, but for private deployments.

OctoAI wants to make private AI model deployments easier with OctoStack

Image Credits: OctoAI

OctoAI CEO and co-founder Luis Ceze told me the company has over 25,000 developers on the platform and hundreds of paying customers who use it in production. A lot of these companies, Ceze said, are GenAI-native companies. The market of traditional enterprises wanting to adopt generative AI is significantly larger, though, so it’s maybe no surprise that OctoAI is now going after them as well with OctoStack.

“One thing that became clear is that, as the enterprise market is going from experimentation last year to deployments, one, all of them are looking around because they’re nervous about sending data over an API,” Ceze said. “Two: a lot of them have also committed their own compute, so why am I going to buy an API when I already have my own compute? And three, no matter what certifications you get and how big of a name you have, they feel like their AI is precious like their data and they don’t want to send it over. So there’s this really clear need in the enterprise to have the deployment under your control.”

Ceze noted that the team had been building out the architecture to offer both its SaaS and hosted platform for a while now. And while the SaaS platform is optimized for Nvidia hardware, OctoStack can support a far wider range of hardware, including AMD GPUs and AWS’s Inferentia accelerator, which in turn makes the optimization challenge quite a bit harder (while also playing to OctoAI’s strengths).

Deploying OctoStack should be straightforward for most enterprises, as OctoAI delivers the platform with read-to-go containers and their associated Helm charts for deployments. For developers, the API remains the same, no matter whether they are targeting the SaaS product or OctoAI in their private cloud.

The canonical enterprise use case remains using text summarization and RAG to allow users to chat with their internal documents, but some companies are also fine-tuning these models on their internal code bases to run their own code generation models (similar to what GitHub now offers to Copilot Enterprise users).

For many enterprises, being able to do that in a secure environment that is strictly under their control is what now enables them to put these technologies into production for their employees and customers.

“For our performance- and security-sensitive use case, it is imperative that the models which process calls data run in an environment that offers flexibility, scale and security,” said Dali Kaafar, founder and CEO at Apate AI. “OctoStack lets us easily and efficiently run the customized models we need, within environments that we choose, and deliver the scale our customers require.”

source

Spread the love <3

You may also like...

May
11
2024
0
Ex-NHS doctor stuck in Gaza says family are &#039;terrified&#039; but he &#039;has to keep going&#039;

Ex-NHS doctor stuck in Gaza says family are 'terrified' but he 'has to keep going'

The European Hospital, in southern Gaza, is a dangerous place to be.  The facility is the only remaining hospital east...

Spread the love <3
Kas
10
2024
0

Snapdragon 8 Elite’li Honor Magic 7 serisi rekora doymuyor!

Honor‘un ekim ayında kullanıcıların beğenisine sunduğu Magic 7 serisi gerek özellikleriyle gerekse tasarımıyla dikkatleri üzerine çekiyor. 8 Kasım‘da satışa açılan...

Spread the love <3
Kas
01
2024
0

Steam Türkiye’de Call of Duty rüzgarı! En çok satan oyunlar açıklandı

Milyonlarca kullanıcısı ile en büyük dijital oyun mağazası olan Steam, her hafta olduğu gibi geçen hafta için de en çok satan...

Spread the love <3
May
01
2024
0
Amazon CodeWhisperer is now called Q Developer and is expanding its functions

Amazon CodeWhisperer is now called Q Developer and is expanding its functions

Pour one out for CodeWhisperer, Amazon’s AI-powered assistive coding tool. As of today, it’s kaput — sort of. CodeWhisperer is...

Spread the love <3
Whatsapp İletişim
Merhaba,
Size nasıl yardımcı olabilirim ?