Hugging Face launches an open source tool for affordable AI deployment

An AI face in profile against a digital background.
(Image credit: Shutterstock / Ryzhi)

Hugging Face has introduced its latest offering, Hugging Face Generative AI Services (HUGS), aimed at simplifying the deployment and scaling of generative AI applications using open-source models.

Built on Hugging Face technologies such as Transformers and Text Generation Inference (TGI), HUGS promises optimized performance across various hardware accelerators.

For developers using AWS or Google Cloud, the service is available at $1 per hour per container, with a five-day free trial on AWS to help users get started.

Streamlining AI with zero-configuration inference

HUGS offers developers a solution to run AI models on their own infrastructure without the need for manual configuration. One of the primary challenges when deploying large language models (LLMs) is optimizing them for specific hardware environments. Each accelerator, whether it is an NVIDIA GPU or an AMD GPU, requires fine-tuning to extract maximum performance.

With HUGS, these optimizations are managed automatically, delivering high throughput out of the box. In addition to NVIDIA and AMD GPUs, the company promises that its support will soon extend to AWS Inferentia and Google TPUs.

Hugging Face aims to ease the transition from black-box APIs to open, self-hosted solutions with support for a wide array of models, including well-known LLMs like Llama and Gemma, with plans to introduce multimodal models such as Idefics and Llava soon. In the future, the company says it will include embedding models like BGE and Jina, giving developers even more options to customize their AI applications.

This service uses standardized APIs compatible with OpenAI’s model interfaces, therefore, developers can migrate their own code.

For startups in particular, HUGS provides an opportunity to build AI applications without incurring the high costs associated with proprietary platforms. The availability of one-click deployments on DigitalOcean makes it even easier for small teams to experiment with generative AI technologies.

Meanwhile, larger enterprises can leverage HUGS to scale their applications without being locked into a single cloud provider or proprietary API. On DigitalOcean, HUGS is included at no extra charge beyond the standard cost of GPU Droplets. Hugging Face also offers custom deployment solutions for enterprises through its Enterprise Hub.

You might also like

Efosa Udinmwen
Freelance Journalist

Efosa has been writing about technology for over 7 years, initially driven by curiosity but now fueled by a strong passion for the field. He holds both a Master's and a PhD in sciences, which provided him with a solid foundation in analytical thinking. Efosa developed a keen interest in technology policy, specifically exploring the intersection of privacy, security, and politics. His research delves into how technological advancements influence regulatory frameworks and societal norms, particularly concerning data protection and cybersecurity. Upon joining TechRadar Pro, in addition to privacy and technology policy, he is also focused on B2B security products. Efosa can be contacted at this email: udinmwenefosa@gmail.com

Read more
Hugging Face
What is Hugging Face? Everything we know about the ML platform
DeepSeek
Nvidia out? DeepSeek pairs with banned Chinese tech giant to deliver unbelievably low pricing on AI inference which could cause Nvidia's house of cards to come crashing
A person using DeepSeek on their smartphone
DeepSeek R1 is now available on Nvidia, AWS, and Github as available models on Hugging Face shoot past 3,000
The Google Gemini logo against a black background.
What is Google AI Studio? Everything we know about Google's AI builder
HeyGen
What is HeyGen? Everything we know about the AI video platform
A profile of a human brain against a digital background.
Navigating the rising costs of AI inference in the era of large-scale applications
Latest in Pro
Microsoft
"Another pair of eyes" - Microsoft launches all-new Security Copilot Agents to give security teams the upper hand
Lock on Laptop Screen
Medusa ransomware is able to disable anti-malware tools, so be on your guard
AI quantization
What is AI quantization?
US flags
US government IT contracts set to be centralized in new Trump order
An abstract image of digital security.
Fake file converters are stealing info, pushing ransomware, FBI warns
Google Gemini AI
Gmail is adding a new Gemini AI tool to help smarten up your work emails
Latest in News
Girl wearing Meta Quest 3 headset interacting with a jungle playset
Latest Meta Quest 3 software beta teases a major design overhaul and VR screen sharing – and I need these updates now
Microsoft
"Another pair of eyes" - Microsoft launches all-new Security Copilot Agents to give security teams the upper hand
Hatch Restore 3 in Putty
You can finally start your day with The Office theme song, and I couldn't be more excited
Cassian Andor looking nervously over his shoulder in Andor season 2
New Andor season 2 trailer has got Star Wars fans asking the same question – and it includes an ominous call back to Rogue One's official teaser
Ncuti Gatwa as The Fifteenth Doctor in Doctor Who
Disney+ drops new trailer for Doctor Who season 2 that promises an epic adventure across time and space
23andMe
23andMe is bankrupt and about to sell your DNA, here's how to stop that from happening