By Jeff Fan and Anish Singh Walia
Hugging Face’s Generative AI Services (HUGS) makes deploying and managing LLMs easier and faster. Now, with DigitalOcean’s 1-Click deployment for HUGS on GPU Droplets, you can set up, scale, and optimize LLMs on a cloud infrastructure tailored for high performance. This guide walks you through deploying HUGS on a DigitalOcean GPU Droplet and integrating it with Open WebUI. It also explains why this setup is ideal for seamless, scalable LLM inference.
Set up the Droplet:
Go to DigitalOcean’s Droplets page and create a new GPU Droplet. Under the Choose an Image tab, please select 1-Click Models and use one of the available Hugging Face images.
Access the Console:
Once your Droplet is ready, click on its name in the Droplets section and select Launch Web Console.
Please note the Message of the Day (MOTD): This contains the bearer token and inference endpoint for API access, which you’ll need later.
Hugging Face HUGS will automatically start after the Droplet setup. To verify, check the status of the Caddy service managing the inference API:
sudo systemctl status caddy
[secondary_label Output
● caddy.service - Caddy
Loaded: loaded (/lib/systemd/system/caddy.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/caddy.service.d
└─override.conf
Active: active (running) since Wed 2024-10-30 10:27:10 UTC; 2min 58s ago
Docs: https://caddyserver.com/docs/
Main PID: 8239 (caddy)
Tasks: 17 (limit: 629145)
Memory: 48.8M
CPU: 73ms
CGroup: /system.slice/caddy.service
└─8239 /usr/bin/caddy run --config /etc/caddy/Caddyfile
Allow 5-10 minutes for the model to fully load.
Launch Open WebUI using Docker on another Droplet. Please use the below docker command to run the Open WebUI docker container.
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Once Open WebUI runs, access it at http://<your_droplet_ip>:3000
.
To connect Open WebUI with Hugging Face HUGS:
Open Settings:
Go to Admin:
Set the Inference Endpoint:
/v1
. If a specific port is required, include it, e.g., http://<your_droplet_ip>/v1
.Verify Connection:
hfhgus/Meta-Llama
.With HUGS integrated into Open WebUI, you’re ready to interact with your LLM:
Does DigitalOcean offer object storage?
: sudo docker ps
sudo docker logs <your-container-ID> -f
Ease of Deployment and Simplified Management
Deploying HUGS with DigitalOcean’s one-click setup is straightforward. No need for manual configurations—DigitalOcean and Hugging Face handle the backend, allowing you to focus on scaling.
Optimized Performance for Large-Scale Inference HUGS on DigitalOcean GPUs ensures optimal performance, running LLMs efficiently on GPU hardware without manual tuning.
Scalability and Flexibility DigitalOcean’s infrastructure supports scalable deployments with load balancers for high availability, letting you serve users globally with low latency.
By using Hugging Face HUGS on DigitalOcean GPU Droplets, you not only benefit from high-performance LLM inference but also gain the flexibility to scale and manage the deployment effortlessly. This combination of optimized hardware, scalability, and simplicity makes DigitalOcean an excellent choice for production-level AI workloads.
With HUGS deployed on DigitalOcean’s GPU Droplet and Open WebUI, you can efficiently manage, scale, and optimize LLM inference. This setup eliminates hardware optimization concerns and provides a ready-to-scale solution for delivering fast, reliable responses across multiple regions.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
I’m a Senior Solutions Architect in Munich with a background in DevOps, Cloud, Kubernetes and GenAI. I help bridge the gap for those new to the cloud and build lasting relationships. Curious about cloud or SaaS? Let’s connect over a virtual coffee! ☕
Helping Businesses stand out with AI, SEO, & Technical content that drives Impact & Growth | Senior Technical Writer @ DigitalOcean | 2x Medium Top Writers | 2 Million+ monthly views & 34K Subscribers | Ex Cloud Engineer @ AMEX | Ex SRE(DevOps) @ NUTANIX
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Jump in! Follow our easy steps to deploy Hugging Face HUGS on DigitalOcean GPU Droplets and bring your AI models to life.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.