Sr Technical Writer

With the increasing demand for multilingual communication, real-time audio translation is rapidly gaining attention. In this tutorial, you will learn to deploy a real-time audio translation application using OpenAI APIs on Open WebUI, all hosted on a powerful GPU Droplet from DigitalOcean.
DigitalOcean’s GPU Droplets, powered by NVIDIA H100 GPUs, offer significant performance for AI workloads, making them ideal for fast and efficient real-time audio translation. Let’s get started.
1.Create a New Project - You will need to create a new project from the cloud control panel and tie it to a GPU Droplet.
2.Create a GPU Droplet - Log into your DigitalOcean account, create a new GPU Droplet, and choose AI/ML Ready as the OS. This OS image installs all the necessary NVIDIA GPU Drivers. You can refer to our official documentation on how to create a GPU Droplet.

3.Add an SSH Key for authentication - An SSH key is required to authenticate with the GPU Droplet and by adding the SSH key, you can login to the GPU Droplet from your terminal.

4.Finalize and Create the GPU Droplet - Once all of the above steps are completed, finalize and create a new GPU Droplet.

Open WebUI is a web interface that allows users to interact with language models (LLMs). It’s designed to be user-friendly, extensible, and self-hosted, and can run offline. Open WebUI is similar to ChatGPT in its interface, and it can be used with a variety of LLM runners, including Ollama and OpenAI-compatible APIs.
There are three ways you can deploy Open WebUI:
In this tutorial you will deploy Open WebUI using Docker as a docker container on the GPU Droplet with Nvidia GPU support. You can check out and learn about how to deploy Open WebUI using other techniques in this Open WebUI quick start guide.
Once the GPU Droplet is ready and deployed. SSH to the GPU Droplet from your terminal.
ssh root@<your-droplet-ip>
This Ubuntu AI/ML Ready H100x1GPU Droplet comes pre-installed with docker.
You can verify the docker version using the below command:
docker --version
OutputDocker version 24.0.7, build 24.0.7-0ubuntu2~22.04.1
Next, run the below command to verify and ensure Docker has access to your GPU:
docker run --rm --gpus all nvidia/cuda:12.2.0-runtime-ubuntu22.04 nvidia-smi
This command pulls the nvidia/cuda:12.2.0-runtime-ubuntu22.04 image (if it has not already been downloaded or updates an existing image) and starts a container.
Inside the container, it runs nvidia-smi to confirm that the container has GPU access and can interact with the underlying GPU hardware. Once nvidia-smi has executed, the --rm flag ensures the container is automatically removed, as it’s no longer needed.
You should observe the following output:
OutputUnable to find image 'nvidia/cuda:12.2.0-runtime-ubuntu22.04' locally
12.2.0-runtime-ubuntu22.04: Pulling from nvidia/cuda
aece8493d397: Pull complete 
9fe5ccccae45: Pull complete 
8054e9d6e8d6: Pull complete 
bdddd5cb92f6: Pull complete 
5324914b4472: Pull complete 
9a9dd462fc4c: Pull complete 
95eef45e00fa: Pull complete 
e2554c2d377e: Pull complete 
4640d022dbb8: Pull complete 
Digest: sha256:739e0bde7bafdb2ed9057865f53085539f51cbf8bd6bf719f2e114bab321e70e
Status: Downloaded newer image for nvidia/cuda:12.2.0-runtime-ubuntu22.04
==========
== CUDA ==
==========
CUDA Version 12.2.0
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
Thu Nov  7 19:32:18 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06             Driver Version: 535.183.06   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA H100 80GB HBM3          On  | 00000000:00:09.0 Off |                    0 |
| N/A   28C    P0              70W / 700W |      0MiB / 81559MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
Please use the below docker command to run the Open WebUI docker container.
docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui --gpus all ghcr.io/open-webui/open-webui:main 
The above command runs a Docker container using the open-webui image and sets up specific configurations for network ports, volumes, and GPU access.
docker run -d:
docker run starts a new Docker container.-d runs the container in detached mode, meaning it runs in the background.-p 3000:8080:
http://localhost:3000 on the host.-v open-webui:/app/backend/data:
open-webui to the /app/backend/data directory inside the container.–name open-webui:
open-webui, which makes it easier to reference (e.g., docker stop open-webui to stop the container).ghcr.io/open-webui/open-webui:main:
ghcr.io/open-webui/open-webui is the name of the image, hosted on GitHub’s container registry (ghcr.io).main is the image tag, often representing the latest stable version or main branch.–gpus all:
Verify if the Open WebUI docker container is up and running:
docker ps 
OutputCONTAINER ID   IMAGE                                COMMAND           CREATED         STATUS                            PORTS                                       NAMES
4fbe72466797   ghcr.io/open-webui/open-webui:main   "bash start.sh"   5 seconds ago   Up 4 seconds (health: starting)   0.0.0.0:3000->8080/tcp, :::3000->8080/tcp   open-webui
Once Open WebUI container is up and running, access it at http://<your_gpu_droplet_ip>:3000 on your browser.

In this step, you will add your OpenAI API key to Open WebUI.
Once logged in to the Open WebUI dashboard, you should notice no models running as seen in the below image:

To connect Open WebUI with OpenAI and use all the available OpenAI models, follow the below steps:
Open Settings:
Go to Admin:
Add the OpenAI API Key:
Verify Connection:

Now, Open WebUI will then auto-detect all available OpenAI models. Select GPT-4o from the list.

Next, set the text-to-speech and speech-to-text models and audio settings to use OpenAI whisper model:

Again, navigate and click Settings -> Audio to configure and save the audio STT and TTS settings, as seen in the above screenshot.
You can read more about the OpenAI text-to-speech and speech-to-text here.
If you’re streaming audio from your local machine to the Droplet, route the audio input through an SSH tunnel.
Since the GPU Droplet has the Open WebUI container running on http://localhost:3000, you can access it on your local machine by navigating to http://localhost:3000 after setting up this SSH tunnel.
This is required to let Open WebUI access the microphone on your local machine for realtime audio translation and realtime lamguage processing. As without this it will throw the below error when clicking the headphone or microphone icon to use GPT-4o for natural language processing tasks.

Use the below command to set a local SSH tunnel from your local machine to the GPU Droplet by opening a new terminal on your local machine:
ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=5 root@<gpu_droplet_ip> -L 3000:localhost:3000
This command establishes an SSH connection to your GPU Droplet as the root user and establishes a local port forwarding tunnel. It also includes options to keep the SSH session alive. Here’s a detailed breakdown:
-o ServerAliveInterval=60:
ServerAliveInterval to 60 seconds, meaning that every 60 seconds, an SSH keep-alive message is sent to the remote server.-o ServerAliveCountMax=5:
ServerAliveCountMax to 5, which allows up to 5 missed keep-alive messages before the SSH connection is terminated.ServerAliveInterval=60, this setting means the SSH session will stay open for 5 minutes (5 × 60 seconds) of no response from the server before closing.-L 3000:localhost:3000:
3000 (before the colon) is the local port on your machine, where you will access the forwarded connection.localhost:3000 (after the colon) refers to the destination on the GPU Droplet.Now, this command will allow you to access the Open WebUI by visiting http://localhost:3000 on your local machine and also use the microphone for real-time audio translation.
Click the headphone or microphone icon to use whisper and GPT-4o models for natural language processing tasks.

Clicking on the Headphone/Call button will open a voice assistant using OpenAI GPT-4o and whisper models for real-time audio processing and translation.
You can use it to translate and transcribe the audio in real time by talking with the GPT-4o voice assistant.

Deploying real-time audio translation using OpenAI APIs on Open WebUI with DigitalOcean’s GPU Droplets allows developers to create high-performance translation systems. With easy setup and monitoring, DigitalOcean’s platform provides the resources for scalable, efficient AI applications.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
I help Businesses scale with AI x SEO x (authentic) Content that revives traffic and keeps leads flowing | 3,000,000+ Average monthly readers on Medium | Sr Technical Writer @ DigitalOcean | Ex-Cloud Consultant @ AMEX | Ex-Site Reliability Engineer(DevOps)@Nutanix
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Get simple AI infrastructure starting at $2.99/GPU/hr on-demand. Try GPU Droplets now!
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.