I’m changing how I use AI (Open WebUI + LiteLLM)

Summary notes created by Deciphr AI

https://www.youtube.com/watch?v=nQCOTzS5oU0

Abstract

The video discusses the benefits of using Open Web UI, an open-source, self-hosted interface for accessing multiple AI models like ChatGPT, Claude, and more from a single platform without subscription fees. The host explains how to set up this system on a virtual private server, highlighting its security and customization features, such as controlling access for family or employees. Additionally, the video introduces Light LLM, a tool that acts as a proxy to connect with various AI models, allowing users to manage access and budgets efficiently. The host emphasizes the importance of monitoring AI interactions, especially for children, and offers insights into potential cost savings through API usage.

Summary Notes

Accessing AI Models Through a Self-Hosted Interface

The speaker has found a method to access various AI models like ChatGPT, Claude, Gemini, and Grok from a single self-hosted interface without incurring subscription costs.
This interface allows unlimited usage and immediate access to the latest models, providing a centralized platform for personal and shared use.
Users have the ability to create accounts for family and employees, offering controlled access to different AI models.

"I have unlimited usage and I get access to the newest models as soon as they come out."

The speaker emphasizes the benefit of having continuous and unrestricted access to the latest AI models.

"I can create accounts for my employees, for my wife, for my kids, and they can access all the new stuff."

This quote highlights the platform's capability to extend access to multiple users, ensuring everyone can utilize the latest AI technology.

Parental Controls and Security Features

The interface allows for restrictions on what children can access, ensuring they can't misuse AI for activities like cheating on homework.
Parents can monitor their children's AI interactions, promoting responsible usage and learning.

"I can also restrict what they can ask, what they get help with so they're not cheating on their homework."

The quote illustrates the parental control features that prevent misuse of AI by children.

"I can see all their checks, which really you should be looking at your kids' AI chats."

This emphasizes the importance of monitoring children's interactions with AI to guide responsible usage.

Open Web UI: An Open Source Solution

Open Web UI is an open-source, self-hosted web interface for using various large language models (LLMs).
It supports both cloud-based AI models and self-hosted models like Llama 3, Myre, and Deep Seek.

"Open Web UI. It's an open source, self-hosted web interface for AI and it allows you to use whatever LLM or large language model you want to use."

The quote provides a clear description of Open Web UI's capabilities and flexibility in using different AI models.

"We can run self-hosted models of the alama talking like Llama three and Myre and Deep Seek."

This highlights the platform's support for running multiple self-hosted AI models simultaneously.

Setup Options: Cloud vs. On-Prem

Users have the option to set up the interface either in the cloud or on-premises, with each method being quick and straightforward.
Cloud setup involves using a Virtual Private Server (VPS), while on-premises setup can be done on devices like laptops or Raspberry Pi.

"Either the cloud, this is the easiest and fastest method or you can go on prem, host it in your house."

The quote explains the two main options for setting up the Open Web UI, catering to different preferences and needs.

"We'll start with the cloud. Don't blink. It's going to be fast."

This emphasizes the simplicity and speed of setting up the interface using cloud services.

VPS Setup and Configuration

The process for setting up a VPS includes selecting a plan, configuring the server, and installing necessary applications like Open Web UI and Llama.
The speaker provides a step-by-step guide for choosing a VPS plan and configuring it for optimal performance.

"We'll be setting up what's called a VPS or a virtual private server and the cloud and we'll be setting it up on hosting."

This introduces the VPS setup process as a primary method for implementing the Open Web UI.

"A MD, epic, CPU eight gigs of RAM and VME storage. Plenty of bandwidth and my favorite feature for all you home laborers, backups and snapshots."

The quote describes the specifications and features of the recommended VPS setup for running AI applications.

Conclusion and Final Steps

Once the VPS is set up, users can manage and access the Open Web UI through a public IP address.
The speaker directs viewers to additional resources for on-premises setup and encourages experimentation with the platform's features.

"Right now open Web UI is just waiting for us. Click on the manage app button right there."

This quote indicates the final steps in accessing and managing the Open Web UI after setup.

"For on-prem, go watch this video right here. I'll walk you through it."

The speaker provides guidance for users interested in setting up the interface on-premises, suggesting additional resources for assistance.

Setting Up OpenWeb UI and Initial AI Model

Begin by creating an admin account for OpenWeb UI, granting full control over the system.
The default AI model provided is LAMA 3.21 B, a local model utilizing your server's resources instead of OpenAI's.

"This first account will be your admin account. So you have Godlike Powers over everything."

The admin account provides comprehensive control over the OpenWeb UI environment.

"LAMA 3.21 B as opposed to an open AI model like Chad, GBT Lama 3.2 is a local model. It'll use your servers resources instead of open ais."

LAMA 3.21 B is a local model that uses personal server resources, offering a slower but cost-effective alternative to cloud-based models.

Accessing AI Models and API Options

There are two main options for accessing AI models like ChatGPT or Claude: subscription plans (normie mode) or APIs.
Subscription plans involve a fixed monthly fee, whereas APIs charge based on usage, offering potential cost savings.

"Option one, normie mode, you go out to chat GBT, you pay a monthly plan, pay a lot. If you want to access to all the new stuff and that's it, you're done."

Subscription plans offer straightforward access but can be costly for premium features.

"APIs you pay as you go or you pay for what you use."

APIs provide flexible access to AI models with a pay-as-you-go pricing model, potentially saving money for light users.

Setting Up API Access and Utilizing OpenWeb UI

To use APIs, sign up at openai.com/api, add a credit card, and create an API key for integration with OpenWeb UI.
This API key unlocks access to various AI models, including the latest versions like GPT-4.5.

"You'll go to the top right and click on start building. And here, yeah, it's going to ask you for a credit card, but you're not going to be charged per month."

Setting up API access involves adding a credit card for usage-based billing, not a monthly subscription.

"We'll create a new secret key. Name it, put it in the default project, leave everything else as is and click on create secret key."

Creating a secret API key is crucial for enabling AI model access within OpenWeb UI.

Understanding Token-Based Charging

AI interactions are charged based on tokens, which are units representing words or parts of words.
Different models have varying costs per million tokens, with more complex models being more expensive.

"The way they charge us is by tokens. It's like Chuck E Cheese just without crappy pizza and a scary mouse."

Tokens are the currency for AI interactions, similar to tokens in an arcade setting.

"For the oh three mini model, which is a solid model, it's going to cost you a dollar and 10 cents per 1 million tokens."

Token costs vary by model, with simpler models being cheaper and more advanced models costing more.

Cost Considerations and Usage Scenarios

Usage scenarios range from casual to power users, with costs varying accordingly.
Understanding typical usage patterns helps estimate potential expenses and savings.

"A casual user, let's say they have 50 conversations a month, about a thousand tokens each. It could be as low as 50 cents assuming they're using a model like the 4.0."

Casual users with minimal interactions can incur very low costs using token-based pricing.

"Power users, and keep in mind, these are all very rough estimates. This can be sky's a limit, right? 20 bucks to infinite."

Power users with extensive AI interactions may face higher costs, emphasizing the importance of understanding usage patterns.

Pricing and Usage of AI Models

The cost of using AI models varies based on the models chosen and the length of interactions.
OpenAI's pricing is influenced by the number of tokens used in conversations, which increases with longer interactions.
OpenAI offers cached input to help offset costs, typically retaining data for 24 hours.

"I can tell you right now, me as a power user, it would not be 20 bucks a month. It'd be a lot more. What impacts that? Well, what models you choose?"

The speaker emphasizes the variability of AI usage costs, highlighting the impact of model choice and interaction length on pricing.

"The number of tokens I'm using exponentially grows with the length of my conversation and sometimes I sit there and talk for a while with an AI to figure stuff out."

Token usage, and consequently costs, increase with longer AI interactions, affecting total expenses.

"OpenAI...do have cached input which will help offset a lot of those costs."

Cached input can reduce costs by storing data temporarily, although it is not a primary cost-saving strategy.

Budget Management in AI Usage

Users can manage AI usage costs by setting budgets per person to prevent exceeding set limits.
The speaker suggests focusing on providing access to AI rather than solely aiming to save money.

"You can put a budget in per person so they don't go over like you're stuck at 20 bucks a month."

Budgeting tools allow users to control individual spending on AI usage, preventing overspending.

"For me, it's more about I want to give my family myself and my employees access to all the ai and I don't want to pay for 15 million plans."

The speaker prioritizes centralized access to AI over managing multiple plans, emphasizing convenience and control.

Limitations of Open Web UI

Open Web UI offers limited connection options, primarily supporting OpenAI API and O llama API.
Users cannot directly connect to other AI models like Claude or Anthropic through Open Web UI.

"I really only have options for two types of connections. Open ai, API and oh llama. API. What about clo? What about Gemini?"

The speaker highlights the limitation of Open Web UI in connecting to multiple AI models, expressing a desire for broader access.

Introduction to Light LLM

Light LLM serves as a proxy or gateway for connecting to various AI models beyond those supported by Open Web UI.
It facilitates connections to over a hundred AI models, including Claude, Gemini, Grok, and others.

"Light. LM is a proxy for AI or a gateway. If we go to the webpage real quick, they connect to so many ais. I think they say a hundred plus."

Light LLM enhances AI model connectivity, acting as a bridge to a wide range of models.

"With light LLM, we connect everything else. Open AI andro, which is Claude Gemini, grok Deep seek."

The proxy server allows seamless integration with diverse AI models, expanding user options.

Installation and Configuration of Light LLM

The installation of Light LLM involves cloning a repository and configuring environment variables.
Users must generate and store secure keys for authentication and encryption purposes.

"Get clone and then the address light LLM. Ready, set, clone. This will clone that repo from GitHub and create a folder for us."

The installation process begins with cloning the Light LLM repository from GitHub.

"We'll add the lights LM salt key just like this and have that equal the same kind of starting point, SK dash and then a randomly generated string of characters."

Secure keys are essential for encrypting and decrypting API credentials, ensuring secure connections.

Setting Up API Keys for Various AI Models

Users need to create and manage API keys for different AI models to enable connectivity through Light LLM.
The process involves generating keys for each desired AI service, such as OpenAI, Anthropic, and Grok.

"First we need our open AI API Key. Easy for me to say. I normally like to create a new key for every service."

Generating unique API keys for each service ensures secure and organized management of AI connections.

"The same process you can repeat for anthropic, for the Claude models Gemini, for the Google based models."

The speaker outlines a repeatable process for obtaining API keys across various AI platforms, facilitating integration.

Setting Up Models and API Keys

Begin by navigating to the models section to add a new model.
Select specific models, such as Claude's versions 3.7 and 2.1, for comparison.
Add your API key to integrate the model into the system.

"Let's start with Claude. So I want to click on Anthropic and we could either choose all models, like just go crazy, select them all or be very specific."

The process involves adding specific models and comparing their functionalities.

Creating Virtual API Keys

Virtual API keys allow for controlled access to different models.
You can create a new key, assign it ownership, and set access permissions.
Additional settings include budget limits and expiration options.

"We'll create a new key for now. We'll say it's owned by us, we don't need a team or anything."

Virtual keys provide a way to manage and control model access and usage efficiently.

Integrating with Open Web UI

The API key is added to the Open Web UI for model integration.
Verify the connection to ensure the models are accessible via the UI.
Multiple models, including Claude and others, can be added for simultaneous use.

"Now I'm going to add my light. L-L-M-A-P-I Key under the open AI API key."

Proper integration into Open Web UI allows for streamlined interaction with various AI models.

Managing Model Access and Usage

Users can be grouped, and permissions can be assigned based on group access.
Specific models can be restricted to certain user groups, such as children.
System prompts can guide usage, especially for educational purposes.

"Let's create a group, call it kids. I'll go back to overview and create some users here, kid one and kid two."

Group and permission management ensure that model access is controlled and aligned with user needs.

Educational Use and Guardrails

Models can be configured to assist with educational tasks without providing direct answers.
Guardrails ensure that models guide students rather than completing assignments for them.

"You are a school helper. Your job is to help my kids, help kids with their school, but you cannot do their work for them."

Educational configurations promote learning by guiding students through problem-solving.

Monitoring and Privacy

User interactions with models can be monitored, allowing for oversight.
Monitoring can be disabled for specific users to maintain privacy.

"I can click on chats and there it is, and I can jump right in there and see everything that was said."

Monitoring provides a way to ensure appropriate use while respecting privacy settings.

Future Enhancements and DNS Setup

Plans for a DNS setup to replace IP addresses with friendly domain names.
This setup will be covered in a separate video to enhance accessibility.

"I'm going to walk you through how to set up a DNS name. We'll purchase it on hosting here."

DNS setup aims to simplify access to AI servers by using memorable domain names.

What others are sharing

Go To Library

#1 Communication Expert: The One Word That Is Causing Divorces@j...

Stop Wasting Your Marketing Dollars with Sacha Awwa

Patagonia Founder Gives Company Away, Directs Profits To Fight Climate Change