Work Stuff: experiments with LLMs and docker

Tuesday, April 01, 2025

experiments with LLMs and docker

Tried out some "easy to use locally" Large Language Models (LLMs) with the motivation of being able to use them with our own documents.

Looking for easy ways to install and run them, via Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!

https://ollama.com/download

ollama list

ollama pull llama3.1:latest (~4 GB - the download went at nearly 4 MBps on the Airtel 40 Mbps connection, and around 6 MBps on Airtel 5G - 12 minutes)

ollama pull llama3.2:latest (~2 GB, and runs faster on low-end hardware)

(run as ollama run llama3.1:latest --verbose to get tokens/sec.

)

With 3.1, it was going at ~4 tokens per second on the desktop. CPUs, I think. With llama3.2, it was going at 23 tokens / sec - quite an acceptable speed. On the Windows laptop, running on CPU, llama3.2 at 9 tokens / sec.

For a web ui, via Set up a Local AI like ChatGPT on your own machine!

Wanted to try the docker route - installed docker via https://docs.docker.com/engine/install/ubuntu/ by setting up docker's repository. All my docker commands needed to be prefaced by sudo, since I didn't go through the post-install steps.

Trying to run open-webui with the commandline given in the youtube video,

sudo docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

failed. https://docs.openwebui.com/getting-started/quick-start

has the steps.

sudo docker pull ghcr.io/open-webui/open-webui:main

For Nvidia GPU support, add --gpus all to the docker run command:

Single user mode -

sudo docker run -d -p 3000:8080 --gpus all -e WEBUI_AUTH=False -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main

That also failed. "Unable to load models". Then, found that we need to start ollama separately in another terminal window, from How to Install Ollama, Docker, and Open WebUI on Windows - YouTube
and we need a different set of parameters in the command line if ollama is already installed on our system. So, got the openwebui to work with two modifications:

ran from administrator cmd (on Windows)

without -d (so that I can see the errors if any - do not detach)

and must start ollama first in another window with

ollama run llama3.2

docker run -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

But connecting stable diffusion to open-webui seems to be complicated -

https://torgeir.dev/2024/05/connecting-stable-diffusion-webui-to-your-locally-running-open-webui/

So, will run Stable Diffusion (for image generation and upscaling) from its own web ui instead.

Work Stuff

Tuesday, April 01, 2025

experiments with LLMs and docker

No comments:

Post a Comment

google analytics 4 code