Looking for easy ways to install and run them, via Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!
ollama list
ollama pull llama3.1:latest (~4 GB - the download went at nearly 4 MBps on the Airtel 40 Mbps connection, and around 6 MBps on Airtel 5G - 12 minutes)
ollama pull llama3.2:latest (~2 GB, and runs faster on low-end hardware)
(run as ollama run llama3.1:latest --verbose to get tokens/sec.
)
With 3.1, it was going at ~4 tokens per second on the desktop. CPUs, I think. With llama3.2, it was going at 23 tokens / sec - quite an acceptable speed. On the Windows laptop, running on CPU, llama3.2 at 9 tokens / sec.
For a web ui, via Set up a Local AI like ChatGPT on your own machine!
Wanted to try the docker route - installed docker via https://docs.docker.com/engine/install/ubuntu/ by setting up docker's repository. All my docker commands needed to be prefaced by sudo, since I didn't go through the post-install steps.
Trying to run open-webui with the commandline given in the youtube video,
sudo docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
has the steps.
sudo docker pull ghcr.io/open-webui/open-webui:main
For Nvidia GPU support, add --gpus all to the docker run command:
Single user mode -
sudo docker run -d -p 3000:8080 --gpus all -e WEBUI_AUTH=False -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main
That also failed. "Unable to load models". Then, found that we need to start ollama separately in another terminal window, from How to Install Ollama, Docker, and Open WebUI on Windows - YouTube
and we need a different set of parameters in the command line if ollama is already installed on our system. So, got the openwebui to work with two modifications:
and we need a different set of parameters in the command line if ollama is already installed on our system. So, got the openwebui to work with two modifications:
ran from administrator cmd (on Windows)
without -d (so that I can see the errors if any - do not detach)
and must start ollama first in another window with
ollama run llama3.2
docker run -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
But connecting stable diffusion to open-webui seems to be complicated -
So, will run Stable Diffusion (for image generation and upscaling) from its own web ui instead.
No comments:
Post a Comment