Personal llm - pipe dream really

Elf_Boy · Feb 12, 2025

Preferably under win 11. As I expect that using a VM to run Linux will lose to much cpu/gpu power or has that changed.

Other than using CoPilot I have no experience or knowledge on the topic of the underside of an LLM.

My hope is to set up an LLM that has a persistent memory, one where I can create characters (as in a book/fiction, describe settings and the like, and then ask for prose.

Is this doable?

Grimlakin · Feb 12, 2025

the application you are looking for is backyard ai. https://backyard.ai/desktop

Elf_Boy · Feb 12, 2025

Thank you!

David_Schroth · Feb 18, 2025

You'll likely want to use Docker with WSL (windows subsystem for linux) to run these - shouldn't lose much in the way of performance.

Depending on how fancy you want to get, something like Langflow can help stitch together a few different sources that may be helpful, but might be fancier than you're looking for. You'd use Ollama as the AI backend there.

Basic setup would be one docker container running Ollama (where you pull the model(s) down you want to use), then another chat agent (or langflow) in a container that hits the Ollama service.

LazyGamer · Feb 18, 2025

....and if you don't want to install / enable Hyper-V (I leave it off on most of my systems as I've had it cause issues in the past) you can just use Virtualbox. Hyper-V is required for WSL.

If you're more serious about performance, dual-boot into Linux; you can probably do this in a live environment (boot and run off USB).

Grimlakin · Feb 18, 2025

Oh and running just about ANY decently sized LLM will consume your video card. Might be doable if you want a dual GPU box, one GPU for your LLM/AI the other for gaming and such. Might be an idea to make my 7900xtx a LLM card only... but that would cut down lanes for a 5090 provided I get one. More and more on the fence for that now.

Elf_Boy · Feb 18, 2025

Thank you all.

I do have Hyper V enabled, I've played around a little with VM's - getting Win 98 going to play old games kind of thing, or playing with a Linux system. How could I verify I have the linux subsystem installed/enabled? I think I have but it was a while back.

I get the LLM will heavily use the GPU, help me understand why I would need two GPU's to game rather then just shut down/Pause the VM while gaming?

What is Docker (I'll start googling shortly

and Langflow?

I have (who doesnt) a few USB sticks laying around and booting off one is no issue. Last I looked at dual booting (with a boot manager) there were some quite serious concerns/issues with Linux corrupting Windows volumes and it was recommended to be read only. That was a very long time ago so I am thinking that has likely been fixed, but dont know for sure that is has. I dont want to deal with a boot manager so ya USB, should I pick up a new one of a certain size or larger and is the read/write speed of the USB relevant?

Looking at specs, my mobo has both USB A and C 3.2 gen 1 and Gen 2 ports. (looking that up) Ok Gen 2 is 10Gbps - can a usb even get that fast? I haz researching to do. Hmmm I have a 1 tb stick here somewhere if I can find it. Time to start looking.

Grimlakin · Feb 18, 2025

Oh yea 100% you can shut down the VM/AI to game. I thought you were saying you wanted one running 24/7. You CAN do that with backyard and even have it accessible over the network.

I found backyard AI MUCH easier than running a little linux vm on my box. Though I have done both.

Marees · Feb 19, 2025

Training or Inference ?

Pure inference is possible on 2x epyc cpu with 768 gb ram & deepseek (without any gpu). It can process at rate of 7 to 8 tokens/sec

Personal llm - pipe dream really

Elf_Boy

Quasi-regular

Grimlakin

Forum Posting Supreme

Elf_Boy

Quasi-regular

David_Schroth

Administrator

LazyGamer

FPS Junkie

Grimlakin

Forum Posting Supreme

Elf_Boy

Quasi-regular

Grimlakin

Forum Posting Supreme

Marees

Quasi-regular