Started playing around with my build and so far have mostly just validated hardware works/is stable (70B Llama 4-bit QLoRA fine tune went … fine). Now I’m onto playing with fine tuning other models, trying different configs and trying to learn things. So far, DS-Coder-V2-Lite-Instruct was an absolute PITA to get going on 8-bit LoRA.
Would love to connect with any others doing local LLM research/development to exchange ideas and experiences.
I can definitely offer hardware setup help - that’s my strongest area of expertise, at least for now. I am learning about everything else quickly though!
Hey, I am pretty unfamiliar w hardware, how it physically works. I am curious, are those gpus connected via physical bridge or is it built in motherboard?
Not sure what happened to your question (hidden) but I saw it in notifications. Answer is it’s BOTH. You need a physical nvlink bridge and they’re very expensive right now specifically because everyone wants them and supply is low / Nvidia killing off the tech. With nvlink, the cards have 55ish GB/a bandwidth (one way). My nvlink tests at 112Gb/s (both directions). So it makes it feasible for model training to essentially treat the 48GB of VRAM as one unified pool. There is some comm bandwidth between the cards through mobo as well but not nearly enough.
4-slot nvlinks are $1000+
3-slot are much more widely available but then the issue becomes cooling and/or card model (some cards to this to sit so close). Not to mention that if course you need to plan to have the cards in x16 slots. Only two of your mobo slots will be x16,