Help with VRAM allocation Lenovo Thinkpad C13 Yoga

dement0re · August 27, 2025, 6:08am

Hello everyone,

First of all, I want to say a huge thank you to everyone involved in this project. Your work is incredible.

I’m new here, and I’ve successfully installed Fedora 42 on my 16GB RAM model. Everything is running great!

My only issue is with the VRAM allocation. It’s currently fixed at 512MB, and I’d love to increase it for better GPU performance. Unfortunately, there is no option in the BIOS/UEFI to change the UMA buffer size.

I’ve been reading through the forum and I found this specific post which discusses flashing a firmware to allocate more VRAM: https://forum.chrultrabook.com/t/thinkpad-c13-yoga-only-5-7gb-ram-available/259/34

I also tried using the Smokeless_UMAF tool (https://github.com/DavidS95/Smokeless_UMAF/tree/main), but unfortunately without success.

Has anyone managed to increase the VRAM on this specific model? Any advice on the correct procedure to flash the firmware or any other potential solutions would be greatly appreciated.

Thanks again for the amazing work on this project and for any help you can provide

MrChromebox · August 27, 2025, 2:04pm

there is zero benefit to setting the UMA larger than 512MB on these boards, and setting it larger than 1GB doesn’t even work.

dement0re · August 27, 2025, 5:26pm

I understand that for general use, there’s little benefit to setting the UMA buffer above 512MB since the driver and GTT handle memory management efficiently. My situation, however, requires a specific workaround for a known limitation in Ollama when running LLMs on iGPUs.

The core issue is that Ollama’s hardware detection only queries the dedicated VRAM allocated by the UMA buffer, ignoring the much larger pool of system memory accessible via the GTT. As a result, if an LLM exceeds the UMA buffer size, Ollama incorrectly determines there is insufficient GPU memory and reverts to a CPU-only fallback, making hardware acceleration unusable.

My goal is to “trick” this flawed detection logic. By setting an unusually large UMA buffer (e.g., 6GB or more), Ollama will register enough “dedicated” VRAM and load the model onto the iGPU. To be clear, this is not for a general performance increase but purely to bypass this specific software bottleneck.

While I’m also exploring how to enable Vulkan support in Ollama as a more permanent solution (since the underlying llama.cpp handles GTT correctly via Vulkan, a capability not yet exposed in Ollama), I would still like to try the UMA buffer modification as a direct, hardware-level test to see if it’s a viable solution.

Regarding your comment about it not working over 1GB, would you mind describing the issues you ran into?

Thank you again for your quick response, I really appreciate it.

MrChromebox · August 27, 2025, 6:46pm

the device will simply not boot. The (closed-source) AMD blobs simply don’t support anything that large. 2GB and above requires using MMIO both above and below 4GB, which isn’t supported by the blobs or coreboot. I don’t even think you can do above 4GB for the GPU MMIO since FSP is 32-bit and so the GPU would fail to init altogether.