Running Ollama AI Models on a Midrange Laptop: My Experience
My experience installing and testing Ollama AI models locally on a midrange laptop with 16GB RAM and a 2GB GPU.
Feb 27, 2026
Introduction
If you’re curious about running AI language models locally, Ollama makes it possible, even on a midrange laptop. Here’s my experience installing and testing models on my ASUS VivoBook 15.
System Details
Before diving into AI, let’s look at my hardware and software setup:
- Laptop: ASUSTeK VivoBook 15 X542UF
- CPU: Intel® Core™ i7-8550U x 8 threads
- RAM: 16 GB
- GPU: NVIDIA GeForce MX130 (2 GB VRAM) + Intel UHD Graphics 620
- Disk: 1.8 TB SSD
- OS: Ubuntu 24.04.4 LTS, 64-bit, GNOME 46, Linux Kernel 6.17
This is a typical midrange laptop, good for everyday tasks, but limited for large AI models.
Installing Ollama
Installing Ollama is straightforward on Ubuntu:
curl -fsSL https://ollama.com/install.sh | sh The installer will:
- Create an
ollamauser - Add you to the appropriate GPU/video groups
- Set up a systemd service
- Detect NVIDIA GPU if present
After installation, verify the version:
ollama --version
# e.g., ollama version 0.17.4 Running AI Models
First Attempt: LLaMA 3.2
I tried to run the flagship model:
ollama run llama3.2 The download worked perfectly, but the process immediately crashed:
Error: 500 Internal Server Error: llama runner process has terminated: exit status 2 Why?
- The MX130 GPU has only 2 GB VRAM, far below the 24 GB+ needed for LLaMA 3.2.
- CPU mode is also impossible for this model on a 16 GB RAM laptop it requires ≥32 GB RAM.
- Ollama tried to use the GPU automatically, and the model process failed instantly.
Working Models on My Laptop
Smaller models work perfectly, including:
| Model | RAM Requirement | Notes |
|---|---|---|
gemma3:270m | ~1-2 GB | Works flawlessly, very responsive |
gemma3:1b | ~4-6 GB | Works, slower on CPU |
llama2:7b | ~12-14 GB | Fits, CPU-only, may swap |
llama3:3b | ~8-10 GB | Works, slow on CPU |
Tip: Always start with the smallest models first to test performance.
Models that do not run:
llama3.2- LLaMA 2 13B+ or 70B
- Any model requiring ≥24 GB GPU memory or ≥32 GB RAM
Removing Non-Working Models
Once I realized llama3.2 would not run, I removed it to free up disk space. Ollama provides a simple command:
ollama rm llama3.2 After running this, checking the list of models confirms it’s gone:
ollama list
# Only gemma3:1b and gemma3:270m remain This is a useful step to keep your system clean and avoid clutter with large, unusable models.
Checking Your System Before Running Models
- GPU visibility:
nvidia-smi - Available RAM:
free -h - List available Ollama models:
ollama list Only pick models that fit in your RAM or GPU limits.
Key Takeaways
- Hardware matters: Midrange laptops can run small models (hundreds of millions to a few billion parameters) but cannot handle flagship models locally.
- GPU vs CPU: Ollama auto-detects GPU. If your GPU is too small, large models will crash.
- RAM is critical: Large LLaMA models require ≥32 GB RAM to run on CPU.
- Start small: Models like
gemma3:270mgive you instant results and let you experiment safely. - Remove non-working models: Use
ollama rm <model-name>to free disk space. - Cloud is an option: For LLaMA 3.2 or 70B models, consider Ollama Cloud, Google Colab Pro, AWS EC2 GPU instances or Azure VMs with sufficient resources.
Conclusion
Running Ollama AI locally is possible, even on a modest laptop. The trick is to match model size to hardware limits. With 16 GB RAM and a small GPU, smaller models like gemma3:270m or llama2:7b are perfect for learning, testing, and experimentation.
Once you upgrade your hardware or move to a cloud instance, you can explore larger models like LLaMA 3.2 and beyond.