Ian O’Byrne
Overstory Writing

How to Run Your Own AI Models: Getting Started with LM Studio

How to get started building your own local AI sandbox with LM Studio.

Posted
Oct 1, 2025
Last revised
May 1, 2026
Author
Ian O’Byrne
Read
4 min
Topics
ai · reading · writing

In my last post, we talked about why you might want to build your own AI sandbox. Now let’s actually build one.

This isn’t difficult, but it does require downloading a few gigabytes of data and waiting for the installation to complete. Grab a coffee, and let’s get started.

What You Need Before You Start

Hardware Check:

  • Minimum: 8GB RAM, any modern processor (last 5 years)
  • Comfortable: 16GB RAM, dedicated GPU (optional but helpful)
  • Ideal: 32GB+ RAM, NVIDIA GPU with 8GB+ VRAM
  • Storage: 10-50GB free space (models vary in size)

Simple test: Can your computer handle video editing or modern gaming? Then it can probably run AI models.

What this means in practice:

  • 8GB RAM = smaller models (3B parameters), slower responses
  • 16GB RAM = medium models (7-13B parameters), decent speed
  • 32GB+ RAM = larger models (30B+ parameters), faster responses

Step 1: Installing LM Studio

Why LM Studio? Right now, it’s the easiest way to get started. Think of it as iTunes for AI models. It handles all the technical complexity behind a clean interface.

The Download:

  1. Go to lmstudio.ai
  2. Download for your operating system (Windows/Mac/Linux)
  3. Install like any other application
  4. Launch it, you’ll see a clean, modern interface

First impressions tour:

  • Left sidebar: Your downloaded models
  • Center: The chat interface (looks familiar, right?)
  • Top: Search and download new models
  • Bottom: Settings and system resources

Step 2: Downloading Your First Model

The paradox of choice: There are thousands of models available. For your first time, I’ll keep it simple.

Recommended starter model: Llama 3.2 3B Instruct (or current equivalent)

Why this one?

  • Small enough to run on most computers (3-4GB download)
  • Fast responses
  • Good at following instructions
  • Won’t max out your system

How to download:

  1. Click the search/download icon (top of LM Studio)
  2. Type “Llama 3.2 3B”
  3. Look for “Instruct” in the name
  4. Find one that says “GGUF” format (this is optimized for local running)
  5. Click download
  6. Wait 5-15 minutes, depending on your connection

While you wait: Brief explanation of what’s downloading: You’re downloading billions of mathematical parameters, compressed into a single file. This is the entire ‘brain’ of the AI, trained on massive amounts of text.

Step 3: Loading and Chatting

Loading your model

When you download a model and “load” it, you’re running or using the model. Think of it like downloading a movie and pressing play, or downloading a program and launching it.

  1. Once downloaded, it appears in your left sidebar
  2. Click it
  3. Click “Load Model” at the top
  4. Watch the loading bar (your computer is reading the model into memory)
  5. When it says “Model loaded,” you’re ready

Your first conversation

Start simple:

  • “Hello, who are you?”
  • “What can you help me with?”
  • “Write a haiku about sandboxes.”

What to notice

  • Response speed (slower than ChatGPT at first)
  • Your computer’s fan noise (it’s working!)
  • Be sure to plug in your computer and watch the battery usage.
  • No internet required (you can turn off WiFi and it still works)

Understanding What You’re Seeing

The response generation. Watch the text appear word by word. Each word is the result of billions of calculations happening on your machine, right now.

The resource monitor. Be sure to pay attention to LM Studio’s built-in resource display. It will show the following:

  • RAM usage
  • GPU usage (if applicable)
  • Response speed (tokens per second)

What “tokens per second” means: Quick explainer: Roughly 1 token = ¾ of a word. 20 tokens/second = pretty fast. 5 tokens/second = slow but functional.

Common Issues and Quick Fixes

” It’s too slow.”

  • Try a smaller model (look for 1B or 2B parameter models)
  • Close other applications
  • Check if you accidentally downloaded a huge model
  • You can download and test various models to determine which one works best for your purposes.

” It won’t load.”

  • Check available RAM
  • Try restarting LM Studio
  • Ensure you have enough disk space

” The answers are weird.”

  • This is normal! Smaller models are less capable than ChatGPT
  • Try being more specific in your prompts
  • Remember: this is a 3 billion parameter model vs. ChatGPT’s 100+ billion

” My computer sounds like a jet engine.”

  • This is expected during model loading and first responses
  • It should quiet down after the initial processing
  • You can adjust settings to reduce CPU/GPU usage

What You Just Did

Congratulations, you just downloaded an entire AI model. You hosted it on your own computer. You used this to hold a conversation that never left your machine. Learned what AI actually costs in terms of computing resources

The difference: Every conversation with ChatGPT travels through the internet to OpenAI’s servers. Every conversation you just had stayed within your computer. Your sandbox. Your rules.

Next Steps

Try these experiments:

  1. Turn off your internet and keep chatting (it still works!)
  2. Ask it to write different things (code, poems, explanations)
  3. Notice which tasks it handles well vs. poorly
  4. Check your system resources during different tasks

In the next post: Now that you have a working local AI, let’s talk about choosing better models, measuring actual electricity costs, optimizing performance, and deciding when to use local AI vs. cloud services.

Your personal AI sandbox is now open for business. What will you build first?