An Experiment in Voice — Fine-tuning Qwen3 8B with Unsloth

An Experiment in Voice: What Happens When AI Learns to Write Like You

Fine-tuning Qwen3 8B with Unsloth, and what it taught me about what voice actually is.

I fine-tuned a language model on three years of my own writing. Not because I wanted a clone spitting out newsletters while I slept. Not because I thought the world needed more content with my name on it. I did it because I got curious about something specific: what actually happens when you teach an AI system how to sound like a real person?

Most fine-tuning feels like this. You grab a general model. Point it at a specific domain. Marketing copy, customer support, engineering docs. The model learns the patterns of that domain and gets better at sounding like it belongs there. But it doesn’t capture perspective. It doesn’t learn the actual choices a human makes when deciding how to explain something.

I wanted to try something different. So I took Qwen3 8B and trained it on three years of how I actually talk to people about technical stuff. Not the polished version that lands on The GitHub Blog. The real stuff. How I’d onboard someone new to the team. When I’d decide to start with “here’s why this exists” instead of jumping to “here’s how to use it.” The moment where I shift from talking about theory to talking about what actually works. The tone that makes someone feel like you understand what it’s like to be new to this and not stupid.

Newsletters. Conference talks. Conversations with developers learning GitHub for the first time. Issues where I had to explain something five different ways until it finally clicked. The unglamorous moments where communication actually happens.

How Unsloth made this possible

Here’s the thing that made this experiment actually feasible: Unsloth.

Fine-tuning a language model normally requires serious hardware. You’re talking GPU clusters, thousands of dollars, infrastructure that most people don’t have access to. Unsloth is a library that optimizes the fine-tuning process so aggressively that you can train on a free Colab T4 GPU. Sixteen gigs of VRAM. That’s it.

I started with Llama 3.1 8B but switched to Qwen3 8B. The base model matters more than anything you do after. Qwen3 is newer, trained on more data, and its instruction-following is significantly better out of the box. Same parameter count, better foundation. That switch alone improved the output more than any hyperparameter tweak I tried.

Unsloth does this through something called LoRA, which stands for Low-Rank Adaptation. Instead of retraining the entire model’s weights (which is expensive and slow), LoRA adds small adapter matrices to specific layers. The base model stays frozen. You’re only training the adapters. Think of it like adjusting the instruments on an orchestra instead of replacing the musicians. The orchestra still plays all the same notes. You’re just tuning how it sounds.

I set LoRA rank to 16. That number controls how much capacity those adapters have. Higher rank means more flexibility, more VRAM, more training time. Lower rank means tighter constraints but faster convergence. Rank 16 was the sweet spot for eighty-one examples of my voice. I also set LoRA alpha to 32 (double the rank) so the adapter updates scale correctly during training.

The target modules were the attention layers: q_proj, k_proj, v_proj, o_proj, plus the feed-forward layers (gate_proj, up_proj, down_proj). Those are the parts of the model that control how it processes and generates language. That’s where voice lives.

Training was sixty steps with a batch size of 2 and gradient accumulation across 4 batches. Learning rate at 1e-4, which is conservative for a small dataset. A higher learning rate would overshoot on eighty-one examples and corrupt the model’s general knowledge. Lower and you’re training forever.

The data went through the ChatML template, which is Qwen’s format: system prompt, user message, assistant response. Each example was formatted correctly so the model learns not just what to write but how to respond to a specific prompt structure.

The export was quantized to q4_k_m. That’s 4-bit quantization with K-means clustering. The model compresses from full precision (32-bit floats) down to 4-bit integers without much quality loss. Final size: about 5GB. Portable. Runs on a MacBook Pro with 16GB RAM. Local. No API calls. No privacy concerns.

Without Unsloth, this experiment doesn’t happen. You’d need a lab budget or months waiting for cloud infrastructure. Instead, I did it in a free Colab notebook in an afternoon and downloaded the trained model by evening. Total cost: zero dollars and an afternoon.

What it captured and what it didn’t

The model learned how I write. Not how I think.

When you give it a technical problem, it mirrors back my sentence structure. It breaks thoughts into separate sentences instead of chaining them together with corporate connectives. It asks what you’re actually trying to do before jumping to answers. It flinches away from marketing language like an instinct. It opens with the problem instead of the feature.

Those are communication choices. Patterns. Style. The model got really good at capturing them.

What it didn’t capture is the thing underneath that actually matters: the ability to change your mind.

I used to be really convinced about certain stacks. Defended them hard. Built entire arguments around why they were the right choice. Then I changed my mind. I’ve done it with people too. Teams I thought were solid until they weren’t. Technologies I was certain about until I saw something better and had to actually admit I was wrong about the first thing.

That willingness to be wrong isn’t in the training data. It can’t be. The model learns from what you’ve already written. It doesn’t learn that you might write something completely different tomorrow because you figured something out today that contradicts what you said yesterday.

That’s the actual gap between a system that learned your voice and a voice that belongs to a real person. You get to change. You get to contradict yourself. You get to look back at what you believed last year and go “yeah, nah, I was wrong about that.”

The model will always sound like 2026 Andrea. You won’t be 2026 Andrea forever.

Why I’m telling you this

Most companies are fine-tuning AI to sound professional. Polished. On-brand. Safe. But there’s something more interesting you can actually do: teach an AI system to communicate the way a real technical authority communicates. Not to think like them. Just to sound like them.

If you’re building documentation, that means AI that matches how you actually talk about things, not some generic “professional” tone that makes everything sound corporate. If you’re a developer advocate trying to scale your explanations without losing your actual voice, it means you can get a tool that captures how you really talk about problems. If you’re trying to communicate technical ideas at scale without everything turning into corporate mush, teaching voice is a different kind of leverage entirely.

The model didn’t learn my perspective. It learned my patterns.

What you can actually do with this

You could use this to write more content faster. Delegate some of the explanation work. Get a communication partner that sounds like you without requiring you to actually sound like someone else.

But I didn’t build this to manufacture content. I built it to understand what voice actually is. To prove that you can teach an AI system to communicate like a real person. To create something that sounds like a technical authority instead of a corporate algorithm.

The result is useful if you’re trying to scale thoughtful explanation. If you need documentation that doesn’t sound generic. If you want a partner that talks through problems the way you talk through problems.

But here’s what it won’t do: it won’t change its mind. It won’t wake up tomorrow and realize it was wrong about a framework, a practice, a person. It won’t evolve because it learned something new. It won’t contradict itself because the evidence demanded it.

That’s the thing worth protecting in yourself. The freedom to be wrong. The courage to change your mind when the evidence shows up. The humility to know that what you’re certain about today might be the thing you completely rethink next year.

Teaching voice to a machine is useful. But your voice matters because it’s attached to a mind that keeps changing.