← Back to all posts

Building with LLMs: Lessons Learned

Aug 2025

Practical takeaways from coding with Claude and GPT.

I've been building side projects with LLMs for about a year now. Some went well. Most taught me something the hard way. Here's what I've learned.

Lesson 1: Start with the prompt, not the code

The biggest mistake I made early on was jumping straight into API calls. The real work is figuring out what you actually want the model to do. Spend time crafting and testing prompts in the playground before writing a single line of application code.

Lesson 2: Claude and GPT have different strengths

After extensive use of both:

Pick the right tool for the job instead of defaulting to one.

Lesson 3: Structured output saves hours

Getting LLMs to return JSON or other structured formats reliably was a game-changer. Use function calling or explicit schema instructions. Don't try to parse free-text responses with regex—you'll lose that fight.

Lesson 4: Evals matter more than vibes

"It seems to work" is not a testing strategy. Build a set of test cases, run them systematically, and track scores over time. This is especially important when you change prompts—what improves one case often breaks another.

Lesson 5: Costs add up fast

Token usage is easy to ignore during development and painful to discover in production. Monitor your usage from day one, cache aggressively, and consider whether you really need GPT-4 or if a smaller model would suffice.

What's next

I'm currently experimenting with tool use and agentic patterns. The ability for models to call functions, browse the web, and chain actions together feels like the next big unlock. More on that soon.

← Back to all posts