Writing Evals for LLM Agents

LLM powered software experiences provide a unique challenge within software engineering. Because models are non-deterministic, a variety of factors complicate evaluation.

Read More

Test Driving Prompting Strategies

Natural Language to SQL is an area of continued research and one I’ve explored previously. Today, I will implement prompting techniques from two research papers. The first explores updating the structure and providing supplementary SQL knowledge. The second uses few shot Chain of Thought reasoning.

Read More

Text to SQL with LLMs

One of the most interesting and perplexing aspects of new generative LLM models is the way we interact with them. When you go to use one of these tools, you’re greeted with a text input where you can type in anything you want. At first it feels magical - “I can ask it anything” - but the more you interact with the model, the more it becomes clear that it’s not so simple. Asking a question in a different way, or adding or removing context, can provide a better response. Attempting to get the results you want can drive you to extremes - we’ve seen people promise to tip the LLM or punish it in an attempt to improve the output.

Read More

Agent Faceoff - A brief interlude

While developing fantasy football insight agents, I was looking for a way to compare LLM powered agents against one another and easily iterate. In short, I wanted an AI agent comparison tool. Not able to find one, I built a simple one that is open for anyone to use - check it out here. The Agent Faceoff makes it easy to compare how two agents or agent versions answer the same question, which ones perform better, and test hypotheses to understand why.

Read More

Fantasy Football Stats with an LLM

As I posted earlier, I play fantasy football and have a database of player stats. I usually have written my own queries to answer questions I have about players - but what if an LLM could take natural language and convert it to SQL? This is what is called an agent - it uses an LLM as the brain and connects it to tools, in our case an SQLite database with player stats. We’ll build a simple agent, running as a command line app using Bun and Typescript since I’m familiar with both. We’ll use a local instance of Llama3 running via Ollama as our LLM.

Read More

Data Driven Fantasy Football

I’ve played fantasy football since high school but the last 4 years, it’s gotten more serious. My friend Nick invited me to join a dynasty league, meaning we build teams and keep players over time. Prior to that I was fairly casual in my draft research, cramming a couple of articles in to inform my picks on the day of the draft. After the first year, I realized I needed to up my game.

Read More

Observability 2.0

I'm not an observability guru but everything inside of me screamed “yes yes yes” when I read Ivan’s article about logs, metrics and traces vs "wide events".

Read More

Learning React all over again

I’ve been using React since 2017 and I thought I would make a short list of how I would approach learning React now if I had to do it over again. At the end of the article, I also point you to some helpful tools to get up and moving quickly.

Read More

No more cowboy coding

"I'll just push this fix to production really quick." We've all heard someone say that - or even said it ourselves. We've all been there: under the gun, a critical issue breathing down our necks. Of course it's after hours on a Friday and you had plans. Those plans are now ruined and worse, it's all riding on you. The pressure in that moment is real, the embarrassment of having a broken user experience, a site or service down, our phones blowing up with alerts and stakeholders all banging on the door.

Read More

Four Lessons I learned launching homeWODsquad

Launching homeWODsquad was one of the most fun projects I've ever done. It also taught me a lot about the product and marketing sides of the software business that being a developer, I rarely see. My four biggest takeaways were the importance of having the right audience, knowing what that audience will see as essential, what creates user value in user interactive apps, and the unexpected value of monetization.

Read More

Using Web Workers with React

Web Workers provide a powerful mechanism for developing complex frontend web applications. They allow a web application to offload long running processes to another thread, allowing for the user to continue interacting with the UI in a smooth, uninterrupted fashion. However, web workers have some quirks.

Read More

Service Workers vs Web Workers

Progressive web apps or PWAs are a hot new topic and an exciting frontier for frontend developers. They provide incredible potential to change what users expect from web applications. When diving into PWAs, service workers are often front and center and for good reason. They are at the heart of PWAs and the offline capabilities they add are what makes PWAs magical. However when starting with PWAs it's easy to get confused about what to use Service Workers for and what the less commonly discussed Web Workers are for.

Read More