While developing fantasy football insight agents, I was looking for a way to compare LLM powered agents against one another and easily iterate. In short, I wanted an AI agent comparison tool. Not able to find one, I built a simple one that is open for anyone to use - check it out here. Agent Faceoff makes it easy to compare how two agents or agent versions answer the same question, which ones perform better and test hypotheses to understand why.

I decided to experiment with some different UI libraries and tools when building the Agent Faceoff and I was a huge fan of them all. I used Lucide for icons and I found it to be frictionless and intuitive. I also gave into the shadcn/ui hype and tried it for the first time. There's good reason for the hype. The components are beautiful and thoughtfully designed - a pleasure to use. I also leveraged Bun’s built in web server to provide the Agent’s web socket. A quick search led me to this documentation page and it worked, first try. I continue to be impressed with Bun’s documentation and the batteries included approach. Also, I want to give a shout out to a post I found helpful. Tania’s Websockets in Redux post was a helpful walkthrough of connecting to a web socket using a custom middleware.
I'm a big Redux fan but I discovered that Redux Toolkit’s documentation is unclear and often chaotic. I have over 5 years of mostly vanilla Redux context under my belt and I found myself fighting the tooling often, trying to get it to do what I wanted. The TypeScript documentation is mostly lacking and I had to use trial and error quite a few times to determine the correct types. Things that are simple in vanilla Redux, such as cross-slice state subscription, I found undocumented in Redux Toolkit. The underwhelming experience may push me to explore other state libraries in future projects.
To wrap up, if you’re interested in giving it a shot with your own agent, it’s pretty simple. Agents need to provide a web socket interface, capable of sending and receiving messages in this format. The agent can simply read the message field and respond with the provided task and agent ID and their own message. Reach out if you have any questions!