January 8, 2026
In the old days, like five years ago, debugging was a easy. Just kidding. But if you had the right tools in place, it could be easy-ish. We looked at the logs. We found a stack trace. It pointed us… Continue Reading…
January 8, 2026
In the old days, like five years ago, debugging was a easy. Just kidding. But if you had the right tools in place, it could be easy-ish. We looked at the logs. We found a stack trace. It pointed us… Continue Reading…
January 3, 2026
Let’s talk about your bug reports. Specifically, let’s talk about the word “Hallucination.” If you are testing AI features today, I bet your Jira is full of tickets that say, “The model hallucinated.” How can I put it? Not really… Continue Reading…
December 16, 2025
Let’s talk about the dirty secret of our industry. We’ve all been there. You come in on a Monday morning, check the nightly run, and see red. Again. Not “we broke the build” red, but that annoying, flickering, “it worked… Continue Reading…
December 16, 2025
The recording of Webinar III in the AI Quality series is up! In our previous sessions, we focused on the basics of AI testing. We learned how to paddle the “AI Kayak” in the safe harbor of development. But eventually,… Continue Reading…
December 4, 2025
Who doesn’t like asserts? We have a habit of confusing “simple” with “easy.” In traditional automation, defining quality was simple. It was binary. Assert.AreEqual(expected, actual). It either matched, or it didn’t. Green or Red. But with AI, “Good” isn’t binary.… Continue Reading…
November 11, 2025
So, you did it. You built a fantastic “AI Testing Kayak.” You followed the AI Quality Funnel. You have developer tests, sanity checks, and a “Golden Dataset” that defines “good” responses. You even have an “Automated Scorecard” that runs in… Continue Reading…
November 9, 2025
Most people think of testing an AI feature like testing a chatbot – you assess the quality of a single response. But in our real-world systems, it’s almost never a single response. We build chains of calls. The response from… Continue Reading…
November 14, 2025
Last week, I published our new strategic model: Are You Building an “AI Testing Kayak” or an “AI Testing Ship”? That post explained the “why.” It defines the “Kayak” as the collection of fast, practical tactics for the practitioner in… Continue Reading…
November 7, 2025
If AI is so smart, why can’t it tell me how much I can trust its answers? Well, maybe it can… We talked about creating “Golden Datasets” – benchmarks that help us evaluate an answer from an AI model. (If… Continue Reading…
November 7, 2025
Ahoy, mateys! In our last webinar (recording is coming soon), “AI Testing: Beyond the Basics” we did something amazing: we built a functional “AI Testing Kayak.” We created a frameworkf for testing AI products, that’s fast, practical, and, most importantly,… Continue Reading…