How to build a good AI product

So, you’ve got an idea that isn’t a “Chat with your PDF” app, and you want to launch something to production? Here are some best practices I’d recommend following (and will be doing so myself). This initially started as a section of a previous article The development of AI Development in 2024 but I thought it deserved its post. They are broken down by the high-level stages of a project’s lifecycle.

Prototyping

Spend the time doing in-depth user discovery and quality sample selection. This is especially important if you’re not an expert in the problem space. Jump on some video calls to watch users through their current process, and get a copy of the ideal outcome.
Research the latest technology and frameworks on offer at the moment. Even if it’s only been a few weeks since you last checked, it’s worth doing a quick deep dive for any major improvements or breakthroughs.
Take the goal and break it down as much as possible, as if you were solving a complex development problem. And just like all good software projects, it’s better to allow for more room than to not have enough.
Think through the steps a human goes through in detail. Be very analytical here and slow it down. You’ll be amazed at how many steps your thought process goes through to arrive at an answer. It may seem instant, but there are always steps, and we need to get the LLM to follow the same path.
Give the AI room to think before returning its response. Record this as well.

Building

Save every user input and capture both direct and indirect feedback. Indirect means actions they take that indicate a good or bad response, such as copying the response (good), editing the response before copying (okay, but could be better), or not using it at all (bad).
Spend the time to set up replicas of your prompts in a prompt playground for easy debugging and re-running. Tight feedback loops are important here because there’s no “hot reload” when testing prompts.
Use function calling to return the final response as structured data.

Deploying

High level of observability. There are a couple of promising platforms taking shape in this place, especially around cost monitoring and version control space. But I also think a lot of the time you can achieve the same result with your database and some SQL queries. Plus, things are changing rapidly, so extra layers can sometimes be a form of pre-mature optimisation - making it harder to adapt. Adaption is key.
Use frameworks like Langchain for prototyping, and see if you can achieve the same outcome by stripping things down for production. These frameworks can sometimes add a lot of unnecessary or redundant text to the final prompts.

Improvement

If suitable, share the raw response or chain of thought from the AI. If you’re dynamically inserting context, show this to the user as well. The more you’re able to include them in the process, the better feedback they’ll be able to provide. This also helps with buy-in.
Use AI to improve your prompts. It may be a little confronting, but if you’re running into issues, ChatGPT can help you fix ChatGPT. Be sure to provide the inputs, generation, and desired / correct output. Bonus points if you can create a feedback loop to edit its prompts based on user feedback 🤯

I’ll continue to update this list over time as I discover new tech and learn more from some current projects in play.