Introduction
BaristaAI is a personal project I worked on after buying an espresso machine and wanting to improve the coffee I was making.
There are many variables that can have an impact on the quality of an espresso. The taste of an espresso can help give some hints as to what may need tweaking to get the best you can out of your coffee.
I created BaristaAI as a simple but all-encompassing assistant app to help with coffee brewing. It works using a large language model (LLM) which has been provided some initial context to help guide conversations to give useful tips on coffee brewing.
Examples
I use this application myself quite frequently to help me improve the coffee I brew. For example, yesterday I made an espresso with some new beans but found it tasted overly bitter.
I typed my issues into BaristaAI:
I was then given some useful tips to improve my next espresso, such as to try a coarser grind setting:
Today, after making the adjustments suggested by BaristaAI, my coffee tasted a lot better. I can now iterate, providing updated feedback on the taste of the coffee to BaristaAI to continue moving towards great coffee, making a note of the ideal settings for when I make coffee with these beans again in the future.
Some Technical Details
- BaristaAI is built with .NET MAUI (currently targetting .NET 8), and can be deployed for Android, iOS, or Windows Desktop
- Powered by Google Gemini (configured to use the
gemini-1.5-pro
model by default) via thegenerative-ai
package for .NET - Abstracted LLM service, making it easy to swap out the implementation that uses Google Gemini with other LLM APIs (e.g. OpenAI) in the future
- Uses dependency injection (DI)
- Markdown formatting support
- Markdown LLM output is displayed using a custom
MarkdownView
control which converts markdown to HTML to be displayed using aWebView
- Markdown LLM output is displayed using a custom
- Follows MVVM design pattern
Future Improvements
User Interface
BaristaAI’s user interface isn’t particularly flashy at the moment as I’ve primarily created this as a utility for myself, but this is the main area I’m looking to improve on in the future with the following planned features:
- Navigation side bar
- Allows other pages like an “Options” page to be added to customize colors / defaults
- Pressable icons for brewing method
- Instead of requiring user to give details about the brewing method in their text prompt, having a few icons at the top of the UI to quickly swap between them will be more convenient
- An option will be available to allow the user to not specify a brewing method if necessary just in case they’re using a more niche brewing method that wasn’t considered (e.g. AeroPress)
Image Generation
At the moment, only text is generated by the application, however Gemini and many other LLM APIs provide image generation capabilities too. An improvement I would like to make would be to allow BaristaAI to generate images and show them to the user where appropriate.
For example, the app could show the user a diagram demonstrating suggested changes to their brewing process.
Migrating LLM API Requests from Client-Side to Cloud
At the moment, the app performs Gemini API requests from the BaristaAI client application. For this reason, the API key for Gemini must be entered by the user before using the generative AI features of the app. I have implemented it this way for now to avoid including the API key in the application for security.
However, a potential future improvement would be to use a cloud-based server application or serverless functions service (e.g. Google’s Cloud Run Functions / AWS Lambda) for API requests, with the BaristaAI app instead reaching out to this cloud service to get generated responses. Users would need to authenticate with this server before they could use the application.