A fun web app to generate witty and funny captions for your photos.
Built with Google's Gemini API, using Gemini-1.5-flash-8b to generate witty captions for user images.
Ever wanted to add a witty caption to your photos but couldn't think of one? Whimsical Captions is here to help! Whimsical Captions is a fun web app that generates witty and funny captions for your photos. It uses Google's Gemini API, specifically Gemini-1.5-flash-8b, to generate captions for user images.
Here are a few examples of captions generated by Whimsical Captions:
The Landing Page!


You can visit Whimsical Captions to generate your own fun and witty captions! (Note: It might take upto a minute to generate the first caption if the app hasn't been used for a while 🙂)
I built Whimsical Captions to explore the capabilities of Google's Gemini API and to create a fun and engaging web app that users can enjoy.
With the advancements in generative AI, I wanted to build something to see it's capabilities and how I can use it possibly in the future in a different project.
I like building silly and simple projects, especially to "dip my toes into the water" to test out whatever I'm learning and exploring.
Your image is sent to the Whimsical Captions backend server, where it is sent to Google's Gemini API along with a custom prompt I wrote, requesting it to generate a witty caption for your image.
The image is then loaded on the screen with a cool polaroid border along with the generated caption. The app also looks up your uploaded file's properties to display it's "last modified" date to also display it alongside the caption 😄
Cool candy cane style background 🍭
Polaroid border for your images 📸
Witty captions generated by Google's Gemini API 🤖


Whimsical Captions was built using ReactJS (HTML,CSS,JavaScript) for the frontend. The backend was built using Node.js and Express.js. The app uses Google's Gemini API to generate captions for user images.
Currently, the app is deployed on the free-tier of a deployment platform (Render). If the application is not used for more than 15 minutes, the backend server shuts down and can take over a minute to boot back up and serve a new request. This can make the user experience a bit slow.
To improve the app, I can deploy it with the paid tier to ensure that the backend server is always running and can serve requests quickly.


Building Whimsical Captions was a fun and engaging experience. I enjoyed exploring the capabilities of Google's Gemini API and creating a fun web app that users can enjoy. I also learned a lot about generative AI and how it can be used to create engaging content.