Apart from researching and writing about food, I worked with ChatGPT-4 as a food photography coach to help me plan photos and get feedback during the photosession. With today’s post about food photography, we have reached the last post in the project about collaborating with ChatGPT-4 for a food blog post. We have already seen the result, a post about Sweden and Swedish vegan semlor written by ChatGPT-4. And last time, I shared my experiences working with ChatGPT-4 as a coauthor for the blog. Today, I want to share what I find to be the most exciting possibility with ChatGPT-4: working with them as a food photography coach. For this, we rely on their ability to use images as input and output to give feedback and show suggestions.
A series on working with ChatGPT-4 for a food blog post
Chances are high you have heard of both text-based AI models and AI-generated images in the past year as both of these technologies have recently made a huge jump in quality and availability to everyone. The text-based models, so-called large language models (LLMs for short) in particular have received a lot of attention, both good and bad. If you are not too familiar with large language models, you can read a short introduction at the bottom of the first post.
In this series, I explore using ChatGPT-4, one of the latest and most developed LLMs, to write a post for the blog. Last time, we looked at how I used ChatGPT-4 to generate the text for the post about Swedish vegan semlor. One neat thing about ChatGPT-4 in particular is that it is multi-modal, which means it does not only deal with text. It also accepts images as input and can use another AI tool (DALL-E) to create new images, such as the images in this post (based on photos of myself). In other words, in addition to “just” generating the text for a blog post, ChatGPT-4 has the potential to be a food photography coach, where I “show” them my photos and ask for feedback. I briefly tried using ChatGPT-4 as a food photography coach in the recent post about Kiribati, where I “showed” some pictures and ChatGPT-4 suggested I rearrange the gai lan and add a bit of red to complement the green. In this post, I will detail how I worked with ChatGPT-4 for the photo shoot of Swedish vegan semlor.
Posts in this series about using ChatGPT-4 for a food blog post:
- Sweden: vegan semlor – a food blog post by ChatGPT-4
- Collaborating with ChatGPT-4 for a food blog post
- ChatGPT-4 as a food photography coach (this post)
ChatGPT-4 as a food photography coach
ChatGPT-4 is a multimodal AI tool which means it is not limited to working with one medium such as text. For ChatGPT-4, it means that they can work with images as both input and output, in addition to text. This is nothing new but it is a new feature in ChatGPT’s toolbox. To generate images, you can just describe an image and politely ask ChatGPT to create it for you. For the blog post about semlor, ChatGPT created the majority of the images from simple prompts like
Me: I'd like for you to generate the entire [set of images for the post]. We can do one image at a time. Let's start with the photos that go with the recipe you created. First image: a stainless steel bowl with the dry ingredients. The recipe step is "In a large bowl, combine the flour, sugar, yeast, cardamom, and salt."
The generated image is shown below. While the image looks quite good at first glance, you soon start to notice odd details like what appears to be giant yeast grains or multi-coloured peppercorns (10 o’clock in the bowl), whole cardamom pods, and multiple types of sugar, salt, ground spice, or flour.
Generating images
For the blog post, I tried to have ChatGPT-4 generate all food images, including the instructive images I like to have at the end of each post. For one step in particular, ChatGPT-4 really struggled. Once the sweat buns have cooled, you have to cut them horizontally to expose the interior (where the almond paste and whipped cream go). I could not get ChatGPT-4 to show us how this is done. Once I had compared the cut surface to a hamburger, I could not stop them from adding sesame seeds to the semla. This is where I decided to take photos of the entire baking process and include in the post, so you could recreate the semlor more easily.
The prompts that created the images below were:
Image 1: Eighth image: showing a single bun with the lid cut off and part of the interior scooped out with a spoon to allow space for the almond paste. The instruction reads "Cut off the top third of each bun and set the tops aside. Hollow out a small portion of the middle of each bun."
Image 2: Hmm. Could you try again. The bun should have the top cut of horizontally, more like a hamburger (no sesame seeds). Just the top quarter to third of the bun. The buns should otherwise look similar to your previous rendering (like the ones coming out of the oven)
Image 3: Thank you. I would argue that the image has a vertically cut bun, not horizontal though. Could you please try again? Make it more like a semla without cream, just the lid lifted off. Thank you!
Using photos as input
By now, using an AI tool to generate images is old news. There is a wide range of free-to-use and paid tools with large amounts of users exploring and pushing the limit of AI-generated art, with increasingly photo-realistic results that are hard to tell from reality. What ChatGPT-4 can do that is pretty neat is to use images as input. That is, you can upload a photo and ask questions about it. For instance, the rendered images of myself and the ChatGPT robot avatar are based on a photo of me. And I must say, I think they are pretty good. ChatGPT represented themselves as a friendly robot or robot picture on a screen (their own choice).
ChatGPT-4 as a food photography coach – choosing props
When enlisting ChatGPT-4 as a food photography coach, I wanted their help to plan a photoshoot. The first step was for me to tell them about the plates, backdrops, and fabrics I have at hand, so they know what they can include in the photo. For this, I took a series of photos just using my cell phone camera. I then created small montages (using PowerPoint + print screen; very hi-tech) which I uploaded to ChatGPT-4 to analyze.
To make sure ChatGPT-4 knew what they were “looking” at, I asked them to describe the different backdrops, plates, and fabrics. I must say they did a decent job describing the items but it was a bit tricky to decipher. For instance, the backdrops were described with locations like “top left”, “middle left” and “bottom left”, despite there only being two rows of objects. For the plate image, I was impressed that it caught that one object was not a plate but rather “a small stone or slate tile, possibly used as a coaster or for presenting smaller items.” An extract of our conversation is shown below. Based on my desire for a dark, rustic photo with dramatic side lightning, they suggested backdrops, plates, and fabrics to use in the shoots.
Me: For the photoshoot, I was thinking something rustic, a bit dark and with some nice side lighting. Here's a montage of 10 of my photography backgrounds. Can you help me choose one to be the table surface in the photo?
ChatGPT-4: Certainly, let's look at each of the backgrounds you've provided to select one that would be most suitable for a rustic and slightly dark aesthetic with side lighting:
1. Top-Left: A dark, mottled grey background with a weathered concrete look.
2. Top-Center: A white wooden surface with peeling paint, giving a vintage, distressed feel.
3. Top-Right: A light wooden surface with a natural grain, clean and minimalist.
4. Top-Far-Right: A dark slate with dramatic white veining, modern and striking.
5. Middle-Left: A dark brown wood with strong grain patterns, classic and warm.
6. Middle-Center: A grey, aged wooden surface, rustic and full of character.
7. Middle-Right: A faded, grunge-like surface with hints of brown and grey, atmospheric and moody.
8. Bottom-Left: A soft gradient from dark to light grey, subtle and versatile.
9. Bottom-Center: A grey, textured surface that looks like aged plaster or concrete.
10. Bottom-Right: A dark, mottled grey with high texture and a sense of antiquity.
Considering your desire for a rustic look with a bit of darkness to enhance the side lighting, I would suggest background 6 (Middle-Center) or background 9 (Bottom-Center). Both have a textured, weathered appearance that would lend a rustic charm to the photographs, and their neutral tones will contrast nicely with the golden brown of the semlor, making the buns stand out. The texture will also play well with side lighting, creating interesting shadows and highlights.
ChatGPT-4 as a food photography coach – planning the shot
After evaluating my backdrops, plates, and fabrics, ChatGPT-4 chose a few of them for me to use in the photoshoot. I described the desired final photo as “rustic, a bit dark and with some nice side lighting” and asked them to use the chosen props and render a few different views, ranging from closeups to a collection of semlor. I must say I am very pleased with the resulting images. Even though they are not photo-realistic and have some AI quirks (like a slate fading into the table, the camera being in the photo and the semlor lacking cream), the composition, the props, and the overall lighting are excellent suggestions. Trying to replicate these images in a photograph of my own really helped improve my work.
ChatGPT-4 as a food photography coach – giving feedback
Apart from planning the photoshoot based on my own props, I wanted to take the collaboration one step further. I wanted ChatGPT to give me feedback on my ongoing photoshoot. So for the photo showing a plate of semlor, I started really basic and just plopped down a plate with 4 semlor on a table, snapped a blurry pic, and asked what they thought of the picture and how to improve it. And I must say it gave really good and patient feedback, pointing out that the pic was blurry and giving suggestions on how to fix this (tripod, better lightning, change aperture settings, etc.). It also suggested rearranging the semlor to have some in focus in the foreground and a few out-of-focus ones further back. I incorporated the feedback, snapped another pic, and got new feedback. I did this for a few iterations, including post-processing, until we were both happy with the resulting image.
The gallery below shows the photos I took and the feedback ChatGPT-4 gave me. Some of ChatGPT’s suggestions:
- Image 1: Improve focus and showcase some semlor in the foreground and some in the background
- Image 2: Improve the lighting, in particular its angle (the first two images only used ceiling lights); add some props such as flour, almond paste, and a sieve with powdered sugar
- Image 3: Sprinkle some icing sugar around the buns and show some almonds; use the rule of thirds in the composition (place a key object on one of the 1/3 lines in the image)
- Image 4: Photo is good, use postprocessing to adjust color balance, and contrast and add a vignette (darker edges with lower saturation)
Conclusion
Working with ChatGPT-4 as a food photography coach seems to be a great way to improve my work. Planning a photoshoot and looking at inspirational images from other photographers is useful. But having ChatGPT render images with my own plates and props takes this to the next step. They can help me plan a shoot with what I already have at hand or help me suggest what props, plates, and bowls I should aim to buy. Trying to recreate their rendered image on my own with my lighting system, cardboard pieces to sculpt light, and moving my props around was really useful for this shoot. Having a vision to work toward and seeing a preview if you will is both motivational and a good guide.
I see this exercise of using ChatGPT for food photography as something I will continue for a while to grow as a photographer and help build my skills. But it takes more time for sure, especially when asking for feedback during a live shoot, since I have to upload my pics to the chat and discuss them. But the feedback can be really useful. For this post, I started with an extreme case of a very bare and blurry pic to see where ChatGPT would take it. For a real use case, I would try and make something really nice to start with, reducing the number of required iterations (hopefully). The feedback I received fit well with my own experience over the years and two online food photography classes I took at Udemy. So ChatGPT does have at least some food photography info in their training data and knows how to apply it to photos uploaded.
End of series
With this post, we have concluded our short series about working with ChatGPT-4 for a food blog post and using ChatGPT-4 as a food photography coach. I hope you have enjoyed this little foray into using a multimodal LLM for food blogging and maybe learned a thing or two. Either about the power of current LLMs or some tricks to use in your own work. If not, maybe you learned something about Swedish cuisine or semlor at least!
Next time, we’re back to the normal vegan world tour which will take us to Southeast Asia. If you don’t want to miss the next post, be sure to subscribe to the newsletter.
Bloopers – choosing plates
When I tried to get ChatGPT to render an image showing the two of us choosing plates, I had quite a bit of a struggle, as shown in the gallery blow. Lots of confusion!