How advanced is AI, really?

You might have seen the recent news about Open AI or more specifically DALL-E— an AI system that can generate an entire picture from a simple text command. Usually generated images look slightly unnatural – as predicted since this AI can’t physically see. Yet sometimes the images yield surprising results, below are some of my favorites. 

Inputting “Salmon in a river” generates salmon fillets in a river swimming like live fish, which is super strange! You could ask anyone on the planet to describe salmon in a river – and no one would think of raw fillets in the water. We can start to see how data pollution can affect generated images. Here, there have been so many pictures of raw/cooked salmon over the actual fish, so DALL-E can’t discern the difference between the two and chooses the more popular one. 

 

Here’s another interesting one: inputting “Chess match on a pizza”, not only do you get the physical pieces and somehow a chess board instead of cheese, but I’m especially impressed by the first image. DALL-E chose to include 2 people playing on this pizza chess board – which again shows how much input data affects this AI’s understanding of the game chess as a 1 on 1 board game rather than just pieces on a checkered board. 

One last cool example of DALL-E is when inputting “Future of City of Kigali”  DALL-E generates the image below:

For context, Kigali is the capital city of Rwanda, a country in central Africa. As things currently stand, they’re ranked roughly in the bottom 20% of all Nation’s Gross Domestic Products, a statistic used to show the

 economic growth and output of a nation. Yet DALL-E is pretty confident that this developing nation will one day become a futuristic city with an emphasis on greenery. 

All of these images show how far AI has come, especially in visualizing our world. Thinking about it, DALL-E has boiled down our complex 3d world into 1’s and 0’s that computers use, which seems a bit out there. 

As an avid reader of the New York Times, I was recently captivated by this article, “A.I. Wrote These Thanksgiving Recipes. Would You Make Them?” by Priya Krishna. Krishna used another AI, Open AI’s GPT-3, a program that takes input and can give a plethora of not only recipe ideas but full measurements and instructions. Some of the inputs for GPT-3 were basic background questions like describing your heritage, to which Krishna typed: 

“I am originally from Texas, and I grew up in an Indian American household. I love spicy flavors, Italian and Thai food, and desserts that are not too sweet. Some ingredients I frequently cook with are chaat masala, miso, soy sauce, herbs and tomato paste.”

Not only did this AI create full start-to-finish recipes, but also created fusions based on thanksgiving dishes and Krishna’s culture. One recipe was Pumpkin Spice Chaat, a variation of a traditional Indian dish. Not only did GPT-3 have a spark of inspiration to fuse two cultures together, but wrote introductory notes (usually tips and tricks written by the experienced chef to aid the home cook, or backgrounds on why they cooked the dish) in the voice and perspective of Krishna. Here is the full top-to-bottom recipe: 

Pumpkin Spice Chaat

Written by the A.I. system GPT-3.

I grew up in an Indian American household, and I love spicy flavors. I often do not enjoy traditional Thanksgiving food because it is too bland. I like recipes that have different kinds of flavors and textures, that balance savory with sweet and spicy and tangy. This recipe for pumpkin spice chaat is perfect for my tastes! It is made with pumpkin purée, chaat masala, ginger, cloves, nutmeg, black pepper, honey, lime juice, cilantro, mango, red onion and chickpeas. It is a delicious and easy snack or side dish that is perfect for Thanksgiving.

Sidenote from Max: This introduction sounds so authentic, the voice and reasoning behind this fusion dish is so clear!!! (End sidenote)

Ingredients:

1 can pumpkin purée

1 tablespoon chaat masala

1 teaspoon ground ginger

¼ teaspoon ground cloves

½ teaspoon ground cinnamon

½ teaspoon ground black pepper

1 tablespoon olive oil

1 tablespoon lime juice

2 tablespoons chopped cilantro

¼ cup chopped roasted peanuts

½ cup yogurt

Instructions:

  1. In a bowl, mix together the pumpkin purée, chaat masala, ginger, cloves, cinnamon and black pepper.
  2. Heat the oil in a pan over medium heat. Add the pumpkin mixture and cook for 5 minutes.
  3. Add the lime juice and cilantro and cook for another minute.
  4. Serve with the peanuts, yogurt and lime wedges.

(End Recipe)

Before I describe how the dish went when Krishna actually made it, let’s revisit DALL-E for a second. Krishna decided to go a step further, inputting “Pumpkin Spice Chaat” into the image generation platform. Below, one of the images is what DALL-E generated, and one of the images is what Krishna made as per GPT-3’s instructions in her kitchen. Which is the fake one?  

It turns out, the left one was generated by DALL-E. Shocking! The vivid colors, pumpkin pieces on the Chaat, dots of yogurt, not to mention the LITERAL PIECE OF PUMPKIN in the background, make the image look authentic. This trick of putting the raw ingredients in the background is something that every food blogger does – as I’m sure you’ve seen at one point or another.

Overall, Krishna rated the flavors of the Chaat as a “grassy-flavored mush”. While sure, the flavor aspects of these recipes are lacking, I’m thoroughly impressed by the depth of both recipe creation and image generation of these dishes. Maybe one day our recipes passed down from generation to generation will be indistinguishable from those generated in minutes by AI. 

 

5 thoughts on “How advanced is AI, really?

  1. This blog has interesting pictures, with my favorite as the salmon filet in the river. I agree with you that when we think of salmon swimming, we imagine the miraculous jumping of the whole salmon as they move upstream in the Atlantic. The salmon are strong and brave fish as they take these daring jumps. I also find it interesting that the drawing for the future of Kigali looks like something out of a movie, and it looks like Wakanda. This is a positive image for a country stuck at war for years. With such severe poverty, this picture does not even come to the present situation now. It would be nice if things turned around and had a nice city like in the picture. I agree that AI is not necessarily accurate; rather, it pools data. However, it can be entertaining. It is neat that you made a mixture dish, and there is always a chance to modify dishes with new spices. Having an Indian background, my grandmother uses exotic spices in her cooking, making the flavors come alive. I think that AI has positives and negatives, and with anything online, we need to read things with a critical eye. I hope that as AI evolves, it becomes more accurate.

    Reply

  2. Hi Max, your blog was really interesting and thought provoking. I am also really interested in AI and have been following the progress of text to image systems closely over the past few years as well (midjourney, DALL-E, stable diffusion, etc). It is mind blowing to see how computers can create incredible works of art solely using math and data, but at the same time make some pretty silly errors like the raw salmon one you pointed out. I had never heard of artificial intelligence being used to create recipes and I am shocked that it was able to conceive of a meal like Pumpkin Spice Chaat, which seems to blend two different culinary traditions together (albeit not too well). I would like to suggest another article for you to read, titled “Columbia Engineering Roboticists Discover Alternative Physics”. Basically scientists at Columbia trained a machine learning system to interpret real physical phenomena without giving it any information about our physics system (mass, velocity, acceleration, etc.) and the machine was able to generate its own variables to describe the system (some of which were indecipherable). This suggests that there may be alternate ways to understand our universe that could potentially help solve some of science’s most pressing questions.

    Reply

  3. Hey Max!
    Once again, nice blog! I was so intrigued by the topic you chose, and by how far A.I has come. It’s funny to me how the A.I most people think that we have right now is this powerful, world-endangering thing that can destroy humankind, but in reality, it can’t even distinguish filet salmon from the living kind. There is so much to A.I that the general public doesn’t know/inform themselves about, but it’s cool to see how far we have come (as someone who isn’t caught up with the world of A.I). I also thought the story about the artificially generated recipe was really interesting too! When you asked which picture was the fake one, I honestly fell for it and thought the one on the right was the A.I picture.
    I feel like the future of A.I is so huge that it can possibly be implemented nearly anywhere- but the question is, do we want it to? I remember several years ago, I attended a summer program at Northwestern called CTD and took a course relating to Artificial Intelligence and long story short- it was pretty interesting! One thing we talked about was if A.I generated art should be considered “art” and where the line should be drawn. Back then, I didn’t think too much of it, as I wasn’t that into drawing. However, now that the topic has come up again (especially on Twitter, etc.), as someone who likes to draw digital art, I can’t help but realize how harmful this actually is to all the artists that post their digital pieces online, given that these A.I art generating programs learn by taking/stealing art from online sources (most often without the artist’s consent).
    As A.I grows, I definitely hope and know that it will help our society flourish- but I wish for it to grow in a way where it doesn’t damage our community/creativity too much. Anyways, insightful blog post! It was a fun read, good job once again!

    Reply

  4. I spend a fair amount of my free time on social media, so I’ve come across AI image generators (including DALL-E!) several times in the past year or so. The concept of machines being fed information about our world, being given input, and spitting out its interpretation of that input based on that given knowledge is truly fascinating. These types of programs can absorb insanely wide info sets to begin with, and DALL-E’s capability to not only absorb new data endlessly but actually interpret it in a meaningful way truly astounds me. At this stage in their development, the visual outputs of AI image generators still fall well within the realm of unrealistic and/or unsettling (salmon fillets when you’re asking for an actual salmon), an issue which will likely be ironed out through further updates and data input. What truly stands out through your post is the improvisational skills GPT-3 demonstrated on multiple fronts. In a very short period of time, it received data, learning our typical standards for something (recipe instructions), and is immediately able to take the recipe concepts to the next level, given additional context, in ways few people would consider. Again, the result wasn’t flawless, but the effort and process were undeniably there. GPT-3’s and DALL-E’s ability to understand us so easily instills in me a small amount of fear for the future. If we input the entirety of human thought and culture and the program masters it just like that, imagine what a sentient and/or malicious AI is capable of.

    Reply

Leave a Reply to dmdelegge Cancel reply

Your email address will not be published. Required fields are marked *