The New Year is upon us, and so is the time to think about what’s coming next in the world. 2022 has been an incredible year for breakout technology (from gene folding to bringing the fail whale back to Twitter!).
For AI, in particular, it marks a year where generative AI has really hit mainstream consciousness. Midjourney reached the Discord server limit and onboarded more than 1million users. Lensa turned people into new versions of themselves (sometimes not entirely appropriately). Most dramatically of all ChatGPT took the world by storm in December with its highly believably generated text responses to almost any question.
Using AI for creativity is the core of what we’re doing at Timewarp, and it’s been an interest of mine for many years, so seeing all the end-of-year AI predictions flying around, I figured I’d give my own take on what 2023 is going to bring us.
Let’s get started!
#1. Generative AI will continue to generate headlines
There is something particularly fascinating about a machine producing a fully formed text or creative work. The results pull mental strings because the pieces themselves are often powerful. Using Lensa to produce variants of our own faces is shocking because we see ourselves staring back. Perhaps this is what Kings and Queens from previous centuries felt when seeing the results of long sittings for a portrait.
The explosion in the usage of these tools will continue to make them better and most likely more surprising. As millions of transformations are generated by users and subsequently scored, the algorithms learn and improve. By their nature, these generative processes also explore the space of possibilities in such a way that they’ll often generate things a single user doesn’t expect (see a discussion of this here).
In addition to the shock headlines, it’s likely that we’ll see technical improvements. This will likely mean some popular audio, video and text content at the very least, will start to contain AI-generated content. Each small piece raising a “XYZ using AI generated content” headlines.
All this will mean that there’s unlikely to be a let up in the news articles. I’d expect generative AI related news to be 3-5x in 2023 than it was in 2024.
#2. People will generally understand that the AI is “winging it”. Discrimination of when that’s fine may get lost.
Generative AI works by learning relationships between huge numbers of images, texts, prompts, and other relevant media and then using input prompts to explore the space of possible analogous results. This is the equivalent of saying, “what might look like a competent answer to this question based on all the questions and answers I have seen?”. In other words, superficial similarities are being remixed and combined to produce something plausible, but there is no deep structural understanding of the question being asked.
The resulting effect is very similar to what one might find on a Sci-Fi movie set. From the right angle, the result looks very believable and even spectacular. Behind the scenes though, everything is held together with string, green screens, and duct tape.
The underlying errors this can create are absolutely fine in some domains (art, music, story-writing, etc.), especially if a human creator is able to generate many variants and fuse the best (using their creative judgement and knowledge of the world). Such errors would be disastrous in other fields (medicine, structural engineering, code), to name but a few. In these instances, the resulting artifacts are potentially so complex that they can’t be validated but look “plausible”.
The amusing results of Art generators will no doubt amuse us. The idea that we might be blindly using ChatGPT instead of search is altogether more worrying (see below!).
#3. Text to Video, Text to 3D.
Throughout 2022 we’ve been wowed by prompt-to-image and prompt-to-text applications in particular. We’re now also seeing fledgling new applications in:
- Prompt-to-Video: Meta’s demo is a great example, and Google also has early versions.
- Prompt-to-3D model: Open AI’s Point-E is one of the first examples of this. There’s also a great in-depth article by Dariusz Gross on this topic.
These are much harder problems, and there is also less data available. However, as people play with these tools, they will improve rapidly. This problem is significantly harder than image generation, so in 2023 it’s likely that breakthroughs will be limited to certain types of video and objects. For example, there is already a plethora of “talking head” type video generators (see Synthesia, for example).
#4. Prompt Engineering and Prompt Management
“What was your prompt for that?” is probably the most common comment in Reddit and Facebook forums for generative art. In other words, “How on earth did you get that?”. Unfortunately, it’s not always apparent to a creator using one of the systems to know exactly why they got a certain result. Running the query again may well generate something very different, particularly if the underlying system is actively learning.
Prompt engineering is thus already emerging as a clear requirement for using generative AI at scale. One way to think of Art, Programming, or anything else in the far future is “Design the prompt”, that will get you what you need.
A few places to get started with prompt engineering:
- “Prompt Engineering: The Career of the Future“
- “A Complete Introduction to Prompt Engineering For Large Language Models”
The challenge, however, is that there is such a huge gap between the abstract nature of the prompt (and the information it contains) to the specificity and detail of the result that a deeper and deeper understanding of the model may be required to produce useful results.
In addition, we’ll likely need to get much better at “Prompt Management”, which I would term as tracking the prompts used, their results, and using the resulting log to understand the prompts better. We’ll need software solutions to help us do this at scale, especially when teams of people are collaborating and using AI.
#5. Search space completion will be feasible in some domains.
AlphaFold’s exploration of almost all possible protein combinations is an example of a type of breakthrough that could be tremendous in many fields. By generating vast numbers of combinations and “testing” them, one can eliminate non-viable combinations. While in many fields the number of combinations is so vast there is no way to “map out” the viable combinations. When it does happen though, it’s extremely powerful and essentially represents finding rules to collapse the search space from impossibly large to tractable.
It remains to be seen how complete AlphaFolds projections are, but if we can find more hidden structure in other places, we’ll see stunning scientific breakthroughs. Adjacent fields to AlphaFold’s current domain will be the first candidates.
#6. ChatGPT won’t replace search (yet)
One of the most intriguing ideas stemming from ChatGPTs success in November was the notion that it represents an existential threat to Google’s search business. The argument (articulated here, here, and elsewhere) goes that “If ChatGPT simply returns the answer to a query in one-shot, there’s no need for a user to click on a link and hence no opportunity for a search engine to monetize those clicks.”
In reality, this is already happening (Google and other engines already provide ‘short form’ responses to many popular queries), and there’s no doubt something like ChatGPT could provide a search engine rival a leg-up.
However, as mentioned in #2, ChatGPT and its like are essentially returning what looks to be “a plausible synthesis of relevant content”. This can often be way off the mark in terms of accuracy. The systems do not have an underlying model of reality that would enable them to tell an obviously false answer from a correct one.
A further issue is that when it comes to search, knowing which sources are reliable is required information when knowing what to build a model on.
So it’s unlikely that 2023 will see ChatGPT replacing Google search for common queries.
In the mid-term, however, a bigger challenge for Google is likely to be that the interface users expect for search queries will change. Why not voice only? Why not a synthesis of the top results? Google will still have a data advantage, but its business model may need to change. That is a massive shift.
#7. AIA Act, Algorithmic Accountability should shape thinking.
The European AIA Act and the American Algorithmic Accountability Act have been creeping along their respective legislative tracks. While it’s still unclear how they will impact the use of AI (and in the case of the US Act if it will ever pass), both acts show a desire by governments to shape how automated decision-making systems are used. It is genuinely positive that legislation will start to put some ground rules around AI use.
Neither of these Acts will directly affect generative AI since generative AI is typically not being used to make automated decisions impacting individuals.
Having said this, the legislation does touch upon issues of user data, so any of the companies (here’s looking at you Lensa) that actively bring in User data to process will certainly need to comply with data management provisions.
Most players in the space will probably ignore these acts in 2023. However, those with long-term ambitions should be thinking about how their results affect users, how data is managed, and whether or not legislation like this could create situations where their models could be deemed to be discriminatory.
#8. Copyright and Plagiarism debates will continue to rage
One of the hottest topics in generative AI has been the strong sentiment from Artists artists that many of the systems unfairly reuse existing Art to create new work in a way that violates copyright.
Many of the image generation systems are trained using huge databases of online images, many of which have unclear copyright attribution. As such, one can argue that some “essence” of these images leaks out in the results the AI returns for any given prompt.
Even more upsetting can be prompts that ask for results “in the style of” another artist, such as the ones above. Or this one:
In the later example, it’s unclear who should be more upset: Leonardo da Vinci or Edward Hopper.
The counterargument, however, goes that Human Artists have been thriving on these types of interpretations since the beginning of Art. Many famous paintings are inspired by other paintings.
It’s unlikely these arguments will get resolved quickly since it will be hard to draw a boundary of what exactly is a derivative work, how similar things can be, and what controls are in place to prevent out-and-out plagiarism.
In 2023 and for quite a while beyond, it seems unlikely there will be new legislation to cover these types of images. Instead, true legal disputes, should they arise, will likely be handled in the same way (to the same standard) as they would as if a human had produced the image. This is, ultimately, the only really true measure since AI is a tool in the hands of a human operator.
Much of the angst related to the copyright topic is likely tied up with the related feeling that praise (or monetary benefit) for such an image is “unearned”. This feeling of unfairness is. a natural result when one considers human artists might need months of work to produce something similar. The image above for the Hopperized Mona Lisa options took just 15 seconds for the system to render.
#9 AI for code generation.
Whilst image and text generation have been the most visible aspects of generative AI in 2022, the application of similar techniques to code generation has (surprisingly) started to get somewhere.
Github’s co-pilot was the first to get wider acclaim. The system relies on OpenAI’s Codex and helps a programmer formulate lines of code in real-time from natural language prompts. The results are impressive, though not perfect (and there are also some open source violation claims in flight). Some of the other players include T5, Cogram, and Code-LMS.
None of these systems yet replace a programmer entirely, but the increasing quality means that 2023 may, at the very least, see robust, usable products for some niche programming tasks. These could include building data integrations, specific types of websites or data mapping functions. The more specialized the domain with a large but tractable set of combinations, the more likely these tools are to succeed.
#10 AI in Context will bring the biggest wins
Building general AI is very hard. Even in the context of a specific function like “image generation,” it is hard to serve a wide range of query types equally well. It’s likely that the most successful systems in terms of generating valuable outputs will be those which constrain the domain of operation tightly. Be that in code generation for specific application types or text generation in given domains.
In 2022 we were wowed by what’s possible, but we see the cracks. In 2023 we’ll likely see how high the bar can go when AI is constrained to be specific.
I expect to see at least a few “domain” or “style” specific AI generation services popping up and becoming favorites. This is something we’re also aiming to do at Timewarp for certain types of gaming generation (so we’ll be making part of this prediction true!).
Predictions are tough, but it’s an interesting game to play. Making them focuses the mind on what you think is going to be important. What are your AI and Creativity predictions for 2023?