Creativity has been defined as the use of the imagination or original ideas, especially in the production of an artistic work. While the source of the development of those ideas can be debated—does creativity spring from the heart, the brain, the soul, or one's experiences—it has been largely accepted that humans alone possess the capability to truly create.
The emergence of computers and artificial intelligence (AI) has led to systems that, fed a sufficient amount of training data, can mimic the output of a creative writer, artist, or musician, thereby encroaching on humans' monopoly on the creative process. Artificial intelligence techniques can be used to create new ideas in a few different ways, such as producing unique combinations of familiar ideas, creating new works based on the attributes of previous works, and by offering new ideas based on combinations of attributes and ideas that humans may not have thought of during the creation of a new work.
A notable example of the power of AI to generate a so-called "creative" work was demonstrated in 2016 when the IBM Watson AI platform was used to create a movie trailer for 20th Century Fox's horror film, Morgan. The first example of a trailer created solely by AI, Watson was used to analyze the visuals, sound, and composition of hundreds of existing horror film trailers, and then it selected scenes from the completed Morgan movie for editors to patch together into a trailer. The use of AI to comb through scenes to create a trailer in the style of other horror movies helped to reduce the amount of time editors needed to spend on the project from a week down to a single day.
The process for using AI to generate creative content is largely based around the use of foundational models or generative adversarial networks. These approaches utilize deep neural networks designed to mimic the ways in which the human brain learns by creating associations between specific elements that can be combined to create a finished work.
Figure. Image created by DALL•E 2 when prompted, "AI envisioned as an artist creating beautiful art."
Figure. Image created by DALL•E 2 when prompted, "the most wildly creative image imaginable."
These neural networks are fed millions or billions of examples of a particular output (which could include images, sound samples, or text passages), which they subject to a sophisticated type of pattern matching to "learn" specific attributes, patterns, or cues. For example, algorithms that are used to create artwork in the style of impressionist artists would be shown works from Monet, Renoir, Manet, Degas, Cezanne, and Matisse, generally considered to be masters in this style of artwork. The neural network examines the works as patterns of pixels, and can be trained to identify the specific patterns that define the impressionist style. This creates a framework of knowledge that can be used to create a new work based on the learned parameters and attributes. The more "layers" or "depth" the model has, the more complex the resulting patterns and correlations can be.
Large AI companies such as OpenAI (which describes itself on its website as "a research and deployment company" whose mission is "to ensure that artificial general intelligence benefits all of humanity") have created applications such as DALL-E (referencing the artist Salvador Dali), which was announced in January 2021, and demonstrated that this approach could reproduce and recombine features from those existing images in new and aesthetically pleasing ways. The next version, DALL-E 2, released a year later, featured improvements to image quality and demonstrated that the system could reproduce different artistic styles.
There is an inherent level of reality within works created by AI, because models are trained on existing works that incorporate aspects humans find pleasing.
A similar approach can be taken with other types of creative works, such as music, where a neural network would be fed examples of music in order to learn specific patterns (such as common chord progressions, melodies, rhythms, and structure) used within a given genre, artist's repertoire, or other style data (for example, data distinguishing a waltz from a 12-bar blues or a jam-band-style improvisation).
Figure. Image created by DALL•E 2 when prompted, "AI creative self portrait."
This approach was used to create a song called "Break Free," which was composed by musician and YouTuber Taryn Southern. Southern utilized an open source platform called Amper Music (www.ampermusic.com) to create the basic stems, a grouped collection of audio sources grouped together as a single unit that will be combined with other stems during the mixing process. By choosing parameters that governed the song's structure, including the tempo, rhythm, type of instrumentation, and style, the AI platform created possibilities that Southern could choose from to create a song. Southern picked the possibilities she liked, and then arranged the pieces into a song structure to fit the lyrics she had written.
Despite the ability of AI to produce creative outputs based on the attributes of existing works, the process is not the same as a human's creativity, which comes from a combination of real-world experience, emotion, and inspiration. Indeed, these neural-network-driven systems lack real-world understanding of the world, and without the proper guardrails or parameters, can produce works that are nonsensical or strange. Further, because these outputs are based on the images, sound clips, or texts on which they are trained, they can reflect certain societal biases, such as creating artwork that renders, say, pilots as male, nurses as female, or song lyrics overrun with racial slurs or obscenities.
"Actual creativity is difficult to replicate. However, machine learning is an expert in finding patterns," says Wayne Butterfield, a partner of ISG Automation, a unit of global technology research and advisory firm ISG. "If these patterns are deemed as creativity, then, yes, computers can appear to be creative. The reality is for AI, each note, brush stroke, design, or other aspect of artistic expression is based on existing data, versus a true, original stroke of genius."
There is an inherent level of humanity within works created by AI, because models are trained on existing works that incorporate attributes humans find pleasing, such as certain combinations of colors, notes, or words. That said, "A computer is not a sentient being, and so it needs to have an input and then it will have an output," says Roger Firestien, a senior faculty member of the Center for Applied Imagination at the State University of New York College at Buffalo, a creativity consultant to Fortune 500 firms, and author of books detailing how creative ideas can be generated and applied to problems. "It seems like there's still always going to be some sort of human element there, at least at the genesis of using the technology."
Perhaps the greatest change AI will impose on the creative arts is the expansion of the number of people or organizations that can generate and experiment with art, writing, and music. This can impact not only individual creators, but organizations that utilize creative works. For purely commercial applications, such as television scores or jingles, AI can be seen a method for improving the overall quality of a production, while reducing costs.
"Twenty years ago, television shows like A&E's Biography were still paying composers to score the episodes with original music," says Bob Higgins, a television producer and songwriter based in New York City. "That practice was replaced by blanket deals with music libraries offering cheaper, albeit generic, instrumental cuts for pennies. Now, it sounds like shows can have original music again, cuts created specifically for the show, but also done on the cheap by AI."
"For AI, each note, brush stroke, design, or other aspect of artistic expression is based on existing data, versus a true, original stroke of genius."
Even as AI improves, it is likely the technology will be used to augment, rather than replace, human creators. "Rather than put people out of work, using AI alongside their own skills enables even jingle writers, artists, etc., to do more, faster and potentially cheaper," Butter-field says. Further, he says, AI already has helped content creators scale their efforts in another domain: the development of realistic, feature-rich video games.
"We are already seeing AI-generated landscapes in video games doing the heavy lifting in our most immersive games," Butterfield says. "AI is enabling a bigger game to be built, which simply would be uneconomical if purely human computer programmers were used."
From a true artistic perspective, AI is likely to find use as a catalyst to spur more human creativity, much in the way recording studio technology has been used by musicians to expand how their final works were presented. Recording artist and guitarist Les Paul was a pioneer of using technologies such as multitrack recording and overdubbing to create now-classic recordings featuring his then-wife, Mary Ford, singing not only the melody of a song, but multipart harmonies.
Similarly, Higgins notes that the Beatles found new sounds on their ground-breaking Revolver album by playing the tape of a guitar solo in reverse. "It [was] music inspired by the technology used to create it," Higgins says. "Inspiration is a songwriter's bread and butter."
It is likely AI will become another tool for creators, rather than replacing the act of creation itself, which is usually considered a fulfilling and enjoyable process for artists, writers, and musicians. Firestien, a musician himself, says the process of using AI to come up with novel musical riffs or motifs could serve as a catalyst or starter for a composer suffering from writer's block. Indeed, Firestien notes, "Much of the joy of music is in the composing. Why would we want to teach a computer to compose when composing is so much fun?"
For the consumers of art, poetry, literature, music, or other creative content, enjoyment is often derived from the authentic, shared human experiences that are referenced to create that art. "It's in the person," Firestien says. "Can a computer fall in love? No. Can a computer be depressed? No. Can a computer go through the pandemic? No."
Technology is used when it is convenient and desirable for certain tasks. Despite inventions such as the telegraph, the telephone, email, and Web videoconferencing, people still desire to visit each other in person. Similarly, AI is likely to be used by artists, musicians, and other creators only to the extent it suits the needs of their creative process.
"Computers and AI can be good starters," Firestien says. "And maybe good finishers as well."
How Taryn Southern used AI to Create an Album, Forbes, Sept. 26, 2017, https://bit.ly/3PNYPR3
DALL•E: Creating Images from Text, OpenAI, January 5, 2021, https://openai.com/blog/dall-e/
An introduction to Generative Adversarial Networks (GANs), Machine Learning Mastery, https://bit.ly/3pNjV7J
What are Foundation Models?, IBM Research Blog, https://research.ibm.com/blog/what-are-foundation-models
©2023 ACM 0001-0782/23/02
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from firstname.lastname@example.org or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2023 ACM, Inc.
No entries found