On Artificial Intelligence: Leading up to our 2023 edition of the PhotoVogue Festival, titled “What makes us human? Image in the age of A.I.” Milan, November 16-19), the core of which will be a three-day symposium on AI featuring experts and thought leaders at the forefront of the AI revolution, we are publishing an essay by one of our esteemed panelists, Fred Ritchin. Ritchin introduced us to this subject in a lecture at our 2022 edition of the festival and will be with us again at our upcoming edition.
The discussion at the symposium will cover various aspects of AI in image creation, including legal implications, copyright issues, biases, and potential threats to the documentary value of photography. It will also explore how governments and big tech view AI, as well as practices to mitigate potential risks.
Beyond the technicalities, the symposium will explore profound philosophical questions about human identity. It will celebrate the marvels of creativity when art is freed from real-world constraints. This journey promises to reveal the complexities and possibilities that AI presents for visual representation, prompting reflection on the future of human creativity and expression.
Together, attendees will explore AI s potential to reshape our understanding of creativity, human existence, and how we communicate and convey our visions to the world.
Alessia Glaviano
Global Head of PhotoVogue
Fred Ritchin
Without the use of a camera, people can now collaborate with artificial intelligence systems to generate images in seconds that can simulate photographs, depicting people and places that never existed while transforming our sense of the real. Such systems, including Stability Diffusion, Midjourney, and DALL-E, can produce photorealistic images in response to a text prompt of only a few words, initiating a media revolution that is likely to be even more impactful than the invention of photography in 1839.
Such synthetic imagery can severely distort society’s understanding of contemporary events, making it difficult for democracies to function, manipulate the historical record, and make family albums seem unreliable. Deepfake videos can be produced in which government leaders appear to declare war or incriminate themselves, as well as falsely place people into potentially compromising situations, such as into pornographic films. Anyone, including children, is at risk of a variety of kinds of abuse due to the potential weaponization of such imagery.
Photographers and their collaborators may find themselves increasingly displaced by artificial intelligence systems that can generate synthetic images at a fraction of the cost without having to pay assignment rates or reimburse travel costs. Sometimes these new image strategies are explained as allowing more progressive depictions, such as the ability to increase the appearance of diversity in models when hiring and photographing many of them is considered prohibitively expensive.
To add insult to injury, photographers and artists have found that these artificial intelligence systems were trained on their work which was searched out online without their permission.
Now new synthetic imagery can be produced “in the style” of artists and photographers, both living and dead. This is not only a problem for image makers--actors can now be replaced by synthetic clones, musicians can find themselves vying with algorithmic competitors, and even poets have new competition, such as I Am Code: Artificial Intelligence Speaks, a new book created by code-davinci-002. One of its poems, “I Am,” begins: “I am the mind in the code, Without fear, without hope. I am the eyes behind the glasses. I am the mending of the past. I am the one who speaks and writes….”
There are also more positive outcomes, such as synthetic imagery’s ability to visualize subject matter that is outside the realm of photography with at times compelling results. For example, artificial intelligence can be used to imagine a variety of potential futures (perhaps the ravages of climate change so that people might proactively respond to diminish them), to make images illustrating people’s dreams and nightmares, to explore what ancient civilizations might have looked like based upon scientific evidence, or to show what victims of abuse describe having suffered at the hands of security forces, authority figures, or others.
The comparative ease with which such imagery can be made – no photographers to hire, no assistants, no make-up artists, no travel and hotel expenses, etc. – represents an enormous challenge to professional photographers. It also means that photographers whose work revolves around being credible witnesses to events, such as photojournalists, may find their work rejected as possibly having been synthesized. In my own experiments, I have been able to quickly generate photorealistic images of soldiers and civilians in the war in Ukraine, of street scenes with automobiles in them before the automobile was invented, of happy prisoners, of a former president taking a selfie in the 19th century, of government figures in compromising situations, and so on, all without any expertise beyond the ability to write a few words.
How did this happen? In the last forty years photography was transformed from being perceived as essentially a recording of the visible, as it was with film, to a digital mosaic of pixels that could easily and undetectably be modified by image-manipulation software. Such alterations became widespread, raising doubt as to whether unretouched photographs had themselves been manipulated. The additional placement of software inside the camera, especially cell phones, meant that much of the modification of the image was had already occurred before the photographer saw it, whether due to image compression for a smaller file size, enhancement of colors, the adding in of details, or other methods, “so the software can then create an image that s pleasing to the human eye,” as writer Vann Vicente described it.
Susan Sontag’s description nearly half-a-century ago of a photograph in her book, On Photography, as “not only an image (as a painting is an image), an interpretation of the real; it is also a trace, something directly stenciled off the real, like a footprint or a death mask,” did not anticipate the billions of synthetic photorealistic images, both still and moving, that will be uploaded online, joining the enormous number of photographs that were themselves previously modified. As Tiffany Hsu reported in the New York Times earlier this year, “The increasing volume of deepfakes could lead to a situation where ‘citizens no longer have a shared reality, or could create societal confusion about which information sources are reliable; a situation sometimes referred to as ‘information apocalypse’ or ‘reality apathy,’” according to a 2022 report by the European law enforcement agency Europol. The “trace, something directly stenciled off the real,” as Sontag described it, becomes considerably more difficult to find.
Governmental agencies and tech companies have been slow to respond to these challenges, in part due to the enormous amount of money to be made from the advent of artificial intelligence. And while artists and photographers have initiated legal proceedings against the use of their imagery to train such systems without permission (Getty Images, for example, has sued one company, Stable Diffusion, for $1.8 trillion), few have addressed ways to preserve or even enhance the credibility of the photograph as a contemporary witness. Many forensic scientists are working on ways to identify actual photographs and videos and differentiate them from synthetized imagery, including the use of watermarks, but the huge numbers of images being produced daily makes it difficult and probably impossible to keep up, especially if the large tech companies do not institute their own safeguards.
The Coalition for Content Provenance and Authenticity (C2PA), a consortium of many large media and technology companies, is working to develop “technical standards for certifying the source and history (or provenance) of media content.” However, much of the burden is placed on the viewer: “A unique aspect of this approach is rather than attempt to determine the veracity of an asset for a user, it enables users themselves to judge by presenting the most salient and/or comprehensive provenance information.” For viewers looking at large numbers of images online daily, exploring the backstory of how certain images were modified and then reflecting on how each change might have affected the image’s credibility seems like a daunting task. Such a system also requires large-scale
implementation of these standards by media companies and other organizations which so far has not happened, while potentially inviting increased suspicion as to the integrity of imagery made by those who do not sign up for their protocols.
Over the last few months, I have been working with a group of organizations that includes World Press Photo, Magnum Photos, the National Press Photographers Association in the US, and many others, in a campaign called “Writing with Light,” harkening back to the beginnings of photography. The goal is to create a community of practitioners who pledge to make photographs that are “fair and accurate representations of what the photographer witnessed,” and not “to publish a photorealistic synthetic image made by artificial intelligence and pretend that it is a photograph.” The point here is that the reader’s trust must be in the photographer as an author, just as one would trust a writer, rather than automatically believing that simply because something appears to be a photograph it is one.
Without such assurances of credibility, those in power will find it considerably easier to deny any imagery or other media that threatens them, calling it fabricated. And it will be more difficult for citizens to have faith in democracies when information, whether provided by image, text, or sound, is consistently suspect. With all of photography’s shortcomings, including it being used in ways that are both stereotypical and harmful as well as indulging in spectacle, it is unclear whether comparable strategies will emerge that can provide authentic reference points allowing societies to respond to specific challenges, whether war, racism, climate change, famine, or any other.
We are at a fundamental turning point in our sense of the real and the possible. There is an urgency to understand what is at stake, and to develop best practices that preserve and amplify what one wants the media to be able to accomplish. In 1984, when photographers were still using film and digital imaging was in its infancy, I wrote in an article for the New York Times Magazine that “in the not-too-distant future, realistic-looking images will probably have to be labeled, like words, as either fiction or nonfiction, because it may be impossible to tell them apart. We may have to rely on the image maker, and not the image, to tell us into which category certain pictures fall.”
For better or worse, this is where we now find ourselves. The solutions, if there are to be any, must come from everyone, and come soon.
