TroublingGAN: generated visual ambiguity as a speculative alternative to photojournalism

Lenka Hamosova

This exposition documents artistic research that engages with generative neural networks and artificial intelligence-driven visual synthesis, the goal being to challenge the limits of the research and question the value of the generated visual outcomes. We present here our experiment with a customised StyleGAN model. In contrast to its utilisation by computer scientists, it has been trained on a heterogeneous dataset, voluntarily exposing the generative neural network to failure while focusing on the unexpected moments of surprise that arise from such a process. Led by the critical thoughts of The Nooscope Manifested (Pasquinelli and Joler 2020), this experiment questions what kind of knowledge generative neural networks produce, whether they could change one’s perspective on studied objects (the dataset) and whether they could function as a viable tool for artistic research. The observed object in this project is the concept of ‘troubling times’ (inspired by ‘Designing in Troubling Times’, the theme of the 2021 Uroboros Festival) — rapid socio-ecological changes caused by shifts in global economic, political and technological power and the subsequent series of troubling events, including the COVID-19 pandemic, violent conflicts and environmental catastrophes. StyleGAN is used as a pattern-recognition and knowledge-production tool to create an intuitive understanding of what this ambiguous term ‘troubling times’ actually means today. During the experimentation process, multiple unforeseen moments and twists happened, offering valuable insights into the nature of synthetic media, generating ethical questions concerning the use of generative neural networks and spawning new propositions for further research.

3. A Different Kind of Knowledge — Neural Network as a Metaphor

Generative models driven by AI prove to be an accurate metaphor for the state of our society. Let us look at machine learning as an instrument of knowledge. It is composed of an observed object (training dataset), an instrument of observation (learning algorithm), and a final representation (statistical model). Vladan Joler and Matteo Pasquinelli use the analogy of optical media to explain how this ‘instrument of knowledge’ works: ‘the information flow of machine learning is like a light beam that is projected by the training data, compressed by the algorithm and diffracted towards the world by the lens of the statistical model’ (2020: 3). In our case of a neural network generating visuals, the observed object is a large dataset of visual material, with the two opposing learning neural networks (GANs) serving as the instrument of observation and the final representation being the trained StyleGAN model that can generate new visuals. Seeing the magic through these eyes makes it hard to believe that there might be something new coming from a statistical model. However, as an instrument of knowledge, a neural network can analyse large quantities of data not only faster but differently from human expectations. Such a StyleGAN model then becomes a tool for the visual representation of patterns inside the observed object, as recognised by the neural network. This is a beautiful paradox. Although StyleGAN just magnifies the information inside the observed object and reproduces it again and again, it is nevertheless bringing a new kind of knowledge that we can detect through repetition within the statistical model. StyleGAN recognises patterns inside the dataset while we recognise patterns in the generated outcomes that StyleGAN produces.

3.1. ‘Something-ness’

This new knowledge can, however, be very subtle. Perhaps it can be perceived only in the form of an emotion. Many visual artists who experiment with StyleGAN and similar generative neural networks talk about its quality to derive the essence of the observed object. Entangled Others Studio, in its project Beneath the Neural Waves (2020), trained a neural network on images of coral reefs to create a new form of artificial life. The neural network recognised visual definitions of the coral reef dataset and created layers of essences, or ‘something-ness’. The results are not new examples of jellyfish but ‘jellyfish-ness’, as the studio’s founder Feileacan McCormick calls it. By observing this process, the artists are ‘dreaming up new ecosystems’, believing that active dreaming is another form of storytelling (Uroboros Festival 2021). Because data collection is a long and tiresome process, these artists are engaging in deep observation of the object, almost a meditation. In retrospect, such an artistic process can be used to question one’s biases — because those are most surely going to be projected in the created dataset.

Similarly, in her artwork Myriad (Tulips), artist Anna Ridler engages in an even deeper relationship with the observed object by creating each photo in the dataset herself (2018). She photographed more than ten thousand different tulips and then sorted, annotated, and aligned every one of them by hand. That is in direct contrast to the way large datasets are built — usually using underpaid workers (i.e., Mechanical Turk) or purloining the images from the Internet. Such datasets contain countless labelling errors, leading to biased results (the most famous example of a biased dataset is that of ImageNet). What kind of knowledge can neural networks create using such poor datasets? The ‘light beam’ will evidently magnify all the neglect, ignorance, prejudices, stereotypes, and other issues. On the contrary, if we invest time and energy in building datasets with care, we can expect something far more valuable to be projected into our world.

The Nooscope analogy offers insight into the ‘black box’ problem of deep learning and biased datasets. We started this artistic research aware of these issues and soon realised that, except for a few artistic projects, most generative neural network applications are using existing, biased datasets owing to their convenience. Considering the current evolution of neural networks that are learning from non-curated datasets (for example, all models using GPT-3), there is a small chance that someone will purposely reject the convenient access to such a vast amount of data. There simply seems to be no time for creating proper datasets anymore. What kind of new knowledge can AI tools trained on biased datasets create when they keep projecting stereotypes from within?

We see a parallel with other planetary troubles. There are so many complex issues humanity is facing at a global scale right now, and although there are obvious solutions at hand, it seems that humans will not proceed with them because they involve slowing down, re-doing, re-assessing, re-building, re-thinking, and re-imagining.

Inspired by the latest edition of the Uroboros Festival and its theme, ‘Designing in Troubling Times’, we decided to shed some light on the vague term ‘troubling times’ using generative neural networks and to observe what kind of new knowledge it can produce.

Figure 2. Early outcomes from TroublingGAN were used in the visual identity of the festival UROBOROS: Designing in Troubling Times, 2021

3.2. Troubling times

The Uroboros Festival brings together people from different disciplines to join workshops and other participatory formats to discuss our position as creators in the current ‘troubling times’. It asks what and how we can design to support positive change. It claims that transformative creative practices are needed to alter the course of events and stop continually creating multiple causes for the ‘troubles’ that society is now experiencing.

Here, design is understood to be a post-disciplinary practice, as per any human activity that is shaping the world around us (and this world is then shaping us). As a world-building tool, it can inspire the collective creation of narratives, prototypes, and situations, proposing alternatives to the existing status quo — in the context of everyday practice and individual lifestyles, as well as in more complex systemic processes. This social turn proposes a ‘shift away from designing quick-fix solutions to engineer our troubles away, and towards efforts to use design to assist in the development of long-term conditions for social change’ (Dolejšová and Hámošová 2021:38).

To illustrate the looping deadlock that Design (and the whole of society) appears to be experiencing, as well as to understand differently the notion of these ‘troubling times’, we are using generative neural networks as a tool of knowledge production and as a metaphor. If we don’t change the way we think about this world and the way we are designing for this world, we will continue repeating the same mistakes and generate new versions of the same problems again and again. Human minds are also neural networks. We collect problems (the dataset), we constantly engage with them, observe them, try to find some patterns in them, and then, informed by this knowledge, we create new outcomes based on that data. This training data, however, can sometimes be traumatic for the observer; that is, it’s often hard not to generate outcomes that are heavily influenced by the inputs. What then results are new forms of the same old problems. This occurs especially if the model is overtrained, if there is not enough variability in the dataset, or if the neural network crosses the sweet spot during the training when it is still learning.

Thus, we decided to train the StyleGAN neural network on a dataset that would visually represent ‘troubling times’ to prove the logic of a self-reinforced learning loop, be it the phenomena of Internet filter bubbles and echo chambers or the isolation of a world-shaping discipline such as Design that keeps repeating the same patterns. The process of creating the dataset raised unexpected critical questions related to the visual representation of troubling events in the news, the recontextualisation of such emotionally charged images in the media and the affective quality of photojournalism.

3.3. Photographs in the news depicting ‘troubling times’

The elusive idea of ‘troubling times’ cannot be represented objectively. To create quality learning material for neural networks on this topic, it is first necessary to identify and localise the content of the dataset.

We had to make multiple reductions before settling on the category of photojournalism. First, we needed to drop the abstract connotations of ‘troubling times’ and think of very specific examples. Second, we needed to accept the biased nature of photojournalism and know we are not attempting to create an objective visual representation of the troubles on a global scale.

The year 2020 brought a myriad of worrying news, including daily reports on the COVID-19 pandemic, environmental catastrophes, unpredictable extremes of weather, social unrest, protests, and armed conflicts. The news used to be accompanied by photographs documenting specific events from around the world, but these often had more of an illustrative role, with the sole purpose of underlining the nature of the message. For example, COVID-19 articles often reused older photographs from intensive care units in hospitals or of people in masks and protective suits. In the Czech Republic, photos of thirty thousand white crosses painted onto the paving stones in the Old Town Square [Staroměstské náměstí] in Prague were used repeatedly, long after the act of remembrance took place. Its emotional effect intensified the urgency of the message in the article. Similarly, we are seeing illustrative photographs showing various explosions or fires and approaching them like stock photos, although they originate from documentaries of specific events. It becomes hard for us to distinguish between photojournalistic images that relate to the subject being documented and images displayed only for their affective quality. It is of concern when a photograph has been chosen to illustrate an article because of its affective quality and still contains the details of its origin.

Photojournalism is supposed to be objective, factual, and complete, telling an attention-grabbing story, but as Howard S. Becker describes in his essay Visual Sociology, Documentary Photography, and Photojournalism: It's (Almost) All a Matter of Context, photojournalism and documentary photography are just ‘social constructions whose meaning arises in the contexts, organisational and historical, of different worlds of photographic work’ (1995:5). These images never serve merely to inform about the events they depict. Although they were created as documentary photography, in media space, they often become weaponised for their emotional charge, stripped of their original meaning and contextualised again and again. This can be used to provoke an emotional response in viewers in order to create engagement with an unrelated issue, or, on the other hand, it can be purposefully used to achieve desensitisation and to render formally violent and disturbing images banal.

Figure 3. Illustrative use of photojournalism on Instagram. Collages made by Lenka Hámošová, 2022

Nicholas Mirzoeff (2016) writes about the obscenity of the image war, when events such as 9/11 or videotaped executions performed by the Islamic State were designed to create spectacular media images affecting millions of viewers around the world. Mirzoeff notes that to claim a victory, it is important to have the violence shown. Today, since the beginning of the Russian invasion of Ukraine, we can observe the same obscenity of image war with a small difference: these brutal images permeate social media and are seen mixed with banal everyday content. One can scroll through stories on Instagram and, in 15-second intervals, jump between tourist images, make-up ads, and highly disturbing war content, such as footage of the Bucha massacre. Interestingly, these images often end with a blurred warning about ‘sensitive content’ to protect viewers from the shock of the unexpected or unwanted sight of something disturbing. Whether the images were reported by users or detected as inappropriate by Instagram’s algorithms, this content ends up hidden behind a thick layer of blurring that turns the images into pure abstraction. However, such abstraction does not exploit the full potential of the abstract image. It does not engage the viewer on an emotional level, nor does it invite any contemplation. It simply censors. This kind of abstraction communicates nothing. While it successfully obscures graphic violence and brutality, it also sacrifices disturbing emotions that cannot penetrate through the blurring to the foreground. It’s all or nothing.

On the other hand, semi-abstract visual material generated by a neural network that has been trained on a dataset of thematic documentary photographs could create a new genre of illustrative photojournalism that does not carry with it the ethical problems outlined above. Substituting recontextualised news photography with visually ambivalent synthetic footage could achieve better involvement with the attached message and create space for a different way of perceiving the image. Such footage does not burden the reader with additional meanings and does not create unnecessary, often misleading, information noise. It is deprived of context yet still carries the desired ‘troubling’ atmosphere.

Even if state-of-the-art AI visual synthesis can produce photorealistic content now, we suggest avoiding the photorealistic stage of perfection in favour of semi-abstract visuals that allow multiple interpretations. Visually indeterminate generated images, despite the absence of any specific scenes or objects, still strangely resemble photography. This is very perplexing for the human eye. The mind still wants to impose meaning on these ambiguous images, however abstract they may be. Yet, because the imposed meanings are dynamic and constantly shifting, it is the atmosphere and emotional charge that affect the viewer.

<< Previous

Next >>

Tip

Tip

Lenka Hamosova, Pavol Rusnák - TroublingGAN: generated visual ambiguity as a speculative alternative to photojournalism - 2023