内容简介:In January 2020 we finalized the development phase ofGenerative Adversarial Networks, or GAN, was the
In January 2020 we finalized the development phase of Marrow . Shirin Anlen and I are sharing lessons learned during this process, and our post about optimizing and augmenting a small dataset was recently published on towardsatascience . This post looks at how custom web-based tools can inspire a collaborative artistic workflow when working with machine learning models.
Myself and Marrow
Marrow is a hands-on research project and an interactive theater experience by shirin anlen that explores the possibilities of mental disorders in machine learning . I have previously worked with Shirin on a number of projects, most notably the VR documentary Tzina: Symphony of Longing . In 2018 I joined shirin to preview Marrow as an installation at IDFA Doclab 2018 . The prototype was a success, and one year later we went as collaborators to an intensive development phase co-produced by the National Film Board of Canada and Atlas V .
About GAN and its latent space
Generative Adversarial Networks, or GAN, was the first machine learning model we decided to research . It focuses on generative visual imagery and exhibits a very clear dissonance if you attempt to train it on complex concepts using banal stock images. In a previous post we described how we created a dataset of ‘ Perfect family dinner’ images and used it to train StyleGAN V1 . This particular dataset was constructed to serve the story of the experience; one of a dysfunctional family that sees itself only through the distorted data that it was trained on. Because of this, we aimed for results that are imperfect and represent the glitches that emerge when the model tries to go deep into social narratives.
Our dataset was a bundle of around 6,500 images containing figures of four family members, stripped away from their family dinner setting. Once StyleGAN finished the training process, we ended up with a vast space of possibilities for newly generated images containing four distorted familial figures. The infinite, continuous, space of possibilities for an output image is called the Latent Space . It is “latent” because the output image generated by GAN is determined by a seemingly hidden process of mathematical transformations, starting from a series of numbers, and ending with a bitmap image. When you change any of the initial numbers in the series, the resulting image would be slightly different. The transformation network is so deep, that it’s hard to predict what would change in the image.
If you have a good enough dataset and algorithm, you might be able to reach disentanglement : that is when one of the input numbers controls one meaningful element in the resulting image; for example, one number would change the age of one generated person, while another changes their hair color. Needless to say, we were not able to achieve disentanglement with our small dataset. A change in a single number from the initial series could induce various changes in multiple family members. The same number could simultaneously control one family member’s pose, another member’s smile, and the appearance of a Christmas hat in a third figure (a repeating motif in stock images, it seems). The family members were in fact entangled .
The Shadow Allegory
Marrow tracks each of its models ‘thinking’ process and questions what could go wrong. In GAN, the latent space gives us information about how input data is being broken-down and then reconstructed into something new. But as much as visualizing the latent space is intriguing, we were looking for ways to integrate storytelling into experience. We wanted to materialize GAN’s distorted image of the world.
When watching the ongoing training process of GAN we started noticing things that are other than human, coming from the source dataset. It was like staring at Rorschach tests; flat images that appear different depending on who is watching. We realized that we are learning more about GAN not by seeing the result that we expect, but by seeing its in-between spaces. Plato’s Allegory of the cave speaks about finding meaning in the simple and flattened representation of things. The people in the allegory are stuck in a cave with a fire burning outside. The fire projects the shadows of passing by objects on the cave’s walls, and that is all they can see of reality. They are so used to those shadows, that once a prisoner breaks free, their eyes get burned by the flaring sun. When the prisoner’s eyes are finally accustomed to reality, they come back to the cave to tell the others, but now they are unable to see anything in the darkness. The other prisoners assume that something evil lies outside.
Interestingly, Plato’s allegory of the cave corresponds quite well with the structure and training process of GAN . GAN is in constant conflict between reality, representations of reality, and fantasy. When the algorithm generates images that are too close to the original dataset, it finds itself stuck in a simple and flat representation of the world, unable to escape to pathways of creativity. When GAN’s generations are too fantastical, they are inevitably deemed as fake and wrong. GAN is in a constant struggle to find the balance between the real and the imaginary. Therefore, we decided to visualize GAN’s struggle by using the shadow representation of the distorted family outputs.
Animating over the latent space
Marrow is an interactive theater piece where the participants play the role of machine learning models in a family dinner setting. In the experience, a participant who represents GAN is telling their story about the difficulties they face in discerning memory from imagination — both of those perceptions are in fact distorted in GAN, so we decided to explore at this phase the additional layer of fantastical animated layer over the world of shadows, that would represent the character’s struggle between the real and the fake. We worked with the talented Paloma Dawkins , a master of hand-drawn animations and alternate dimensions. Now we had to ask ourselves: how do we orchestrate a workflow that starts in the mathematical depths of GAN, but ends with hand-drawn animations that perfectly match GAN’s latent movements across the image space? The answer came in the form of our custom-developed tool: Marrow GAN Explorer .
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
孵化Twitter
[美]尼克·比尔顿(Nick Bilton) / 欧常智、张宇、单旖 / 浙江人民出版社 / 2014-1 / 49.90元
一个在挣扎中生存的博客平台Odeo,一小撮龙蛇混杂的无政府主义者员工,经历了怎样的涅槃,摇身一变,成为纽交所最闪耀的上市企业Twitter? 一个野心勃勃的农场小男孩,一个满身纹身的“无名氏“,一个爱开玩笑的外交家,一位害羞而又充满活力的极客,这四位各有特色的创始人如何从兢兢业业、每日劳作的工程师,成为了登上杂志封面、奥普拉秀和每日秀的富裕名人?而在Twitter日益茁壮成长的过程中,他们又......一起来看看 《孵化Twitter》 这本书的介绍吧!