Espaço Latente [latent space], 2021
project for a spatial installation
binaural electroacoustic study, 3'08''.

Available at:

The Latent Space (2021) project was born from the motivation to experiment with the concept of latent space as a structuring basis for a spatial installation piece. Although it is only currently presented as a electroacoustic short piece, the ultimate goal of the work is to integrate different technologies to create an immersive sound installation. In the context of this piece, the topology of a latent space (developed in the software AudioStellar) refers to a multidimensional space containing variable values ​​in space and time, which makes it possible to encode an internal and meaningful representation of sounds of a large set of playable samples (image on the left).

Emphasizing the notion of topology as a metaphor for thinking the piece structure, the composition project focus on elaborating a listening experience that oscillates between functions of continuity and fragmentation (cuts) – presenting the listener with different perspectives a sonic scene, so that they can be immerse into the diversity of materials that make up the final piece. As part of the composition process, the Self Organizing Maps (S.O.M.) produced by AudioStellar was used as the main tool to “navigate” this latent space and score the gestures, textures and transitions manipulated during the composition.

Through proper editing procedures and juxtaposition of sound masses and dynamic moving gestures, the composition project also emphasizes the hybrid coral sound synthesized using the neural network SampleRNN. The constant fragmentation of the composition emphasizes a structure that at times presents an immersion into sound textures in a broader space – produced from human voice timbre – and the contact with disruptive (mostly electronic) moving gestures. Hence, marking reference points in the spatial sonic environment and the topology produced by the S.O.M. For the spatialization project (implemented in PureData), the fragmentation of the sound scene was achieved by coordinating both synchronous, and otherwise, the OSC between AudioStellar and the binaural engine used (bottom left image).

The sound material used in the electroacoustic study can be divided into three categories:
1. Recordings of Gregorian chants and Renaissance polyphony. Part of this material is used in structural points in a fragmented way in the first minutes of the piece and is also central for the sonic textures in the middle and end of the piece.

2. Sounds of bells. A set of sounds with samples of 5 seconds each was created, then a latent space of this set was produced using the AudioStellar program. This bell sound is evident in the last texture of the piece (among the vocal sounds of Gregorian chant).

3. Sounds synthesized using the PRiSM-SampleRNN neural network – trained from the dataset of polyphonic chants (item 1). After having trained the model for 145 epochs, reaching an accuracy of around 65% in the modeling of the network, 20 minutes of sound were synthesized and then edited into 5-second samples. This material is central in some of the sections of the beginning and middle, as well as in the more electronic and noisy timbre of the disruptive gestures throughout the piece.

As a further development of this project, I also collaborated with artist and programmer Andrey Koens to develop a software in Python that manipulates sound spectrograms that turns them into digital images that deconstruct the linearity implicit in traditional spectrogram representations. All sound manipulations, training of the neural network, composition and programming (in AudioStellar and PureData) was done by me.

Here are some of the spectrogram visual deconstructions based on a sample taken from the audio: