Notes from talk with Martinus about AI music neural nets.

Question: How do I train an AI to play experimental music with bridges?

Answer: You can use my algorithm!

Drilling down into the details, what file formats, where do I actually put them..

Does the AI need metadata? Apparently not as the system reads the sound files visually by spectogram. I have SO MUCH TO LEARN!!

Starting with the data compilation for now, listing resources for later review.



other links spiralling out from Martinus’ original conversation

Open AI
Attention is all you need

Music Transformer:


Open AI JukeBox neural net intro

Now You Can Generate Music From Scratch With OpenAI’s NN Model – Analytics India Magazine One of the popular AI research labs, OpenAI has been working tremendously in the domain of artificial intelligence, particularly on the grounds of neural networks, reinforcement learning, among others.Just a few days back, the AI lab introduced Microscope for AI enthusiasts who are interested in exploring how neural network work.. And now the audio team of OpenAI has introduced a new machine …

Deep Music Visualiser
GAN deep music sound and latent space

Timbre Latent Space

[2008.01370] Timbre latent space: exploration and creative aspects
Recent studies show the ability of unsupervised models to learn invertible audio representations using Auto-Encoders. They enable high-quality sound synthesis but a limited control since the latent spaces do not disentangle timbre properties. The emergence of disentangled representations was studied in Variational Auto-Encoders (VAEs), and has been applied to audio.

GANsynth Magenta making music with GANs

GANSynth: Making music with GANs How does it work? GANSynth uses a Progressive GAN architecture to incrementally upsample with convolution from a single vector to the full sound. Similar to previous work we found it difficult to directly generate coherent waveforms because upsampling convolution struggles with phase alignment for highly periodic signals. Consider the figure below: The red-yellow curve is a periodic signal …

GanSynth Demo

Maestro dataSet

Music Transformer – generating music with longterm structure

Music Transformer article

Kaggle data bridges not walls

The MAESTRO Dataset and Wave2Midi2Wave
MAESTRO (MIDI and Audio Edited for Synchronous TRacks and Organization) is a dataset composed of over 172 hours of virtuosic piano performances captured with fine alignment (~3 ms) between note labels and audio waveforms. This new dataset enables us to train a suite of models capable of transcribing, composing, and synthesizing audio waveforms with coherent musical structure on timescales …
The Deep Music Visualizer: Using sound to explore the latent space of BigGAN | by Matt Siegelman | Towards Data Science
A deep music video. Want to make a deep music video? Wrap your mind around BigGAN. Developed at Google by Brock et al. (2018)¹, BigGAN is a recent chapter in a brief history of generative adversarial networks (GANs). GANs are AI models trained by two competing neural networks: a generator creates new images based on statistical patterns learned from a set of example images, and a discriminator …

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s