xy-pad/index.html

<!DOCTYPE html>
<html lang="en">
	<head>
		<meta charset="UTF-8" />
		<meta http-equiv="X-UA-Compatible" content="IE=edge" />
		<meta name="viewport" content="width=device-width, initial-scale=1.0" />
		<title>XY PAD Example</title>
		<link rel="stylesheet" href="style.css" />
		<script src="pad.js" defer></script>
		<script src="distance.js" defer></script>
	</head>
	<body>
		<h1>Trail~hunger~riverbed~dust</h1>
		<p>Using semantic distance as an instrument to modulate sound effects</p>

		<h2>XY PAD</h2>
		<p>A way to control some parameters using spatial information.</p>
		<ul>
			<li>Movements on the X axis control the color</li>
			<li>Movements on the Y axis the size</li>
		</ul>
		<div class="pad" id="xy-pad"></div>
		<div class="target" id="xy-target"></div>

		<h2>Distance</h2>
		<p>
			The same principle can be used using the distance from an element. <br />
			Here we are using the distance between the word <em>Corner</em> and the cursor.
		</p>
		<div id="distance-pad">
			<span id="distance-round">Corner</span>
		</div>
		<div class="target" id="distance-target"></div>

		<h2>Semantic distance</h2>
		<p>
			The same idea could be applied with the semantic distance between two words. <br />
			In order to do that we can work with
			<a href="https://machinelearningmastery.com/what-are-word-embeddings/">
				Word Embedding
			</a>
			&
			<a href="https://towardsdatascience.com/word2vec-explained-49c52b4ccb71"> Word2Vec </a>.
			With this word embedding thing words can be transformed in numbers (actually in
			<a href="https://en.wikipedia.org/wiki/Vector_(mathematics_and_physics)">vectors</a>,
			aka numbers with a direction and an intensity)
		</p>

		<p>
			<a href="https://semantle.novalis.org/">Semantic distance?</a> ← A nice game and
			explanation. <br />
			Here you need to guess a secret word with the only hint of how distant your guesses are.
			(Cannot find the source code anymore, it was on gitlab 1 month ago)
		</p>

		<p>We can build an instrument that uses something like this.</p>

		<!-- <h2>Bonus</h2>
		<p>
			As far as I understood with the technique above we can also do something called
			dimensionality reduction. The relations between words are complex and high dimensional,
			meaning that they are organized in a high dimensional space. If we reduce this space to
			three, we can see a cloud of these words organized in 3D clusters. If we reduce this
			space to two we can print the words organized on a sheet of paper. Something like this.
		</p>
		<img src="https://miro.medium.com/max/1400/1*vvtIsW1AblmgLkq1peKfOg.png" /> -->

		<p>
			With this idea we can draw different maps to use with the same instrument with different
			results. Need to elaborate on this a bit!
		</p>
		<h2>Audio routing?</h2>
		<p>
			This instrumet (or effect?) ideally should work in real time, affecting a signal and
			modulating the effect when a new word is spoken. Apparently Python is not really great
			for manipulating digital sounds in real time because it's not fast enough. Ok. So we can
			rely on something else like SuperCollider or PureData or whatever and using Python as an
			MIDI or OSC controller.
		</p>

		<p>The routing could be something like:</p>

		<ul>
			<li>Choose a reference word</li>
			<li>Input a new word with Speech to Text with Vosk</li>
			<li>
				How distant are the reference word and the new one? (word embedding & semantic
				distance)
			</li>
			<li>
				Result is a number in percentage 0% out of range - too distant, 100% reference and
				new word are the same word
			</li>
			<li>
				The result is sent to a platform for audio synthesis with a protocol like OSC or
				MIDI
			</li>
			<li>In the audio engine the routing is something like:</li>
			<li><code>audio input --> effect -->audio output</code></li>
			<li>
				The OSC message from the semantic distance script modulates some parameters of the
				effect, for example the intensity of a reverb.
			</li>
		</ul>
	</body>
</html>