Not my ADC 2024 talk

I went back to delivering two talks this year. I’m not posting my open mic transcript this year because I made fun of a shady company with a thin-skinned founder. Not that one this time: a law of nature mandates that emotionally underdeveloped founders and shonky companies always come in pairs.

I’m not foolhardy enough to make my words that evening a matter of permanent record. Everything I riffed around is self-evidently true; anybody who’s been making audio toys for long enough, including me, has had at least one of their products cloned by them. But it’s a convention of our society that even a meretricious legal letter must be taken seriously. Being a solo engineer, I prefer, to use Rory Sutherland’s formulation, solving problems to winning arguments. Shallow pockets can be an advantage: it’s only worth suing the wealthy. Nevertheless, there are fights I don’t need to pick.

That said, I really did get some stickers made.

Stick them over somebody else’s swag to add a secondary level of sarcasm. Or clone them yourself for a tertiary level. Either way, second-class post is cheaper than defending a defamation lawsuit and pleasingly deniable. If you want some, let me know.

☙ ❧

My main talk this year was about taking over the Mantis synth from Chris Huggett. The product is out, people seem to like it, and it’s getting better bug fix by bug fix. (And what a lot of bugs creep into a project of that complexity: I’m grateful for testers even if the process can be maddening.) The video of that talk will officially be online at some point halfway through next year. The most immediately representative parts of my experience were, of course, the ones I chose to talk about. What I have here is some crumbs and gravy: diversions, corrections and so on that would still be jumping-off points for talks themselves.

Hey, if you don’t want a technical article, what are you even doing here?

Don’ts

Well, there aren’t any don’ts really: it’s your life. But at one point during this project I wasn’t much fun to be around. Being the only engineer on a polyphonic hybrid hardware synthesiser and up against a tight deadline, even if you’re pretty much born for the job, is just a bit too much.

Chris Huggett, industry luminary that he was, never seemed to have this problem when I knew him. He had at least one technical co-pilot within Novation, and usually more than one. For Mininova, it was me and a testing team.

Those were the days (back in 2011) when Chris was still using the processing platform he used throughout most of his Novation career: the Motorola 56000 family. At one point in the early 90s, everybody used this. I call it the Motorola, but they got bought by Freescale, who were bought by NXP, who were bought by Qualcomm … Today, the DSP56F series (as it’s now called) is a forgotten corner of their Galactic Empire and ‘not recommended for new designs’. There are cheaper and easier ways of getting the same horsepower these days. But it was a cool and quirky chip, and he owned the software on it.

The hardware platform

I did the user-facing part of Mininova: MIDI, control surface, menu tree, and parameter management stuff. This was on a separate ARM chip, so the synth engine from my perspective was a big lump of 24-bit memory to which I occasionally poked a parameter or two at Chris’s direction when something salient changed. The other thing I did was make up for a blind-spot of Chris’s which was electromagnetic compatibility [EMC]. Like anybody who has invented their own industry, Chris learned this trade by practice more than formal tuition. He developed his habits in a world where these regulations didn’t yet exist and, by the time they did, he was senior enough to let somebody else look after them.

I missed out the analogue revival part of his story at Novation. Chris moved from the Mininova (his last DSP56F project) to the Bass Station II, which is digitally-controlled analogue, and then the Peak, which apparently has an FPGA to run the oscillators. It seems strange to imagine Chris tackling a complex VHDL or Verilog project in the later part of his career, but that’s what engineers do.

I think I’m fairly safe in asserting that Mantis was the only project that Chris designed with the synth engine running on an ARM. But again, Circuit and Peak came after my time at Novation.

The code

As Chris was a tabs person and I’m a spaces person, I’d reformat his code as I assimilated it, but the changes run deeper than typography. To summarise, while all of Chris’s intentions endure, very little remains of his original code. There are a lot of reasons for this. As I mentioned in the talk, I had other ideas about how to use the chips he’d specified. The Cortex-M4 that runs the sound engine is the most expensive chip in the circuit. We bought it for sound, and it’s a rule of these products that the moment one frees up processor capacity, one finds a new use for it. So it seemed sensible to decouple it from the process of scanning a control surface, which requires mostly donkey-work that a far less capable chip can do instead.

It wasn’t that easy to wrench the synth engine away from the controls in practice. I was doing it against the deadline of Superbooth 2023. That felt like changing the tyres on a car while driving it. I just about made the deadline without breaking what was already there, but at the expense of not having the time to get the new audio converter running properly. The Superbooth prototype made a noise of sorts, but a very peculiar noise that was slightly out of tune owing to Chris’s eccentric choices of sampling frequencies and his habit of hand-baking these choices into derived constants. Paul had to style out my incomplete work during Superbooth and you can see this for yourself. After the new prototype made some really quite interesting sounds, the new chassis was used as a controller for Chris’s original proof-of-concept PCB.

At this point, let us celebrate having a boss who’s been around long enough and listens enough to understand how the development process works in practice. I’ve had managers who would hoik me into an office and call me names for what was effectively the fruit of multiple 80-hour weeks of progress. They were too self-absorbed to acknowledge somebody else’s sacrifice; too ignorant to regard an imperfectly-reached milestone as anything but a miss … Anyway, if I ever took a founder like Paul for granted, I certainly don’t now.

Some of my other changes were for reasons of efficiency: shaving off a few microseconds in places and opening the scope for extra versatility. Some was a result of migrating to a more modern development environment. When I started this project, we took the decision to spend about £1000 on a licence for the same environment that Chris used, which seemed like the cost of continuing his legacy, but it uses a C compiler that dates from about 2010.

New C compilers are magic and older ones are quite dumb. This would optimise by just unrolling every loop it could until I ran out of memory. But drawback of migrating to a more modern, less kludgy version of GCC was breaking changes. I couldn’t take the hacky USB drivers with me because they were sprawling and oddly compiler-specific, and I had to start again. It took a week, post-release, to write a new USB device driver. (You could do a lot worse than the excellent libusb_stm32, because ST’s own documentation for the device takes no prisoners. I rolled my own for the sake of compatibility with my existing protocol-layer stuff, but I found myself getting to know this library very well as a ‘what’ve I done wrong’ crib.)

The rest of the codebase ported in less than an hour, and I went from using all 128 kilobytes of program FLASH to using about 88 kilobytes with no loss of performance.

Here are the .c files I inherited:

I shared this source code with a friend who’d volunteered some evenings to help me with the first stages of taking it over: mostly finding which parts of the code were actually part of the project. He’s no small force in this industry, and I guard him jealously. But, after looking through the files for more than a quarter of an hour he asked, ‘Where’s the synth?’

25% of Chris’s code, and most of the salient parts, resided in two files called panel.c and synth.c, the functions of which were nothing like as clearly delineated as their filenames promise.

As you can see, main.c was accompanied by mainUN.c when I took over. Ultranova … There was other third-party code that Chris would have been allowed to re-use, and occasional comments referring to FPGAs in code that didn’t get used betray its pedigree (fpgadelaytime.c, anybody?) It’s a hack of past work that, I assume, would have been cleaned up later had Chris enjoyed better health. (test_cph.c uses Chris’s initials; test_nb.c, I’d assume, is a test routine for an unnamed synth by the inestimable Nick Bookman of Novation. Heaven knows what it was doing here.)

But much of the code was Novation’s: Chris had a licence to use it as a consultant, but we don’t. One reason for a massive refactor was to rebuild our way comprehensively out of any IP issues, while creating our own value in reusable modules. Stubs of code I inherited like menu.c and alpha_cat_gen.c were designed for synthesisers with text handling, imported wholesale for the sake of a few dependencies, so that patch names and other strings sprouted off our screenless synth like phantom limbs. They eventually disappeared without trace.

Refactoring when you’re not precious

Best practices for inheriting code from an absent person or team are well established because it happens all the time. There is a legion of respected books available (search Amazon for ‘legacy code’). But they all assume that you’re inheriting a codebase that has test coverage and that works. It’s bad practice, we all agree, to throw out working, tested code. That’s your company’s competitive advantage. If your code hasn’t yet been deployed, very few rules apply aside from your own taste and judgement.

There were areas where Chesterton’s Fence was actually appropriate: the most careful strategy. Parsing by eye, line by line. Manually eliminating unused variables and branches and checking that the thing still compiles. Renaming and commenting in multiple passes as you go. Nothing must be removed until we understand what it does, what its side effects might be, and why it’s written that way. A large part of the process of recursive tinkering is training yourself to appreciate what’s there. The reverb was exactly like that, and I’ve written reverbs before. It was the only part of the code I printed out and annotated by hand. In hindsight it would have been better to do a first pass through it with my cursor.

Some code was compartmentalised carefully so it could be addressed in isolation at my leisure: the usual middle way, and often a precursor to bringing something under unit-test discipline. Chris’s envelope implementation started out as very clear code. Most of my work on it simply modernised the C, clarified the maths, and gave it its own file.

Aside from the control surface, perhaps a third of the code was chuck-away-and-start-again: much of this is because I knew a better way, or already had some hardened library code that would do the same job. The oscillators were rewritten for commercial reasons. I talked at ADC about why I made this decision: we wanted a better palette of timbres and more control than the prototype allowed us.

The 99,665 bytes of synth.c became expressed as a number of single-purpose, self-contained, stateless-as-possible source files in a folder called synth/, which is the closest the C language gets to modularity.

The device-specific stuff ended up in a hardware/ folder, and the things I reuse for every project were, as always, in their own included library.

Abstractions are never perfect. Things like the joystick, the FLASH load-levelling for loading and saving synthesiser presets, the obligatory multipurpose buffers, all occupy a liminal space between the hardware and the synth engine. But they’re more properly abstracted than they were.

Incidentally, Chris had decided to generate the wave tables in run-time. This was the only part of his code that used floating point arithmetic (for the sine and cosine operations). I assume he did this either as a temporary measure so he could prototype different waveforms easily, or as a force of habit because the other synths he’d written had very little ROM and ran startup routines that built exactly 100 kilobytes of look-up tables. That used to be a lot. Whatever the reason, I built them ahead of compile-time with a Python script so they could be plucked straight from ROM.

Steal and borrow

The voice management code was the worst part I took over: a mess of side-effectful code that had been imported hurriedly from elsewhere, didn’t properly work, wasn’t the right shape for the synth, and would have taken ages to debug.

Voice assignment remains the biggest, messiest part of this firmware and the hardest to work on because routing notes to voices, and stealing and returning them when you have complicated polyphony rules and two MIDI inputs, an arpeggiator, and local off mode all stealing your attention, is always going to pinch a little.

The voice allocator was 300 lines long with 16 state variables and GOTOs: not bad per se, especially given its clear origins in a multitimbral synth, but hard to maintain. Here’s a representative part of it from about 2400 lines into synth.c, complete with loads of state variables and some references to where it’s been used before:

Jules’s first ever keynote, Obsessive Coding Disorder, sprang to mind. My fingers itched to put this right but one approaches code like this with some trepidation: every time you try to tidy it, you’ll add bugs.

During my time at ROLI, I evolved my style about as close as I dared to a C version of the JUCE coding standard because that’s what we followed when we wrote C++. Changes are cosmetic to begin with — that’s the point, they’re for people — but I was able to stop swearing at it and start fixing it once I got the voice allocator to look like this:

I’m still not sure about it: it’d be more efficient written in descending order of priority, but there’s probably a reason why I decided to run every path. This routine gets called only once per note, so there are bigger gains to be had elsewhere.

On a project this size (about 700 kilobytes of source code depending on how it’s counted), you can conduct your own code reviews. Several weeks after writing something, you have no recollection of having written it, no prior assumptions about how it works, and an outsider’s perspective on what you’re like to work with. Good code is written for humans first because a compiler doesn’t spend long enough in the code to care, and these days it’ll blithely optimise away abstractions that aid readability like dividing a long routine into standalone functions so that its structure and intent pops right out of the screen (as above). You get a very stark impression of the limits of your powers.

The voice management code was something I needed two or three good days to tackle. The original work was removed gradually from much of the codebase so it didn’t distract me, which meant that Mantis went from being a duophonic synth in Chris’s prototype to a monosynth until the last three months, when eventually I could face the work. It was not too hard once my fatigue had subsided enough to take it on: the writer’s block came in part from the quality of what I’d inherited. The very final features I made work were the ‘duo’ and ‘quad’ buttons.

Finally, polishing and maintenance aren’t sexy, but they make the difference between a mediocre business and a great one. A potentiometer usually needs a better law than the first guess so the synth responds more musically. Attacks and releases are particularly troublesome because they need meaningful control that spans three or more orders of magnitude of time: 10 milliseconds to 10 seconds, with the same subtlety of control at either end of the scale, is the goal. All this has to be ready before the sound designers can start writing presets, or their work is going to sound different every time they upgrade their firmware.

Fixing bugs can be a two-days-per-week job in the year after shipping but it doesn’t generate money: the advantage it confers is the long-term social capital of reputation. If you’re lucky and careful, many bugs you catch will result in library improvements that fix other projects, and do your future self a favour.

Finesse and fixing is much of the second half of any project. It helps to be a musician yourself because it’s really hard to specify precisely when something feels right.

PS. There’s a glaring bug in the code I flashed up in page 31 of the talk, fortunately with rather minor consequences. No prizes offered for seeing it, but it’s now fixed.

The OSC Advanced Sound Generator

At the end of my talk, I called the ASG prototype that we rediscovered in Chris’s loft the British Fairlight. Although it never saw commercial release, it was demonstrated several times in the mid-80s and there are a couple of published mentions of it that survive online. Here’s a ‘family photo’ of Chris’s early works (with an early Chris). It’s right in the middle of the shot, bulky and glowering with an EDP Gnat balanced on top of it.

The reverse image search doesn’t tell me where this was scanned from so I cannot attribute it. But this picture from my talk, taken by Paul last month, deserves re-posting:

I said during the talk that that this would not be worth resurrecting in anything like its final form. Of course it isn’t. There’d be little hunger in today’s music tech environment for a British answer to Fairlight, even as an app. What’s attractive isn’t the technology or workflow but the principle that gave rise to it. It’s a radical contribution to the ongoing conversation about what a musician’s creative process should look like in five years’ time. You’d have to hit ‘undo’ a few hundred times to create a customer base for 1985’s version of that conversation.

When I propose that the ASG lives on in Chris’s later projects, it’s no mere metaphor. I’d bet that the floppy drive from the front panel ended up in an Akai S1000 prototype. All the more reason for acknowledging the family tree of Chris’s creations that are spiritual successors to the ASG: the Akai samplers; the Supernova digital polysynth and its offspring; the impOSCar software instrument that was very sensitively informed by both OSCar and the ASG.

At some point, it’ll be a leisure project to get that box into a working state, if we can, and to see what vestiges remain of the world’s rarest workstation synth/sampler. I suspect its source code is lost, but Chris did keep some floppy disks from that era, so there’s a fighting chance we’ll find something even if the tools to build it have to be recreated.

If I ever get round to it, of course I’ll be blogging about it.

Follin from Grace

Now, I work in the synthesiser business because I wanted to be a musician, but also because I wanted to mess with electronics. This thing that I call a career is largely a consequence of being too scared to commit fully to either track, and I’m compelled to try to make a living from what I’d happily do for free. Even so, occasionally I get bored and research things like this.

In the last few months I happened across mentions of a familiar but rather obscure type-in music demo by a composer called Tim Follin, originally published in a British magazine in 1987. I thought this was an ancient curio that only I’d remember vividly from the first time around, with the unprepossessing title Star Tip 2. But apparently it’s been discovered by a younger generation of YouTubers, such as Cadence Hira and Charles Cornell, both of whom went to the trouble of transcribing the thing by ear. (You should probably follow one of those links or you won’t have the faintest grasp of why I’ve bothered to write this post.)

My parents — lucky me! — subscribed me to Your Sinclair at some point in 1986. It was the Sinclair magazine you bought if you wanted to learn how the computer really worked. So I was nine years old when the August ’87 issue flopped onto the doormat. One of the first things I must have done was to type in this program and run it. After all, I had my entire adult trajectory mapped out at that age. Besides, it’s a paltry 1.2 kilobytes of machine code. Even small fingers can enter that in in about an hour and a half, and you wouldn’t normally expect much from such a short program.

When I got to the stage where I dared type RANDOMISE USR 40000, what came out of the little buzzer was a thirty-eight second demo with three channels of audio: a prog rock odyssey with detune effects, crazy virtuosic changes of metre and harmony, even dynamic changes in volume. All from a one-bit loudspeaker. It seemed miraculous then and, judging by the breathless reviews of latter-day YouTubers with no living experience of 8-bit computers, pretty miraculous now. And the person who made all this happen — Tim Follin, a combination of musician and magician, commissioned by an actual magazine to share the intimate secrets of his trade — was fifteen years old at the time. Nauseating.

After three and a half decades immersed in this subject, my mathematics, Z80 assembler, music theory, audio engineering, and synthesis skills are actually up to overthinking a critical teardown of this demo.

The code is, of course, compact. Elegant in its own way, and clearly written by a game programmer who knew the processor, worked to tight time and memory budgets, and prioritised these over any consideration for the poor composer. This immediately threw up a problem of authorship: if Tim wrote the routine, surely he would have invested effort to make his life simpler.

Update: I pontificated about authorship in the original post, but my scholarship lagged my reasoning. I now have an answer to this, along with some implicit explanation of how they worked and an archive of later code, courtesy of Dean Belfield here. Tim and Geoff Follin didn’t work alone, but their workflow was typically crazy for the era.

From a more critical perspective, the code I disassembled isn’t code that I’d have delivered for a project (because I was nine, but you know what I mean.) The pitch counters aren’t accurately maintained. Overtones and noise-like modulation from interactions between the three voices and the envelope generators are most of the haze we’re listening through, and the first reason why the output sounds so crunchy. And keeping it in tune … well, I’ll cover that later.

Computationally, it demands the computer’s full attention. The ZX Spectrum 48K had a one-bit buzzer with no hardware acceleration of any kind. There is far too much going on in this kind of music to underscore a game in play. Music engines like these played title music only, supporting less demanding tasks such as main menus where you’re just waiting for the user to press a key.

The data

In the hope that somebody else suffers from being interested in this stuff, here’s a folder containing all the resources that I put together to take apart the tune and create this post:

Folder of code (disassembly, Python script, output CSV file, MIDI files)

The disassembly

The playback routine is homophonic: it plays three-note block chords that must retrigger all at once. No counterpoint for you! Not that you’d notice this restriction from the first few listens.

Putting it in the 48K Spectrum’s upper memory means it’s contention free. The processor runs it at 3.5MHz without being hamstrung by the part of the video electronics that continually reads screen memory, which made life a lot easier both for Tim and for me.

So there are the note pitches, which appear efficiently in three-byte clusters, and occasionally a special six-byte signal beginning 0xFF to change the timbre. This allows the engine to set:

  • A new note duration;
  • An attack speed, in which the initial pulse width of 5 microseconds grows to its full extent of 80 microseconds;
  • A decay speed, which is invoked immediately after the attack and shortens the pulse width again;
  • A final pulse width when the decay should stop.

This arrangement is called an ADS envelope, and it’s used sparingly but very effectively throughout. In practice, notes cannot attack and decay slowly at different speeds in this routine, because it alters the tuning too much.

Multichannel music on a one-bit speaker

The measurement of symmetry of pulse waveforms of this kind is called the duty cycle. For example, a square wave has high and low states of equal length, so a 50% duty cycle.

80 microseconds, the longest pulse width used in Star Tip 2, is a very low duty cycle: it’s less than four audio samples at 44.1kHz, and in the order of 1–2% for the frequencies played here. Low-duty pulse width modulation (PWM) of this kind is the most frequent hack used to give the effect of multiple channels on a ZX Spectrum.

There are many reasons why. Most importantly, it is simple to program, as shown here. You can add extra channels and just ignore existing ones, because the active part of each wave is tiny and seldom interacts with its companions. Better still, you can provide the illusion of the volume changing by exploiting the rise time of the loudspeaker and electronics. In theory, all a one-bit speaker can do is change between two positions, but making the pulse width narrower than about 60 microseconds starts to send the speaker back before it has had time to make a full excursion, so the output for that channel gets quieter.

The compromise is the second reason for the crunchy timbre of this demo: low-duty PWM makes a rather strangulated sound that always seems quite quiet. This is because the wave is mostly overtones, which makes it harmonically rich in an almost unpleasant way. Unpicking parallel octaves by ear is almost impossible.

The alternative to putting up with this timbre is to use voices with a wider pulse width, and just let them overload when they clash: logically ORing all the separate voices together. When you are playing back one channel, you have the whole freedom of the square wave and those higher-duty timbres, which are a lot more musical.

Aside from being computationally more involved, though, as you have to change your accounting system to cater for all the voices at once, you strengthen the usual modulation artifacts of distortion: sum and difference overtones of the notes you are playing.

So, on the 48K Spectrum, you have to choose to perform multichannel music through the washy, nasal timbre of low-duty PWM, or put up with something that sounds like the world’s crummiest distortion pedal. (Unless, like Sony in the late Nineties, you crank up the output sample rate to a couple of megahertz or so, add a fierce amount of signal processing, really exploit the loudspeaker excursion hack so it can play anything you want it to, and call the result Direct Stream Digital. But that’s a different story. A couple of academics helped to kill that system as a high-fidelity medium quite soon afterwards by pointing out its various intractable problems. Still, when it works it works, and Sony gave it a very expensive try.)

There’s one nice little effect that’s a consequence of the way the engine is designed: a solo bass B flat that first appears in bar 22, about 27 seconds in. The three channels play this note in unison, with the outer voices detuned up and down by about a twelfth of a semitone. We’re used to this kind of chorus effect in synth music, but the result is especially gorgeous and unexpected on the Spectrum’s speaker.

You don’t get many cycles in 0.2 seconds for bass notes, but here’s the three detuned voices in PWM, with the pulse troughs drifting apart over time.

A tiny bit of messing with the code

I didn’t do much with this code on an actual ZX Spectrum, but it’s possible to silence different voices on an emulator by hacking the OUT (254), A instructions to drive a different port instead. You can’t just change them for no-operations or the speed and pitch changes. POKE 40132,255 mutes channel one; 40152,255 mutes channel two; 40172,255 mutes channel three.

Python data-to-MIDI conversion

Are you running notes with a slow attack or decay? If so, the chord loop runs somewhat faster than it does when your code bottoms out in the envelope sustain stage. Are you now playing a higher-pitched note? Your chord loop now runs a little slower on average because the speaker needs to be moved more often, so all your other notes go slightly flatter.

The pitch of every note in this piece of code depends on everything else. Change one detail and it all goes out of tune. I had visions of Tim Follin hand-tuning numbers in an enraging cycle of trial and error, which would somewhat have explained his devices of ostinato and repetition. But it turns out from his archive, now online thanks to Dean Belfield, that he possessed some compositional tools that were jointly maintained by his associates. Having written the calculator in the opposite direction, I can confidently say: rather them than me.

To get the note pitches out accurately, you need a Spectrum emulator, so I wrote the timing part of one in Python. It counts instruction cycles, determines the average frequency of chord loops given the configuration of envelopes and pitches for every set of notes, and uses these to extract pitches and timings directly. The Python data-to-MIDI script takes the data straight from the Spectrum source code, and uses MIDIUtil to convert this to MIDI files.

The code generates three solo tracks for the three voices, follin1.mid to follin3.mid, and all three voices at once, as follin123.mid. The solo voices include fractional pitch bend, to convey the microtonal changes in pitch that are present in the original tune. (By default, the pitch bend is scaled for a synthesiser with a full bend range of 2 semitones, but that can be changed in the source code. Pitch bend is per channel in MIDI, not per note, so that data is absent from the three-voice file.)

The MIDI files also export the ADS envelopes, as timed ramps up and down of MIDI CC7 (Volume) messages. Because the music is homophonic, they are identical for every voice, meaning that the composite track can contain them too.

Microtonal pitch

There are some interesting microtones in the original data. Some of the short notes in the middle voice are almost 30 cents out in places, but not for long, and it seems to work in context. As does the quarter-tone passing note (is it purposeful? Probably not, but it all adds feeling) at bang on 15 seconds.

Cadence Hira’s uncanny transcription is slightly awry in bar 10 here: the circled note is actually a B-half-sharp that bridges between a B and what’s actually a C in the following bar. Meanwhile the bass channel is playing B throughout the bar. But she did this by ear through a haze of nasty PWM and frankly it’s a miraculous piece of work. Two things count against you transcribing this stuff. First, the crunchy timbre issues we’ve already discussed that stem from the way this was programmed. Second, the sheer speed of this piece. Notes last 100ms: your ear gets twenty or thirty cycles of tone and then next! If you’re transcribing by ear you have to resort, as Cadence has, to filling in the gaps with your music theory knowledge of voice leading.

Most of the other notes are within 5 cents of being in tune. Once you expect them, you can just about hear the microtones in any of the multiple recordings of the original (example), but only if you slow them down.

A big CSV file is also generated for the reader’s pleasure, with:

  1. The time and duration of every note in seconds;
  2. The envelope settings for each note;
  3. All the interstitial timing data for the notes in T-states (more for my use than anybody else’s);
  4. Exact MIDI note pitches as floats, so you can inspect the microtonal part along with the semitones.

Because the detail in the original tune was quite unclear, it wasn’t possible to be sure whether the quirks in tonality were intentional. Originally I did not bother to include microtones, but added them to the MIDI a week after posting the first draft.

I’ve now satisfied my initial hunch that the microtones aren’t deliberate, but a phenomenon of the difficulty of keeping this program in tune. They also happen to be quite pleasant. But going beyond conventional tonality was not essential to appreciating the music, or to recreating the intentions of the composer.

Getting the timing right

Including exact note timings has proved interesting for similar reasons to tonality: the whole notes (semibreves, if you must) are about a sixteenth-note (a semiquaver, if you insist) shorter than intended because of the disparity in loop speeds. That is definitely unintentional because the specified durations are round numbers, with no compensation for different loop speeds. Again, the feeling of a slightly jagged metre works well in context.

But the care I have taken in accountancy seems to have paid off: you can compare this MIDI against an emulated Spectrum and the notes match, end to end, with a total drift of less than ten milliseconds.

The exact reconstruction: mix 1

As a way of avoiding real work, here’s an ‘absolutely everything the data gives us’ reconstruction of Star Tip 2 — microtones, envelopes and all — played through a low-duty pulse wave on a Mantis synth.

In other words, this is simply the original musical demo on a more advanced instrument: one that can count cycles and add channels properly. The only thing that I added to the original is a little stereo width, to help separate the voices.

That B flat unison effect (above) is unfortunately a wholly different phenomenon when you’re generating nice band-limited pulses, mixing by addition, and not resetting phase at the beginning of every note. It’s gone from imitating something like a pulse-width modulation effect (cool) to a moving comb filter (less cool).

The quick and dirty reinterpretation: mix 2

This was actually my first attempt. My original reluctance to do anything much to the tune means I didn’t labour the project, using plain MIDI files with no fancy pitch or articulation data.

But, because I’m sitting next to this Mantis, I put a likely-sounding voice together, bounced out the MIDI tracks, rode the envelope controls by hand, and ignored stuff I’d have redone if I were still a recording musician.

Adding a small amount of drive and chorus to the voice creates this pleasing little fizz of noise-like distortion. It turns out that some of that is desirable.

Epilogue

Now I’ve brought up the subject of effects and production, neither of these examples are finished (or anywhere near) by the production standards of today’s computer game soundtracks. But I’m provoked by questions, and I hope you are too. First, philosophy: could either mix presume to be closer to the composer’s intentions than the original? Then, aesthetics: does either of these examples sound better than the other?

This brings me to the main reservation I had about starting this project in the first place: that using a better instrument might diminish the impact of the work. Star Tip 2 continues to impress because of its poor sound quality, not in spite of it. Much of its power is in its capacity to surprise. It emerged from an unassuming listing, drove the impoverished audio system of the ZX Spectrum 48K to its limit, and was so much better than it needed to be. But the constraints that dictated its limits no longer exist in our world. An ambitious electronic composer/engineer would need to explore in a different direction.

Exploring and pushing technical boundaries, then, is not the only answer. An equally worthy response would be a musical one. Twenty-four years after Liszt wrote Les Jeux D’eaux A La Villa D’este for the piano, Ravel responded with what he’d heard inside it: Jeux d’eau. One great composer played off the other, but the latter piece turned out to be more concisely expressed, more daring, and quickly became a cornerstone of the piano repertoire. It’s hardly an exaggeration to say that it influenced everybody who subsequently wrote for the instrument. (Martha Argerich can pretty much play it in her sleep, because she’s terrifying.)

I’m not a composer, though, and definitely not one on this level. A magnificent swing band arrangement of a subsequent Tim Follin masterpiece? Somebody else’s job. Designing the keyboard used in that video? Creating the Mantis synth used above? Wasting a weekend on an overblown contemplation of a cool tune I typed in as a child? Definitely mine.

ADC Open Mic 2023 : Where the ideas come from

When an artist produces a work about an artist producing a work, it’s hard not to detect a cry for help.

My dearest cousin Geoffrey,

I have run out of the food of inspiration, and am now digesting myself. Am going quite spare. Any crumb of an idea that you might spare me, might spare me.

It occurs to me that this postmodern fad for self-reference in literature might be getting rather stale.

Yours etc.,
my dearest cousin Geoffrey.

The question ‘Where do your ideas come from?’ is an inside joke among writers. An oyster can’t tell you how to make a pearl. The grit gets in, who knows how, and the rest is nature.

But a creative endeavour needs an irritant or stimulus of some kind: the thing that gets you to the point where gradually revealing the work is enough to propel you forward.

Chuck Close was (until recently) a painter, but he might be just as famous for giving the world an aphorism: ‘Inspiration is for amateurs: the rest of us just show up and get to work’. It suggests, if you’re suitably attuned, that you can just pluck a starting point out of background noise.

Here’s one starting point:

A couple of years ago, people couldn’t go on holiday, so they spent their holiday money on guitars and microphones and plug-ins. And all was well, as long as our loved ones stayed alive, and we didn’t need microchips to build hardware with.

This year, the music tech budget is right back on holidays. Or it’s blown on something frivolous and stupid, like not freezing to death in winter. Everybody here is working hard to get back to where we were. And we’ll probably end up there anyway, but not by being complacent or losing ground.

A big consequence of having a slow year is that it makes public companies cheaper to invest in. Year by year, it’s getting more probable that any given person in this room has spent part of their waking life in the service of a private equity company, who have decided, in their own language, to go long.

These companies are like buy-to-let landlords in London, who convert every cubic foot of enclosed air into a mezzanine with a mattress on it. Like landlords, there are many exceptions, but not enough to soften the stereotype.

Inside a PLC, everything long-term, everything commercially risky, every corridor and stationery cupboard, and all but the most predictable project with the nearest horizon is under pressure to deliver a rapid return, or else be dissolved and absorbed into a more immediate hit.

Quarterly growth targets. They’re what inspired so many of us to get into audio.

The pressure towards banal uniformity in everything, everywhere, is well documented. This is just one of many causes, and ought to make us a bit angry. But I’m done as an employee, and I’m done being angry about things I can’t change, and experimental data suggests that being able to sit on a stage and whine about banal uniformity is a hard-won and delicate privilege.

Besides, when short-sightedness and inertia overtake a competitor it’s a great day for me.

But, as deadly sins go, anger is a powerful creative force. There’s always been money in anger, as demagogues and journalists know, and now there’s a whole corner of the tech world that exploits it to the extent that it’s called the rage economy.

Especially there, though, we end up with banal uniformity. For all the buttons it pushes in our brain stems, social media is flatly unsatisfying and becoming more so. A restaurant where all the meals are free because someone’s chewed them already. And then taken a cocktail stick and written ‘Try Grammarly’ in the dribble.

Who else here has deactivated a social media account in the last few weeks?

[hard to tell with the lights shining in my face and the auditorium unlit, but perhaps 10-15% of hands go up]

I’m left with LinkedIn and my God.

‘My company has a new product out. In case you missed my previous fifteen posts over the last two weeks, here’s another silent video of me playing with it in my home studio.’

‘Do you want a solution that’s truly unique and crafted to your business? We’ve got a warehouse full of the bastards.’

Is there a word for grudging capitalism? An acceptance of our fate in a bigger machine, like a stoned Karl Marx? A bearded Victorian bellowing ‘Workers of the world! Could we just keep money but, like, stop being such dicks about it?’

The second big trend that’s changing our world is jumping out of every surface here so enthusiastically that I barely need to introduce it. In common with the other innovations that have turned our industry inside out since the last time I laced a tape, machine learning changes the nature of inspiration, expression, and the art itself.

The tools for wielding it are getting so accessible that we’re running out of excuses not to play with them.

Computers aren’t creative in the same way we are, and may never be. But it doesn’t matter. Crafters romanticise the process, but success is mostly about the product. Other animals display whimsical creativity; we’re just the only primate that can hold a paintbrush properly. There’s no reason why creativity has to be the sole preserve of organic chemicals either.

So I’m going to leave you with this. I asked a friend what I should say tonight, and she then asked ChatGPT for ‘a short closing speech on the theme “keeping your enemies closer”. Intended for an audience of audio engineers who are very intelligent. Make it funny.’

For some reason, ChatGPT responded in the voice of P T Barnum on crack, getting hung up on alliteration. I’m going to spare you most of the words. Rest assured that writers are safe for a couple more months at least. It seems weird to give the final say to a large language model, but the closing words deserve centre stage, and we should probably get used to this.

May our mixes be clear,
and our enemies near
— but not too near: we don’t want feedback.

Thank you.