The digital representation of modular synthesizer signals

This page talks about the internal digital representation of analog signals in digital (microcontroller based) Eurorack modular synthesizer.  Most signals outside of the module are analog, so these signals need to be translated into the digital domain when entering the module, and back into the analog domain when exiting the module. Digital signals like triggers, gates and clocks are already digital, and so are easier to convert (if conversion is needed at all).

Why digital?

There was a time when all the circuitry in a modular synthesizer was analog, and some users still prefer to keep their systems mostly in the analog domain in order to avoid the artifacts and compromises that digital processing can introduce. This is a reasonable point of view, and something to keep in mind when converting signals – quality has to come first.  What you are doing is a compromise, and along the way you may introduce unanticipated problems. The only reason to use digital is to do things that analog circuitry can not handle.

Digital issues

Current microcontrollers operate at two different voltage levels:

  • 5 volts for older microprocessors
  • 3.3 for newer generation lower power microprocessors

This means that the circuitry that translates signals for the analog to the digital domain must be different depending on the voltage of the microprocessor; for example if you have an audio signal at -5 to +5 volts, and a microprocessor analog input at 0 to 3.3 volts, you will need specific circuitry to reduce and offset the signal so that the microcontroller can understand it.  This might make a good subject for a further page.

Fortunately, the internal representation of the signal remains the same, so in this sense it does not matter what voltage your are running at.  a 44K, 16 bit deep audio signal is always represented by the same numbers no matter what levels the Analog to Digital converter expects.

Also remember that all signals need to be converted in as well as out of the module.  No matter how good the conversion, each pass will change the data is some way, and the more conversions are made, the more change enters into the signal, otherwise known as distortion. This is, of course, one of the reasons that all analog synthesis is so popular.

There are limits to the ears ability to hear sound. Much research has been done on the subject, but now that we all have synthesizers and computers it is fairly straightforward to test these (and our) assumptions about what sounds good. When converting analog audio to digital audio it is always best to actually implement a range of solutions and compare them to determine the quality of the experience.

What technologies exist to translate from the analog to the digital world?

ADCs

An analog to digital converter (ADC) is used to bring analog signals into the digital domain.  Most microcontrollers include ADCs as part of their feature set.  ADCs can also be added in the form of external chips that communicate with the microcontroller over a serial line giving potentially higher resolution and signal quality.

DACs

A digital to analog converter (DAC) is used to take the digital signal and turn it back into an analog signal.  Many microcontrollers have built in DACs, but many use a technique called Pulse Width Modulation (PWM) to simulate analog output using only digital I/O. True DACs can also be added in the form of an external chip that can communicate with the microcontroller over a serial line providing much improved signal quality.

Digital I/O

All microcontrollers have digital I/O; these are pins that can be configured to read or write a digital signal.  These pins are used for the digital signals associated with Eurorack modules.  The can also be configured to produce PWM, a pseudo- analog signal mentioned above.  More digital i/o can also be added in the form of an external chip that can communicate with the microcontroller over a serial line.

What parameters determine the quality of the signal created by ADCs and DACs?

Assuming that the external circuitry is well designed, and the ADCs and DACs are stable and of high quality, two parameters determine the quality of the converted signal:

Sample rate

This is how many samples can be converted in a second and still have processor cycles left over for other processing.

Bit depth

This is how much information will be included in the conversion.  The converter produces a digital number, and the more bits in that number, the better quality the signal:

  • 8 bits allows for values of 0 to 255
  • 10 bits provides values from 0 to 1023
  • 16 bits provides 65536 states, from o to 65,535

Different types of signal require different sample rates and bit depth, determined by the amount of information in the analog signal.  Typical audio (at CD rates) runs at a sample rate of 44,000 times a second, and a bit depth of 16.  On the other hand, 1 volt per octave pitch control voltage signals require much less resolution; in almost all cases sampling at 1000 times a second with a bit depth of 12 would be enough that no listener could hear the difference.

In the topic What signals are used in modular synthesis? the following types of signals were identified, and their purpose was briefly described.  Each of the signals has different acceptable ranges depending on what their purpose is.

Signal types

Analog audio

  • Audio rate AC
  • LFO rate audio AC
  • Noise AC

Analog control voltage

  • AC LFOs
  • DC envelopes
  • DC 1 volt per octave pitch signals

Digital

  • Trigger signals
  • Gate signals
  • Clock signals

Analog Audio

Analog audio is perhaps the simplest format to discuss simply because we have so many examples in our day to day life – CDs, MP3s, digital TV all use high resolution digital audio.

As shown above, analog audio comes in three forms in the modular synthesiser environment:

Audio rate audio

16 bit, 44,100 samples per second recommended.

A CD contains 16 bit audio at a sample rate of 44100 samples a second.  8 bit chiptune sounds proves that less resolution introduces artifacts.  Sometimes those artifacts can be useful and provide interesting color, but sometimes they just provide unwanted distortion.  In general it is a good idea to target CD rates – you can always downsample and convert to a lower resolution if you want to.  Strange as it sounds, noise should also be sampled at this rate to preserve it’s character.

LFO rate audio

16 bit, 4,000 samples per second recommended.

These signals appear much like audio rate signals, with the exception that they are slowed down. LFOs typically don’t run faster than 2o hertz, and a single cycle probably needs at least 200 samples per cycle to avoid artifacts.  That gives a maximum sample rate of 4,000 samples per second.  The bit depth does not change from audio rate data, remaining at 16 bits.

Noise

16 bit, 44,100 samples per second recommended.

Noise can be used as part of an audio signal or as a control voltage, and the resolution needed varies depending on the application.  Since audio has the greatest demands, using audio sample rates will do the best job of preserving the nature of the signal.

Analog control voltage

Bipolar LFOs

16 bit, 4,000 samples per second recommended.

This has already been discussed in the LFO rate audio section.

DC envelopes

10 bit, 1000 samples per second recommended.

An envelop can change pretty rapidly, or very slowly.  Although it would take experimentation to confirm, a sample rate of 1,000 is probably sufficient.  The perception of changes in loudness has lower resolution than pitch – again experimentation will produce an acceptable value.  10 bits (1024 states) should be enough, and 8 bits (256 states) would be too little.  An envelope needs to sound smooth, and 256 steps is just to little to achieve the effect.

DC 1 volt per octave pitch

12 bit, 4,000 samples per second (or less) recommended

Pitch is challenging because the ear can perceive very small pitch changes over a very large frequency range, around 30 hz to 20K hz.

Just-noticeable difference (jnd)

The sensitivity of the ear to changes in pitch changes over time is logarithmic, unlike 1 volt per octave control voltage, which is by definition linear.  This means that the ear is more sensitive to pitch change in the higher frequencies.  Therefore more information is needed to encode pitch at high frequencies than is needed at low frequencies.

The jnd for complex musical tones such as you find on a synthesizer is (according to Wikipedia) is about 1 Hz. Since the number of Hz per interval is logarithmic, more resolution in control voltage is needed at higher frequencies.  In addition, when two tones are played at the same, sensitivity increases even more and the jnd becomes even smaller. You can experiment with this by your self using a guitar, a slide and a way of accurately measuring string length.

Since higher resolution is needed at higher frequencies, it makes sense to calculate bit depth at a high reference point.

Doepfer specifies an 8 octave range (0 to 8 volts), however practically speaking most modules settle for a 5 volts. Let that range cover octave 2 through 6 in the image below, and take the difference between e and f (a semitone, 1 12th of an octave) there is a range of 1318.5 to 1396.9, or a difference of 78.4 hz.

fadb-f2

Chart from Frequency, Amplitude & dB by Rod Elliott (ESP)

If we take 1 hz as the jnd (this should really be confirmed experimentally), that means that we need around at least 80 steps to the semitone to give a smooth, accurate representation of pitch for this octave.

There are 62 semitones in 5 octaves, so this means that ideally we would have around 4960 digital steps over the 5 volts that represents 5 octaves.

10 bits = 1024 steps (16.5 steps per semitone)

11 bits = 2048 steps (33 steps per semitone)

12 bits = 4096 steps (66 steps per semitone)

13 bits = 8192 steps (132 steps per semitone)

etc. to

16 bits = 65536 steps (1057 steps per semitone)

This means that the minimum bit depth for accurate pitch control voltage would be 12 bits.  This is also a common value when shopping for ADC chips, although it is less common in microcontrollers. 16 is much more common, and would produce a very accurate representation of pitch.

Sample rate is probably most critical when representing glissando between pitches.  The 4000 hz rate suggested above for LFOs is probably enough.  Again, this should be confirmed experimentally.

Digital Signals

1 bit over 2oo samples a second.

Bit depth is 1.  In MIDI sequencers a temporal resolution of over 2oo times a second is considered good.  That criteria probably applies here as well.  This should be adequate for:

  • Trigger signals
  • Gate signals
  • Clock signals

Related topics:

Using the Arduino for electronic music #1

What signals are found in modular synthesizers?

 

Advertisements