|
Programming the PC Speaker, part 2 | |
| Phil Inch, Game Developers Magazine | ||
DOWNLOAD
... The sound player mentioned in this article is contained in the file
VOC-IT.ZIP (17,174 bytes) which can be downloaded by clicking
this disk icon.
DOWNLOAD
... The waveform viewer mentioned in this article is contained in the file
WAVEFORM.ZIP (8,526 bytes) which can be downloaded by clicking
this disk icon.
Introduction
I'm sure you're curious to know what we're going to learn this
issue, and I'm not going to keep you in suspense any longer ...
As promised, we're going to make some digital sound effects! We're going to do this using the program attached to this article, "VOC-IT" which is capable of playing Soundblaster .VOC files. You can download this program and the source code by clicking on the above icon!
But before you run it, some warnings - read these or (maybe) suffer the consequences ...
* If you're running from within Windows, or any other "shell", it's probably best that you exit now and run VOC-IT from Dos. The way Windows shares processor time between applications is not compatible with VOC-IT.
* On some machines, particularly portables which always have crap little speakers, this sound may sound really awful, or may be completely inaudible. There's not a lot I or anyone else can do about this, I'm afraid.
* The real-time clock is STOPPED during playback. This is not normally a problem but if you play a lot of sound files your time-of-day will steadily become more and more wrong. I've yet to work out a truly satisfactory way around this.
* Finally, if you're at work, this sound effect might be loud so be sure no-one's around ... you can't stop the playback once it's started.
(Note that it can only play VOC files up to the limit of available memory, some some of the really large VOC files are beyond its capacity ... if I work out a way around this, I'll publish an update).
I've also just discovered it can play .WAV files, and in fact it may be able to play lots of other file types also - you'll just have to experiment. You can even play .EXE files and listen for satanic messages in your copy of Word for Windows (grin).
This issue, we're going to use timer 2 again, but in a way which allows us to play back "digital" sound effects.
To quickly refresh your memory, we set timer 2 up using a countdown value which dictated the frequency at which timer 2 oscillated. We then "connected" timer 2 to the speaker, meaning that whenever timer 2 oscillated, the speaker would "click", thus producing the tone.
A waveform diagram is like a graph. It consists of two axes. The Y axis (up and down) represents the amplitude of the wave, and for our purposes it's the current position of the speaker cone. That is, a point low down on the graph represents the speaker cone at or near the rest position, and a point high on the graph represents the speaker cone at or near its maximum possible extension.
The X axis (left to right) is "time", so if you trace the waveform with your finger from left to right, you're roughly tracing the path of the speaker cone as the voltage across the speaker is modified.
The waveform for tones like we generated last issue looks something like this:
FULL EXTENSION * * * *
| * * * * * * * *
| * * * * * * * *
| * * * * * * * *
| * * * * * * * *
+----*-*------*-*------*-*------*-*------
| * * * * * * * *
| * * * * * * * *
| * * * * * * * *
| * * * * * * * *
REST POSITION *** ****** ****** ****** ******
You should be able to clearly see the "pulses" produced by timer
2 oscillating. But what's physically happening to the speaker?
When a voltage is applied to the speaker, the cone starts to move outwards which is represented on this diagram by the wave (the line of *'s) moving upwards.
When the cone reaches full extension (or the amount of extension caused by the applied voltage, as you'll see), the wave stops at the top.
When the voltage is removed, the cone starts to move back to the rest position, which is shown as the wave moving downwards until it reaches the bottom.
The important point from this diagram is, the speaker cone can only exist for any length of time in two positions - rest, when there is no applied voltage, and full extension, when a certain voltage is being applied.
All positions between these are just "passed through" as the speaker moves from rest to full extension and back again. This is the kind of wave we created last issue using timer 2. The PC speaker can be connected to 0V (rest) or 5V (full extension) by timer 2. We can't apply 2.5V or any other voltage in between.
Now, let's look at a possible wave form for digital sound:
FULL EXTENSION ** **
| * * * * *
| * * * * * * * *
| * * * * * * * *
|* * * * * * *
+------------*---*-*--------*-------*-*------
| * * * * * *
| * * * * * *
| * * * * *
| * *
REST POSITION
As you can see, the waveform is very rough and does not follow
any regular pattern. In addition, the speaker cone can be made
to rest in any position by applying a fraction of the voltage
required for full extension.
This relationship is not linear, that is, applying half the voltage does not necessarily mean that we end up with half extension, but for our purposes now we'll assume it does.
(If you have trouble understanding this, watch the bass speaker on your stereo and you'll see the cone moving in and out (from rest to fully extended). You'll notice that the distance the cone travels out from rest varies with the music, and this movement corresponds to the voltage being applied.)
When playing digital sound, we want to be able to move the speaker to positions between rest and full extension, which means we need to apply a fraction of 5V, which the PC does not allow us to do directly.
But how do we know when the cone has reached the position we want? We rely on the fact that the time taken for the cone to travel from rest to full extension is approximately 60 millionths of a second (!!!), so we just have to apply the voltage, wait for a fraction of 60 millionths of a second, then turn the voltage off.
For our purposes, we'll assume that the fraction of 60us (us=microseconds, or millionths of a second) required is the fraction of full extension we want. That is, we'll assume that the speaker moves at a constant rate from rest to full extension.
For example, after half the time required for full extension (30us), we'll assume that the speaker has travelled half the distance to full extension.
Easy, huh!
In order to make digital sound, we will need to reprogram timer 2 to keep the voltage applied to the speaker for a certain amount of time, which we will dictate by giving a countdown value just like we did last issue when we were playing notes, and to then turn it off.
The timer will do this once only as opposed to doing it repeatedly as it does when playing a continuous note (like last issue). In this way we will control exactly the amount of extension (movement) the speaker undergoes.
Repeatedly doing this process produces what's known as a "square
wave". Can you see where it got its name?
We know that it takes about 60 millionths of a second for the
speaker cone to move from rest to full extension, and we're
going to assume that we can vary the amount of extension by
varying the time the voltage is applied.
We also know that timer 2 is driven by an external oscillator
which runs at 1,193,180Hz (remember the theory from last
issue?). This means that it reduces the countdown value
1,193,180 times every second, or once every 0.83us.
Therefore, for the speaker cone to reach full extension (60us),
we would need to count down ( 60 / 0.83 ) times = 72 times. In
other words, if we program the countdown with a value of 72 or
higher the speaker will have time to reach full extension.
If we program the timer with countdown values of less than 72,
the speaker cone will NOT have time to reach full extension
before the voltage is turned off (countdown reaches 0), and we
will therefore effectively control how much extension we
actually desire!
Just as an aside, you should now see why the piezo-electric
speakers you get in laptop computers won't replay digitised sound
very well. A PE speaker is just two flat metal plates (if
you've ever opened up a digital watch, a PE speaker is that flat
round gold plate thingy).
When a current is passed between them they "vibrate" at a
specific rate which creates the sound. It's not possible in any
way to control the extension because nothing actually extends,
and therefore we can't properly reproduce a digital waveform.
If you plotted these values on a graph, you would be able to
produce the waveform diagram for that sound file. In the
WAVEFORM zip file you'll find a very basic version of
this, WAVEFORM.EXE (C source included, as always).
Just run WAVEFORM {filename} (eg WAVEFORM TADA.WAV) and the
program will display one screenful of the waveform at a time,
and you can press space to move to the next screen or ESCAPE
to stop. This program only works with VGA mode 13h, and it's a
useful application of the graphics code we learnt last issue!
WAVEFORM.EXE is just a "quick and dirty" program which I
encourage you to experiment with. For now, just refer back
to the example digital waveform I drew a little earlier.
The other important piece of information about a digital sound
file is, what frequency was it recorded at? In other words, how
often was the position (extension) of the speaker (microphone)
recorded? Most digital sound files are recorded at around
16,000Hz which means that the position (extension) of the speaker
(microphone) is noted 16,000 times a second, and each time a
corresponding number is written to the file.
(Therefore, a file recorded at 16,000Hz theoretically takes
16,000 bytes for every one second of sound, but this can be
reduced as you'll see in later issues).
How can we accurately know when it's time to update timer 2?
Well, this is where timer 0 comes in (the real time clock). As
you know from previous issues, timer 0 usually runs at 18.2Hz.
If we beef up timer 0 to run at 16,000Hz, and update the
countdown value for timer 2 each time timer 0 oscillates, then
we can accurately replay the digitised sound!
Recall that timer 0 also controls the generation of interrupt
8, which among other things, updates the real time clock. If
we run timer 0 at 16,000Hz (879 times faster than normal), then
the clock will update 879 times faster than normal, which is
clearly unacceptable.
There are two solutions to this problem. The first solution is,
stop timer 0 from generating interrupt 8 - that is, stop the
clock while the sound is playing. While this can't be done
directly, we can replace the computer's interrupt 8 routine with
our own (which does nothing), and this is the approach I have
taken with VOC-IT.
The other possibility is to write our own interrupt 8 routine
which calls the original interrupt 8 routine at the correct rate
of 18.2 times a second, but my experiments with doing this were
not very successful.
The disadvantage of totally disabling the computer's default
interrupt 8 handler is that the real-time clock appears to
"stop", that is, the time is not updated! This can be solved by
calculating how long we were running our own handler and then
manually updating the time of day when we've finished playing
the sound, but for simplicity I have left this out of VOC-IT.
In a later issue we'll add it in.
What is unique about interrupt 8 is that it is possible to tell
the computer to "wait" (do nothing) until the next time
interrupt 8 is generated. This is done using the assembler
command HLT (halt).
This is useful for us, because we use it to wait for the next
interrupt 8, which is our signal that it's time to modify the
countdown value for timer 2. In this way we can precisely
follow the specified playback frequency, regardless of the
speed of the PC we are using!
(This is a variation of the trick used to make games appear to
play at the same speed regardless of the machine they're running
on, and next issue I'll show you how this is done!)
1) Load the sound data into memory
Loading the sound data simply means opening the file and
reading each byte into memory. At this stage we apply the
muting factor which simply means we divide each byte by a
certain number which reduces the range of values.
Recall we found that only values in the range of 0 to 72
produce useful countdown values. (Anything above 72 always
produces full extension of the speaker). The maximum value
of a byte is 255, and 255/72 is about 4. The muting factor
is a power of 2 - ie mute 1 = /2 (2 to the power 1),
mute 2 = /4 (2 squared), mute 3 = /8 (2 cubed). Therefore
the default muting factor is 2.
You can experiment with changing the muting factor. In
VOC-IT, this is done by using the /M command line switch.
Larger mute values will make the sound quieter, and I'm
sure you will now understand why (let me know if you don't).
Smaller values may appear to amplify the sound if the values
in the file were small to begin with, but it may also stop
the sound completely because you're exceeding the maximum
useful value of 72 and therefore leaving the speaker at full
extension permanently.
Some sound files you will encounter will not need a muting
factor at all - they have been stored on disk already
scaled to the correct range. You will only discover which
files these are by experimentation. For these files, use
mute factor 0 which will not alter the values at all.
Finally, not all digital sound files are stored in "raw"
format. VOC files in particular have a quite complicated
structure which among other things allows us to compress and
repeat blocks of data. Next issue we'll examine the
structure of VOC files and write a "proper" loading
routine. For now, just playing it as a raw sound file is
acceptable.
2) Replace the computers interrupt 8 handler with our
own (do-nothing) handler.
If you look at the file 'INT08.ASM' in the SOURCE\VOC-IT
directory, you'll see the new interrupt 8 handler. All it
does is tell the interrupt controller that the interrupt is
complete. Its beyond the scope of this article to explain
all the details of this process, but look at this issues
interrupts article for an explanation.
In 'C', we use the setvect() and getvect() functions to set
and retrieve the addresses of the interrupt functions. I
have no idea how to do it in other languages, although I do
know there's a DOS interrupt for doing this. Please let me
know if you know how to do this in other languages.
3) Re-program timer 2 to generate a single pulse and to
"attach" to the speaker.
I will cover timer programming in much more detail in a
future issue, but for now, a brief explanation of what to
do. Port 43h is the "mode control register" for the timers,
and when we want to do something with a timer, we output a
control byte to this port. The control byte is composed of
8 bits, as you know, and each bit has some significance to
the timer controller.
Firstly, we will output 90h which is 10010000 in binary. We
are sending four pieces of information here. This is what
we are sending:
10 = select counter 2
So we are telling the controller:
* we're programming timer 2.
* we will send single byte countdown values, which
limits our range to 0-255. Strictly speaking this
is not necessary, but it avoids us having to send
two bytes every time we want to update the countdown
value. (If we're only sending values between 0 and
255, the high byte would always be zero!) In this
mode we're telling the timer to assume the high byte
is zero.
* we want the pulse applied (voltage applied to the
speaker) for the whole countdown, and then removed
when the counter reaches 0. Mode 0 also only happens
once, and it's triggered by us loading the countdown
value.
4) Set the frequency of timer 0 to the desired playback
frequency.
To do this, we calculate the required high byte and low byte
values for the playback frequency we want, and then program
timer 0 with this countdown value in exactly the same way we
programmed timer 2 last issue.
5) Get the first byte of sound data, which dictates the
amplitude of the wave (ie: amount of extension of the
speaker cone).
In VOC-IT, we using the DS:SI registers to point to the
location in memory which holds the next byte of the sound
sample.
6) Re-program timer 2 with this byte as the countdown value.
This automatically starts timer 2.
This is simply done by OUTing the byte value to port 42h,
which is the port for timer 2.
7) Wait for the next interrupt 8 to occur, by issuing a HLT
instruction.
8) If we just used the last byte in the sound file, go to
step 10
In VOC-IT, when the program loads the sound file, it adjusts
the range of values depending on the muting factor
specified, and at the end of the sample data it puts the
value FFh. The playback routine checks to see if the byte
it just read was FFh, and if so, it stops playback.
An alternative method is to check the number of bytes played
against the number of bytes read when the sound was loaded,
but this is not as fast, and we need to write the tightest
possible code for this routine!
9) Return to step 6 with the next byte in the sound data.
If it's not FFh, assume it's another byte of sound data, and
continue. The "oversampling" factor allows us to play each
byte of sound data more than once if we want to, and the
reasons for doing so are explained below.
10) Turn the speaker off. (sound is finished)
Best done by disconnecting timer 2 from the speaker, as
explained in the last issue.
11) Set the timer 0 frequency back to 18.2Hz (countdown=0)
Exactly as for step 4.
12) Restore the computer's interrupt 8 handler
See step 2.
Here's an experiment. Try playing a digitised
sound at 8,000Hz. I warn you, this will sound bad so
spare a thought for where and when you do this.
You can do it with VOC-IT with the following command. Remember NOT to
do this under Windows. This example uses the PING.WAV sound that comes
with Windows:
In other words, if we replay a file at 8,000Hz we also create a
constant 8,000Hz tone which is what you just heard, and as you
probably noticed it drowns out the sound.
The obvious solution to this is to replay the file at a
frequency where the carrier tone is either out of human hearing
range or so close to it that the resulting tone only has a
negligible effect on the sound quality.
So, for example, to replay the same sound at 16,000Hz:
The solution to this is to play every byte twice. This will
make the sound appear half the speed, and half of 16,000Hz is
8,000Hz which is the frequency we're trying to replay at, but
since we are actually playing at 16,000Hz we don't get the
8,000Hz carrier whine.
The act of playing each byte of sound data more that once is
called oversampling.
There are two ways to play every byte twice. The first way is,
when we load the sound file into memory, to store every byte in
memory twice. This certainly works and it makes the sound
playing code a lot simpler, but it's wasteful in memory terms
and reduces the maximum size of the sound we can load into
memory.
In addition, what if we want to replay a sample recorded at
5,000Hz? We would probably oversample (play each byte) three
times to achieve 15,000Hz, which would triple our memory
requirement.
The best solution, (and for this I send my thanks to my colleague
Dave Harvey for helping me work out some very tight code), is
simply to keep a count of how many times we play each byte.
Therefore if we want an oversample factor of 2, we count each
byte twice, or in other words we only fetch a new byte of sound
data every other time we play a byte.
For an oversample of 3, we only get a new byte of sound data
every 3 interrupts, and we use each byte 3 times in a row to
reprogram timer 2.
Just to show you it does work, here's how to play the same sound again,
played at 16,000Hz but this time with an oversample of 2,
meaning each byte is played twice.
This limitation means that you can't really use a lot of
digitised sound effects in your games, and bearing in mind that
on some machines the sound is terrible this may be a good thing.
However, a judiciously used sound effect on your title screen,
or between stages, can add a lot of atmosphere. Remember,
digitised sounds can range from a simple "congratulations" to
"game over" to even short bursts of digitised music. The choice
is yours.
You will find that VOC-IT will achieve results almost as good as
the digital voice on a sound card, depending on the quality of
your PC speaker. It is at least as good as the sound engine in
two excellent products for the PC speaker, LINKS 386 PRO and
PC-STUDIO.
Those of you who have seen the MOD players on the market, such
as MODPLAY, and more recently the excellent IPLAY, will know
that it's possible to play digitised sound and do a lot of other
things besides, such as displaying the waveform. Right now I
don't know exactly how this is achieved but I have an inkling.
I will endeavour to find out more if I can and I'll bring the
results in future issues.
It actually took me six weeks to write this article, of which
five were pure research and development of VOC-IT and another
week writing and revising this text.
Next issue I'm hoping to develop an article explaining the
basics of programming Adlib and Sound Blaster cards, but until I
sit down to work it out I can't be sure what form this will
take.
Anyway, for now, have fun making noise with the speaker and
remember that your local BBS offers hundreds or even thousands
of digital sound files for you to play using VOC-IT. Many
commercial games also leave their sound files in raw format, for
example Ultima Underworld I & II come with a wealth of VOC files
which VOC-IT happily plays.
EXTENSION | ********** ******** *********
| * * * * *
| * * * * *
| * * * * *
| * * * * *
+----*--------*--*------*--------*--------
| * * * * *
| * * * * *
| * * * * *
| * * * * *
REST POSITION***** **** **********
As I said, we're going to reprogram timer 2 to keep the
speaker on until the countdown value we give it reaches 0.
"See, I told you
it was easy!"
Digital sound files
Your average digital sound file, VOC, WAV or whatever, usually
consists of numeric values which represent the amplitude
(extension) of the speaker at any given moment.
"If we have file recorded
at 16,000Hz, we need to
adjust timer 2 16,000
times a second!"
Rolling our own interrupt 8 "handler"
One of the more versatile abilities of the IBM is that it allows
us to replace almost all the interrupt services with our own
routines, or "handlers". I won't go into the details here, but
I have explained how this is done in this issue's "interrupts"
section.
"The IBM ... allows us to
replace interrupt services
with our own"
Playing the digitised sound
In step form, then, here's how we play a digitised sound.
Further explanation is given where required, and I recommend you
also print out and look at the source code (VOC-IT.C and
INT08.ASM), even if you're not familiar with C and assembler.
01 = countdown values are 1 byte only (not 2)
000 = generate a single pulse for the whole countdown
0 = countdown in binary
Oversampling
You'll see that in step 9 I mentioned oversampling, which is a
term you may have come across on CD players. I think in CD
players it means re-reading an area of the CD a number of times
to try to get an error-free reading. This is not what it means
when replaying digitised sound.
VOC-IT /F:8000 PING.WAV
Ugh! Totally awful, as you'll have noticed. Why? One of the
side effects of the way we are playing sounds is that we produce
a constant tone at the playback frequency. This is called the
"carrier" tone.
VOC-IT /F:16000 PING.WAV
Well, the awful "whine" has gone, but now our sound is playing
twice as fast, which should come as no surprise.
VOC-IT /F:16000 /O:2 PING.WAV
Congratulations! You've made it all the way through the theory,
and once you understand all this you'll be well on the way to
creating your own digitised sound routines!
Advantages and disadvantages of playing digitised sound this way
Clearly the biggest disadvantage of this method of playing
digital sound is that the processor is so busy there is no time
to do anything else, and you will notice that packages that use
digitised sound on the PC speaker (such as Links 386 Pro) "stop"
everything else while the sound is playing.
"It is possible to
play digitised sound
'in the background'"
In conclusion
This has been an extremely technical article and you shouldn't
feel bad if you don't understand it all. Several re-readings
may be necessary to really take it all in.
"I'll be interested to
hear your "feedback"
over the next few months"
Next time...
Now that you know how to play music and digitised sound effects,
you really know all there is to know about using the PC speaker,
so for the time being we'll end our discussion of it.