If you owned a C64, an Atari or an Apple computer, then you might be familiar with the synthetic sound of the voice of SAM. The Software based text-to-speech speech synthesizer package that required no additional hardware. But if back then, you only had a VIC-20 then you missed out, because SAM won't run on such a system, there simply aren't enough resources, since the VIC-20 doesn't have enough RAM. But the project on this page is about SAM and this project does work perfectly fine on a VIC-20 (and also on various other Commodore computers that have a userport).
How it started...
A long time ago <click here for more info>, I wondered how the speech synthesis efect of the Scott Adams adventure games would sound like. So I wrote an app for my Android phone and connected it to my VIC-20 through an HC-06 Bluetooth serial connection. It worked but didn't give me the satisfaction I expected. I also didn't like the fact that I needed a phone+Bluetooth device instead of a single and dedicated device. But also because the setup of the Bluetooth connection caused problems when the first beta-tester tried it out. So from there I drew the conclusion that this approach perhaps wasn't the way to go and maybe even be too complicated for some situations. Anyway, things would be confusing or difficult to properly document/explain/support over email in case of user problems and various types of phones and phone versions.
So I abandoned this project, which was still in it's early stages and came to the conclusion that a proper speech synth for the VIC-20 should be pure hardware and should work straight out of the box. Completely independent of mobile phone technology, which is a technology that changes far too quickly. And if something is developed for a purely Android system, there will always be a user who has an iPhone and therefore cannot use it. But perhaps the true nail in the coffin of that project, was perhaps the fact that the sound coming from the phone was was too perfect. It didn't sound retro and at that time I didn't new how to change that. And to be honest, I still don't really know. So, a fun experiment, but with a dead end. A completely true text-to-speech speech synthesizer, small, simple and in a box that plugs into your CBM, using modern components seemed to be a little bit to complicated. Or at least, that is what it looked like at that time...Then a few years went by, although the memories of the sound of SAM (Software Automatic Mouth, by "Don't ask software") always remained in the back of my head. As it was my first speech synth experience, which ran on my C64. It was purely software based. I played along with the demos supplied with it, but never really found much use for it. Though, the sound was awesome. The beauty of SAM was that it didn't require and additional hardware, it ran on an unexpanded C64. Since it was completely software, I discovered it on the back of a games disk I copied from some friend. I'm pretty sure that if I had to buy SAM, I would never got in contact with it. Simply because I was a kid at that time, with very limited resources, so I rather spend my money on saving for a printer or for consumables like empty disks and not some software-gadget I was only going to play with for a few times. So, the only reason I got to know SAM, was because I got SAM for the cost of an empty disk... without even knowing it at first. And when I played with it, I learned to like and appreciate it, that voice coming from my computer. I was able to make my computer talk, let it say anything... this (in the eyes of a kid of the 80's) was magic! I'm pretty sure that I was not the only kid, who got to learn to know SAM this way. In my believe, the fact that SAM was purely software (and therefore could be copied), was perhaps the biggest reason for it's success. Perhaps not commercial, but surely sentimental. One thing people should realize, is that SAM was a text-to-speech synthesizer. Meaning that you could type a sentence and SAM would speak it. In those days, this wasn't always the case for a speech synthesizer. Many speech synthesizers required the user to glue sounds (phonemes) together in order to generate the sound of the desired word. But with SAM, you could directly type that word. If you wanted it to say "eight", you just typed "eight" (or "8" if you are lazy), but not "EY4T" which would be the description of the sequence of phonemes required to speak "eight". Now you could use SAM in a special phoneme mode, so SAM could operate in that mode too. But the text-to-speech functions worked so well, that in many cases there wasn't really a reason to go through all that trouble. In other words, SAM was very user friendly and very flexible.
Some years ago, when I was working on my linear clock, project I noticed the audio library https://github.com/earlephilhower/ESP8266Audio by Earle F. Philhower, which made it possible to playback all kinds of audio from the I2S port of the ESP8266, with or without an I2S hardware DAC. He mentioned that he also ported the SAM code (which was ported from assembly to C, by Sebastian Macke) to the ESP8266 and made a library for it. So I gave it a quick try, and to my amazement it sounded just like the C64 in my memories. But since I was working on a clock project that didn't require any speech, I forgot about it for a while. Until I realized that my Android speech synthesis project, could finally be made using real hardware. But, also relatively cheap. And with modern components, in a way everyone could make one... well... sort of. I just mean that the parts can be easily obtained and aren't really special in this modern world at this moment in time. So, in other words "speech synthesis for the masses, not the classes".
Now some may say that an ESP8266 is quite an overkill to create a SAM speech synthesizer, and they are probably right. Because, well seriously, the C64 is just a 1MHz computer and only has 64K of RAM. But the costs of an ESP8266 are so low these days and they are so easily available that it may perhaps be the best choice for this project. The fact that I have many of these little modules scattered around my desk may also have something to do with it.
Now the library written by Earle, is great in itself, but in order to make it work like a practical serial port speech synthesizer, some additional work was required. So I decided to add a serial command interface. Which would allow all kind of settings to be changed and text/phonemes to be spoken. Regarding the hardware, some tricks were required. This because the ESP shares some of it's serial port pins with the I2S interface. Meaning that serial and I2S cannot fully be used at the same time. A port-swap solves that problem partially (only for RX, TXD is stil blocked by the I2S functions). But because I really wanted a TXD, I decided to write a small bit banged serial port. Which served me well during debugging and is used to let the device send data back to the CBM computer, although you can do without.
What is a Votrax type 'n' talk?
The speech synthesis for a VIC-20 projects, are all inspired by the fact that the Scott Adams games support it. These games require a Votrax type 'n talk (or similar) device to be connected to the userport. Now, as you might have guessed, I do not own a Votrax type 'n' talk and I'm sure that not many people do. This device was a text-to-speech device. Where normal speech synthesizer chips would require a set of phonemes send over a parallel port, the Votrax type 'n talk could convert text directly into speech while using only a single serial line. This made the device very easy to connect to different types of computers... but also very expensive. But the biggest advantage would be that adding speech to an application, using the Votrax type 'n talk, has a very low impact on the program using it. Because the text printed to screen could also be send to the serial port (without any conversion) and directly spoken by the device. A great device for it's time, but out of reach for most of us. Because owning one back then would be expensive and finding one today would be difficult and most likely expensive too. Although, these days we might have deeper pockets than when we were young.
VIC-20 games supporting it
Commodore has released the following Scott Adams adventure games: Adventure land, Mission Impossible, Pirate cove, The count, Voodoo castle. Of all the available Scott Adams games support a user-port speech synthesizer. Unfortunately I only own "The count", "Mission impossible" and "Voodoo Castle" and I don't even like adventure games as I lack the skills and patience to solve the game. But when you look at the boxes you instantly appreciate the detail of the drawings, though keep in mind that the game itself is only pure text, no graphics. However, this pretty box-art is not only designed for making you buy it, it also pulls you into the story.
Using it in your own programs
Using a serial port based text-to-speech synthesizer is as easy as typing a print statement. On your VIC-20 and C64, C128 you only need to remember the following two lines:
10 OPEN 1,2,3,CHR$(10):REM OPEN SERIAL PORT @ 2400Baud
20 PRINT#1,"HELLO WORLD":REM SPEAK TEXT (NO OUTPUT TO SCREEN)
However if you have a Plus4 it would be slightly different, but notmuch:
10 OPEN 1,2,3,CHR$(26)+chr$(5):REM OPEN SERIAL PORT @ 2400Baud
20 PRINT#1,"HELLO WORLD":REM SPEAK TEXT (NO OUTPUT TO SCREEN)
As you can see, you don't have to be an advanced programmer in order to use it. Just place your text in a print statement and that's it.
Serial Speech Synthesizer SAM
Below you see the device installed in a VIC-20 setup (the VIC-1916 cartridge is a Scott Adams adventure). In order to hear the sound of SAM through your monitor speaker, the sound is mixed using a special mixing cable. So you can hear both the sound of SAM and the sound of your VIC-20 through the same speaker. This results in a very natural listening experience.
If you decide to connect the speech synthesizer to a PET computer then you also require a small battery operated active speaker, since the PET doesn't have a build in speaker we could tap into. Or perhaps if you really like the sound of SAM singing "daisy", plug it into your stereo amplifier and treat the neighbors to some high-tech 80's artificial singing sounds.
For this little project I thought that some small decoration in the form of a sticker would be a nice touch. Above you see the sticker I made for the device. As you can see, there is a new logo for SAM. It is slightly different, but considering it has exactly the same voice I kept the same look as the old SAM logo. However, in the new logo, SAM does no longer consist of a floppy disk (which symbolized the pure software solution), it now consists of an ESP12 module (symbolizing the hardware module it now runs on), his hand is slightly raised and his feet step into the other direction (towards the megaphone, to symbolize that SAM is back, alive and kicking).
Serial command interface
Although you can use SAM with the two BASIC lines as shown above, you can do much more when using some of the -config commands. For example, to take control over SAM's voice there are the commands: SPEED, PITCH, THROAT, MOUTH, PHONETIC, SINGMODE
You may personalize SAM by defining your own welcome message. Welcome messages are being spoken on a cold-start (power-on) and on a warm-start (reset). So by changing this message, you can make your computer say "Greetings, professor Falken. Shall we play a game?" every time you switch the machine on. Okay, you CBM computer isn't a WOPR, but who cares, it's fun! These new settings can be stored permanently to simulated EEPROM (so you don't have to do your settings over and over again), which is very convenient.
Requesting SAM to sing the demo is easy, just type the following on your VIC-20 / C64 and enjoy the song:
Instruct SAM to say a personalized welcome message on your VIC-20 / C64 (press reset to hear the new setting):
PRINT#1,"-CONFIG MSG2 HEY! WHY DID YOU PRESS RESET?"
Now, this is just a tiny bit of information about the possibillities of this device, for more details you really should read the user manual, which is full of examples and information about speech synthesis. Lot's of information comes directly from the original SAM user manual, but also very much information has been added. For instance, making SAM sing is now very well described with very clear examples. Although, that doesn't mean that making SAM sing is easy. Regrading the usermanual, please refer to the downloads section on the bottom of this page.
Use this on other CBM computers
Although I created this project mainly for use on a VIC-20 it certainly isn't the only computer that can use it. The C64, C128 and Plus4 can also use this device. This because these computers have the same userport pinout and support the RS-232 protocol in the kernal. Allowing you to send serial data over the userport with just two lines of BASIC code.
But, unfortunately, the PET computers have a slightly different pinout than the VIC-20 and therefore cannot use this circuit without some modification. For example, Pin-2 of the userport does not contain +5V on a PET, but it contains a video related signal. So when you plug in this device into the PET, the screen will go black and nothing happens. Now, don't worry, this doesn't damage your PET, but if nothing works and the screen goes blank, it isn't very useful either. This can be solved with a small modification to the printed circuit board of SSSSAM (a solder-jumper needs to be changed on the PCB of the device and a wire required to be connected to the cassetteport connector, connecting to the +5V there). But the real problem is within the fact that the PET computer does not have support for it in the kernal. Meaning that it requires much more code then just two lines of BASIC. In fact it requires a lot of machine code (a.k.a. assembly) in order to handle the RS-232 like signals in the proper way. So, to make a long story short, the PET computer isn't really suited for using SSSSAM. Although technically not impossible at all, to do it properly would be a project on it's own.
But let us not forget that this project is intended mainly for the VIC-20. Simply because the VIC-20 has the smallest memory of all and there isn't a decent software speech synthesizer available for it. And if it was, it would be eating precious memory. The audio from the speech synth is mixed with the VIC-20 audio, this however is done using a 5-pin video cable. Which also fits on your C64 and C128, but it uses composite video. Meaning that if you do want/need to use S-video on your C64 or C128 (using the 8-pin video cable), you can't use this splitter cable. Now S-video users might wonder why I didn't make an 8-pin video cable for this device the answer is simple. The female 8-pin connector is very difficult to find/buy and if you do find one you'll notice that they are relatively expensive. But also logistically it would be a nightmare for me to facilitate every variation video cable possible. Because some user might also want to have both 5-pin and 8-pin functionality in one single cable for all sorts of fuzzy reasons. So in order to prevent this logistical nightmare, I'll only support the use of a 5-pin composite video cable with audio mixing capabilities, so that SAM will sound from your monitors speaker along with your normal VIC-20 sound. This is the only type of cable that works on all CBM systems and therefore makes the most sense to bundle with the SSSSAM system.
But if you want, you can always make your own cable, instructions on how to do that are included in the technical chapters of the user manual which is available for download in the download section at the bottom of this page.
Although SAM isn't difficult to use, if you want to use it to the max, or want to learn more about speech synthesis in general, then the user manual I wrote is a very good place to find all required information. A lot of information comes directly from the SAM orginal user manual, which makes sense considering it is SAM. But also lot's of new information has been added. How to make SAM sing for instance or how to enter commands and modify settings. How to make an audio cable to mix the sound into your computer's AV-cable so that you don't need an additional speaker. How to troubleshoot when things don't work, because you have to remember, this device was made for a computer 40 years old (in 2021). You can find the user manual along with all other files in the download section.
SSSSAM is based on the ESP8266 module so this would mean that it will be able to connect to the internet via a wifi connection. But in practice this isn't really the case. There are experimental functions that allow it to connect to a telnet based BBS. But this is very buggy, now the manual explains how this works with a functional example and you can improve upon this but functionality will always be very limitted. But it is fun to play with on a rainy sunday afternoon.
Configuration and firmware upgrades using a webbrowser. The easiest way to upgrade firmware is using a webbrowser, this because it doesn't require any cables and is completely safe as long as you follow the instructions properly. But firmware upgrades aren't expected to be released often, they are mainly for fixing crucial problems and these are expected to be all solved as soon as you get your SSSSAM system. But configuration is a different thing. Because SSSSAM has all sorts of options you can configure using a few lines of basic code, sometimes a more convenient interface is nice to have. So via a webbrowser you can open and edit the configuration file to suit your needs or just to observe or play with.
A dictionary can be a nice thing to have. And SSSSAM has the option of using a dictionary. To be precise, SSSSAM can handle TWO dictionaries at the same time. If a word isn't found in the first dictionary, it looks it up in the second and if found uses that. If a word to be spoken isn't in any of the dictionaries, SSSSAM uses the regular speech rules to pronunce the word to be spoken. A dictionary can be fun if you want to make SSSSAM speak a foreign language, or to be more precise, a language other then American English (because SAM is an American speech synthesizer). Let's say that you want to make SAM speak German. Well then you define a dictionary of common German words and make that the primary dictionary. Then when SSSSAM is required to speak it will search through the primary dictionary, finds the German word and speaks it. If it cannot be found, it goes to the Secondary dictionary and if the word is there it speaks it, and again if not found there too, it refers to the regular speech rules to pronounce it as an English word. Now there is one thing... SSSSAM will always have a very strong American accent, this because it can only speak using the phonemes that are hard-coded into the speech synthesizer. Now with a lot of creativity you can come a long way. But some words no matter how hard you try will never sound the way you want them too. This because different languages use different phonemes. Regarding the phonemes used by SAM and how you can use them in your own dictionary, please refer to the user manual for more detailed information and examples. Please keep in mind that the use of a dictionary is for advanced users, this because currently the dictionary functionality doesn't have a fancy editor but also because writing the contents of a dictionary requires a lot of work and patience.
Another thing with a dictionary is that you can give words a different meaning, you can use a dictionary as a filter. You can prevent SAM speaking nasty words if you add them to the dictionary and make them "silent" OR you just give words a different meaning. If for example the word "cock" seems to be undesired to be spoken then simply add the pronunciation of the word "rooster" instead. So if you want to use this system with your child and you want to prevent them from typing in swear words to make SAM speak swear words, just make a dictionary full of filth in order to prevent this. This can be a nice exercise for the parent who makes this dictionary and for the child who tries to figure out which words are included in the dictionary. So parent beware, you might be achievening the opposite effect when using a dictionary in this manor.
Because of a youtube video that shows an IBM 7094 singing the "Daisy Bell" song, many websites are starting to take over the confusing information that this is actually the type of computer that started it all. Which, although very close, doesn't seem to be the case. This because the song was first sung by an IBM 704 at Bell Labs in 1961. At that point in time, the IBM 7094 was not in production yet, it was to be announced in January 1962. Meaning that it could not been available to the Bell Labs at the time of the demonstration.
In 1962 Arthur C. Clarke (who wrote the novel and co-wrote the screenplay for the 1968 movie – “2001: A Space Odyssey”) visited Bell Labs. There, he was treated to a performance of the song ‘Daisy Bell’ by the IBM 704 computer. This evidently inspired him to have the HAL-9000 computer sing the song during it's final scene. This as an homage to the Bell Labs demonstration of computer controlled speech synthesis.
In the movie you'll see the astronaut disabling the HAL-9000 computer, by removing all the memory modules in order to save its own life. As the computer gradually starts loosing its memory, it refers to its earliest memories and tells the astronaut about a song it was taught a long time ago. The astronaut agrees in the song being sung by the computer and HAL-9000 sings the song to which it refers to as “Daisy”. The computer sings the song in a very slowed down version suggesting the regression and failure of the HAL-9000 computer. This reference in the movie might be perhaps one of the main reasons why we all still know this song and perhaps why we know it as “Daisy”. The song has been used or referred to (sometimes in very subtle ways) in many other films, television shows and even video games. Most likely because this was such an iconic scene in a movie praised for its impressive special effects that withstood the test of time. Below is a small list of references to the "Daisy Bell" song or the HAL-9000 shut-down scene.
The Hitcher (with Rutger Hauer)
He hums it while being transported to jail (sort of meeting his final destiny, like HAL).
Futurama : Season Four, Episode 11 (Love and Rocket)
Bender sang the song over a montage of romance with the Planet Express Ship.
John Dimaggio uses his Bender voice to act the famous HAL-9000 shut-down scene.
Mass Effect 2
The character Joker comments on the AI system “EDI” singing “Daisy Bell”.
One of the robots’ death quotes is “Daisy, Daisy, give me your answer do.”
Alexa (the voice control system)
Voice command: "Daisy Daisy". Alexa's response: "I'm half crazy, all for the love of you".
Stranger Things 3 (the mall rats)
A Stream Wurlitzer carousel organ is heard playing "Daisy" when Steve Harrington feeds change into an Indiana Flyer coin-operated horse ride at the mall. This sound is also heard by Dustin intercepting a Russian transmission with his Cerebro transmitter/receiver, it is a background noise when a Russian is speaking the coded lines. In real life true number stations also use (or used) these kind of simple repetitive tunes, although not as background noise, but more to fill the gap in between the repeated messages. There is however no direct reference to the HAL-9000 scene or speech synthesis although number stations due use repetitive text that might come from a system that could be considered as sample playback and therefore could be confused for speech synthesis.
unconfirmed reference mentioned on reddit
"More and more I have heard the name Daisy being sung by various Battleborn characters and even sentries on Incursion..."
Reasons for reference unknown
The Time Traveler's Wife (2009)
Alba and her father Henry sing the song "Daisy Bell" in an attempt to stop him from traveling through time while he is still using a wheelchair from a recent accident.
Sonic Boom episode “Dude, Where’s My Eggman?”
The song that Cubot sings to raise money to bail Dr. Eggman out of jail was Daisy Bell.
Agents of SHIELD (episode S.O.S. Part One)
Skye’s father hums it to her. We find out that her real name is Daisy.
The song is sang or mentioned, but highly unlikely related to the above (sometimes as song is just a song)
Dylan was riding a tandem bike with three other people at Disney World, singing the Daisy Bell song.
Nat King Cole
He sang a cover of it in 1963 in the album “Those Lazy-Crazy-Hazy Days of Summer” (this is years before 1968 movie – “2001: A Space Odyssey”).
Sang a cover of it as well as a B-Side on their single “Sunday”.
The Nursery Rhymes Collections Vol.2 (2011)
Covered a version of the song.
The Gay Nineties Old Tyme Music: Daisy Bell
An album filled entirely with covers of "Daisy Bell" by various artists: Katy Perry, Tyler, the Creator, "Weird Al" Yankovic, Nick Cave, Kirk Hammett of Metallica, Mark Mothersbaugh of Devo, Wall of Voodoo's Stan Ridgway, Danny Elfman, and others.
Speech synthesis for the blind
In the video regarding this project, I mentioned Mackan. He's blind and uses all various devices to help him with live his life. One of the things he finds interesting is retro computing. He has a setup where his C64 runs a custom kernal that redirects the text from his screen to the serial port of his C64, from there he connects his C64 serial port to a PC that has a braille display, this way he can read in braille the text that appears on the screen of his C64. This way he can type text and use his C64 without the use of a conventional screen.
In many ways this is an interesting setup, but he was looking for something more compact and with speech synthesis. He stumbled upon my website page that holds the bluetooth speech synthesis project. I've told him that that project was not really usefull and no longer actively supported. But that I was working on the Serial Speech Synthesizer SAM project. Which he found very interesting, so after lot's of emails I've send him a fully functional prototype of the SSSSAM device and he started to play with it. He was very pleased with the device. Unfortunately, for reasons unknown, we've lost contact, so I never got to ask him what's inside his C64 kernal. If someone might know what kernal this might be, feel free to tell me all about it. So that I can publish that info on this website. This way we might be able to help out somebody else who might desire a similar sollution.
Anyway, without a modified kernal, the SSSSAM device isn't of much help to the visually impaired or blind.
If you want to make this project yourself, or if you just want to browse through the code then use the links below. But before doing anything I suggest that you browse through the very informative user manual first.