In this video, I reverse engineer the protocol used by the HP 27201A.
The “major computer manufacturer” claim comes from the HP museum page for the SOM. There certainly were other speech synthesizers that were contemporary with this one, or perhaps even predated it. For example, the TI99/4A speech module. As such, I think it may come down to your definition of “major”.
Background
I bought this thing off ebay, not really knowing that much about it. The documentation on the web is really sparse, there’s basically on page and a half article that gives a few details.
The article was enough information to give me a couple of details:
- The module is designed to sit inline, between an HP 1000 or HP 3000 series minicomputer and a peripheral device, such as a terminal or line printer.
- The interface is RS-232 serial.
- The communication protocol is ASCII, and has 8 commands.
Unfortunately, the details about the protocol were pretty vague. For example, although I knew there was a DOWnload command, I didn’t know the arguments or syntax.
Pictures
Below are some pictures of the speech synthesizer:
It’s nice that the dip switch settings were documented right on the case, that would have made life a lot more difficult had they not been.
The internal view we can see a lot of things:
- A TMS5220 speech processor IC. Same chip as used by the popular TI99/4A speech synthesizer peripheral.
- Five 2K static RAM ICs. This is where the speech data is stored. There are jumpers next to each IC that allow an EPROM to be placed instead. Using an EPROM would allow you to use pre-stored speech data rather than having to download it on startup.
- A Z8 Microprocessor with a piggyback socket for a ROM. This is where the software for the speech module resides.
- Two 8-pin RS232 line drivers, U11 and U13, UA9636ACP
- A quad RS232 receiver, AM26LS32ACN
From the RS232 line drivers, it’s easy to learn which pin son the 15-pin connector are +12V and -12V. It’s also evident that four pins are RS232 output signals. From the AM26LS32, one can tell that there are four RS232 inputs.
Pinout
By tracing the pins, I managed to figure out the following:
Pin | Signal |
1 | Ground |
2 | Groond |
3 | Not Connected |
4 | +5V |
5 | SerIn 1B – Z8-P33, handshake line? |
6 | SerIn 3B – Likely serial RX from inline dev, passthrough to Pin15 |
7 | SerIn 2B – Z8-P30, Serial RX (accepts escape sequences this pin) |
8 | SerIn 4B |
9 | Ground |
10 | +12V |
11 | -12V |
12 | Serout U11-2 |
13 | Serout U11-1 |
14 | Serout U13-2 – serial TX (probably to inline dev) |
15 | Serout U13-1 – serial TX (probably to host) |
The key bit of information is that Pin7 is the serial input from the host and pin 15 is the serial output to the host. The other pins are either the slave device, which we don’t really care about, or handshake signals such as RTS/CTS, or power pins.
Reverse Engineering the Protocol
I pulled the ROM and downloaded it using my USB EPROM programmer and minipro software. Then using unidasm, a universal disassembler from the mame project, I disassembled the Z8 ROM.
Unfortunately, trying to understand an architecture your unfamiliar with, without any assembly comments or anything else to go on, can be a long process, and I stayed up two long weekend nights figuring out what arguments each of the commands took. This is what I came up with:
Command | Syntax |
clear | <ESC>&ySCLE <bank>;<ESC>&yU |
download | <ESC>&ySDOW <bank>.<phrase> <length> [F|V] <hex digits>;<ESC>&yU |
pitch | <ESC>&ySPIT [H|M|L];<ESC>&yU |
reset | <ESC>&yRES;<ESC>&yU |
speak | <ESC>&ySSPE <bank>.<phrase>;<ESC>&yU |
status | <ESC>&ySSTA [S|A];<ESC>&yU |
transparent | <ESC>&ySTRA;<ESC>&yU |
upload | <ESC>&ySUPL <bank>;<ESC>&yU |
identify | <ESC>&yI |
The download and speak commands are the most interesting. The first argument to each are two decimal numbers separated by a dot, which identify a bank and phrase. For example, 3.27 would be RAM chip #3, phrase #27. The bank numbers start at 1.
When downloading, you then specify the length in bytes. Then you put the letter F or V (no, I don’t yet know the difference) and finally you send a bunch of hex bytes, two characters per byte. Separators between the hex bytes are not necessary.
Every command (except identify, which isn’t really a command) is terminated with a semicolon.
The Status, Upload, and Identify commands for me all did not produce output unless I followed their escape sequence with an <XON> character.
Obtaining speech data
Bitsavers has a collection of HP 1000 software, and the Voice Exerciser (VX) program that would have originally been used with the speech module is present there in the type-4 and type-5 archives. Specifically, this list of files. The 27203 files are the ones you’re looking for:
27203-16001 2320 01/01 A02797 00101 00005 %VX VOICE EXERCISER 27203-16001 2330 01/01 A02798 00101 00005 %VX VOICE EXERCISER 27203-16002 2320 01/01 A02799 00101 00005 %VMNGR VOICE MANAGER 27203-16002 2330 01/01 A02800 00101 00005 %VMNGR VOICE MANAGER 27203-16003 2320 01/01 A02801 00101 00004 VSCHMA IMAGE SCHEMA FILE 27203-16003 2330 01/01 A02802 00101 00004 VSCHMA IMAGE SCHEMA FILE 27203-16006 2320 01/01 A02803 00101 00004 VDBLD VMNGR COMMAND FILE 27203-16006 2330 01/01 A02804 00101 00004 VDBLD VMNGR COMMAND FILE 27203-16007 2320 01/01 A02805 00101 00004 SOMWRD DICTIONARY WORDS 27203-16007 2330 01/01 A02806 00101 00004 SOMWRD DICTIONARY WORDS 27203-16008 2320 01/01 A02807 00101 00004 SMWRD1 WORDS FOR FLOPPIES 27203-16009 2320 01/01 A02808 00101 00004 SMWRD2 WORDS FOR FLOPPIES ( 27203-16010 2320 01/01 A02809 00101 00004 VXVERF VX COMMAND FILE 27203-16010 2330 01/01 A02810 00101 00004 VXVERF VX COMMAND FILE 27203-16011 2320 01/01 A02811 00101 00004 *VINST INSTALLATION FILE 27203-16011 2330 01/01 A02812 00101 00004 *VINST INSTALLATION FILE 27203-16012 2320 01/01 A02813 00101 00004 VDBLF VMNGR COMMANDS - FLO 27203-16013 2320 01/01 A02814 00101 00004 *VINS1 INSTALLATION - FLOPP 27203-16014 2320 01/01 A02815 00101 00004 *VINS2 INSTALLATION - FLOP 27203-17001 2320 01/01 A02816 00101 00004 "VXHLP VX HELP FILE 27203-17001 2330 01/01 A02817 00101 00004 "VXHLP VX HELP FILE 27203-17002 2320 01/01 A02818 00101 00004 A27203 SP LIB SOFT NUM CAT 27203-17002 2330 01/01 A02819 00101 00004 A27203 SP LIB SOFT NUM CAT 27203-17003 2320 01/01 A02820 00101 00004 "VMHLP VMNGR HELP FILE 27203-17003 2330 01/01 A02821 00101 00004 "VMHLP VMNGR HELP FILE
The two files with the speech data are 27203-16007_Rev-2330.src and 27203-16009_Rev-2320.src. These two files together contain approximately 1700 English words and their associated speech data.
Note that the VX software could be run on an actual or simulated HP 1000 series computer and that would have yielded the protocol by simple snooping of the serial interface. That was my backup plan had I been unable to disassemble and understand the Z8 ROM. I did have a lengthy chat with Dave Bryan who developed the Simh HP2100 simulator, and learned that loading complex software that requires a database, such as the voice exerciser, may be a daunting project. It’s still my hope, with some help from Dave, that I might be able to get this running on the simulator and try it out in the environment it was intended to work in.
My python program
I wrote a python program to communicate with the speech synthesizer and it’s present in my github repository linked below. There are two programs:
- build_vocab.py. Builds a vocabulary file to download to the module, acceptings words from stdin and writing the vocabulary to stdout.
- hpcli.py. A command-line tool to interact with the speech module, with commands for initializing, speaking, etc.
The video at the top of this blog post shows me running the python programs to interact with the SOM
Resources
- My Github Repository: https://github.com/sbelectronics/hp27201a
Thank you so much for this nice resurrection of ancient digital voices.
I am collecting everything about speech synth and of course about the TMS5220. Here is my collection (I just added the hp27201a files you dug) of LPC files
https://www.polaxis.be/lpc-files/
I use a slightly different format to stay compatible with the one used in Arduino and https://github.com/ptwz/python_wizard
Keep going!
Cool find on eBay!
Whether this unit was the first from a major computer manufacturer is (as always) a “dangerous” assumption. For example, D|I|G|I|T|A|L (DEC) had the DECtalk 1. It used a TI DSP processor for the sound generation and had a 68000 as interface. You send text serial to the unit, and speech comes out. We al *know* the DECtalk 1, because this unit generated the voice for Stephen Hawking (RIP).