The Aggregate's `helpme` Utility

Audio Diagnostics for Clusters & Server Farms

Clusters and server farms often have many nodes... and very little hardware support for locating and diagnosing problems. Using the PC speaker, this less than 50K byte program allows a node to signal that it has a problem and describe the problem using arbitrary alphanumeric text.

Overview

Cluster supercomputers and server farms made using commodity PC hardware suffer from the combination of high component counts, high component failure rates, and relatively fragile interconnections between nodes: nodes will fail. Determining what has failed and why is complicated by the fact that most nodes have neither a console nor any other type of traditional status output device. Using a network connection to provide diagnostic information seems reasonable, but in practice it is common that the problem lies in the network card, cables, or switches.

Hardware manufacturers often provide diagnostic feedback using status lights, but that approach really isn't effective for nodes. Even if we had an alphanumeric display panel on each case, or a KVM switch connection and a monitor, identifying a "sick" node would require examining each node individually -- which is not practical for a large number of nodes. Manufacturers of PC motherboards instead use the PC speaker to output a "beep code" to help the user identify problems when no other I/O devices are functional. Because the audio signal from a single node may be able to be heard from a wide area within a room filled with nodes, audio signals are a better diagnostic output mechanism than per node displays. Unfortunately, vendor-specific "beep codes" are not very intuitive ways to convey diagnostic information.

This is why we built helpme: to serve as a cluster node diagnostic output mechanism. Rather than having a complex set of beep codes covering all the relevant status conditions, helpme uses a very simple beep code only to attract attention to the node and to express the urgency of the information. The actual status information is conveyed in the form of alphanumeric text rendered as an audio signal.

Given that the PC speaker does quite well producing beeps, it is not surprising that it is very effective rendering alphanumeric text as Morse code. In fact, using a somewhat higher pitch than most beep codes, the PC speaker provides a stunningly crisp rendition of dits and dahs. The patterns are so clear that they easily can be identified over the din of several racks of nodes. The relevant Morse code patterns are:

0: -----  A: .-    K: -.-   U: ..-
1: .----  B: -...  L: .-..  V: ...-
2: ..---  C: -.-.  M: --    W: .--
3: ...--  D: -..   N: -.    X: -..-
4: ....-  E: .     O: ---   Y: -.--
5: .....  F: ..-.  P: .--.  Z: --..
6: -....  G: --.   Q: --.-
7: --...  H: ....  R: .-.
8: ---..  I: ..    S: ...
9: ----.  J: .---  T: -

Of course, the catch is that not everbody knows Morse code, so it would be nice to have another option....

The ideal would be to use the PC speaker for true text-to-speech voice synthesis, but the quality of PC speaker voice synthesis is mediocre at best, making it unreliable for conveying information as pure text-to-speech. Fortunately, we are far from the first people to be faced with this problem: the task of highly-reliable communication of alphanumeric data over noisy voice transmissions is common in military operations and air traffic control. The international standard solution for air traffic control is the NATO "Phonetic Alphabet" -- you know, that Alpha, Bravo, Charlie stuff. Even a very crude, low data rate, PC speaker software driver can reliably convey status information by pronouncing each character using the NATO-standard words (and their slightly unusual pronunciations, e.g., "Three" is pronounced "Tree", "Nine" is pronounced "Niner"):

0: Zero   A: Alpha    K: Kilo      U: Uniform
1: One    B: Bravo    L: Lima      V: Victor
2: Two    C: Charlie  M: Mike      W: Whiskey
3: Three  D: Delta    N: November  X: X-ray
4: Four   E: Echo     O: Oscar     Y: Yankee
5: Fife   F: Foxtrot  P: Papa      Z: Zulu
6: Six    G: Golf     Q: Quebec
7: Seven  H: Hotel    R: Romeo
8: Eight  I: India    S: Sierra
9: Nine   J: Juliet   T: Tango

The PC speaker voice synthesis used in helpme is essentially 1-bit conversion using an 8KHz sample rate, yielding barely decipherable speech -- but that's good enough to clearly distinguish the NATO words. Ok, you'll do better if you listen to at least a few known phrases first just to get used to the audio defects, but it works. The very low data rate driver also means that the voice samples are very small. In fact, the enitre helpme program, supporting both Morse code and NATO phonetic alphabet, is well under 50K bytes as a fully self-contained program (no libraries, helper programs, nor files). Thus, helpme is suitable for ubiquitous use, even within single-floppy Linux distributions.

Use of `helpme`

As described above, helpme is not intended to be a general speech or sound facility. It is not compatible with any of the multitude of competing standards for such use. It also can be quite CPU intensive, making it highly undesirable if the CPU could be used for more productive work. Instead, helpme is intended to be a stand-alone audio diagnostic renderer for clusters and server farms, with a variety of features aimed at making it better suited to that task.

The input to helpme consists of ASCII text presented either as one or more command line arguments or as the standard input stream. If command line arguments are used, the arguments are treated as a single input separated by spaces. To allow for possible future command line options, any command line argument that begins with "-" will be treated as an option specifier and omitted from the input.

The input text is not just text, but uses a simple diagnostic message language to specify things like Morse or NATO rendering and repeats of a message. Since helpme is most useful for a node that has no other way of communicating, input syntax errors are simply ignored. The syntax is roughly:

input: phrase*

phrase: "(" phrase ")" repeat
      | item*

repeat: number
      | "*"      -- repeat forever

item: alphanumeric
    | " "        -- wordbreak pause (multiples are collapsed into one)
    | ","        -- wordbreak pause (incompressible)
    | "_" number -- set ms duration for tones (default 2000ms = 2 seconds)
    | "~" number -- play tone with number Hz frequency, 0 Hz is silent
    | ";"        -- equivalent to ~0
    | "-"        -- render using Morse code
    | "@"        -- render using NATO phonetic alphabet (also default)
    | "%"        -- toggle Morse code vs. NATO phonetic alphabet
    | "!"        -- urgent problem beep code
    | "?"        -- possible problem beep code
    | "."        -- informational beep code

In the above grammar, number is any sequence of 0 or more digits and alphanumeric is any digit or letter. Uppercase and lowercase letters are treated identically and tab and newline are treated exactly like " ". Similarly, any of the commonly used grouping marks can be used instead of "(" and ")", so that any of "(<{[" can be paired with any of ")>}]". The "=", "$", and "#" symbols are reserved for possible macro and/or comment facilities; generally, other characters are ignored. Here are a few examples:

$ helpme PE42 -v OK.

Use the (default) NATO rendering of "Papa" "Echo" "Four" "Two", wordbreak pause, render "Oscar" "Kilo", and then make the informational alert sound. Notice that the "-v" is ignored because it begins with a "-"; it is interpreted as a command line argument.

$ helpme
(!NIC%)*

Make the urgent problem alert sound, then use the (default) NATO rendering of "November" "India" "Charlie", switch the default rendering, and repeat forever. Thus, "NIC" is alternately rendered as NATO and Morse versions.

$ helpme '(1{2}34 5)6'

Use the (default) NATO rendering of "One", then render "Two" 34 times, then render "Five" (without a wordbreak pause), and repeat the entire sequence for a total of 6 times.

$ helpme
a
             b,,c_4000;d

Use the (default) NATO rendering of "Alpha", pause for a single wordbreak, render "Bravo", pause for two wordbreaks, render "Charlie", set duration to 4 seconds, pause for 4 seconds, and render "Delta".

$ helpme
_500
~440

Set duration to 1/2 second, then play a tone with a frequency of 440Hz.

If you want, you can even play music with helpme. Here are Taps and Hornpipe. Ok, I didn't say it would sound good. ;-)

One minor technical detail to keep in mind: in this version, audio rendering does not begin until the entire input has been read. This was done because intermixing reads of the input with audio rendering would occassionally cause timing glitches that would make the audio rendering less understandable.

The Code

The full source code of helpme, helpme.c, is freely available as a full public domain release. The code as distributed will only work on an IA32 PC whose processor supports the tick counter performance register (Pentiums or Athlons of any flavor should work).

There are only three compilation options that you may want to alter:

If USEIOCTL is defined, helpme uses the relatively new /dev/console ioctl() calls to play all the music; this is strongly prefered for recent Linux kernels, and is thus the default. Undefining USEIOCTL causes the program to use only direct I/O port accesses to produce the sounds, preferable for use with older Linux kernels or if you want to borrow portions of this code for use in a kernel module. For NATO rendering, CPU overhead is around 50% no matter how you set the options (it is that low only because usleep() calls are used where possible). Morse rendering also yields about the same overhead either way, but the overhead is less than 15%. Beep codes have less than 2% overhead using ioctl(), but approach 15% for direct I/O port accesses.
The second compilation option involves how helpme determines the frequency of the CPU clock (hence, of the tick counter performance register). You can specify a particular frequency by defining TICKSPERSEC; otherwise, the default is to autocalibrate the tick counter using repeated calls to time(0), which also adds as much as 2 seconds delay to the start of the message.
The third compilation option, FORCETOGGLE, forces helpme to play everything by directly toggling the PC speaker. For tones, the processor overhead becomes nearly 100%, so this option should be used only if nothing else works. In particular, it seems some versions of Linux (e.g., the kernel in RedHat 7.3) do not implement /dev/console if the system has no keyboard and video display, thus USEIOCTL causes helpme to immediately exit. However, the same kernel also seems to interfere with the direct I/O port programming of the tone timer. Hopefully, those kernel bugs will be fixed and the FORCETOGGLE option can go away....

To compile the program:

cc helpme.c -o helpme -O6

If you need helpme to be compiled to the absolute minimum image size, you probably will want to look at things like uClibc, a C library for embedded systems. Standard GCC compilation using -nostdlib gives undefined references to __udivdi3, usleep, read, signal, and exit; USEIOCTL will cause ioctl and open to also be referenced, and not specifying TICKSPERSEC causes a reference to time. Removing the reference to usleep only increases CPU overhead, and the reference to read also can be removed if only command line input will be used for invoking helpme. The signal and exit references also can be removed without serious effect. That leaves only __udivdi3 -- which is needed to do a bunch of 64-bit unsigned divides.

The use of ioctl() calls requires access permission on /dev/console, which usually requires root priviledges for a process not using the console as its standard input. If you instead tell helpme to use direct accesses to I/O ports, it needs access rights to those ports and also permission to temporarily disable interrupts around some critical I/O code, again requiring root priviledges. Thus, either way, helpme generally must run with root priviledges. Once compiled, you can mark it as set uid root by executing the following two commands while logged-in as root:

chown root helpme
chmod +s helpme

Because helpme has been placed in the public domain, it is entirely up to you to determine fitness for your application. Neither the author, Prof. Hank Dietz, nor the University of Kentucky, are to be held responsible for any problems that may occur. That said, we will try to fix reported bugs (in our copious spare time ;-).

Version History

20020910: The initial release. There are no known bugs at the time of release.

References (actually more like a mini-FAQ)

Morse code was developed by Samuel Morse in the 1840s. As described here, the original Morse code was somewhat different from what we use now, which should more correctly be called something like "International Morse Code."

A "Phonetic Alphabet" as we describe it here is not a set of phonetic symbols to describe the sounds in a language, but rather a set of words that can be used to more clearly distinguish the individual letter each word represents. This type of phonetic alphabet seems to have originated in military applications, probably around 1900. The particular set of words we use is the one that, according to this US Navy History, was adopted in 1957. From sometime in the 1950s, it has been the standard for NATO (North Atlantic Treaty Organization), ICAO (International Civil Aviation Organization), ITU (International Telecommunications Union), and the FAA (Federal Aviation Administration). It also is commonly used for radio call signs, etc. From it's military heritage, especially NATO naval signal codes, a few idiomatic sequences have commonly known special meanings. For example, "Bravo Zulu" means "well done." Another common idiomatic phrase is "Sierra Hotel" for "extremely capable" (according to this, it is really an abbreviation for "Sh*t Hot").

In case you are wondering which voice synthesis packages were considered before going with Morse and NATO rendering of text, there are really only two freely available alternatives that seemed viable. The most practical alternative seems to be rsynth 2.0, which is fairly compact and sounds a lot like the old Votrax synthesizer (but doesn't need any special hardware). Festival is a newer system that seems much more capable, but it also is significantly harder to strip down for standalone use in an audio diagnostic renderer. Of course, there are many other systems freely available, including: KPE80 - A Klatt Synthesiser and Parameter Editor, Emacspeak, FreeTTS, and Flite (festival-lite, a simplified Festival that, when stripped and compressed by bzip2 fits in 2.5MB). In fairness, the text-to-speech stuff really isn't all that bad in itself, but by the time the sound is passed through a random PC's speaker (even using 8-bit PWM), the combination of defects makes it very difficult to understand. Add to that the fact that you'll be listening in a rather noisy machine room and it just isn't good enough. In fact, even when I recorded myself saying various text phrases and played that back, it was very difficult to catch every word.

The code for using the PC speaker has a bit of a history. Back in 1994, we developed a multi-voice music compiler to generate code that would play the music using the PC speakers on the nodes of our Linux PC clusters. It was done as a demonstration of the cheap barrier synchronization and global functions supported by aggregate function communication: each new note was randomly assigned to a different node in the cluster, which all nodes agreed upon. In fact, early versions barrier synchronized with each toggle of a PC speaker. Many better PC speaker drivers have been built since 1994, including a variety of ways to make the PC speaker appear to be an 8-bit sound card (clever uses of 8-bit PWM), but these techniques require lots more data and really do not improve the sound quality all that much. Thus, the only newer stuff that we've really used is the Linux kernel support for ioctl().

Where did the NATO audio data come from? Me. I recorded myself using Microsoft .wav files, 8-bit, 8K samples/second, in my too-noisy office using a laptop PC. I wrote simple (and ugly) software to convert the .wav files into various compressed formats. After trying quite a few variations on both desktop and laptop PCs, I found that the more expensive methods actually sounded slightly better on some machines, but far worse on others. Run-length encoded 1-bit data, still at 8K samples/second, wasn't great on any machine... but it wasn't terrible on any machine either, and it is very compact. That's how the tables you see in helpme.c are encoded. Although I doubt I'll change the encoding, it is quite possible that a future version will replace those tables with an improved set of recordings....

Why is the program called helpme? Remember the original Star Trek Episode 77: The Savage Curtain? In that episode, Surak is captured and apparently screams out in pain "Help Me, Spock!" Lincoln goes to rescue him and discovers that Surak was already dead, and the evil aliens demonstrate how they imitated Surak's voice, and then how they will imitate Lincoln's voice saying "Help Me, Kirk!" once he's dead. Since helpme essentially calls for help for a dead or injured node of a cluster or farm, this name seemed as appropriate as anything else we could think of. It also conveys the idea that this tool should only be used in such desperate situations; it is not intended as a general-purpose audio interface.

Author Contact Info

If you have any questions or comments, contact:

Professor Hank Dietz, James F. Hardymon Chair in Networking
College of Engineering
Electrical and Computer Engineering Department
453 Anderson Hall
(Office 307 EE Annex, Lab 672 Anderson Hall)
Lexington, KY 40506-0046

Office Phone: (859) 257 4701
Lab Phone:    (859) 257 9695
Fax :         (859) 257 3092
Email:        hankd@engr.uky.edu
Home URL:     http://aggregate.org/hankd/

The only thing set in stone is our name.

The Aggregate's helpme Utility