Speak Freely for Unix
by John Walker
SFMIKE(1) SFMIKE(1)
NAME
sfmike - Speak Freely sound transmission utility
SYNOPSIS
sfmike [ -abcdefglmnqrtu ] [ -lpc[10[rn]] ] [
-slevel[,timeout] ] [ -baAESKey ] [ -bfBlowfishKey ] [
-bxAESHexKey ] [ -celp ] [ -iIDEAkey ] [ -kDESkey ] [
-natcsock,dsock ] [ -okeyfile ] [ -robustn ] [ -rtp ]
[ -t ] [ -vat ] [ -wdumpfile ] [ -yindev[:ctldev] ] [
-zUser_list ] hostname[:port] [ -phostname[:port] ] [
soundfile ... ]
DESCRIPTION
Speak Freely allows users of a variety of Unix and Unix-
like systems equipped with audio hardware connected by a
network to converse, using the audio input and output fa-
cilities of the machine to digitise and later reconstruct
the sound and the network to relay sound packets. Audio
files in Sun .au format or .gsm files pre-compressed with
toast may be transmitted and played on remote machines as
well. Optional compression is provided, allowing conver-
sations over relatively low-bandwidth Internet links as
well as local area networks. Speak Freely consists of two
programs, sfmike and sfspeaker.
You can send audio to machine hostname running the sfs-
peaker program with:
sfmike hostname
which sends real time audio, or:
sfmike hostname soundfile
where soundfile is one or more files of prerecorded sound
in Sun (.au) format or GSM compressed (.gsm) sound files
created by toast. The hostname can be either a local or
Internet host name (like stinky.dwarves.org) or a numeric
IP address (for example 192.168.67.89). If your network
supports IP Multicasting, you can transmit to a multicast
group simply by giving its name or IP address. The scope
(time-to-live) of the multicast can be specified as a num-
ber between 0 (restricted to the same host) and 255 (unre-
stricted) at the end of the group name or IP address, sep-
arated by a slash, for example 231.111.75.122/128; the de-
fault multicast scope is 1: restricted to the same subnet.
If the host you're transmitting to uses a different port
number than the default configured in the Makefile, speci-
fy the port number after the host name or IP address, sep-
arated by a colon, for example bink.bilgepump.com:5050.
If both a port number and multicast scope are specified,
the port number should come first: 227.31.89.117:4851/64.
Users with dial-up Internet connections which assign a
different host name and IP address for each session can
publish their current address on a Look Who's Listening
server. Others can then use the sflwl lookup program to
see, based on an individual's invariant E-mail address or
name, whether they're connected and if so with what ad-
dress. A Look Who's Listening server is currently avail-
able at the site lwl.fourmilab.ch.
To protect against eavesdropping, sfmike provides a vari-
ety of encryption algorithms, including AES, Blowfish,
IDEA, DES, and a key read from a file. Any number of en-
cryption algorithms may be used simultaneously (assuming
your machine is fast enough). If pgp or gpg is installed,
sfmike can invoke it automatically to securely transfer a
randomly-generated session key to the party you're commu-
nicating with.
If soundfile is a single period, real time audio from the
microphone jack is selected. This permits you to send one
or more sound files, then switch to live audio all in a
single command.
sfmike is normally used in conjunction with an audio con-
trol panel to set audio record and playback levels. An
excellent X Window audio control tool which runs on most
Unix platforms including Linux and FreeBSD is xmmix; for
more information, visit the Web site http://metal-
ab.unc.edu/tkan/xmmix/. Most Unix workstations with audio
hardware come with a proprietary audio control panel; con-
sult the manufacturer's documentation for details.
In interactive (push-to-talk) mode, you can send text chat
messages to users to whom you're transmitting by pressing
the period (``.'') key, then entering a line of text.
This can be useful when you're setting up a connection and
trying various compression modes to establish a reliable
audio link.
It's perfectly valid to send audio to a copy of sfspeaker
running on the same machine as sfmike. In fact, it's a
very handy way to experiment, as long as your audio hard-
ware permits full-duplex operation.
OPTIONS
Options are processed left to right and sound files are
sent with the modes specified by options to their left on
the command line.
-a Selects ``always transmit'' mode. Unless sup-
pressed by squelch (see the -s switch below)
sfmike transmits live audio continuously. It's
usually better to use the default push-to-talk
mode. This mode is completely non-interactive;
you'll need to use Control C or a kill command
to terminate sfmike, and you can't use text
chat. The -a option is primarily intended for
automated broadcast or audio-on-demand applica-
tions such as sfvod. To run full-duplex and re-
tain the ability to pause transmission, use text
chat, and exit the program normally, use the de-
fault push-to-talk mode, and simply leave the
program continuously in talk mode.
-b Selects push-to-talk (button) mode. This is the
default. Output is initially off and the legend
``Pause:'' appears. Pressing the space bar (or
any key other than those which exit the program
or enter text chat) toggles back and forth be-
tween ``Pause:'' and ``Talk:'' modes. In Talk
mode sound packets are sent to the destination,
while in Pause mode they are discarded. Push-
to-talk mode reduces load on the network since
no packets are sent unless you're talking.
Push-to-talk makes conference calls practical,
since only the person who ``has the floor'' is
transmitting to the group. To exit sfmike,
press Escape, ``q'', Control C, or Control D.
Pressing the period (``.'') key pauses audio (if
in Talk mode) and prompts you with ``Chat:''.
You can then enter a line of text which will be
sent to every destination you're transmitting to
and printed on standard output by sfspeaker
there, tagged with your identity. This can be
handy when you're trying to choose the best com-
pression mode and having trouble getting audio
through. After entering a line of chat text,
transmission remains paused; you can enter addi-
tional lines of chat text, if you wish, each
prefixed by a period, or resume audio transmis-
sion by pressing any other key.
-bakey The specified key is used to encrypt sound
transmitted to subsequently named hosts using
the FIPS-197 Advanced Encryption Standard (AES)
algorithm. To decrypt sound encoded with the
-ba switch, sfspeaker on the receiving machine
must be invoked with an identical -bakey speci-
fication on the command line. The key can be as
long as you like; if it's a phrase of several
words, be sure to enclose it in quotes. The ac-
tual 128 bit AES key is created by applying the
MD5 algorithm to the given key. You can specify
a 256 bit key by supplying two key phrases after
the -ba option, separated by a plus sign
(``+''). sfspeaker will continue to correctly
receive unencrypted sound even if invoked with
the -ba switch. To disable AES encryption for
subsequent hosts, specify the -ba switch with no
key. AES encryption is fast and far more secure
than the DES encryption performed by the -k
switch. It has been adopted by the U.S. govern-
ment as the successor to DES and is free of
patent restrictions. The -bx option, described
below, permits you to specify AES keys of 128,
192, or 256 bits in hexadecimal form.
-bfkey The specified key is used to encrypt sound
transmitted to subsequently named hosts using
the Blowfish algorithm. To decrypt sound encod-
ed with the -bf switch, sfspeaker on the receiv-
ing machine must be invoked with an identical
-bfkey specification on the command line. The
key can be as long as you like; if it's a phrase
of several words, be sure to enclose it in
quotes. The actual 128 bit Blowfish key is cre-
ated by applying the MD5 algorithm to the given
key. sfspeaker will continue to correctly re-
ceive unencrypted sound even if invoked with the
-bf switch. To disable Blowfish encryption for
subsequent hosts, specify the -bf switch with no
key. Blowfish encryption is extremely fast and
generally considered to be far more secure than
the DES encryption performed by the -k switch.
However, Blowfish is newer and has not been en-
dorsed by governmental bodies or standards or-
ganisations. It is free of patent restrictions
and may be used by anybody in any manner without
a license.
-bxhexkey The specified hexkey is used to encrypt sound
transmitted to subsequently named hosts using
the FIPS-197 Advanced Encryption Standard (AES)
algorithm. To decrypt sound encoded with the
-bx switch, sfspeaker on the receiving machine
must be invoked with an identical -bxhexkey
specification on the command line. The key is
specified in hexadecimal, and must consist of
the digits from 0 to 9 and letters from A
through F (upper or lower case). The length of
the the key is determined by the number of key
digits given: 128 bits for 32 or fewer digits,
192 bits for 33 through 48 digits, and 256 bits
for 49 through 64 digits. The key is used as
given; no hashing or transformation is per-
formed. If fewer digits than the key length are
specified, they are left justified and unspeci-
fied digits are set to zero. sfspeaker will
continue to correctly receive unencrypted sound
even if invoked with the -bx switch. To disable
AES encryption for subsequent hosts, specify the
-bx switch with no key. AES encryption is fast
and far more secure than the DES encryption per-
formed by the -k switch. It has been adopted by
the U.S. government as the successor to DES and
is free of patent restrictions.
-c Simple sound compression. (Note: The -t switch,
documented below, uses a far more sophisticated
form of compression which reduces network band-
width by a factor of five while delivering sound
quality almost indistinguishable from the origi-
nal. The -c form of compression is retained
primarily for compatibility with earlier ver-
sions of sfspeaker which did not support the -t
switch, and for machines too slow to perform -t
switch compression in real time. The -f switch
enables ADPCM compression which achieves the
same reduction in data rate as the -c switch
with much less loss of fidelity and only modest
demands on the CPU. Try -t and -f first, and
only use the -c switch if you have trouble [such
as regular pauses in the audio which indicate
either the sending or receiving CPU is too
slow].) Simple compression is not supported by
RTP and VAT protocols, and hence can be selected
only in Speak Freely protocol.
-celp Compress sound with the United States Department
of Defense Federal Standard 1016 CELP (Code-Ex-
cited Linear Prediction) algorithm. This algo-
rithm is extremely computationally intense on
the compression side (but not to decompress, on
machines with fast floating point hardware), but
provides acceptable voice grade fidelity with a
4800 bit per second data rate. Only one of the
compression modes ADPCM (-f), CELP (-celp), LPC
(-lpc), LPC-10 (-lpc10), and GSM (-t) may be se-
lected at once.
-d Enables debug output from both the local copy of
sfmike and the receiving copy of sfspeaker (un-
less blocked by the -q option on sfspeaker).
-e Prints, on standard output, a ``session key''
based upon a collection of data from the machine
execution environment likely to be unique in the
history of the universe, used as the seed to
generate a 128 bit key. sfmike exits after
printing this value. Send it to the person
you're talking to with a public key package such
as pgp, then use it as the key for one of the
regular encryption modes. The session key is
printed in groups of four letters separated by
dashes so it's easier to read, if you wish, over
a regular telephone (but how do you know no-
body's listening?).
-f Compress sound using the ADPCM (Adaptive Differ-
ential Pulse Code Modulation) algorithm. This
reduces the volume of data transmitted by a fac-
tor of two with much less loss of fidelity than
the simple compression selected by the -c
switch. It may be used in conjunction with the
-c switch to achieve a fourfold compression, al-
beit with substantial degradation of the audio.
Only one of the compression modes ADPCM (-f),
CELP (-celp), LPC (-lpc), LPC-10 (-lpc10), and
GSM (-t) may be selected at once. ADPCM is pro-
vided as an alternative to GSM for users with
computers too slow to perform GSM compression in
real time; ADPCM requires much less computation
than GSM.
-g Automatic gain control is enabled for real time
audio. The recording gain is dynamically adjust-
ed to compensate for the amplitude of the sound
received, using the maximum dynamic range with-
out clipping. If this switch is specified, the
record gain cannot be manually set with the au-
dio control panel. Automatic gain control is
off by default, and may not be supported by some
audio drivers.
-ikey The specified key is used to encrypt sound
transmitted to subsequently named hosts using
the International Data Encryption Algorithm
(IDEA), the same algorithm used by pgp to en-
crypt message bodies with the random session
key. To decrypt sound encoded with the -i
switch, sfspeaker on the receiving machine must
be invoked with an identical -ikey specification
on the command line. The key can be as long as
you like; if it's a phrase of several words, be
sure to enclose it in quotes. The actual 128
bit IDEA key is created by applying the MD5 al-
gorithm to the given key. sfspeaker will con-
tinue to correctly receive unencrypted sound
even if invoked with the -i switch. To disable
IDEA encryption for subsequent hosts, specify
the -i switch with no key. IDEA encryption is
substantially faster and generally considered to
be much more secure than the DES encryption per-
formed by the -k switch. However, IDEA is new-
er, has not been formally adopted by govern-
ments, and is patented, restricting its commer-
cial use.
-kkey The specified key is used to encrypt sound
transmitted to subsequently named hosts using a
slightly modified version of the Data Encryption
Standard algorithm (the initial and final permu-
tations, which do not contribute to the security
of the algorithm and exist purely to deter soft-
ware implementations of DES are not performed).
In order to decrypt sound encoded with the -k
switch, sfspeaker on the receiving machine must
be invoked with an identical -kkey specification
on the command line. The key can be as long as
you like; if it's a phrase of several words, be
sure to enclose it in quotes. The actual DES
key is created by applying the MD5 algorithm to
the given key, then folding the resulting 128
bit digest into 56 bits with XOR and AND. sfs-
peaker will continue to correctly receive unen-
crypted sound even if invoked with the -k
switch. To disable DES encryption for subse-
quent hosts, specify the -k switch with no key.
-l Remote loopback is enabled. Each packet re-
ceived by sfspeaker will be immediately trans-
mitted back to a copy of sfspeaker running on
the originating machine. You can use loopback
to evaluate the quality of transmission over
various kinds of communication links without the
need to have a person at the other end.
-lpc Compress sound with an experimental linear pre-
dictive coding algorithm developed by Ron Fred-
erick of Xerox PARC. This algorithm achieves a
tremendous degree of compression: more than 12
to 1, with relatively good sound quality. If
you select it, be extremely careful not to set
your microphone level too high. Driving the
sound input into clipping causes terrible crack-
ling break-ups in the audio. It's best to ex-
periment with a local machine or echo server to
make sure you have the input level set optimal-
ly. Like the GSM compression selected by the -t
option, this form of compression requires a
great deal of computation: in this case in
floating point. If your computer is too slow or
too busy running other tasks, you may get drop-
outs in the sound. LPC compression does not
provide as good sound quality as GSM, and is
somewhat finicky to set up; it is provided as an
alternative when network bandwidth must be re-
duced to a minimum. Only one of the compression
modes ADPCM (-f), CELP (-celp), LPC (-lpc),
LPC-10 (-lpc10), and GSM (-t) may be selected at
once.
-lpc10[rn]
Compress sound to a data rate of 2400 bits per
second using the United States Department of De-
fense Federal Standard 1015 / NATO-STANAG-4198
algorithm, republished as Federal Information
Processing Standards Publication 137 (FIPS Pub
137). LPC-10 compression (an algorithm com-
pletely different from that selected by the -lpc
option) compresses sound by a factor of more
than 26 to 1 with fidelity, albeit less than
that of GSM (-t) compression, perfectly adequate
for voice-grade communications. LPC-10 compres-
sion requires a great deal of floating point
computation. If your computer is too slow or
too busy running other tasks, you may get drop-
outs in the sound. Only one of the compression
modes ADPCM (-f), CELP (-celp), LPC (-lpc),
LPC-10, or GSM (-t) can be selected at once.
LPC-10 is not a standard compression mode of RTP
or VAT protocol, and hence can be selected only
in Speak Freely protocol.
The extreme compression achieved by the LPC-10
algorithm allows the option of ``robust trans-
mission,'' in which multiple copies of sound
packets are sent, each containing a sequence
number which allows the receiver to discard du-
plicate or out-of-sequence packets. Robust
transmission often allows intelligible conversa-
tion over heavily loaded network links which
would otherwise induce random pauses and gaps in
received sound. To enable robust compression,
add the suffix rn to the -lpc10 option, where n
is the number of copies of each packet to be
sent, between 1 and 4. If no rn suffix is spec-
ified, no duplicate packets are sent (equivalent
to specifying r1. For example, to send three
copies of each LPC-10 sound packet, specify the
option -lpc10r3. Sending duplicate sound pack-
ets requires more network bandwidth. LPC-10
compression with no duplicate packets can func-
tion on a 4800 bit per second connection to the
Internet; a 9600 bit per second line can accom-
modate two copies of each packet (-lpc10r2),
while a 14,000 bit per second or faster link can
handle three (-lpc10r3) or four (-lpc10r4)
copies. (Four copies of each packet is just
within the capability of a 14,400 bit per second
line, so if the line is being used for other si-
multaneous traffic, you may have to reduce the
number of copies to three.) Sending more than
four copies of each packet does not improve per-
formance and simply wastes bandwidth; packet
replication is therefore limited to four copies.
-m Manual gain control. Allows you to manually set
the input level with your audio control panel.
This is the default mode.
-n Disables compression of sound. The switch per-
mits canceling the effect of a previous -c,
-celp, -f, -lpc, -lpc10, or -t switch when send-
ing multiple sound files with one sfmike com-
mand.
-natcsock,dsock
This option is reserved for sfspeaker when
launching sfmike to contact a site behind a
router or firewall which performs Network Ad-
dress Translation.
-ofilename
The contents of the specified filename are used
as a ``key file'' to encrypt sound data sent to
subsequently named hosts. The file should be at
least 8000 bytes long and contain data with as
little regularity as possible. The ``pgp
+makerandom=length filename'' facility is an ex-
cellent way to create a key file. To decode
sound encrypted with a key file, sfspeaker on
the receiving machine must be invoked with the
-o switch specifying a file identical to that on
the transmitting machine. You can disable key
file encryption by specifying the -o switch with
no filename. Unencrypted sound will still be
played correctly even if the -o switch is speci-
fied on the call to sfspeaker. You can use a
public-key cryptography package such as pgp or
gpg to exchange a key file with another person.
Key file encryption is much faster than any of
the other options but is far, far less secure;
use it only if all of the other forms of encryp-
tion run too slowly on your machine.
-phostname
Adds hostname to the list of hosts to which
sound is sent. The same sound will be sent to
each host you name. If you have a slow network
link, the number of hosts will be limited since,
even with compression, there may not be enough
outbound bandwidth to transmit packets to all
the hosts.
-q Quiet--disables debug output. This is the de-
fault; the switch can be used to cancel the ef-
fect of a prior -d switch. This switch has no
effect on a remote copy of sfspeaker invoked
with the -d switch.
-r Ring. This is used to get the attention of a
user when you're trying to establish a connec-
tion. The speaker output is unmuted and the
playback volume is set to mid-level to guarantee
audibility. Sun workstation users may subse-
quently switch the output back to the head-
phones, if desired, with audiotool. The -r
switch has no effect if remote ring has been
disabled with the -n switch on sfspeaker. If
your audio driver does not permit setting the
recording level, this option will have no ef-
fect.
-robustn Use ``robust transmission mode'' in which n
copies of each audio packet are sent to the des-
tination, each incorporating a serial number
which allows the receiver to discard duplicate
and out of order packets. Robust transmission
increases the number of packets sent and hence
the bandwidth required by a factor of n, but may
permit reliable transmission on connections
which frequently drop and shuffle packets. Ro-
bust transmission works best with protocols that
provide the greatest degree of compression such
as LPC (-lpc), LPC10 (-lpc10), CELP (-celp), and
GSM (-t). Robust transmission may not be used
with VAT or RTP protocols, and is incompatible
with releases of Speak Freely prior to 7.5 for
any compression mode other than -lpc10.
-rtp Transmit using the Real-Time transport Protocol
(RTP), as defined in Internet RFCs 1889 and
1890. This allows sfmike to send audio to other
Internet voice applications which support a com-
mon subset of RTP. To comply with the RTP stan-
dard, when -rtp is selected only DES (-k) en-
cryption is available and simple (-c), CELP
(-celp), and LPC-10 (-lpc10) compression cannot
be selected. RTP compliant programs do not nec-
essarily implement all compression modes or en-
cryption; consult the documentation for the pro-
gram with which you wish to communicate to see
which options it supports.
-slevel[,timeout]
Squelch output whenever input volume is below
the specified level. The level specification is
an arbitrary number from 1 to 32767 with larger
numbers denoting louder sound. The default
squelch value, if none is given on the -s
switch, is 4096 which works reasonably well un-
less your computer room is very noisy (in which
case you might want to avail yourself of a head-
set with a directional boom microphone).
Squelch interacts poorly with automatic gain
control; if you enable squelch, don't use the -g
switch. Squelch is off by default, equivalent
to a specification of -s0. Enabling squelch al-
lows multiple people to send sound to the same
destination(s) and, as long as only one speaks
at a time, for the result to be intelligible.
In order for this to work the input and squelch
levels must be set so that sound is sent only
when you're talking. Enabling debugging output
with the -d switch can help to determine the
best settings. To avoid breakups due to momen-
tary pauses in speech, squelch continues to
transmit for a period after the last packet ex-
ceeding the squelch threshold was seen. By de-
fault, this interval is 1.5 seconds, You can
specify the squelch timeout by giving the value
in milliseconds (one second is 1000 millisec-
onds) after the squelch value, separated by a
comma.
-t Compress sound with the algorithm used by Global
System Mobile (GSM) digital cellular telephones.
This is the default mode. GSM compression re-
duces the network bandwidth requirement by a
factor of five: 1650 bytes per second compared
to the uncompressed rate of 8000 bytes per sec-
ond. This allows Speak Freely to be used on
network links as slow as 19,200 bits per second.
GSM compression is lossy, but given the limita-
tions of 8000 samples per second audio, there is
little perceived loss of fidelity. GSM compres-
sion and decompression are extremely computa-
tionally intense. If the CPU on either end is
not fast enough, regular pauses will be heard in
the audio stream. If you're running on a ma-
chine with other CPU-intensive tasks, you may
encounter random pauses when other tasks use
enough CPU resources so compression and/or de-
compression can't be done in real time. If this
occurs, you can try the ADPCM (-f) or Simple
(-c) compression options described above; they
provide less compression and poorer quality, but
consume much less CPU time.
If you need to reduce the bandwidth further, you
can specify both the -c and -t switches. This
simultaneously hogs the CPU and compromises
sound quality, but the data rate to transmit re-
al time audio is reduced to 955 bytes per sec-
ond. Only one of the compression modes ADPCM
(-f), CELP (-celp), LPC (-lpc), LPC-10 (-lpc10),
and GSM (-t) may be selected at once.
-td Releases of Speak Freely for Unix prior to ver-
sion 6.1e (released in September 1998) contained
a bug which caused GSM compression to be sensi-
tive to the byte order (``endianism'') of the
machine running sfmike and sfspeaker. This er-
ror, which only affected ``little-endian'' ma-
chines such as Intel processors, is corrected in
current releases. If you absolutely must commu-
nicate with a Unix user on a little-endian plat-
form running a version prior to 6.1e, specifying
the -td option on sfmike will force it to send
the old, incorrect byte order. A much better
alternative is to encourage the user to install
the a current release in which the problem has
been corrected.
-u Prints how-to-call information.
-vat Transmit using a protocol compatible with the
Lawrence Berkeley Laboratory's original Visual
Audio Tool (VAT). This allows sfmike to send
audio to other Internet voice applications com-
patible with most releases of VAT. (Starting
with version 4, VAT supports the Internet Real
Time transport Protocol (RTP) as well as the
original VAT protocol. Since RTP provides much
better session control and interoperability with
other applications, you should use the -rtp op-
tion instead of -vat unless you absolutely have
to communicate with programs which support only
the old VAT protocol.) To be compatible with
VAT, when -vat is selected the only DES (-k) en-
cryption is available and simple (-c), CELP
(-celp), and LPC-10 (-lpc10) compression cannot
be selected. Some nominally ``VAT compatible''
applications get bedeviled by the details when
you select infrequently used compression modes
such as LPC and combine them with encryption.
If at all possible, use -rtp mode to communicate
with other Internet voice programs.
-wdumpfile
Real-time audio (but not sound files you send)
is dumped into the designated dumpfile. The
contents of the dumpfile are the raw bytes
sfmike read from the audio input device, without
any header, control information, or compression.
This option is handy when you're having trouble
getting an audio input device to provide data in
the format expected by Speak Freely. If audio
input is working normally, the dumpfile will
grow at the rate of 8000 bytes per second as you
transmit; be sure to place the dumpfile on a
file system with adequate space and/or limit the
amount of audio you dump to a short passage
suitable for debugging audio input settings.
-yindev[:ctldev]
This option allows you to override the defaults
for the name of the audio input device file (for
example /dev/audio) and, optionally, the audio
control device file, specified after the input
device, separated by a colon. If the first
character of either the input or control device
specification is a sharp sign, ``#'', the bal-
ance is taken as an integer giving the number of
an already-open file descriptor in a parent pro-
cess which is launching sfmike. This facility
(or, if you like, gimmick) allows programs such
as sflaunch to evade the restriction in some au-
dio drivers which support full-duplex but don't
permit two programs to simultaneously open the
audio device files. This option is not avail-
able on Silicon Graphics or other platforms
which do not use device files for audio I/O.
-zuser_list
If pgp or gpg is installed on your machine, you
can specify one or more users in your public
keyring (if you name more than user, be sure to
enclose the user list in quotes). A 128 bit
random session key is generated and pgp or gpg
is invoked to encrypt it with the public keys of
the named users. The encrypted session key is
transmitted to subsequently named hosts and then
used to IDEA encrypt sound sent to them. This
avoids the separate step of generating and ex-
changing a session key described above for the
-e option. Since the actual public key encryp-
tion is performed by pgp or gpg you can enjoy
the convenience of public key exchange of ses-
sion keys for audio as well.
FILES
On most Unix machines audio is read from the /dev/audio
device file. The device will be busy for input whenever
sfmike is running. On Silicon Graphics machines the digi-
tal media development toolkit is used to access the audio
hardware.
BUGS
No warning is given if the destination machine is not run-
ning sfspeaker; sound just disappears.
In order to deliver acceptable (or at least tolerable)
performance across international links, sfmike and sfs-
peaker use ``Internet datagram'' socket protocol which is
essentially a ``fire and forget'' mechanism; neither flow
control nor acknowledgement are provided. Since sound
must be delivered at the correct time in order to be in-
telligible, in real time transmission there's little one
can do anyway if data are lost. Consequently, bogged down
lines, transmission errors, etc., simply degrade or de-
stroy the quality of the audio without providing explicit
warnings at either end that anything's amiss. In addi-
tion, the lack of an end-to-end handshake deprives sfmike
of backpressure information to control the rate at which
it dispatches packets when transmitting a sound file. I
fake flow control by calculating the time it will take to
play each packet and then pause that number of microsec-
onds after sending it. This is, of course, utterly be-
neath contempt, but it actually works quite nicely (at
least as long as your machine isn't busy). If you're mo-
tivated to replace all this datagram stuff with nice,
clean RPC calls, don't bother. That's how I built the
initial version of Speak Freely, and although it ran OK on
an Ethernet, it was a disaster on long distance connec-
tions.
AES, Blowfish, IDEA, DES, and key file options encrypt ev-
ery sound packet with the same key--no key chaining is
performed. (AES, Blowfish, DES and IDEA encryption do,
however, use cipher block chaining within each packet.)
Chaining from packet to packet would increase security but
then loss of any packet would make it impossible to de-
crypt all that followed.
Certain governments attempt to restrict the availability,
use, and exportation of software with cryptographic capa-
bilities. Speak Freely was developed in Switzerland,
which has no such restrictions. The AES, DES, MD5, Blow-
fish, and IDEA packages it uses were obtained from an In-
ternet site in another European country which has no re-
strictions on cryptographic software. If you import this
software into a country with restrictions on cryptographic
software, be sure to comply with whatever restrictions ap-
ply. The responsibility to obey the law in your jurisdic-
tion is entirely your own.
Intelligible speech requires both sufficient bandwidth to
deliver the audio data and a consistent delivery time for
packets. Even if your link is theoretically fast enough,
congestion on it or on other intermediate links may cause
drop-outs. Compressing the data with the -f, -t, -lpc,
-lpc10, -celp, and/or -c switches reduces the bandwidth
required by a factor of from two to twenty-six and can of-
ten alleviate this problem, and the ``robust transmis-
sion'' option of LPC-10 compression may improve intelligi-
bility when communicating across heavily-loaded lines.
Even so, if file transfers or other bulk traffic are un-
derway, you'll probably be disappointed.
By default sfmike transmits on Internet port number 2074.
It is conceivable, albeit unlikely, that this might con-
flict with some other locally-developed network server.
You can specify a different port by appending it to the
destination host, separated by a colon, but of course you
need to ensure the remote copy of sfspeaker is listening
on that port. When communicating with other applications
using VAT or RTP protocols, you must specify the port on
which the other application is listening. RFC 1890 recom-
mends port 5004 as the default port for RTP applications.
Many VAT protocol applications default to port 3456.
There are way too many command line options. Options
should be consolidated wherever possible and changed to
keywords which can be abbreviated to the shortest unique
prefix.
ACKNOWLEDGEMENTS
The Silicon Graphics audio drivers are based on the stand-
alone SGI version developed by Paul Schurman of Espoo,
Finland. Without his generous contribution, Speak Freely
would have probably remained forever confined in an orbit
around the Sun.
Andrey A. Chernov contributed code that enables Speak
Freely to build and run on FreeBSD.
Hans Werner Strube contributed code to allow the program
to build under Solaris 2.4 without any source changes or
need for compatibility modes.
The GSM compression and decompression code was developed
by Jutta Degener and Carsten Bormann of the Communications
and Operating Systems Research Group, Technische Univer-
sitaet Berlin: Fax: +49.30.31425156, Phone:
+49.30.31424315. They note that THERE IS ABSOLUTELY NO
WARRANTY FOR THIS SOFTWARE. Please see the readme and
copyright files in the gsm directory for further details.
The ADPCM compression and decompression code was developed
by Jack Jansen of the Centre for Mathematics and Computer
Science, Amsterdam, The Netherlands. Please see the
readme and copyright files in the adpcm directory for fur-
ther details.
The Federal Standard 1016 -celp code-excited linear pre-
diction algorithm and software were developed by Joseph P.
Campbell Jr., Vanoy C. Welch and Thomas E. Tremain of the
U.S. Department of Defense. Craig F. Reese of the IDA/Su-
percomputing Research Center adapted the original imple-
mentation for use on general-purpose computers.
The -lpc linear predictive coding compression algorithm
was developed by Ron Frederick of Xerox PARC.
The public domain implementation of U.S. Federal Standard
1015 -lpc10 compression algorithm was developed by the
United States Department of Defense, National Security
Agency (NSA). Please see the README and FAQ files in the
lpc10 directory for additional details.
The DES encryption code was developed by Phil Karn, KA9Q.
Please see the readme file in the des directory for fur-
ther details.
The public domain implementation of the Advanced Encryp-
tion System (AES) was developed by Brian Gladman. For de-
tails, please visit his Web page:
http://fp.gladman.plus.com/cryptography_technology/rijndael/
and see the README file in the aes directory.
The Blowfish encryption module and the DES encryption li-
brary used for encrypting and decrypting VAT and RTP pro-
tocol packets were developed by Eric Young. Please see
the README and COPYRIGHT files in the blowfish and libdes
directory for further details. The Blowfish algorithm was
invented by Bruce Schneier and is in the public domain.
The IDEA algorithm was developed by Xuejia Lai and James
L. Massey, of ETH Zurich. The implementation used in
Speak Freely was modified and derived from original C code
developed by Xuejia Lai and optimised for speed by Colin
Plumb The IDEA[tm] block cipher is patented by Ascom-Tech
AG. The Swiss patent number is PCT/CH91/00117, the Euro-
pean patent number is EP 0 482 154 B1, and the U.S. patent
number is US005214703. IDEA[tm] is a trademark of Ascom-
Tech AG. There is no license fee required for noncommer-
cial use. Commercial users may obtain licensing details
from MediaCrypt AG at IDEA@mediacrypt.com. You can use
IDEA encryption for noncommercial communications without a
license from MediaCrypt AG; commercial use is prohibited
without a license. If you don't want to obtain a license
from Ascom-Tech, use AES, Blowfish, DES, or key file en-
cryption instead.
The implementation of MD5 message-digest algorithm is
based on a public domain version written by Colin Plumb in
1993. The algorithm is due to Ron Rivest. The algorithm
is described in Internet RFC 1321.
SEE ALSO
audio(4), audiopanel(1), audiotool(1), gpg(1), kill(1),
pgp(1), sflaunch(1), sflwl(1), sfspeaker(1), sfvod(1),
soundeditor(1), soundfiler(1), talk(1), toast(1), xmmix(1)
by John Walker
March 18, 2003