Speak Freely for Unix

by John Walker

Back to Speak Freely for Unix

SFMIKE(1)                                               SFMIKE(1)


NAME
       sfmike - Speak Freely sound transmission utility

SYNOPSIS
       sfmike [ -abcdefglmnqrtu ] [ -lpc[10[rn]] ] [
       -slevel[,timeout] ] [ -baAESKey ] [ -bfBlowfishKey ] [
	    -bxAESHexKey ] [ -celp ] [ -iIDEAkey ] [ -kDESkey ] [
	    -natcsock,dsock ] [ -okeyfile ] [ -robustn ] [ -rtp ]
	    [ -t ] [ -vat ] [ -wdumpfile ] [ -yindev[:ctldev] ] [
	    -zUser_list ] hostname[:port] [ -phostname[:port] ] [
	    soundfile ...  ]

DESCRIPTION
       Speak  Freely  allows users of a variety of Unix and Unix-
       like systems equipped with audio hardware connected  by	a
       network	to converse, using the audio input and output fa-
       cilities of the machine to digitise and later  reconstruct
       the  sound  and the network to relay sound packets.  Audio
       files in Sun .au format or .gsm files pre-compressed  with
       toast  may be transmitted and played on remote machines as
       well.  Optional compression is provided, allowing  conver-
       sations	over  relatively  low-bandwidth Internet links as
       well as local area networks.  Speak Freely consists of two
       programs, sfmike and sfspeaker.

       You  can	 send  audio to machine hostname running the sfs-
       peaker program with:

	   sfmike hostname

       which sends real time audio, or:

	   sfmike hostname soundfile

       where soundfile is one or more files of prerecorded  sound
       in  Sun	(.au) format or GSM compressed (.gsm) sound files
       created by toast.  The hostname can be either a	local  or
       Internet	 host name (like stinky.dwarves.org) or a numeric
       IP address (for example 192.168.67.89).	If  your  network
       supports	 IP Multicasting, you can transmit to a multicast
       group simply by giving its name or IP address.  The  scope
       (time-to-live) of the multicast can be specified as a num-
       ber between 0 (restricted to the same host) and 255 (unre-
       stricted) at the end of the group name or IP address, sep-
       arated by a slash, for example 231.111.75.122/128; the de-
       fault multicast scope is 1: restricted to the same subnet.
       If the host you're transmitting to uses a  different  port
       number than the default configured in the Makefile, speci-
       fy the port number after the host name or IP address, sep-
       arated  by  a  colon, for example bink.bilgepump.com:5050.
       If both a port number and multicast scope  are  specified,
       the  port number should come first: 227.31.89.117:4851/64.

       Users with dial-up Internet  connections	 which	assign	a
       different  host	name  and IP address for each session can
       publish their current address on a  Look	 Who's	Listening
       server.	 Others	 can then use the sflwl lookup program to
       see, based on an individual's invariant E-mail address  or
       name,  whether  they're	connected and if so with what ad-
       dress.  A Look Who's Listening server is currently  avail-
       able at the site lwl.fourmilab.ch.

       To  protect against eavesdropping, sfmike provides a vari-
       ety of encryption  algorithms,  including  AES,	Blowfish,
       IDEA,  DES, and a key read from a file.	Any number of en-
       cryption algorithms may be used	simultaneously	(assuming
       your machine is fast enough).  If pgp or gpg is installed,
       sfmike can invoke it automatically to securely transfer	a
       randomly-generated  session key to the party you're commu-
       nicating with.

       If soundfile is a single period, real time audio from  the
       microphone jack is selected.  This permits you to send one
       or more sound files, then switch to live audio  all  in	a
       single command.

       sfmike  is normally used in conjunction with an audio con-
       trol panel to set audio record and  playback  levels.   An
       excellent  X  Window audio control tool which runs on most
       Unix platforms including Linux and FreeBSD is  xmmix;  for
       more   information,   visit  the	 Web  site  http://metal-
       ab.unc.edu/tkan/xmmix/.	Most Unix workstations with audio
       hardware come with a proprietary audio control panel; con-
       sult the manufacturer's documentation for details.

       In interactive (push-to-talk) mode, you can send text chat
       messages	 to users to whom you're transmitting by pressing
       the period (``.'') key, then  entering  a  line	of  text.
       This can be useful when you're setting up a connection and
       trying various compression modes to establish  a	 reliable
       audio link.

       It's  perfectly valid to send audio to a copy of sfspeaker
       running on the same machine as sfmike.  In  fact,  it's	a
       very  handy way to experiment, as long as your audio hard-
       ware permits full-duplex operation.

OPTIONS
       Options are processed left to right and	sound  files  are
       sent  with the modes specified by options to their left on
       the command line.

       -a	 Selects ``always transmit'' mode.   Unless  sup-
		 pressed  by  squelch  (see  the -s switch below)
		 sfmike transmits live audio continuously.   It's
		 usually  better  to use the default push-to-talk
		 mode.	This mode is completely	 non-interactive;
		 you'll	 need  to use Control C or a kill command
		 to terminate sfmike,  and  you	 can't	use  text
		 chat.	 The  -a option is primarily intended for
		 automated broadcast or audio-on-demand	 applica-
		 tions such as sfvod.  To run full-duplex and re-
		 tain the ability to pause transmission, use text
		 chat, and exit the program normally, use the de-
		 fault push-to-talk mode, and  simply  leave  the
		 program continuously in talk mode.

       -b	 Selects push-to-talk (button) mode.  This is the
		 default.  Output is initially off and the legend
		 ``Pause:''  appears.  Pressing the space bar (or
		 any key other than those which exit the  program
		 or  enter  text chat) toggles back and forth be-
		 tween ``Pause:'' and ``Talk:'' modes.	 In  Talk
		 mode  sound packets are sent to the destination,
		 while in Pause mode they are  discarded.   Push-
		 to-talk  mode	reduces load on the network since
		 no  packets  are  sent	 unless	 you're	 talking.
		 Push-to-talk  makes  conference calls practical,
		 since only the person who ``has the  floor''  is
		 transmitting  to  the	group.	 To  exit sfmike,
		 press Escape, ``q'', Control C,  or  Control  D.
		 Pressing the period (``.'') key pauses audio (if
		 in Talk mode) and prompts  you	 with  ``Chat:''.
		 You  can then enter a line of text which will be
		 sent to every destination you're transmitting to
		 and  printed  on  standard  output  by sfspeaker
		 there, tagged with your identity.  This  can  be
		 handy when you're trying to choose the best com-
		 pression mode and having trouble  getting  audio
		 through.   After  entering  a line of chat text,
		 transmission remains paused; you can enter addi-
		 tional	 lines	of  chat  text, if you wish, each
		 prefixed by a period, or resume audio	transmis-
		 sion by pressing any other key.

       -bakey	 The  specified	 key  is  used	to  encrypt sound
		 transmitted to subsequently  named  hosts  using
		 the  FIPS-197 Advanced Encryption Standard (AES)
		 algorithm.  To decrypt sound  encoded	with  the
		 -ba  switch,  sfspeaker on the receiving machine
		 must be invoked with an identical -bakey  speci-
		 fication on the command line.	The key can be as
		 long as you like; if it's a  phrase  of  several
		 words, be sure to enclose it in quotes.  The ac-
		 tual 128 bit AES key is created by applying  the
		 MD5 algorithm to the given key.  You can specify
		 a 256 bit key by supplying two key phrases after
		 the   -ba  option,  separated	by  a  plus  sign
		 (``+'').  sfspeaker will continue  to	correctly
		 receive  unencrypted  sound even if invoked with
		 the -ba switch.  To disable AES  encryption  for
		 subsequent hosts, specify the -ba switch with no
		 key.  AES encryption is fast and far more secure
		 than  the  DES	 encryption  performed	by the -k
		 switch.  It has been adopted by the U.S. govern-
		 ment  as  the	successor  to  DES and is free of
		 patent restrictions.  The -bx option,	described
		 below,	 permits  you to specify AES keys of 128,
		 192, or 256 bits in hexadecimal form.

       -bfkey	 The specified	key  is	 used  to  encrypt  sound
		 transmitted  to  subsequently	named hosts using
		 the Blowfish algorithm.  To decrypt sound encod-
		 ed with the -bf switch, sfspeaker on the receiv-
		 ing machine must be invoked  with  an	identical
		 -bfkey	 specification	on the command line.  The
		 key can be as long as you like; if it's a phrase
		 of  several  words,  be  sure	to  enclose it in
		 quotes.  The actual 128 bit Blowfish key is cre-
		 ated  by applying the MD5 algorithm to the given
		 key.  sfspeaker will continue to  correctly  re-
		 ceive unencrypted sound even if invoked with the
		 -bf switch.  To disable Blowfish encryption  for
		 subsequent hosts, specify the -bf switch with no
		 key.  Blowfish encryption is extremely fast  and
		 generally  considered to be far more secure than
		 the DES encryption performed by the  -k  switch.
		 However,  Blowfish is newer and has not been en-
		 dorsed by governmental bodies or  standards  or-
		 ganisations.	It is free of patent restrictions
		 and may be used by anybody in any manner without
		 a license.

       -bxhexkey The  specified	 hexkey	 is used to encrypt sound
		 transmitted to subsequently  named  hosts  using
		 the  FIPS-197 Advanced Encryption Standard (AES)
		 algorithm.  To decrypt sound  encoded	with  the
		 -bx  switch,  sfspeaker on the receiving machine
		 must be  invoked  with	 an  identical	-bxhexkey
		 specification	on  the command line.  The key is
		 specified in hexadecimal, and	must  consist  of
		 the  digits  from  0  to  9  and  letters from A
		 through F (upper or lower case).  The length  of
		 the  the  key is determined by the number of key
		 digits given: 128 bits for 32 or  fewer  digits,
		 192  bits for 33 through 48 digits, and 256 bits
		 for 49 through 64 digits.  The key  is	 used  as
		 given;	 no  hashing  or  transformation  is per-
		 formed.  If fewer digits than the key length are
		 specified,  they are left justified and unspeci-
		 fied digits are set  to  zero.	  sfspeaker  will
		 continue  to correctly receive unencrypted sound
		 even if invoked with the -bx switch.  To disable
		 AES encryption for subsequent hosts, specify the
		 -bx switch with no key.  AES encryption is  fast
		 and far more secure than the DES encryption per-
		 formed by the -k switch.  It has been adopted by
		 the  U.S. government as the successor to DES and
		 is free of patent restrictions.

       -c	 Simple sound compression.  (Note: The -t switch,
		 documented  below, uses a far more sophisticated
		 form of compression which reduces network  band-
		 width by a factor of five while delivering sound
		 quality almost indistinguishable from the origi-
		 nal.	The  -c	 form  of compression is retained
		 primarily for compatibility  with  earlier  ver-
		 sions	of sfspeaker which did not support the -t
		 switch, and for machines too slow to perform  -t
		 switch	 compression in real time.  The -f switch
		 enables ADPCM	compression  which  achieves  the
		 same  reduction  in  data  rate as the -c switch
		 with much less loss of fidelity and only  modest
		 demands  on  the  CPU.	 Try -t and -f first, and
		 only use the -c switch if you have trouble [such
		 as  regular  pauses  in the audio which indicate
		 either the  sending  or  receiving  CPU  is  too
		 slow].)   Simple compression is not supported by
		 RTP and VAT protocols, and hence can be selected
		 only in Speak Freely protocol.

       -celp	 Compress sound with the United States Department
		 of Defense Federal Standard 1016 CELP	(Code-Ex-
		 cited	Linear Prediction) algorithm.  This algo-
		 rithm is extremely  computationally  intense  on
		 the  compression side (but not to decompress, on
		 machines with fast floating point hardware), but
		 provides  acceptable voice grade fidelity with a
		 4800 bit per second data rate.	 Only one of  the
		 compression  modes ADPCM (-f), CELP (-celp), LPC
		 (-lpc), LPC-10 (-lpc10), and GSM (-t) may be se-
		 lected at once.

       -d	 Enables debug output from both the local copy of
		 sfmike and the receiving copy of sfspeaker  (un-
		 less blocked by the -q option on sfspeaker).

       -e	 Prints,  on  standard	output, a ``session key''
		 based upon a collection of data from the machine
		 execution environment likely to be unique in the
		 history of the universe, used	as  the	 seed  to
		 generate  a  128  bit	key.   sfmike exits after
		 printing this value.	Send  it  to  the  person
		 you're talking to with a public key package such
		 as pgp, then use it as the key for  one  of  the
		 regular  encryption  modes.   The session key is
		 printed in groups of four letters  separated  by
		 dashes so it's easier to read, if you wish, over
		 a regular telephone (but how  do  you	know  no-
		 body's listening?).

       -f	 Compress sound using the ADPCM (Adaptive Differ-
		 ential Pulse Code Modulation)	algorithm.   This
		 reduces the volume of data transmitted by a fac-
		 tor of two with much less loss of fidelity  than
		 the   simple  compression  selected  by  the  -c
		 switch.  It may be used in conjunction with  the
		 -c switch to achieve a fourfold compression, al-
		 beit with substantial degradation of the  audio.
		 Only  one  of	the compression modes ADPCM (-f),
		 CELP (-celp), LPC (-lpc), LPC-10  (-lpc10),  and
		 GSM (-t) may be selected at once.  ADPCM is pro-
		 vided as an alternative to GSM	 for  users  with
		 computers too slow to perform GSM compression in
		 real time; ADPCM requires much less  computation
		 than GSM.

       -g	 Automatic  gain control is enabled for real time
		 audio. The recording gain is dynamically adjust-
		 ed  to compensate for the amplitude of the sound
		 received, using the maximum dynamic range  with-
		 out  clipping.	 If this switch is specified, the
		 record gain cannot be manually set with the  au-
		 dio  control  panel.	Automatic gain control is
		 off by default, and may not be supported by some
		 audio drivers.

       -ikey	 The  specified	 key  is  used	to  encrypt sound
		 transmitted to subsequently  named  hosts  using
		 the   International  Data  Encryption	Algorithm
		 (IDEA), the same algorithm used by  pgp  to  en-
		 crypt	message	 bodies	 with  the random session
		 key.  To  decrypt  sound  encoded  with  the  -i
		 switch,  sfspeaker on the receiving machine must
		 be invoked with an identical -ikey specification
		 on  the command line.	The key can be as long as
		 you like; if it's a phrase of several words,  be
		 sure  to  enclose  it in quotes.  The actual 128
		 bit IDEA key is created by applying the MD5  al-
		 gorithm  to  the given key.  sfspeaker will con-
		 tinue to  correctly  receive  unencrypted  sound
		 even  if invoked with the -i switch.  To disable
		 IDEA encryption for  subsequent  hosts,  specify
		 the  -i  switch with no key.  IDEA encryption is
		 substantially faster and generally considered to
		 be much more secure than the DES encryption per-
		 formed by the -k switch.  However, IDEA is  new-
		 er,  has  not	been  formally adopted by govern-
		 ments, and is patented, restricting its  commer-
		 cial use.

       -kkey	 The  specified	 key  is  used	to  encrypt sound
		 transmitted to subsequently named hosts using	a
		 slightly modified version of the Data Encryption
		 Standard algorithm (the initial and final permu-
		 tations, which do not contribute to the security
		 of the algorithm and exist purely to deter soft-
		 ware  implementations of DES are not performed).
		 In order to decrypt sound encoded  with  the  -k
		 switch,  sfspeaker on the receiving machine must
		 be invoked with an identical -kkey specification
		 on  the command line.	The key can be as long as
		 you like; if it's a phrase of several words,  be
		 sure  to  enclose  it in quotes.  The actual DES
		 key is created by applying the MD5 algorithm  to
		 the  given  key,  then folding the resulting 128
		 bit digest into 56 bits with XOR and AND.   sfs-
		 peaker	 will continue to correctly receive unen-
		 crypted  sound	 even  if  invoked  with  the  -k
		 switch.   To  disable	DES encryption for subse-
		 quent hosts, specify the -k switch with no  key.

       -l	 Remote	 loopback  is  enabled.	  Each packet re-
		 ceived by sfspeaker will be  immediately  trans-
		 mitted	 back  to  a copy of sfspeaker running on
		 the originating machine.  You can  use	 loopback
		 to  evaluate  the  quality  of transmission over
		 various kinds of communication links without the
		 need to have a person at the other end.

       -lpc	 Compress  sound with an experimental linear pre-
		 dictive coding algorithm developed by Ron  Fred-
		 erick	of Xerox PARC.	This algorithm achieves a
		 tremendous degree of compression: more	 than  12
		 to  1,	 with  relatively good sound quality.  If
		 you select it, be extremely careful not  to  set
		 your  microphone  level  too  high.  Driving the
		 sound input into clipping causes terrible crack-
		 ling  break-ups  in the audio.	 It's best to ex-
		 periment with a local machine or echo server  to
		 make  sure you have the input level set optimal-
		 ly.  Like the GSM compression selected by the -t
		 option,  this	form  of  compression  requires a
		 great deal  of	 computation:  in  this	 case  in
		 floating point.  If your computer is too slow or
		 too busy running other tasks, you may get  drop-
		 outs  in  the	sound.	 LPC compression does not
		 provide as good sound quality	as  GSM,  and  is
		 somewhat finicky to set up; it is provided as an
		 alternative when network bandwidth must  be  re-
		 duced to a minimum.  Only one of the compression
		 modes ADPCM  (-f),  CELP  (-celp),  LPC  (-lpc),
		 LPC-10 (-lpc10), and GSM (-t) may be selected at
		 once.

       -lpc10[rn]
		 Compress sound to a data rate of 2400	bits  per
		 second using the United States Department of De-
		 fense Federal Standard 1015  /	 NATO-STANAG-4198
		 algorithm,  republished  as  Federal Information
		 Processing Standards Publication 137  (FIPS  Pub
		 137).	 LPC-10	 compression  (an  algorithm com-
		 pletely different from that selected by the -lpc
		 option)  compresses  sound  by	 a factor of more
		 than 26 to 1 with  fidelity,  albeit  less  than
		 that of GSM (-t) compression, perfectly adequate
		 for voice-grade communications.  LPC-10 compres-
		 sion  requires	 a  great  deal of floating point
		 computation.  If your computer is  too	 slow  or
		 too  busy running other tasks, you may get drop-
		 outs in the sound.  Only one of the  compression
		 modes	ADPCM  (-f),  CELP  (-celp),  LPC (-lpc),
		 LPC-10, or GSM (-t) can  be  selected	at  once.
		 LPC-10 is not a standard compression mode of RTP
		 or VAT protocol, and hence can be selected  only
		 in Speak Freely protocol.

		 The  extreme  compression achieved by the LPC-10
		 algorithm allows the option of	 ``robust  trans-
		 mission,''  in	 which	multiple  copies of sound
		 packets are sent,  each  containing  a	 sequence
		 number	 which allows the receiver to discard du-
		 plicate  or  out-of-sequence  packets.	   Robust
		 transmission often allows intelligible conversa-
		 tion over heavily  loaded  network  links  which
		 would otherwise induce random pauses and gaps in
		 received sound.  To enable  robust  compression,
		 add  the suffix rn to the -lpc10 option, where n
		 is the number of copies of  each  packet  to  be
		 sent, between 1 and 4.	 If no rn suffix is spec-
		 ified, no duplicate packets are sent (equivalent
		 to  specifying	 r1.   For example, to send three
		 copies of each LPC-10 sound packet, specify  the
		 option	 -lpc10r3.  Sending duplicate sound pack-
		 ets requires  more  network  bandwidth.   LPC-10
		 compression  with no duplicate packets can func-
		 tion on a 4800 bit per second connection to  the
		 Internet;  a 9600 bit per second line can accom-
		 modate two copies  of	each  packet  (-lpc10r2),
		 while a 14,000 bit per second or faster link can
		 handle	 three	(-lpc10r3)  or	four   (-lpc10r4)
		 copies.   (Four  copies  of  each packet is just
		 within the capability of a 14,400 bit per second
		 line, so if the line is being used for other si-
		 multaneous traffic, you may have to  reduce  the
		 number	 of  copies to three.)	Sending more than
		 four copies of each packet does not improve per-
		 formance  and	simply	wastes	bandwidth; packet
		 replication is therefore limited to four copies.

       -m	 Manual gain control.  Allows you to manually set
		 the input level with your audio  control  panel.
		 This is the default mode.

       -n	 Disables  compression of sound.  The switch per-
		 mits canceling the  effect  of	 a  previous  -c,
		 -celp, -f, -lpc, -lpc10, or -t switch when send-
		 ing multiple sound files with	one  sfmike  com-
		 mand.

       -natcsock,dsock
		 This  option  is  reserved  for  sfspeaker  when
		 launching sfmike to  contact  a  site	behind	a
		 router	 or  firewall  which performs Network Ad-
		 dress Translation.

       -ofilename
		 The contents of the specified filename are  used
		 as  a ``key file'' to encrypt sound data sent to
		 subsequently named hosts.  The file should be at
		 least	8000  bytes long and contain data with as
		 little	 regularity  as	 possible.    The   ``pgp
		 +makerandom=length filename'' facility is an ex-
		 cellent way to create a  key  file.   To  decode
		 sound	encrypted  with	 a key file, sfspeaker on
		 the receiving machine must be invoked	with  the
		 -o switch specifying a file identical to that on
		 the transmitting machine.  You can  disable  key
		 file encryption by specifying the -o switch with
		 no filename.  Unencrypted sound  will	still  be
		 played correctly even if the -o switch is speci-
		 fied on the call to sfspeaker.	 You  can  use	a
		 public-key  cryptography  package such as pgp or
		 gpg to exchange a key file with another  person.
		 Key  file  encryption is much faster than any of
		 the other options but is far, far  less  secure;
		 use it only if all of the other forms of encryp-
		 tion run too slowly on your machine.

       -phostname
		 Adds hostname to the  list  of	 hosts	to  which
		 sound	is  sent.  The same sound will be sent to
		 each host you name.  If you have a slow  network
		 link, the number of hosts will be limited since,
		 even with compression, there may not  be  enough
		 outbound  bandwidth  to  transmit packets to all
		 the hosts.

       -q	 Quiet--disables debug output.	This is	 the  de-
		 fault;	 the switch can be used to cancel the ef-
		 fect of a prior -d switch.  This switch  has  no
		 effect	 on  a	remote	copy of sfspeaker invoked
		 with the -d switch.

       -r	 Ring.	This is used to get the	 attention  of	a
		 user  when  you're trying to establish a connec-
		 tion.	The speaker output  is	unmuted	 and  the
		 playback volume is set to mid-level to guarantee
		 audibility.  Sun workstation  users  may  subse-
		 quently  switch  the  output  back  to the head-
		 phones, if  desired,  with  audiotool.	  The  -r
		 switch	 has  no  effect  if remote ring has been
		 disabled with the -n switch  on  sfspeaker.   If
		 your  audio  driver  does not permit setting the
		 recording level, this option will  have  no  ef-
		 fect.

       -robustn	 Use  ``robust	transmission  mode''  in  which n
		 copies of each audio packet are sent to the des-
		 tination,  each  incorporating	 a  serial number
		 which allows the receiver to  discard	duplicate
		 and  out  of order packets.  Robust transmission
		 increases the number of packets sent  and  hence
		 the bandwidth required by a factor of n, but may
		 permit	 reliable  transmission	 on   connections
		 which	frequently drop and shuffle packets.  Ro-
		 bust transmission works best with protocols that
		 provide  the greatest degree of compression such
		 as LPC (-lpc), LPC10 (-lpc10), CELP (-celp), and
		 GSM  (-t).   Robust transmission may not be used
		 with VAT or RTP protocols, and	 is  incompatible
		 with  releases	 of Speak Freely prior to 7.5 for
		 any compression mode other than -lpc10.

       -rtp	 Transmit using the Real-Time transport	 Protocol
		 (RTP),	 as  defined  in  Internet  RFCs 1889 and
		 1890.	This allows sfmike to send audio to other
		 Internet voice applications which support a com-
		 mon subset of RTP.  To comply with the RTP stan-
		 dard,	when  -rtp  is selected only DES (-k) en-
		 cryption is  available	 and  simple  (-c),  CELP
		 (-celp),  and LPC-10 (-lpc10) compression cannot
		 be selected.  RTP compliant programs do not nec-
		 essarily  implement all compression modes or en-
		 cryption; consult the documentation for the pro-
		 gram  with  which you wish to communicate to see
		 which options it supports.

       -slevel[,timeout]
		 Squelch output whenever input	volume	is  below
		 the specified level.  The level specification is
		 an arbitrary number from 1 to 32767 with  larger
		 numbers  denoting  louder  sound.   The  default
		 squelch value,	 if  none  is  given  on  the  -s
		 switch,  is 4096 which works reasonably well un-
		 less your computer room is very noisy (in  which
		 case you might want to avail yourself of a head-
		 set  with  a	directional   boom   microphone).
		 Squelch  interacts  poorly  with  automatic gain
		 control; if you enable squelch, don't use the -g
		 switch.   Squelch  is off by default, equivalent
		 to a specification of -s0.  Enabling squelch al-
		 lows  multiple	 people to send sound to the same
		 destination(s) and, as long as only  one  speaks
		 at  a	time,  for the result to be intelligible.
		 In order for this to work the input and  squelch
		 levels	 must  be  set so that sound is sent only
		 when you're talking.  Enabling debugging  output
		 with  the  -d	switch	can help to determine the
		 best settings.	 To avoid breakups due to  momen-
		 tary  pauses  in  speech,  squelch  continues to
		 transmit for a period after the last packet  ex-
		 ceeding  the squelch threshold was seen.  By de-
		 fault, this interval is  1.5  seconds,	 You  can
		 specify  the squelch timeout by giving the value
		 in milliseconds (one second  is  1000	millisec-
		 onds)	after  the  squelch value, separated by a
		 comma.

       -t	 Compress sound with the algorithm used by Global
		 System Mobile (GSM) digital cellular telephones.
		 This is the default mode.  GSM	 compression  re-
		 duces	the  network  bandwidth	 requirement by a
		 factor of five: 1650 bytes per	 second	 compared
		 to  the uncompressed rate of 8000 bytes per sec-
		 ond.  This allows Speak Freely	 to  be	 used  on
		 network links as slow as 19,200 bits per second.
		 GSM compression is lossy, but given the  limita-
		 tions of 8000 samples per second audio, there is
		 little perceived loss of fidelity.  GSM compres-
		 sion  and  decompression  are extremely computa-
		 tionally intense.  If the CPU on either  end  is
		 not fast enough, regular pauses will be heard in
		 the audio stream.  If you're running  on  a  ma-
		 chine	with  other  CPU-intensive tasks, you may
		 encounter random pauses  when	other  tasks  use
		 enough	 CPU  resources so compression and/or de-
		 compression can't be done in real time.  If this
		 occurs,  you  can  try	 the ADPCM (-f) or Simple
		 (-c) compression options described  above;  they
		 provide less compression and poorer quality, but
		 consume much less CPU time.

		 If you need to reduce the bandwidth further, you
		 can  specify  both the -c and -t switches.  This
		 simultaneously	 hogs  the  CPU	 and  compromises
		 sound quality, but the data rate to transmit re-
		 al time audio is reduced to 955 bytes	per  sec-
		 ond.	Only  one  of the compression modes ADPCM
		 (-f), CELP (-celp), LPC (-lpc), LPC-10 (-lpc10),
		 and GSM (-t) may be selected at once.

       -td	 Releases  of Speak Freely for Unix prior to ver-
		 sion 6.1e (released in September 1998) contained
		 a  bug which caused GSM compression to be sensi-
		 tive to the byte order	 (``endianism'')  of  the
		 machine  running sfmike and sfspeaker.	 This er-
		 ror, which only affected  ``little-endian''  ma-
		 chines such as Intel processors, is corrected in
		 current releases.  If you absolutely must commu-
		 nicate with a Unix user on a little-endian plat-
		 form running a version prior to 6.1e, specifying
		 the  -td  option on sfmike will force it to send
		 the old, incorrect byte order.	  A  much  better
		 alternative  is to encourage the user to install
		 the a current release in which the  problem  has
		 been corrected.

       -u	 Prints how-to-call information.

       -vat	 Transmit  using  a  protocol compatible with the
		 Lawrence Berkeley Laboratory's	 original  Visual
		 Audio	Tool  (VAT).   This allows sfmike to send
		 audio to other Internet voice applications  com-
		 patible  with	most  releases of VAT.	(Starting
		 with version 4, VAT supports the  Internet  Real
		 Time  transport  Protocol  (RTP)  as well as the
		 original VAT protocol.	 Since RTP provides  much
		 better session control and interoperability with
		 other applications, you should use the -rtp  op-
		 tion  instead of -vat unless you absolutely have
		 to communicate with programs which support  only
		 the  old  VAT	protocol.)  To be compatible with
		 VAT, when -vat is selected the only DES (-k) en-
		 cryption  is  available  and  simple  (-c), CELP
		 (-celp), and LPC-10 (-lpc10) compression  cannot
		 be  selected.	Some nominally ``VAT compatible''
		 applications get bedeviled by the  details  when
		 you  select  infrequently used compression modes
		 such as LPC and combine  them	with  encryption.
		 If at all possible, use -rtp mode to communicate
		 with other Internet voice programs.

       -wdumpfile
		 Real-time audio (but not sound files  you  send)
		 is  dumped  into  the	designated dumpfile.  The
		 contents of  the  dumpfile  are  the  raw  bytes
		 sfmike read from the audio input device, without
		 any header, control information, or compression.
		 This  option is handy when you're having trouble
		 getting an audio input device to provide data in
		 the  format  expected by Speak Freely.	 If audio
		 input is working  normally,  the  dumpfile  will
		 grow at the rate of 8000 bytes per second as you
		 transmit; be sure to place  the  dumpfile  on	a
		 file system with adequate space and/or limit the
		 amount of audio you  dump  to	a  short  passage
		 suitable for debugging audio input settings.

       -yindev[:ctldev]
		 This  option allows you to override the defaults
		 for the name of the audio input device file (for
		 example  /dev/audio)  and, optionally, the audio
		 control device file, specified after  the  input
		 device,  separated  by	 a  colon.   If the first
		 character of either the input or control  device
		 specification	is  a sharp sign, ``#'', the bal-
		 ance is taken as an integer giving the number of
		 an already-open file descriptor in a parent pro-
		 cess which is launching sfmike.   This	 facility
		 (or,  if you like, gimmick) allows programs such
		 as sflaunch to evade the restriction in some au-
		 dio  drivers which support full-duplex but don't
		 permit two programs to simultaneously	open  the
		 audio	device	files.	This option is not avail-
		 able on  Silicon  Graphics  or	 other	platforms
		 which do not use device files for audio I/O.

       -zuser_list
		 If  pgp or gpg is installed on your machine, you
		 can specify one or more  users	 in  your  public
		 keyring  (if you name more than user, be sure to
		 enclose the user list in  quotes).   A	 128  bit
		 random	 session  key is generated and pgp or gpg
		 is invoked to encrypt it with the public keys of
		 the  named  users.  The encrypted session key is
		 transmitted to subsequently named hosts and then
		 used  to  IDEA encrypt sound sent to them.  This
		 avoids the separate step of generating	 and  ex-
		 changing  a  session key described above for the
		 -e option.  Since the actual public key  encryp-
		 tion  is  performed  by pgp or gpg you can enjoy
		 the convenience of public key exchange	 of  ses-
		 sion keys for audio as well.

FILES
       On  most	 Unix  machines audio is read from the /dev/audio
       device file.  The device will be busy for  input	 whenever
       sfmike is running.  On Silicon Graphics machines the digi-
       tal media development toolkit is used to access the  audio
       hardware.

BUGS
       No warning is given if the destination machine is not run-
       ning sfspeaker; sound just disappears.

       In order to deliver acceptable  (or  at	least  tolerable)
       performance  across  international  links, sfmike and sfs-
       peaker use ``Internet datagram'' socket protocol which  is
       essentially  a ``fire and forget'' mechanism; neither flow
       control nor acknowledgement  are	 provided.   Since  sound
       must  be	 delivered at the correct time in order to be in-
       telligible, in real time transmission there's  little  one
       can do anyway if data are lost.	Consequently, bogged down
       lines, transmission errors, etc., simply	 degrade  or  de-
       stroy  the quality of the audio without providing explicit
       warnings at either end that anything's  amiss.	In  addi-
       tion,  the lack of an end-to-end handshake deprives sfmike
       of backpressure information to control the rate	at  which
       it  dispatches  packets when transmitting a sound file.	I
       fake flow control by calculating the time it will take  to
       play  each  packet and then pause that number of microsec-
       onds after sending it.  This is, of  course,  utterly  be-
       neath  contempt,	 but  it  actually works quite nicely (at
       least as long as your machine isn't busy).  If you're  mo-
       tivated	to  replace  all  this	datagram stuff with nice,
       clean RPC calls, don't bother.  That's  how  I  built  the
       initial version of Speak Freely, and although it ran OK on
       an Ethernet, it was a disaster on  long	distance  connec-
       tions.

       AES, Blowfish, IDEA, DES, and key file options encrypt ev-
       ery sound packet with the same  key--no	key  chaining  is
       performed.   (AES,  Blowfish,  DES and IDEA encryption do,
       however, use cipher block chaining  within  each	 packet.)
       Chaining from packet to packet would increase security but
       then loss of any packet would make it  impossible  to  de-
       crypt all that followed.

       Certain	governments attempt to restrict the availability,
       use, and exportation of software with cryptographic  capa-
       bilities.   Speak  Freely  was  developed  in Switzerland,
       which has no such restrictions.	The AES, DES, MD5,  Blow-
       fish,  and IDEA packages it uses were obtained from an In-
       ternet site in another European country which has  no  re-
       strictions  on cryptographic software.  If you import this
       software into a country with restrictions on cryptographic
       software, be sure to comply with whatever restrictions ap-
       ply.  The responsibility to obey the law in your jurisdic-
       tion is entirely your own.

       Intelligible  speech requires both sufficient bandwidth to
       deliver the audio data and a consistent delivery time  for
       packets.	  Even if your link is theoretically fast enough,
       congestion on it or on other intermediate links may  cause
       drop-outs.   Compressing	 the  data with the -f, -t, -lpc,
       -lpc10, -celp, and/or -c switches  reduces  the	bandwidth
       required by a factor of from two to twenty-six and can of-
       ten alleviate this problem,  and	 the  ``robust	transmis-
       sion'' option of LPC-10 compression may improve intelligi-
       bility when communicating  across   heavily-loaded  lines.
       Even  so,  if file transfers or other bulk traffic are un-
       derway, you'll probably be disappointed.

       By default sfmike transmits on Internet port number  2074.
       It  is  conceivable, albeit unlikely, that this might con-
       flict with some other  locally-developed	 network  server.
       You  can	 specify  a different port by appending it to the
       destination host, separated by a colon, but of course  you
       need  to	 ensure the remote copy of sfspeaker is listening
       on that port.  When communicating with other  applications
       using  VAT  or RTP protocols, you must specify the port on
       which the other application is listening.  RFC 1890 recom-
       mends  port 5004 as the default port for RTP applications.
       Many VAT protocol applications default to port 3456.

       There are way too  many	command	 line  options.	  Options
       should  be  consolidated	 wherever possible and changed to
       keywords which can be abbreviated to the	 shortest  unique
       prefix.

ACKNOWLEDGEMENTS
       The Silicon Graphics audio drivers are based on the stand-
       alone SGI version developed by  Paul  Schurman  of  Espoo,
       Finland.	  Without his generous contribution, Speak Freely
       would have probably remained forever confined in an  orbit
       around the Sun.

       Andrey  A.  Chernov  contributed	 code  that enables Speak
       Freely to build and run on FreeBSD.

       Hans Werner Strube contributed code to allow  the  program
       to  build  under Solaris 2.4 without any source changes or
       need for compatibility modes.

       The GSM compression and decompression code  was	developed
       by Jutta Degener and Carsten Bormann of the Communications
       and Operating Systems Research Group, Technische	  Univer-
       sitaet	  Berlin:     Fax:     +49.30.31425156,	   Phone:
       +49.30.31424315.	 They note that THERE  IS  ABSOLUTELY  NO
       WARRANTY	 FOR  THIS  SOFTWARE.	Please see the readme and
       copyright files in the gsm directory for further	 details.

       The ADPCM compression and decompression code was developed
       by Jack Jansen of the Centre for Mathematics and	 Computer
       Science,	 Amsterdam,  The  Netherlands.	 Please	 see  the
       readme and copyright files in the adpcm directory for fur-
       ther details.

       The  Federal  Standard 1016 -celp code-excited linear pre-
       diction algorithm and software were developed by Joseph P.
       Campbell	 Jr., Vanoy C. Welch and Thomas E. Tremain of the
       U.S. Department of Defense.  Craig F. Reese of the IDA/Su-
       percomputing  Research  Center adapted the original imple-
       mentation for use on general-purpose computers.

       The -lpc linear predictive  coding  compression	algorithm
       was developed by Ron Frederick of Xerox PARC.

       The  public domain implementation of U.S. Federal Standard
       1015 -lpc10 compression algorithm  was  developed  by  the
       United  States  Department  of  Defense, National Security
       Agency (NSA).  Please see the README and FAQ files in  the
       lpc10 directory for additional details.

       The  DES encryption code was developed by Phil Karn, KA9Q.
       Please see the readme file in the des directory	for  fur-
       ther details.

       The  public  domain implementation of the Advanced Encryp-
       tion System (AES) was developed by Brian Gladman.  For de-
       tails, please visit his Web page:
       http://fp.gladman.plus.com/cryptography_technology/rijndael/
       and see the README file in the aes directory.

       The  Blowfish encryption module and the DES encryption li-
       brary used for encrypting and decrypting VAT and RTP  pro-
       tocol  packets  were  developed by Eric Young.  Please see
       the README and COPYRIGHT files in the blowfish and  libdes
       directory for further details.  The Blowfish algorithm was
       invented by Bruce Schneier and is in the public domain.

       The IDEA algorithm was developed by Xuejia Lai  and  James
       L.  Massey,  of	ETH  Zurich.   The implementation used in
       Speak Freely was modified and derived from original C code
       developed  by  Xuejia Lai and optimised for speed by Colin
       Plumb The IDEA[tm] block cipher is patented by  Ascom-Tech
       AG.  The	 Swiss patent number is PCT/CH91/00117, the Euro-
       pean patent number is EP 0 482 154 B1, and the U.S. patent
       number  is  US005214703. IDEA[tm] is a trademark of Ascom-
       Tech AG. There is no license fee required  for  noncommer-
       cial  use.  Commercial  users may obtain licensing details
       from MediaCrypt AG at IDEA@mediacrypt.com.   You	 can  use
       IDEA encryption for noncommercial communications without a
       license from MediaCrypt AG; commercial use  is  prohibited
       without	a license.  If you don't want to obtain a license
       from Ascom-Tech, use AES, Blowfish, DES, or key	file  en-
       cryption instead.

       The  implementation  of	MD5  message-digest  algorithm is
       based on a public domain version written by Colin Plumb in
       1993.   The algorithm is due to Ron Rivest.  The algorithm
       is described in Internet RFC 1321.

SEE ALSO
       audio(4), audiopanel(1),	 audiotool(1),	gpg(1),	 kill(1),
       pgp(1),	sflaunch(1),  sflwl(1),	 sfspeaker(1),	sfvod(1),
       soundeditor(1), soundfiler(1), talk(1), toast(1), xmmix(1)

Back to Speak Freely for Unix


by John Walker
March 18, 2003