What you have here is a set of programs which collectively allow the identification of beats. Neither the programs, nor their author, can give the slightest guarantee that these "beats" correspond to anything that humans perceive. There is, in fact, a deal of evidence which suggests that if we supply these programs with reasonable parameters (such as the default ones) the resulting beats will correspond rather well with perceived beats. Furthermore, there is now also a deal of experimental evidence which suggests that the beats extracted by this program suite can tell us some interesting things about speech. References for both of these claims are given at the end. For now I will assume that if you have this code, you know yourself why you want it.
This code is being released in two versions: one for Sun Sparc workstations, and one for Silicon Graphics workstations. (An unofficial version for Linux may be available on request from me - it will inevitably be rougher.) If you have the Sun version, you will need your audio files to be of .au format, with 16-bit linear sample encoding at any sampling rate. If this is gobbledygook to you, don't feel bad, no one should ever have to know this rubbish. However, you do now, so take a look at the Audio File Format FAQ before going much further (if the link is out of date, try Yahoo or some such). If you have an SGI machine, you will use AIFF files, again, with 16 bit linear encoding. Both of these are fairly standard forms. If your data is not in the required form, you might consider using SOX (for Sun) or sfconvert to get what you need.
Machine | Format | Encoding | Sampling rate |
---|---|---|---|
Sun | AU | 16 bit, linear | any |
SGI | AIFF | 16 bit linear | any |
Current versions of both the SGi and Sun versions of Bex are here.
tar xzvf bex0.9-sgi.tar.gz cd bex0.9 Bex
If your machine barfs, and says it doesn't understand
the -z option to tar, then do:
gunzip bex0.9.tar.gz tar xvf bex0.9.tar cd bex0.9 Bex
The SGI code was compiled on an Indigo running IRIX 5.3, and compiled using CC and cc. The Sun code was compiled on a Sun running SunOS 5.5 (Solaris) (that's gummy, if you are from IU Computer Science Dept) using gcc for the C++ code and cc for the plain C code. The compilation was not really straight forward, as the C++ code (sndtoenv) is based on a library of dsp functions called spkit. I have pre-compiled this, and provide it as libsp.a. I do not provide the source for this, however - merely the header files required for compilation. If you need to recompile libsp.a you can get the source here . Future releases of Bex should dispense entirely with spkit, with a view to keeping it all in C and all original. Meanwhile, a big thank you to Kai for his code.
### parameters for sndtoenv ### generated automatically by Bex 1000 Center frequency 600 Bandwidth 20 Cutoff for smoothing ./soundfiles/lim.au Input filename ./soundfiles/lim.env.au Output envelope file
Don't edit this by hand - it is generated on the fly by Bex.
### parameters for envtobeats ### generated automatically by Bex au Audio file type ./soundfiles/lim Input filename stem 0 Do not generate envelope + beat file 0.1 Threshold, as % of signal max 1 Calculate beat strengths
Two claims were made above. For the first, that bex will produce something close to perceived beats, see especially Scott (1993). For the second claim, that something interesting can be learned about speech from the study of these beats, please see my 1997 Indiana University thesis in Linguistics and Cognitive Science, entitled ``Rhythmic Coordination in English Speech: An Experimental Study'' or my Journal of Phonetics paper ``Rhythmic constraints on stress timing in English.'' Journal of Phonetics, 26(2):145-171, which is available in postscript form.