ToBI Intonation Transcription Summary

R. Port, April 9, 2007

This scheme for transcribing intonation and accent in English was developed by Janet Pierrehumbert and Mary Beckman in the early 1990s. It is fairly easy to learn and flexible enough to handle the significant intonational features of most utterances in English.  There are now variants for description of Japanese, Korean, German, etc.   Read a portion of the Lecture Notes on Tobi in the MIT Open Courseware. This has a brief Chapter1 and a Chapter 2 divided into 11 short sections. Read Chapter 1 and Chapter 2, Sections 2.0-2.7.  You download the pdf of each section (about 2 pages each) and read it on your screen, clicking on icons to listen to the utterances under discussion.

Check the official ToBI website for additional information about ToBI if you are interested.

Metrical Autosegmental Phonology.

  The model assumes several simultaneous TIERS of phonological information and hierarchical nesting of shorter units within longer units: Syllable, Word, Intonation Phrase, etc. Assume one (or more) stressed syllables per major lexical word. The phrase "ToBi"  means``tones and break indices''.

Speech is parsed into Intonation Phrases, each of which begins with an Initial Tone (nearly always Low) and ends with a Final Tone (both Low or High).  Within each Intonation Phrase there is at least one Pitch Accented word.  A Pitch Accent is expressed as a simple or complex local perturbation of the pitch around a stressable syllable in the Pitch Accented word.   Finally, between the last PA and the Final Tone, there is another, extended tone, the Phrasal Tone. This too may be High or Low. (The terms High and Low really mean `raise' or `lower' relative to the current pitch value.)

So two typical simple utterances might be a single Intonation  Phrase with one Pitch Accent, like this first in answer to "What does Bill do all day?" and the second a very brief utterance.

            Bill    drives his  pick-up   all over  the neighborhood.                                 How-dy.
       [% L                      H*     ( L-                            )  L%]                        [%L  H* ( L- ) L%]
       Initial Tone             PA         Phrasal Acc                Final Tone

  Both begin with a %L (which we will not generally write down because it is so predictable) and have a simple High pitch accent on `pickup' or `Howdy' that raises the pitch to a peak during the stressed syllable.  Then the L- pulls the pitch down and keeps it there until the end of the phrase where it falls even further.   The long phrase shows what the L- effect is, but the short phrase presumably has the L- as well.

Phrase-Level Tones

  1.  INITIAL BOUNDARY TONE. The most common by far is %L  (and so not usually notated). The %H is rare and semantically insists on what the listener should already know. Eg, if I say `Never eat more than one banana a day', you might respond
                      Bananas aren't dangerous
                    %H L*               L*  L- H%
          I can save a lot of space by writing this as: `BaNAnas %H L* aren't DANgerous L* L-H%'.  So I will do that  below.
  1. FINAL BOUNDARY TONE, L% vs. H% at every phrase boundary This pitch effect appears only on the last 1-2 syllables. The default is L%. H% is used for special contexts like yes-no questions and nonfinal list items.
  2. PHRASAL TONE, L- or H-, fills the interval between the last pitch accent and the final boundary tone. The H- adds some semantic content.

Thus, following the final PA, we find that intonation phrases come in all four logical types:

  1. L- L%    The default DECLARATIVE phrase.  Eg, `Yes H* L- L%' or `Come and see H* me again L-L%'.
  2. H- H%   YES-NO QUESTION. Eg, "Are you going L* today H-H%?" Or "So then are you going L* to the store this afternoon? H-H%?" (where pitch rises right after the L* and stays high til the end).
  3. L- H%   The LIST ITEM intonation (nonfinal items only, of course). Eg, reciting the alphabet, or "I need food L-H%, shelter L-H%, and comfort L-L%." Or even "You said you would run home this afternoon L-H%, grab your golf clubs L-H%, jump in the car L-H% and race to the club L-L%.
  4. H- L%   The PLATEAU. A previous H* or complex accent `upsteps' the final L% to an intermediate level. "I just TOLD you why" L+H* !H-L%. This is a place where one might think there is a 3d level of tone, but ToBI says no, it is a H that is downstepped.

Pitch Accents

These mark the region near the stressable syllable of specific words for a certain semantic effect. An intonation phrase has one or several Pitch Accents. (Multiple pitch accents occur where the speaker wants to emphasize many things. ``Eat H* your H* peas H* L-L%'') The star (*) marks the tone that will occur on the stressed syllable of this word. If there is a second tone connected by a + sign, it applies to (roughly) the preceding or following syllable.

  1. H* -- PEAK ACCENT. The default accent which implies a local pitch maximum plus some degree of subsequent fall.
  2. L* -- LOW ACCENT. Also common but hard to recognize at first. You usually accent a syllable this way when the phrase ends in H-H%.  ``You mean you don't L* like chocolate? H-H%.''  These are the two `simple pitch accents'. The rest of these are used for special emphasis.
  3. L*+H -- SCOOP. Low tone at beginning of target syllable with strong pitch rise on the next.
  4. L+H* -- RISING PEAK. High pitch on target syllable following a sharp rise from previous syllable.  ``I can't L+H* believe he did it L- L%.''
  5. !H -- DOWNSTEP HIGH. Only occurs following another H in the same phrase. This H is pitched somewhat lower than the earlier one. Implies pitch stays fairly high from earlier H to the downstepped one. Can be either !H* or !H-. The pattern [H* !H- L%] is known as the CALLING CONTOUR. Eg "Oh JIM-MY!" H*!H-L% (as opposed to "Oh JIMmy" H* L-L%).

(Notice there are some logical possibilities that are apparently not observed: H+L* and H*+L.)

Definition: The NUCLEAR ACCENT is a pitch accent that occurs near the end of an intonation phrase. Eg, `cards' in: "Take H* a pack of cards H*L-L%"

Break Indices.

Boundaries between words are called break indices and come in 5 levels:

0 clitic boundary. Eg "who's"
1 normal word-word boundary. Eg "see those"
2 either perceived disjuncture with no intonation effect, or apparent intonational boundary but no slowing or other break cues.
3 intermediate phrase. Gets phrase accent, but not terminal tone. Marked with phrase tone: L- or H-. Serves as a domain for downstep of a H- or H%.
4 full intonation phrase - phrase or sentence final L or H. Marked with L% or H%.

The distinction 0 vs. 1 is usually easy, and 4 is easy (with a strong perceptual break). But 2 and 3 are less common and more problematic. Note that intonation is only ONE of the processes that depends on these boundaries (also, eg, allophones of stops like /t/ and /d/ in `See Pat. Over there.' vs. `See Pat over there').

For more information with recorded examples, see the Ohio State ToBI  Website pages.

_______________________________________________________