Speech is somewhat different to written language as it is more transient, i.e. there are no obvious spaces between words. This begs the question of how do we learn and recognise spoken language?
Metrical Segmentation (Cutler & Butterfield, 1992) suggests we take advantage of rhythmic cues in language to create word boundaries. This is language specific, for example in English words often begin with strong phonemes e.g. “The dog chased the cat”. This aid listeners in segmenting speech.
Since rhythmic cues differ with language, this is not an innate method. Instead, native speakers learn to take advantage of these language specific strategies.
Although this method of segmenting speech is common, it does often fail. Misheard lyrics in English are often the result of incorrect segmentation of speech because of weak syllables, either due to articulation or bad lyrical choices.
Phoneme: the smallest unit of sound within a word – any distinct unit of sound e.g.: b vs. the sound of p.
Syllable: a cluster of sounds (approx. three letters) containing a vowel.