In 1786 Sir William Jones announced to the Asiatick Society of Calcutta
that Sanskrit had to be related to Greek and Latin, touching
off what would come to be known as the Neogrammarian move from philology
(the comparison of texts) to what we now consider linguistics.
Jones showed them a whole huge raft of `cognates' like the following,
you might come to the same conclusion (Avestan is an ancestor of
Persian from about 500 BCE, it's the language of the Zoroastrian texts.
And Gothic is an exinct language spoken in Europe up til about 100 CE
that is in the Germanic family (as English is). Little is known about Gothic.
Sanskrit Avestan Greek Latin Gothic English
pita pater pater fadar father
padam poda pedem fotu foot
bhratar phrater frater brothar brother
bharami barami phero fero baira bear
jivah jivo wiwos qius quick
('living')
sanah hano henee senex sinista senile
virah viro wir wair were(wolf)
('man')
tris tres thri three
deka decem taihun ten
satem he-katon centum hund(rath) hundred
Now, cognate words are a "pair/set of words descended from a common
ancestor", not just words that happen to look like each other (e.g.,
"coffee" is not a cognate of kaffe, kahawa, cafe, etc.; that's an
instance of lots of borrowing of the same word between various languages.
What we're talking about here are historically related words. When we
know we've got cognates, we can talk about reconstruction.
Reconstruction revolves around the notion that sound change is
mechanical and exceptionless. If a proto-/p/ becomes /f/ in a
daughter language, it does so in regular fashion,
that is, all /p/s should exhibit the change to /f/ (that's the
heuristic you have to use). If there are exceptions, you try to find
some other conditioning factor. Using this assumption, we can
conclude that some common ancestor produced Sanskrit /bh/, Avestan
/b/, Greek /ph/ (which is NOT /f/, it's aspirated /p/ at the stage
we're talking about), Latin /f/, and Germanic /b/. Now the question
is, what was that common ancestor?
The way we decide what segment must have been there in the proto-
language involves things we know independently about how sounds
behave, based partly on how sounds alternate synchronically in
languages (i.e. rules that operate to change one sound to another in
different contexts during a single stage of a language), partly on
what we know about acoustics and articulation of speech sounds (which
tells us what directionality is more or less likely), and partly on
experience. Pure gold for the historical linguist is ATTESTED
(written) ancient forms.
For instance, we know that the modern Romance languages (French,
Italian, Spanish, Portuguese, Romansch, Rumanian, etc.) are descended
from Latin. And we have lots of attested Latin to work with -- so we
have clear, unambiguous examples of how some sound changes have
worked. Likewise in other language families where ancient texts are
preserved (i.e. ancient religious texts in Semitic etc.) So we have
some real-life models on which to build our guesses.
So anyway, you reconstruct Proto-Indo-Iranian, and Proto-Germanic,
and Proto-Balto-Slavic, and Proto-Celtic, and ultimately you can guess
approximately -- on the basis of your analysis -- what
must have been the forms of certain words/roots in
Proto-Indo-European, before it split up. Now, most scholars believe
this method does NOT yield reliable results further back than about 10,000 years,
because beyond that, too much change has occurred for there to be any
recognizable remnants (that we can be confident about anyway) in attested
languages.
One real triumph of this method of reconstruction was the Laryngeal
Hypothesis: it was known that there were some troublesome words in
Indo-European where the sound changes seemed not to be behaving in
their usual regular way; things were happening to vowels and sometimes
consonants in certain words that couldn't be explained based on what
is found in the attested languages. Ferdinand de Saussure in the late
19th century suggested that there had to be a set of three segments used
in certain words in the proto-language that had not survived in any of
the daughter languages. He was fairly conservative about claiming what phonetic
content they must have had, but he called them laryngeals and pointed out
the precise locations where they must have occurred in particular words.
A few years later, when a bunch of texts in Turkey were finally decoded and
shown to be a new I-E language of ancient Anatolia, Hittite, the oldest
attested Indo-European language -- voila: there were the laryngeals,
in exactly the words where Saussure had predicted they must be just on the basis
of careful reconstruction.
There are other wrinkles, like you can do internal reconstruction
under some circumstances, and there are things other than sounds that
point to common ancestry (morphology, syntax, etc.). And semantic
change is a really neat thing to trace, though much slipperier than
sound change. But the general answer to the question how do we know
what Proto-Indo-European was like is because of the Comparative
Method, which arose in the 19th century and gives us a fairly rigorous way
to compare sounds in daughter languages and reconstruct what the
antecedent sounds must have been.
Oh, and the PIE reconstructed words for the above words are shown below.
Each is preceded by a star to mark that they're proposed reconstructions.
A hyphen at the end shows they are roots that normally have suffixes. In
this familyof languages consonants are much easier to reconstruct than
vowels. The symbol @ means a central schwa vowel.
Emile Benveniste, Indo-European Language and Society (London 1973). [Contains cultural as well as linguistic material.]
Carl D. Buck, A Dictionary of Selected Synoynms in the Principal Indo-European Langauges (Chicago 1949).[A wonderful old reference work. Lists and discusses synonyms and cognates for a variety of ideas (arranged topically) in over 30 Indo-European langauges. Now available in an affordable paperback reprint edition.]
N. E. Collinge, The Laws of Indo-European (Amsterdam 1985). [Catalogs real and alleged sound changes in IE families and languages. Fairly technical]
Antoine Meillet (trans. S. N. Rosenberg), The Indo-European
Dialects (Huntsville 1967). [This and the two following works
are by one of the great masters of the field, but are still relatively
clear and accessible.]
----- (trans. Gordon B. Ford, Jr.), The Comparative Method in
Historical Linguistics (Paris 1967).
-----, Introduction a l'etude comparative des langues
indo-europeennes (Paris 1937).
Holgar Pedersen, The Discovery of Language (Bloomington 1959). [Includes historical perspective on how these discoveries were made.]
Andrew Sihler, New Comparative Grammar of Greek and Latin (New York 1995).
Oswald Szemerenyi, Comparative-historical linguistics : Indo-European and Finno-Ugric (Amsterdam 1993).
Calvert Watkins (ed.), The American Heritage Dictionary of Indo-European Roots (Boston 1985). [Note the extensive introductory essay. Much of the same material can be found in the first and third editions of AHD.]
Werner Winter (ed.), Evidence for Laryngals (The Hague 1965). [Evidence from the various IE languages bearing on Saussure's laryngal theory cited above. Highly technical.]