Lost Language Decipherment using Computers

The headline reads Software that automatically deciphers ancient language developed. The language thus deciphered was Ugaritic – used in Syria from the 14th through the 12th century BCE.  To find out if such a technique can be used to decipher the Indus script, we need to understand how Ugaritic was deciphered.
The language itself was deciphered manually decades earlier. What helped the manual decipherment was the fact that Ugaritic is similar to Hebrew and Aramaic. The first two Ugaritic letters were decoded by mapping them to Hebrew letters and then based on this information few other words were also deciphered. Then one word inscribed on an axe was guessed to be “axe”, which turned out to be a lucky guess.
There were two inputs to the computer program: corpus of the lost language and the lexicon of the related language. The output was the mapping between the alphabets of the known language and Ugaritic and also the traslation between Ugaritic and cognates in the known language. The program was able to map 29 of the 30 letters accurately. It also deduced the cognates in Hebrew for about 60% of the words.
But when it comes to the Indus script, both the script and language are unknown; there is no second input to the program. Still that has not prevented researchers from applying various techniques to gain insight into what the script represents. In the 60s the Soviets and Finns used mathematical models find order in the symbols. Taking this further, Subhash Kak did a mathematical analysis of the Indus script and the oldest Indian script – Brahmi. When a table containing the ten most commonly occurring Sanskrit phonemes (from ten thousand words), was compared to the ten most commonly occurring Indus symbols and there was a convincing similarity, even though Brahmi was a millennium after the Indus script. Surprisingly some of the characters, like the fish, looked similar too.
But that’s it. The current research is not in comparing Indus script with a known language, but in finding if the Indus script even encodes a language or not.
References:

  1. Benjamin Snyder, Regina Barzilay, and Kevin Knight, A statistical model for lost language decipherment, in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(Uppsala, Sweden: Association for Computational Linguistics, 2010), 1048-1057
  2. Subhash C. Kak, A FREQUENCY – ANALYSIS – OF – THE – INDUS – SCRIPT – PB – Taylor & Francis, Cryptologia12, no. 3 (1988): 129.
  3. Subhash C. Kak, INDUS – AND – BRAHMI – FURTHER – CONNECTIONS – PB – Taylor & Francis, Cryptologia 14, no. 2 (1990): 169.

One thought on “Lost Language Decipherment using Computers

Leave a Reply

Your email address will not be published. Required fields are marked *