Wednesday 26 October 2011

Copiale Cipher:scientists crack mysterious

Copiale Cipher could be something out of a Dan Brown novel or a 21st-century update on the Indiana Jones story arc. A yellowing 18th-century manuscript consisting of a mystifying mix of alien symbols and Greek and Roman letters, the Copiale Cipher has been confounding cryptographers since its discovery in the archives of a university in the former East Germany immediately following the Cold War.


That is, until Kevin Knight, a computer scientist at the Information Sciences Institute at the University of Southern California and expert in natural language processing, and two colleagues from Uppsala University, Beáta Megyesi and Christiane Schaefer, decided to take a crack at the cipher.


The 105-page document contains 75,000 characters, all of which are encrypted except those that make up two items: the name "Philipp 1866" appears on the flyleaf, while "Copiales 3," the origin of the manuscript's moniker, is inscribed as a note on the last page.


"I don't have much experience in cryptography," said Knight in an interview. My background is primarily in computational linguistics and machine translation."


Named for one of just two non-coded inscriptions in the document, this mysterious manuscript is 105 pages long and is bound in gold and green brocade paper. The manuscript consists of roughly 75,000 characters. These characters are handwritten very neatly but consist of a perplexing mix of upper- and lower-case Roman letters, along with a large assortment of more abstract symbols (see sample pages above). In total, the Cipher contains 90 distinct characters, including 26 unaccented Roman letters. Adding to the confusion is the lack of spacing between words.


Dr Knight, who primarily conducts research in computational linguistics and machine translation, doesn't have much experience in cryptography. But undeterred, he began collaborating this year with two Swedish linguists, Beata Megyesi and Christiane Schaefer of Uppsala University, with the goal to decipher The Copiale Cipher.


After a few dead-ends, the team realised that the Roman characters designated spaces between words whilst the abstract symbols contained the actual information. They also discovered that a colon indicated that the previous consonant is duplicated. After they predicted that the Cipher was an encryption of the German language and then subjected the Cipher to a word-frequency analysis, things quickly fell into place. The team could finally read the text of the Cipher.


Dr Knight and his colleagues found that The Copiale Cipher describes the rituals and some of the political ideals of a German secret society from the 1730s. They also learned that this society was fascinated by eye surgery and ophthalmology, although none of its members were practitioners.


But why should we care about a dusty old book that no one could read that was written by members of a German secret society?


"This opens up a window for people who study the history of ideas and the history of secret societies", says Dr Knight. He cites several modern examples of challenging ciphers, such as the communications from the still-unidentified Zodiac Killer to the California police in the 1960s and 1970s, and the Kryptos sculpture, located on grounds of the C.I.A. headquarters in the United States, which has been only partly decoded.


Dr Knight also points out that there are other such ancient enciphered texts, particularly the famous Voynich Manuscript, a 240-page volume that has confounded cryptographers for centuries. This document was recently dated back to the early 1400s.


"There are these books and ancient languages of real historical value that contain historical information that we just can't get out yet, and that's of interest to a lot of people," Dr Knight explains in the video interview embedded below. For example, historians think that secret societies played a role in revolutions, but their importance is not known at this time because so many documents are enciphered.




Currently, Dr Knight is collaborating with his former graduate student, Sujith Ravi, who just received his PhD in computer science from USC this year. Together, they are working on translation as a cryptographic problem, an approach that could improve human language translation and may also be useful in translating languages that are not currently spoken by humans, including ancient languages. (Fans of ancient texts will want to check out similar work on the Indus Script by Dr Rajesh Rao.) In my opinion, possibly the most exciting application of this technology is the potential for deciphering animal communication.

No comments: