The question answering (QA) research group at Carnegie Mellon University has recently released an open source version of their Ephyra Question and Answer System. The software utilizes the internet to answer linguistic questions as well as recognize syntax, word ordering, and tone, using a series of algorithms to produce the most context-appropriate and statistically correct responses. The group hopes to get feedback and evaluations from researchers, so the code is currently being made available to the public.
Ephyra retrieves answers to natural language questions from the Web and other sources. The open source version – OpenEphyra – is almost identical to the system that has been evaluated in the TREC question answering track (http://trec.nist.gov/), except that we had to exclude some 3rd party tools and code with specific hardware requirements. The result is a system that is platform-independent, easy to use, and that can be run on a standard desktop computer and evaluated on questions from the TREC 8-15 evaluations.
While possible applications of the software may span the entirety of Social and Behavioral Sciences, it appears particularly useful to Anthropology and Linguistics as a reconstruction experiment: building the necessary systems of cognition from the inside out. Understanding the subsystems of language in the context of efficiency and necessity may be a useful instrument to developing an understanding of our own language acquisition mechanisms. Furthermore, it may help us further discern differences between animal communication and modern human language.