Natural Language Processing


·         See also: Spoken Dialog Systems

·         See also: Mixed-Initiative Dialog

·         See also: Virtual Humans and Collaborative Agents


·         Guinn, C. I., Contract, "STENO: Mobile App for Classroom Discussion", University of North Carolina Wilmington, $9,974.00, Funded. (start: August 1, 2016, end: December 31, 2016).

·         Ricanek, K., Guinn, C. I., Grant, "CASIS Summer 2016", Federal, $25,000.00, Funded. (start: July 2016, end: September 2016).

·         Guinn, C. I., Natural Language Processing for Longitudinal Exposure Data, RTI International, Original Source of Funds: Environmental Protection Agency, August 1, 2004 – March 31, 2008, $29,586.00.


·         Kline, D., Grimsman, A., Vetter, R. and Guinn, C. (2019) Literary Analysis Tool: Text Analytics for Creative Writers. In: Proceedings of the Conference on Information Systems Applied Research, v. 12, n. 5213, Cleveland, Ohio.

·         Guinn, C., Singer, B., and A. Habash (2014) A Comparison of Syntax, Semantics, and Pragmatics in Spoken Language among Residents with Alzheimer’s Disease in Managed-Care Facilities, Proceedings of 2014 IEEE Symposium on Computational Intelligence in Healthcare and E-Health, IEEE, pp. 98-103.

·         Dunn, E., & Guinn, C. I. (2013) Computational Methods for Determining the Similarity between Ancient Greek Manuscripts, Proceedings of International Conference on Artificial Intelligence, CSREA Press, Athens, GA, pp. 496-502.


·         Guinn, C. I., & Habash, A. (2012) Language Analysis of Speakers with Dementia of the Alzheimer’s Type, AAAI Press, Vol. FS-120-01.

·         Komisin, M. and Guinn, C. (2012) Identifying Personality Types Using Document Classification Methods, Proceedings of the 25th International Florida Artificial Intelligence Research Society Conference (FLAIRS-25), AAAI Press.

·         Green, N. L., Guinn, C., and R. W. Smith (2012). Assisting Social Conversation between Persons with Alzheimer’s Disease and their Conversational Partners. Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies, (part of NAACL-HLT 2012), Montreal, Canada, June 8, 2012, pages 37-46.

·         Guinn, C. I. and Daniel Rayburn-Reeves (2009). Remote Monitoring of Activity, Location, and Exertion Levels,  Virtual Healthcare Interaction -- AAAI Fall Symposium, ed. By N. Green and D. Scott, Technical Report FS-09-97, pp. 20-27. 

·         Rayburn-Reeves, D.. and C. Guinn (2008). Improving Upon Semantic Classification of Spoken Diary Entries Using Pragmatic Context Information, Proceedings of the 2008 International Conference on Artificial Intelligence (ICAI'08), July 2008.

·         Guinn, C.I. and D. Rayburn-Reeves (2007). Monitoring Physical Exertion, Activity, and Location Using a Spoken Diary and Heart Rate Monitor, Proceedings of 3rd National Conference on Environmental Science and Technology, Greensboro, NC, USA. 

·         Guinn, C., Crist, D, and H. Werth (2006). A Comparison of Hand-Crafted Semantic Grammars Versus Statistical Natural Language Parsing in Domain-Specific Voice Transcription, Proceedings of Computational Intelligence, Ed. B. Kovalerchuk, San Francisco, CA, pp. 490-495. 


·         A Trainable System for the Extraction of Meaning from Text, with Amit Bagga, Joyce Chai, Alan Biermann, and Alan W. Hui, in Proceedings of CASCON '95, 1995.


This project is developing a trainable system that can extract meaning from texts in different domains (example: various Internet newsgroups). The system does partial parsing based on a large dictionary containing approximately 150,000 words. The system assists the user in extracting a semantic network representation for each member of a set of training articles contained in some large database. Based on the user's training, the system forms statistical tables, a knowledge base, and a set of rules mirroring the user's actions. The system then generalizes these rules. Using statistically-based semantic classification, the system applies these rules to new articles from the database for automatically building semantic networks.

·         Natural Language Processing in Virtual Reality, with R. Jorge Montoya, Modern Simulation and Training , pp. 44-55, June 1998. (htm) , (pdf)


Technological advances in areas such as transportation, communications, and science are rapidly changing our world--the rate of change will only increase in the 21st century. Innovations in training will be needed to meet these new requirements. Not only must soldiers and workers become proficient in using these new technologies, but shrinking manpower requires more cross-training, self-paced training, and distance learning. Two key technologies that can help reduce the burden on instructors and increase the efficiency and independence of trainees are virtual reality simulators and natural language processing. This paper focuses on the design of a virtual reality trainer that uses a spoken natural language interface with the trainee.
RTI has developed the Advanced Maintenance Assistant and Trainer (AMAT) with ACT II funding for the Army Combat Service Support (CSS) Battlelab. AMAT integrates spoken language processing, virtual reality, multimedia and instructional technologies to train and assist the turret mechanic in diagnosing and maintenance on the M1A1 Abrams Tank in a hands-busy, eyes-busy environment. AMAT is a technology concept demonstration and an extension to RTI’s Virtual Maintenance Trainer (VMAT) which was developed for training National Guard organizational mechanics. VMAT is currently deployed in a number of National Guard training facilities. The AMAT project demonstrates the integration of spoken human-machine dialogue with visual virtual reality in implementing intelligent assistant and training systems. To accomplish this goal, RTI researchers have implemented the following features:

· Speech recognition on a Pentium-based PC,
· Error correcting parsers that can correctly handle utterances that are outside of the grammar,
· Dynamic natural language grammars that change as the situation context changes,
· Spoken message interpretation that can resolve pronoun usage and incomplete sentences,
· Spoken message reliability processing that allows AMAT to compute the likelihood that it properly understood the trainee (This score can be used to ask for repeats or confirmations.),
· Goal-driven dialogue behavior so that the computer is directing the conversation to satisfy either the user-defined or computer-defined objectives,
· Voice-activated movement in the virtual environment, and
· Voice synthesis on a Pentium-based PC.

·         Two Dimensional Generalization in Information Extraction, with Joyce Yue Chai, Alan W. Biermann, AAAI/IAAI, pp. 431-438, 1999.


In a user-trained information extraction system, the cost of creating the rules for information extraction can be greatly reduced by maximizing the effectiveness of user inputs. If the user specifies one example of a desired extraction, our system automatically tries a variety of generalizations of this rule including generalizations of the terms and permutations of the ordering of significant words. Where modifications of the rules are successful, those rules are incorporated into the extraction set. The theory of such generalizations and a measure of their usefulness is described.