Lucene
Information retrieval (IR): reverse indexing system to do google-like search of "large corpus of documents"
Entity/concept recognition
True NLP
NLTK
True natural language understanding: Tagging, Chunking, Named-entity recognition
NegEx
Negation in radiology
MALLET
Java based Statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text
MetaMap
Matches strings in free text to biomedical concepts in
UMLS through concept identification, to create structured data for classification/categorization etc.
Lingpipe
Computational linguistics i.e. find names of things in news, classify twitter search results, suggest correct spellings of queries
Seems pretty robust with statistical analysis
Natural Language Processing with Python (basics)
Authors: Bird, Steven; Klein, Ewan; Loper, Edward
Amazon summary: This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.
Speech and Language Processing ("Bible")
Authors: Jurafsky, Daniel; Martin, James H.
Amazon summary: An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology – at all levels and with all modern technologies – this book takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. Builds each chapter around one or more worked examples demonstrating the main idea of the chapter, usingthe examples to illustrate the relative strengths and weaknesses of various approaches. Adds coverage of statistical sequence labeling, information extraction, question answering and summarization, advanced topics in speech recognition, speech synthesis. Revises coverage of language modeling, formal grammars, statistical parsing, machine translation, and dialog processing. A useful reference for professionals in any of the areas of speech and language processing.