Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing.
An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology - at all levels and with all modern technologies - this text takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. The authors cover areas that traditionally are taught in different courses, to describe a unified vision of speech and language processing. Emphasis is on practical applications and scientific evaluation. An accompanying Website contains teaching materials for instructors, with pointers to language processing resources on the Web. The Second Edition offers a significant amount of new and extended material.
Click on the "Resources" tab to View Downloadable Files:
Results 1-3 of 91
We can do this by dividing a corpus into a training set and a test set. Then we train the two different N-gram models on the training set and see which one better models the test set. But what does it mean to “model the test set”?
function TBL(corpus) returns transforms-queue INITIALIZE-WITH-MOST-LIKELY-TAGS(corpus) until end condition is met do templates←G ENERATE-POTENTIAL -RELEVANT-TEMPLATES best-transform←GET-BEST-TRANSFORM(corpus,templates) ...
Time-aligned transcription Another useful resource is a phonetically annotated corpus, in which a collection of waveforms is hand-labeled with the corresponding string of phones. Three important phonetic corpora in English are the TIMIT ...
What people are saying - Write a review
The previous best book on NLP was James Allen's (1995), which was considered ambitious at the time because it covered syntax, semantics and some pragmatics. But Martin and Jurafsky is far more ambitious, because it covers speech recognition as well, and has far expanded coverage of language generation and translation. It also covers the great advances in statistical techniques that have marked the last decade. It is a beautiful synthesis that will reward the experienced expert in the field with new insights and new connections in the form of historical notes that are not well known. And it is well-written and clear enough that even the beginning student can follow it through. Before this book, you would have had to read Allen's book, Charniak's short book on statistical NLP, something on speech recognition, and something else on generation and translation. Like squeezing clowns into a circus car, Jurafsky and Martin somehow, improbably, manage to squeeze this all into one book, but in a way that is elegant and holds together perfectly; not at all the hodge-podge that one might expect. I expect that this book will be seen as one of the landmarks that pushes the field forward. It's worth comparing this book to the other recent NLP text: Manning and Shutze. Jurafsky and Martin cover much more ground, including many aspects that are ignored by Manning and Schutze. So if you want a general overview of natural language, if you want to know about the syntax of English, or the intricacies of dialog, if you are teaching or taking a general NLP course, then Jurafsky and Martin is the one for you. But if your needs are more focused on the algorithms for lower-level text processing with statistical techniques, or if you want to build a specific practical application, then Manning and Schutze is far more comprehensive and likely to have your answer. If you're a serious student or professional in NLP, you just have to have both.
Words and Transducers
26 other sections not shown