Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing.
An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology - at all levels and with all modern technologies - this text takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. The authors cover areas that traditionally are taught in different courses, to describe a unified vision of speech and language processing. Emphasis is on practical applications and scientific evaluation. An accompanying Website contains teaching materials for instructors, with pointers to language processing resources on the Web. The Second Edition offers a significant amount of new and extended material.
Click on the "Resources" tab to View Downloadable Files:
Results 1-3 of 74
Text normalization Sentence tokenization To generate a phonemic internal
representation, we must first preprocess or normalize the raw text in a variety of
ways. We'll need to break the input text into sentences, deal with the
idiosyncracies of ...
without drinking SENTENCE BOUNDARY water due to the flood (pause) many
communities are still cut off... 200ms 200ms 200ms 200ms Figure 10.17
Candidate sentence boundaries computed at each inter-word boundary showing
The main advantage of the HMM alignment model is that there are well-
understood algorithms for decoding and training. For decoding, we can use the
Viterbi algorithm ofChapters5and6tofind the best (Viterbi) alignment for a
sentence pair (F ...
What people are saying - Write a review
The previous best book on NLP was James Allen's (1995), which was considered ambitious at the time because it covered syntax, semantics and some pragmatics. But Martin and Jurafsky is far more ambitious, because it covers speech recognition as well, and has far expanded coverage of language generation and translation. It also covers the great advances in statistical techniques that have marked the last decade. It is a beautiful synthesis that will reward the experienced expert in the field with new insights and new connections in the form of historical notes that are not well known. And it is well-written and clear enough that even the beginning student can follow it through. Before this book, you would have had to read Allen's book, Charniak's short book on statistical NLP, something on speech recognition, and something else on generation and translation. Like squeezing clowns into a circus car, Jurafsky and Martin somehow, improbably, manage to squeeze this all into one book, but in a way that is elegant and holds together perfectly; not at all the hodge-podge that one might expect. I expect that this book will be seen as one of the landmarks that pushes the field forward. It's worth comparing this book to the other recent NLP text: Manning and Shutze. Jurafsky and Martin cover much more ground, including many aspects that are ignored by Manning and Schutze. So if you want a general overview of natural language, if you want to know about the syntax of English, or the intricacies of dialog, if you are teaching or taking a general NLP course, then Jurafsky and Martin is the one for you. But if your needs are more focused on the algorithms for lower-level text processing with statistical techniques, or if you want to build a specific practical application, then Manning and Schutze is far more comprehensive and likely to have your answer. If you're a serious student or professional in NLP, you just have to have both.
Words and Transducers
26 other sections not shown