Natural Language Processing for Online Applications: Text retrieval, extraction and categorization. Second revised editionThis text covers the technologies of document retrieval, information extraction, and text categorization in a way which highlights commonalities in terms of both general principles and practical concerns. It assumes some mathematical background on the part of the reader, but the chapters typically begin with a non-mathematical account of the key issues. Current research topics are covered only to the extent that they are informing current applications; detailed coverage of longer term research and more theoretical treatments should be sought elsewhere. There are many pointers at the ends of the chapters that the reader can follow to explore the literature. However, the book does maintain a strong emphasis on evaluation in every chapter both in terms of methodology and the results of controlled experimentation. |
Other editions - View all
Common terms and phrases
ACM Press ACM SIGIR Conference algorithm analysis anaphora annotation Annual International ACM applications approach Artificial Intelligence assigned automatic Boolean Chapter classifiers clusters collection combination Computational Linguistics Conference on Research contain context coreference court defined Development in Information document retrieval evaluation example FASTUS field filtering final find finding finite first frequency given grammar identified indexing information extraction Information Retrieval International ACM SIGIR International Conference language models Machine Learning match Message Understanding Conference methods named entity Named entity recognition Natural Language Processing noun groups noun phrase occur parse parser patterns performance probabilistic probability problem Proceedings pronoun query expansion query term ranking regular expressions relevant documents Research and Development rules scores search engine Section semantic sentence Sidebar significant specific statistical structure summary syntactic Table tagged task Technology template text categorization text classification text mining Text REtrieval Conference Token topic TREC typically words