Natural Language Processing for Online Applications: Text Retrieval, Extraction, and CategorizationThis text covers the emerging technologies of document retrieval, information extraction, and text categorization in a way which highlights commonalities in terms of both general principles and practical issues. It seeks to satisfy a need on the part of technology practitioners in the Internet space, faced with having to make difficult decisions as to what research has been done an what the best practices are. It is not intended as a vendor guide (such things are quickly out of date), or as a recipe for building applications (such recipes are very context-dependent). But it does identify the key technologies, the issues involved, and the strengths and weaknesses on evaluation in every chapter, both in terms of methodology (how to evaluate) and what controlled experimentation and industrial experience have to tell us. |
From inside the book
Results 1-3 of 44
Page 97
... table avoids the duplication of effort commonly found in less sophisticated algorithms and aids efficiency . CYK is ... table of size n . The table is accessed by subscripts in the range [ 1 , n ] , and V¡ , j denotes the cell in the ith ...
... table avoids the duplication of effort commonly found in less sophisticated algorithms and aids efficiency . CYK is ... table of size n . The table is accessed by subscripts in the range [ 1 , n ] , and V¡ , j denotes the cell in the ith ...
Page 98
... Table 3.6 . The wfsst corresponds to all but the top line of the table shown above . The Row 1 of the table consists of the lexical items in the sentence to be parsed - in this case " The court denies the motion . " ( We will omit row ...
... Table 3.6 . The wfsst corresponds to all but the top line of the table shown above . The Row 1 of the table consists of the lexical items in the sentence to be parsed - in this case " The court denies the motion . " ( We will omit row ...
Page 105
... Table 3.9 . Table 3.9 Filled template for a ' Procedure ' event Procedure Type Purpose Party Outcome petition strike defendant denied These data objects are built by searching the well - formed substring table after the parse is ...
... Table 3.9 . Table 3.9 Filled template for a ' Procedure ' event Procedure Type Purpose Party Outcome petition strike defendant denied These data objects are built by searching the well - formed substring table after the parse is ...
Other editions - View all
Common terms and phrases
algorithm analysis anaphora applications approach assigned automatic Boolean Chapter classifiers cluster collection combination computed Conference contain context corefer coreference court decision tree docu document retrieval estimate evaluation example FASTUS filtering finite frequency given grammar identify information extraction information retrieval linear classifiers linguistic Machine Learning match measure Message Understanding Conference methods Microsoft Naïve Bayes named entity Natural Language Processing non-relevant NOT-A-NAME noun groups noun phrase occur parser parsing patterns performance probabilistic probability problem Proceedings pronoun proper names query expansion query terms ranked retrieval recall and precision regular expressions relevance feedback relevant documents represent rules score search engine Section semantic sentence Sidebar simple statistical structure summary syntactic Table tagged taggers task techniques template text categorization text classification text mining tf-idf tion topic TREC typically vector space vector space model weight vector words