Information extraction  

From The Art and Popular Culture Encyclopedia

(Redirected from Terminology extraction)
Jump to: navigation, search

Related e

Wikipedia
Wiktionary
Shop


Featured:

Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction

Due to the difficulty of the problem, current approaches to IE (as of 2010) focus on narrowly restricted domains. An example is the extraction from newswire reports of corporate mergers, such as denoted by the formal relation:

<math>\mathrm{MergerBetween}(company_1, company_2, date)</math>,

from an online news sentence such as:

"Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp."

A broad goal of IE is to allow computation to be done on the previously unstructured data. A more specific goal is to allow logical reasoning to draw inferences based on the logical content of the input data. Structured data is semantically well-defined data from a chosen target domain, interpreted with respect to category and context.

See also

Extraction
Mining, crawling, scraping, and recognition
Search and translation
General
Lists




Unless indicated otherwise, the text in this article is either based on Wikipedia article "Information extraction" or another language Wikipedia page thereof used under the terms of the GNU Free Documentation License; or on research by Jahsonic and friends. See Art and Popular Culture's copyright notice.

Personal tools