Home Company Partners Services Solutions Portfolio News Careers

 
Our clients include: Cablevision, Scholastic, Oxygen, National Public Radio, Network Solutions, and Fye.
 
 
Interwoven logo

PartnersInterwoven

Interwoven MetaTagger

As organizations seek ways to efficiently leverage the ever-increasing amount of information available on the Web, it is becoming crucial to develop methods that make content "intelligent". By making content intelligent, organizations can get content to potential audiences that is appropriate and relevant to that audience. This ability is key to obtaining maximum value from search, personalization, syndication and portal applications.

Interwoven MetaTagger automates the process of adding intelligence to content by creating rich, descriptive metadata from articles, documents, or Web pages. MetaTagger automatically categorizes these content items by subject and then identifies them so that they can be linked to other relevant information. This next-generation content intelligence solution provides applications that enable content creators to categorize their content interactively and allows developers to create procedures for recognizing and classifying content automatically.

Based on industry-standard or custom controlled vocabularies, MetaTagger can suggest appropriate metadata for content. Through either a semi-automated or fully automated approach, MetaTagger is able to add intelligence to content for use later in run-time search, personalization, syndication, and portal applications.

Controlled Vocabularies

MetaTagger utilizes controlled vocabularies to provide precision and consistency in tagging metadata. The human expertise that goes into building a controlled vocabulary ultimately enables more accurate metadata to be automatically applied to assets. Because they are controlled, the vocabularies ensure that arbitrary metadata cannot be applied to content. Controlling the vocabularies ensures consistency in metadata across all assets.

Support is provided for use of multiple vocabularies, be they industry-standard or custom. Included with MetaTagger are three vocabularies: Public Companies, Geographic Locations, and Industrial Codes. MetaTagger vocabularies are expressed in XML which enables the import of any new or existing custom vocabularies.

Content Classification and Recognition

MetaTagger provides for the categorization of text according to multiple schemes. It uses a training set of pre-categorized texts in order to learn what words and phrases occur in the various categories. By analyzing the content, the asset is then tagged with one or more subject categories that are appropriate. MetaTagger recognizers automatically scan content and identify words or phrases such as products and services, company names, persons, and locations, that match entries in the controlled vocabulary. Assets are then automatically tagged with all matches in the vocabulary.

Vocabulary Search and Browse

MetaTagger enables users to search or browse for terms in any given vocabulary. This is critically important in a semi-automated model, where MetaTagger automatically suggests metadata and an author or subject-matter expert refines the metadata based on personal or organizational knowledge. Users can add to, remove, or replace automatically suggested metadata with the results of a search or browse.

Automated Processing

MetaTagger can be configured to automatically apply metadata to assets in both a semi-automated and fully automated manner. The fully automated mode provides organizations with a robust, efficient, and rapid mechanism for applying metadata to either current or legacy content, incoming syndication feeds, or disparate corporate assets. TeamSite itself leverages MetaTagger automation within workflows to assign metadata to assets without necessarily involving humans in the process.

top