Knowledge Discovery and Management

Coping with information overload requires an innovative and multi-faceted approach.

In systems that search for documents relevant to a user's task, higher recall (increasing the number of relevant documents found) or higher precision (decreasing the number of irrelevant documents found) can dramatically increase the ability for analysts to extract meaningful information from text. ICCI engineers have implemented and tuned large-scale search applications utilizing Oracle XML text indexing, Google search appliances, and the open-source Lucene text indexing engine. The open-source Solr project is utilized over Lucene to implement faceted browsing.

Entity Extraction identifies meaning from the actual content of the document, providing insight into its nature and purpose. Relation extraction further refines the process by identifying how various entities are associated with one another. Past performance has also included advanced Ajax-based UIs to enable analysts to efficiently tag entities and the implementation of custom entity extractors that offer standard web service interfaces. Extracted entities and relationships are stored in ontological repositories that offer efficient storage and retrieval and organize the information in a meaningful hierarchical structure.

For research application, ICCI developed an agent-based processing framework to enable analytics to be run on large volumes of data to generate additional metadata to enhance search performance. Technology from this effort has been transitioned to operational systems processing extremely high volumes of data while supporting thousands of analysts.

Integrated Computer Concepts, Inc. 135 National Business Parkway, #210Annapolis Junction, MD 20701
(tel) work 240.295.0370 (fax) fax 240.295.0369