A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management

S. Liu, C. A. McMahon*, M. J. Darlington, S. J. Culley, P. J. Wild

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Retrieval of document fragments has a great potential for application in engineering information management. Frequently engineers have neither the time nor inclination to sift through long documents for small pieces of useful information. Yet it is frequently in the form of one or more long documents that the information that they seek is presented. Supporting the delivery of the right information, in the right format and in the right quantity motivates the search for better ways of handling document sub-components or fragments. Document fragment retrieval can be facilitated using modern computational technologies. This paper proposes a novel framework for information access utilising state-of-the-art computational technologies and introducing the use of multiple document structure views through decomposition schemes. The framework integrates document structure study, mark-up technologies, automated fragment extraction, faceted classification and a document navigation mechanism to achieve the target of retrieval of specific document fragments using precise, complex queries. These disparate elements have been brought together in an exploratory Engineering Document Content Management System (EDCMS). Using this, investigations using representative engineering documents have shown that information users can access and retrieve document content - at fragment level rather than at document level - both through data in a document and document metadata, through different perspectives and at different granularities, and simultaneously across multiple documents as well as within a single document.

Original languageEnglish
Pages (from-to)401-413
Number of pages13
JournalAdvanced Engineering Informatics
Volume20
Issue number4
DOIs
Publication statusPublished - Oct 2006

ASJC Scopus subject areas

  • Information Systems
  • Artificial Intelligence

Keywords

  • Computational framework
  • Decomposition schemes
  • Document mark-up
  • Faceted classification
  • Fragment retrieval
  • XML

Fingerprint

Dive into the research topics of 'A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management'. Together they form a unique fingerprint.

Cite this