An automatic mark-up approach for structured document retrieval in engineering design

S. Liu*, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Information and knowledge retrieval has been recognized as a key issue in engineering design. A great deal of design-related information used and generated within engineering companies is formally recorded in documents. These documents become more useful if they are structured in a consistent way so that they can be retrieved and their contents accessed more effectively. Achieving useful structure in electronic documents relies on embedding some sort of mark-up or coding that is computer-understandable. Manual mark-up is time-consuming and costly. This paper proposes a knowledge engineering approach to automatic document mark-up employing XML (the eXtensible Mark-up Language) to 'tag' explicitly the structural information. The focus here is on long and complex engineering documents. A three-level model is explored to achieve automatic semantic mark-up using a set of document decomposition schemes. The model includes a strategic level which identifies document typographical features based on such things as styles, inference or templates; a tactical level to define the rules to realize semantic mark-up according to the document features; and an operational level to perform the computational implementation of the mark-up rules. By making document structure explicit, information retrieval can be made more focused by returning not just whole documents but the document components that are most relevant or of most interest to the engineering designer, and information relevant to the designer's need both with respect to document structure and content, not content alone. In addition, interpretation of useful structure by the human user can be hardwired into documents, which allows us to move closer to true semantic level retrieval.

Original languageEnglish
Pages (from-to)418-425
Number of pages8
JournalInternational Journal of Advanced Manufacturing Technology
Volume38
Issue number3-4
DOIs
Publication statusPublished - Aug 2008

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Mechanical Engineering
  • Computer Science Applications
  • Industrial and Manufacturing Engineering

Keywords

  • Automatic mark-up
  • Document decomposition
  • Engineering design
  • Knowledge engineering approach
  • Structured document retrieval
  • XML

Fingerprint

Dive into the research topics of 'An automatic mark-up approach for structured document retrieval in engineering design'. Together they form a unique fingerprint.

Cite this