An approach for document fragment retrieval and its formatting issue in engineering information management

Shaofeng Liu*, Chris A. McMahon, Mansur J. Darlington, Steve J. Culley, Peter J. Wild

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceedings published in a bookpeer-review

Abstract

This paper discusses engineering document fragment mark-up supported by the use of the eXstensible Stylesheet Language - Formatting Objects (XLS-FO). XLS-FO can be used to convert the native format representation of such documents as Word, Excel and PDF into XML. Once in XML, documents fragments can be retrieved at will in response to a search query. In the paper the process of a document fragment retrieval - based on the authors' decomposition scheme approach - has been modelled and the issue of converting documents into XML addressed. Additionally, the use of document templates is discussed as a means of ensuring that the transformed XML documents are compliant with the decomposition schemes. Automating the reformatting of documents into XML and the use of templates helps make implementation of a document-fragment approach to retrieval more resource efficient, so making its adoption in industry more practicable.

Original languageEnglish
Title of host publicationComputational Science and Its Applications - ICCSA 2006
Subtitle of host publicationInternational Conference, Proceedings - Part II
PublisherSpringer Verlag
Pages279-287
Number of pages9
ISBN (Print)3540340726, 9783540340720
DOIs
Publication statusPublished - May 2006
EventICCSA 2006: International Conference on Computational Science and Its Applications - Glasgow, United Kingdom
Duration: 8 May 200611 May 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3981 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceICCSA 2006: International Conference on Computational Science and Its Applications
Country/TerritoryUnited Kingdom
CityGlasgow
Period8/05/0611/05/06

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'An approach for document fragment retrieval and its formatting issue in engineering information management'. Together they form a unique fingerprint.

Cite this