Towards more personalized web: Extraction and integration of dynamic content from the Web

Kowalkiewicz, Marek, Orlowska, Maria E., Kaczmarek, Tomasz and Abramowicz, Witold (2006). Towards more personalized web: Extraction and integration of dynamic content from the Web. In: X. Zhou, J. Li, H. Shen, M. Kitsuregawa and Y. Zhang, Lecture Notes in Computer Science: Frontiers of WWW Research and Development: Proceedings of the 8th Asia Pacific Web Conference (APWeb 2006). Frontiers of WWW Research and Development: APWeb 2006: 8th Asia-Pacific Web Conference (APWeb 2006), Harbin, China, (668-679). 16-18 January, 2006. doi:10.1007/11610113


Author Kowalkiewicz, Marek
Orlowska, Maria E.
Kaczmarek, Tomasz
Abramowicz, Witold
Title of paper Towards more personalized web: Extraction and integration of dynamic content from the Web
Conference name Frontiers of WWW Research and Development: APWeb 2006: 8th Asia-Pacific Web Conference (APWeb 2006)
Conference location Harbin, China
Conference dates 16-18 January, 2006
Proceedings title Lecture Notes in Computer Science: Frontiers of WWW Research and Development: Proceedings of the 8th Asia Pacific Web Conference (APWeb 2006)   Check publisher's open access policy
Journal name Frontiers of Www Research and Development - Apweb 2006, Proceedings   Check publisher's open access policy
Place of Publication Berlin, Germany
Publisher Springer-Verlag
Publication Year 2006
Sub-type Fully published paper
DOI 10.1007/11610113
ISBN 3540311424
ISSN 0302-9743
Editor X. Zhou
J. Li
H. Shen
M. Kitsuregawa
Y. Zhang
Volume 3841
Start page 668
End page 679
Total pages 12
Collection year 2006
Language eng
Abstract/Summary Information and content integration are believed to be a possible solution to the problem of information overload in the Internet. The article is an overview of a simple solution for integration of information and content on the Web. Previous approaches to content extraction and integration are discussed, followed by introduction of a novel technology to deal with the problems, based on XML processing. The article includes lessons learned from solving issues of changing webpage layout, incompatibility with HTML standards and multiplicity of the results returned. The method adopting relative XPath queries over DOM tree proves to be more robust than previous approaches to Web information integration. Furthermore, the prototype implementation demonstrates the simplicity that enables non-professional users to easily adopt this approach in their day-to-day information management routines.
Subjects E1
280103 Information Storage, Retrieval and Management
700103 Information processing services
Q-Index Code E1

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 1 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Thu, 23 Aug 2007, 21:46:39 EST