Dimensional reduction of web-traffic data

Nikulin, Vladimir (2006). Dimensional reduction of web-traffic data. In: Belur V. Dasarathy, Proceedings of SPIE Vol. 6241. Data Mining, Intrusion Detection, Information Assurance, and Networks Security, Orlando, Florida, USA, (J2410-J2410). 17-18 April 2006. doi:10.1117/12.664767

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
UQ206940_OA.pdf Full text (open access) application/pdf 288.69KB 0

Author Nikulin, Vladimir
Title of paper Dimensional reduction of web-traffic data
Conference name Data Mining, Intrusion Detection, Information Assurance, and Networks Security
Conference location Orlando, Florida, USA
Conference dates 17-18 April 2006
Proceedings title Proceedings of SPIE Vol. 6241   Check publisher's open access policy
Journal name Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006   Check publisher's open access policy
Place of Publication Bellingham, Washington, U.S.A.
Publisher SPIE
Publication Year 2006
Sub-type Fully published paper
DOI 10.1117/12.664767
Open Access Status File (Publisher version)
ISBN 0-8194-6297-7
ISSN 0277-786X
Editor Belur V. Dasarathy
Volume 6241
Issue 62410J
Start page J2410
End page J2410
Total pages 12
Language eng
Formatted Abstract/Summary
Dimensional reduction may be effective in order to compress data without loss of essential information. Also, it may be useful in order to smooth data and reduce random noise. The model presented in this paper was motivated by the structure of the msweb web-traffic dataset from the UCI archive. It is proposed to reduce dimension (number of the used web-areas or vroots) as a result of the unsupervised learning process maximizing specially defined average log-likelihood divergence. Two different web-areas will be merged in the case if these areas appear together frequently during the same sessions. Essentially, roles of the web-areas are not symmetrical in the merging process. The web-area or cluster with bigger weight will act as an attractor and will stimulate merging. In difference, the smaller cluster will try to keep independence. In both cases the powers of attraction or resistance will depend on the weights of the corresponding clusters. Above strategy will prevent creation of one super-big cluster, and will help to reduce number of non-significant clusters. The proposed method was illustrated using two synthetic examples. The first example is based on an ideal vlink matrix which characterizes weights of the vroots and relations between them. The vlink matrix for the second example was generated using specially designed web-traffic simulator.
Subjects 0199 Other Mathematical Sciences
Keyword data compression
distance-based clustering
log-likelihood
web-traffic data
Q-Index Code EX

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Wed, 07 Jul 2010, 00:58:41 EST by Ms May Balasaize on behalf of Faculty of Science