Approximation and exploration of streaming time series data

Xie, Qing (2013). Approximation and exploration of streaming time series data PhD Thesis, School of Information Technology and Electrical Engineering, The University of Queensland.

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s4182045_phd_finalthesis.pdf Thesis full text application/pdf 1.74MB 14
Author Xie, Qing
Thesis Title Approximation and exploration of streaming time series data
School, Centre or Institute School of Information Technology and Electrical Engineering
Institution The University of Queensland
Publication date 2013-02
Thesis type PhD Thesis
Supervisor Xiaofang Zhou
Chaoyi Pang
Total pages 192
Total colour pages 25
Total black and white pages 167
Language eng
Subjects 080609 Information Systems Management
Formatted abstract
Streaming time series data have been attracting huge interests in database area and data mining research for decades, due to the its important role in modeling the data properties in various real applications such as processing medical, financial and scientific data. Although existing database techniques have been extended for processing streaming data, the Data Stream Management System can only handle limited number of operations. The practical applications introduce more challenges on the adaptivity and efficiency and desire novel and advanced techniques for online processing. This work utilizes the approximation techniques to address some challenging problems for streaming time series data. Based on some essential studies of the piecewise linear approximation on streaming data, this work further addresses some advanced applications in stream mining, and designs efficient algorithms making use of linear approximation technique.
      Along with the review of current research, this thesis addresses the optimized error-bound linear approximation on data stream, which is a fundamental problem for advanced applications. By summarizing several essential properties of the linear function and the convex hull bounding the feasible region, this work interprets the proposed optimal algorithms, which produce the Piecewise Linear Representation with linear time and space complexity. The comparison with existing works demonstrates the superiority of the proposed solutions.
      The buffer technique has been applied to optimize error-bound linear representation in streaming scenario. In multi-stream environment, the overall buffer should be divided and allocated to each stream. Since buffer size can affect the approximation quality, this thesis addresses the allocation of the limited overall buffer resource, which aims to optimize the overall approximation quality by adjusting the buffer assigned to each stream. By discovering the statistical relationship between the data behavior and the buffer demand, this work designs a dynamic framework to decide the buffer allocation, which is adaptive to the actual buffer demand and optimizes the overall approximation quality.
      This thesis also focuses on the detection of local correlation for streaming data, which is an important issue in stream mining. Such problem is more challenging when there can be certain time delay between correlated subsequences, which is called lag correlation. Based on the fundamental results of Fast Fourier Transform, this work designs the basic solution to detect correlation supporting time delay. This thesis further utilizes the results of linear approximation to estimate the linear relationship between subsequences, which can effectively accelerate the correlation detection, and satisfy the online requirement.
     This work also puts efforts to apply approximation technique in real applications. The continuous detection of online video near-duplicates can be modeled as the continuous query on streaming data, however the high dimensionality of video data results in the failure of conventional pattern matching solutions. By discovering the inherent relationship between video frames, this work designs an approximation signature named Video Trend Stream to effectively reduce the dimensionality. This thesis discovers the information of content structure in video data, and proposes cut signature to accelerate the online processing.
Keyword Data stream
Time series
Performance analysis
Continuous queries
Duplicate detection
Multimedia data

Citation counts: Google Scholar Search Google Scholar
Created: Tue, 19 Mar 2013, 17:14:49 EST by Qing Xie on behalf of Scholarly Communication and Digitisation Service