Advanced Data Mining Methods for Electricity Customer Behaviour Analysis in Power Utility Companies

Ms Anisah Nizar (2008). Advanced Data Mining Methods for Electricity Customer Behaviour Analysis in Power Utility Companies PhD Thesis, School of Information Technol and Elec Engineering, The University of Queensland.

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
n40915548_phd_abstract.pdf 40915548_phd_thesis_abstract application/pdf 10.85KB 275
n40915548_phd_totalthesis.pdf 40915548_phd_total_thesis application/pdf 18.84MB 60
Author Ms Anisah Nizar
Thesis Title Advanced Data Mining Methods for Electricity Customer Behaviour Analysis in Power Utility Companies
School, Centre or Institute School of Information Technol and Elec Engineering
Institution The University of Queensland
Publication date 2008-08
Thesis type PhD Thesis
Supervisor Prof. Zhao Yang Dong
Prof. Penelope Sanderson
Total pages 241
Total colour pages 41
Total black and white pages 200
Subjects 280000 Information, Computing and Communication Sciences
Formatted abstract
Non-Technical Loss (NTL) arising through power theft and other customer malfeasances is a
universal problem in the electricity supply industry. Such loss may occur by a number of means,
such as meter tampering, illegal connections, billing irregularities, and unpaid bills. The associated
identification, detection, and prediction procedures are important for many utilities, particularly
those in such developing countries as Malaysia, Thailand, and Indonesia. NTL is mainly a
consequence of power theft and involves customer management processes that include recognizing
a number of means of consciously defrauding the utility concerned. Currently, most solutions are ad
hoc and can only be implemented after a long period of detection and observation.
NTL activity is considered such a serious problem for many electricity utilities worldwide because
it not only affects the profitability and credibility of the companies concerned, but also because it
increases the cost of electricity to their customers. There is, therefore, a crucial need to minimize
the extent and impact this problem in the interests of both the utilities themselves, including the
Tenaga Nasional Berhad (TNB) of Malaysia that was the focus of the present study, and their
electricity customers. The aim of the research reported here was to develop an analytical framework
of customers’ behaviour to assist in the identification, detection, and prediction of NTL activity.
These objectives were to be achieved by pursuing significant deviations in the load consumption
behaviour of customers signalled by means of data-mining techniques. The framework for NTL
analysis developed as the main outcome of this research will have significant benefits for electricity
utilities and their customers.
Many load profile studies have used data-mining techniques, pattern recognition, and statistical
techniques to obtain knowledge from customer load records. Having such knowledge concerning
customers’ consumption behaviour is very important in the electricity supply business as it proves
very useful in enabling utilities to formulate tariffs and develop marketing strategies, as well as
allowing them properly to bill customers who deviate from their original contracts. The motivation
for the present research was to expand the capacity of the knowledge gathered from customers’ load
profiles to identify, detect, and predict behaviour irregularities or abnormalities that ultimately may
be due to faulty metering or to human intervention designed to perpetrate fraud in the electricity
billing process.
A series of data-mining techniques, including those comprising feature selection, classification, and
prediction, have been applied in achieving the objectives cited above by means of the proposedanalytical framework. Established techniques, including Support Vector Machine (SVM), Extreme
Learning Machine (ELM), and its variation, Online Sequential (OS)-ELM, were used in the
classification and prediction analysis. Prior to that, various feature selection algorithms available in
WEKA software, including best-first search, greedy stepwise search, and rank search, were among
the means selected to optimize the data-mining quality and performance by selecting the most apt
attributes for application in the processes of classification and prediction.
The results gathered from these analyses have proven to be promising and convincing. Three
classification techniques, ELM, OS-ELM and LIBSVM, have been used to predict customers’
behaviour and assign it to classes as normal, abnormal, and suspicious. All the three classification
algorithms applied both sigmoid and RBF node functions. The comprehensive analysis and
comparison of results included distinguishing between types of days in the consumption patterns,
including weekdays, Saturdays, Sundays, and public holidays. The results have been expressed as
percentages for classification success rates and in seconds for time-processing speeds. What the
research revealed, in particular, is that the ELM and OS-ELM techniques have better timeprocessing
and classification rates than does LIBSVM.
In forecasting customers’ behaviour, the two prediction techniques used were ELM and OS-ELM.
Both employed two activation functions, namely sigmoid and radial basis function (RBF) nodes.
Three case studies were undertaken focusing respectively on customers with normal behaviour,
customers with combinations of abnormal, normal, and suspicious behaviour, and customers with
abnormal behaviour, with the results measured in root mean square error (RMSE) for error rates and
in seconds for time-processing speed. Ultimately, these results confirmed that the appropriateness of
the emphasis on short-term load forecasting for predicting normal behaviour established in other
relevant research. However, it is the prediction of abnormal and suspicious behaviour that has the
potential to assist management in making decisions when it is based on the certainty of the
measures specified.
In short, the pertinent conclusion here is that NTL activity can be identified, detected, and predicted
from customers’ load behaviour using data-mining techniques. The proposed analytical framework
is suitable for use in any electricity supply company as a means of acquiring knowledge of its
customers’ existing behaviour and of monitoring changes in such behaviour. In addition, NTL
identification, detection, and prediction can provide a management tool for investigation teams so
as to enhance their decision making when targeting meters that are suspected of signalling
fraudulent activities.
Keyword Classification, Customer Load Behaviour, Data Mining, Extreme Learning Machine (ELM),
Additional Notes 65, 72, 73, 75, 76, 88, 91, 92, 105, 108, 112, 115, 119, 122, 125, 127, 129, 131, 133, 141, 142, 143, 144, 145, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 161, 162, 163, 164.

Citation counts: Google Scholar Search Google Scholar
Created: Thu, 13 Nov 2008, 21:37:27 EST by Ms Anisah Nizar on behalf of Library - Information Access Service