Log in ....Tribune

Monday, November 17, 2003
Feature

Predicting client’s behaviour through data mining
Jaspreet Bedi

DATA analysis is the heart of software development process. Its purpose is to discover previously unknown data characteristics, relationships, dependencies, or trends, which become base for the information framework on which decisions are built. Data analysis tool relies fully on the end users for problem recognition. It becomes quite complex if the end user fails to state the problem appropriately. Given this limitation, current DSS (decision support systems) are now deviating to have various types of automated alerts. These alerts are simply the software agents that constantly monitor certain parameters and then perform specified actions when such parameters reach pre-defined values. For example, the alert may keep a check on sales indicators and inventory levels and send e-mail or alert messages or run appropriate programs etc. in case it is needed.

In contrast to the traditional (reactive) DSS tools, data mining premise is proactive. That is instead of relying blindly on the end user to do the whole job (analysis), it is the work of the data mining tools to automatically search for such anomalies and possible causes and their relationships so that identification of unidentified problems left by the end user can be done. Hence, data mining tools not only analyse the data and unveil problems or opportunities hidden in the data relationships but also form computer models based on their findings. It minimises the end user intervention and the user is therefore able to concentrate on the system’s findings to gain knowledge, which may yield competitive advantages. Data mining hence indicates a new breed of specialised decision support tools that automate data analysis process thereby increasing the efficiency. They are based on algorithms that form the building blocks for artificial intelligence to create knowledge.

Data mining can also be described as a methodology designed to perform knowledge discovery expeditions over the database data requiring minimal end user intervention and resulting in knowledge discovery .The tools governing the process of data mining are however not standardised and specific. Hence they can be implemented in different ways & applied over different data. In spite of the lack of precise standards, the process of data mining is said to pass through four phases:

1) In data preparation phase, the main data sets to be used by the data mining operation are identified and cleansed of data impurities. As data in the data warehouse are already integrated and filtered, data warehouse usually acts as the target set for data mining operations.

2) Data analysis and classification phase studies the data in order to identify common data characteristics or patterns. During this phase the data mining tools applies specific algorithms to find data groupings, classifications, clusters, sequences, data dependencies, links or relationships and data patterns, trends and deviations.

3) The knowledge acquisition phase uses the output of the data analysis and classification phase. During this phase, the data-mining tool selects the appropriate modelling or knowledge acquisition algorithms. It may be accompanied with the possible intervention by the end user .The algorithms used in mining are based on neutral networks, decision trees, rules induction, genetic algorithms, classification and regression trees, memory based reasoning or neighbour and data visualisation. The result of these algorithms is the generation of computer model that reflects the behaviour of the target data set.

4) Although many data mining tools stop at the knowledge acquisition phase, others continue to the prognosis phase. In this phase, data mining findings are used to predict and forecast the future behaviour of business.

A few examples of data mining findings can be:

1. 65 per cent of customers who did not use a particular credit card in the last six months are 88 per cent likely to cancel that account.

2. 82 per cent of customers who bought a new computer are 90 per cent likely to buy a Web camera within the next four weeks.

Data mining technologies have the potential of becoming the next frontier in database development.