Data Mining and Asset Condition Management My Story - Part II
- IntelData Pty Ltd | Asset Management and Business
- Jan 3, 2016
- 4 min read
The Solution
Big Data and Data Mining
At the time I was doing my masters in Industrial Engineering – Maintenance and Logistics. I had decided that I needed to refresh my knowledge if I was about to get the promotions I thought I really deserved. While I was struggling to get my head around the new job and balance the work and family time and find a topic for my thesis, I stumbled over a new concept; “DATA MINING”.
One of my professors suggested that I use data mining techniques and do something in the area of marketing, maybe a new method to cluster the customers or market basket analysis. Frankly, despite the fact that I found data mining very intriguing, I already had too many balls that I was juggling. So, I began to wonder what I could do with data mining to help me in my job and at the same time help do my master thesis.
Embarking on the Project
One of the good things that we had in our system was our CMMS which had been developed in-house by our IT Department. The CMMS data was reliable and I could easily mine that in search for a solution to my problem. I am not going to go through all details of the project. However, I will try to lay the project pathway which is similar in all data mining project so that in case the reader wonders how to approach the data mining project, this information could be of some use to them. Also, I will introduce some of the common terminologies are used in the data mining world.
Now, I started the fleet data mining project based on the phases in the CRISP-DM. below I will explain what CRISP-DATA MINING is.
CRISP-DM:
To go about any project one needs to have a frame work. Data mining project are no exceptions. Cross Industry Standard Process for Data Mining (CRISP-DATA MINING), is a data mining process model which describes the approaches that data miners use to tackle data mining problems.

According to CRISP–DM, a given data mining project has a life cycle consisting of six phases, as illustrated in Figure 1. Note that the phase sequence is adaptive which means the next phase in the sequence often depends on the outcomes associated with the preceding phase. (1)
CRISP-DM: The Six Phase
Here I give a short description of each of the phases and what happens during each phase. (2)
1. Business understanding phase.
The first phase in the CRISP–DM standard process may also be termed the research understanding phase.
Enunciate the project objectives and requirements clearly in terms of the business or research unit as a whole.
Translate these goals and restrictions into the formulation of a data mining problem definition.
Prepare a preliminary strategy for achieving these objectives.
2. Data understanding phase
Collect the data.
Use exploratory data analysis to familiarize yourself with the data and discover initial insights.
Evaluate the quality of the data.
If desired, select interesting subsets that may contain actionable patterns.
3. Data preparation phase
a. Prepare from the initial raw data the final data set that is to be used for all subsequent phases. This phase is very labour intensive.
b. Select the cases and variables you want to analyse and that are appropriate for your analysis.
c. Perform transformations on certain variables, if needed.
d. Clean the raw data so that it is ready for the modelling tools.
4. Modelling phase
Select and apply appropriate modelling techniques.
Calibrate model settings to optimize results.
Remember that often, several different techniques may be used for the same data mining problem.
If necessary, loop back to the data preparation phase to bring the form of the data into line with the specific requirements of a particular data mining technique.
5. Evaluation phase
Evaluate the one or more models delivered in the modelling phase for quality and effectiveness before deploying them for use in the field.
Determine whether the model in fact achieves the objectives set for it in the first phase.
Establish whether some important facet of the business or research problem has not been accounted for sufficiently. Come to a decision regarding use of the data mining results.
6. Deployment phase
Make use of the models created: Model creation does not signify the completion of a project.
Example of a simple deployment: Generate a report.
Example of a more complex deployment: Implement a parallel data mining process in another department.
For businesses, the customer often carries out the deployment based on your model.
In the next post, I will explain how I went about the project and what the analysis looked like.
References:
1. Chapman, Pete, et al. CRISP-DM 1.0 - Step-by-step data mining guide. s.l. : SPSS, 1999, 2000.
2. LAROSE, DANIEL T. Discovering Knowledge in Data - An Introductioon to Data Mining. Hoboken, New Jersey, US : JOHN WILEY & SONS, INC.,, 2005.
Comments