Welcome!

What Tasks Can Be Performed with Data Mining? | Prediction, Affinity Grouping or Association Rules, Clustering and Profiling

August 5th, 2010

Prediction

Prediction is the same as classification or estimation, except that the records are classified according to some predicted future behavior or estimated future value. In a prediction task, the only way to check the accuracy of the classification is to wait and see. The primary reason for treating prediction as a separate task from classification and estimation is that in predictive modeling there are additional issues regarding the temporal relationship of the input variables or predictors to the target variable.

Any of the techniques used for classification and estimation can be adapted for use in prediction by using training examples where the value of the variable to be predicted is already known, along with historical data for those examples. The historical data is used to build a model that explains the current observed behavior. When this model is applied to current inputs, the result is a prediction of future behavior.

Examples of prediction tasks addressed by the data mining techniques discussed in this post include:

  • Predicting the size of the balance that will be transferred if a credit card prospect accepts a balance transfer offer
  • Predicting which customers will leave within the next 6 months
  • Predicting which telephone subscribers will order a value-added service such as three-way calling or voice mail

Most of the data mining techniques discussed in this book are suitable for use in prediction so long as training data is available in the proper form. The choice of technique depends on the nature of the input data, the type of value to be predicted, and the importance attached to explicability of the prediction.

Affinity Grouping or Association Rules

The task of affinity grouping is to determine which things go together. The prototypical example is determining what things go together in a shopping cart at the supermarket, the task at the heart of market basket analysis. Retail chains can use affinity grouping to plan the arrangement of items on store shelves or in a catalog so that items often purchased together will be seen together.

Affinity grouping can also be used to identify cross-selling opportunities and to design attractive packages or groupings of product and services.

Affinity grouping is one simple approach to generating rules from data. If two items, say cat food and kitty litter, occur together frequently enough, we can generate two association rules:

  • People who buy cat food also buy kitty litter with probability P1.
  • People who buy kitty litter also buy cat food with probability P2.

Clustering

Clustering is the task of segmenting a heterogeneous population into a number of more homogeneous subgroups or clusters. What distinguishes clustering from classification is that clustering does not rely on predefined classes. In classification, each record is assigned a predefined class on the basis of a model developed through training on pre-classified examples.

In clustering, there are no predefined classes and no examples. The records are grouped together on the basis of self-similarity. It is up to the user to determine what meaning, if any, to attach to the resulting clusters. Clusters of symptoms might indicate different diseases. Clusters of customer attributes might indicate different market segments.

Clustering is often done as a prelude to some other form of data mining or modeling. For example, clustering might be the first step in a market segmentation effort: Instead of trying to come up with a one-size-fits-all rule for “what kind of promotion do customers respond to best,” first divide the customer base into clusters or people with similar buying habits, and then ask what kind of promotion works best for each cluster.

Profiling

Sometimes the purpose of data mining is simply to describe what is going on in a complicated database in a way that increases our understanding of the people, products, or processes that produced the data in the first place. A good enough description of a behavior will often suggest an explanation for it as well.

At the very least, a good description suggests where to start looking for an explanation. The famous gender gap in American politics is an example of how a simple description, “women support Democrats in greater numbers than do men,” can provoke large amounts of interest and further study on the part of journalists, sociologists, economists, and political scientists, not to mention candidates for public office.

What Tasks Can Be Performed with Data Mining? | Classification and Estimation

August 5th, 2010

Classification

Classification, one of the most common data mining tasks, seems to be a human imperative. In order to understand and communicate about the world, we are constantly classifying, categorizing, and grading. We divide living things into phyla, species, and general; matter into elements; dogs into breeds; people into races; steaks and maple syrup into USDA grades.

Classification consists of examining the features of a newly presented object and assigning it to one of a predefined set of classes. The objects to be classified are generally represented by records in a database table or a file, and the act of classification consists of adding a new column with a class code of some kind.

The classification task is characterized by a well-defined definition of the classes, and a training set consisting of pre classified examples. The task is to build a model of some kind that can be applied to unclassified data in order to classify it.

Examples of classification tasks that have been addressed using the techniques described in this posttechniques

include:

  • Classifying credit applicants as low, medium, or high risk
  • Choosing content to be displayed on a Web page
  • Determining which phone numbers correspond to fax machines
  • Spotting fraudulent insurance claims
  • Assigning industry codes and job designations on the basis of free-text job descriptions

    Estimation

    Classification deals with discrete outcomes: yes or no; measles, rubella, or chicken pox. Estimation deals with continuously valued outcomes. Given some input data, estimation comes up with a value for some unknown continuous variable such as income, height, or credit card balance.

    In practice, estimation is often used to perform a classification task. A credit card company wishing to sell advertising space in its billing envelopes to a ski boot manufacturer might build a classification model that put all of its cardholders into one of two classes, skier or non-skier. Another approach is to build a model that assigns each cardholder a “propensity to ski score.” This might be a value from 0 to 1 indicating the estimated probability that the cardholder is a skier. The classification task now comes down to establishing a threshold score. Anyone with a score greater than or equal to the threshold is classed as a skier, and anyone with a lower score is considered not to be a skier.

    The estimation approach has the great advantage that the individual records can be rank ordered according to the estimate. To see the importance of this, imagine that the ski boot company has budgeted for a mailing of 500,000 pieces. If the classification approach is used and 1.5 million skiers are identified, then it might simply place the ad in the bills of 500,000 people selected at random from that pool. If, on the other hand, each cardholder has a propensity to ski score, it can send the ad to the 500,000 most likely candidates.

    Examples of estimation tasks include:

    • Estimating the number of children in a family
    • Estimating a family’s total household income
    • Estimating the lifetime value of a customer
    • Estimating the probability that someone will respond to a balance transfer solicitation.

      What Is Data Mining?

      August 5th, 2010

      Data mining, as we use the term, is the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules. For the purposes of this post, we assume that the goal of data mining is to allow a corporation to improve its marketing, sales, and customer support operations through a better understanding of its customers. Keep in mind, however, that the data mining techniques and tools described here are equally applicable in fields ranging from law enforcement to radio astronomy, medicine, and industrial process control.

      In fact, hardly any of the data mining algorithms were first invented with commercial applications in mind. The commercial data miner employs a grab bag of techniques borrowed from statistics, computer science, and machine learning research. The choice of a particular combination of techniques to apply in a particular situation depends on the nature of the data mining task, the nature of the available data, and the skills and preferences of the data miner.

      Data mining comes in two flavors—directed and undirected. Directed data mining attempts to explain or categorize some particular target field such as income or response. Undirected data mining attempts to find patterns or similarities among groups of records without the use of a particular target field or collection of predefined classes.

      Data mining is largely concerned with building models. A model is simply an algorithm or set of rules that connects a collection of inputs (often in the form of fields in a corporate database) to a particular target or outcome. Regression, neural networks, decision trees, and most of the other data mining techniques discussed in this book are techniques for creating models. Under the right circumstances, a model can result in insight by providing an explanation of how outcomes of particular interest, such as placing an order or failing to pay a bill, are related to and predicted by the available facts. Models are also used to produce scores. A score is a way of expressing the findings of a model in a single number. Scores can be used to sort a list of customers from most to least loyal or most to least likely to respond or most to least likely to default on a loan.

      Technology | Playing online games

      August 3rd, 2010

      With the today’s technology, it provides us on how to do things in much easier ways and convenient for us. Nowadays, things are become so handy and lots of easy access. If we are talking about our daily chores, shopping, doing our hobbies, etc., things became more accessible and easy to deal with. The use of the internet is one of the things that catch a lot of interest and attentions of many people who of course are also a part of the technology. It is a powerful medium who makes all things connected. What a good thing about this is that, for example, if you need to do shopping, there are already online stores that you can find over the internet. Another thing is whenever you think you need a hobby, you can just surf the internet and find one of your choices. Many had already got their own like playing online games.

      Playing online slots in the casino games online is one of the popular hobbies of most people nowadays. Aside from having fun and lots of time enjoying the game it also online casinos for real cash that’s giving away big payout each time player win. What a good thing about this is that it became a medium of enjoyment, and lots of people are comfortable in playing with it.

      Form of enjoyment

      July 1st, 2010

      Yesterday I visited my friend at his house to have a little bond with him, since we have been hardy seen each other. I arrived at his house, and he was busy playing mac games. Knowing his age right now, it is funny to think that he still addicted in playing such games. According to him since he was at his young age, he was used to play mac kids games with his siblings and cousins. So now that mac games offer paris en ligne, he much enjoys it and brings him more fun and excitement. Now I know why he rather stays at home than go out and have fun with friends.

      Nearby optical clinic

      June 29th, 2010

      Yesterday I’ve been to one of the optical clinics nearby our office. I discovered that some of the opticians don’t want give you PD data like what I had experienced. Maybe the reason why is that they want their client to have no choice but to avail their products and don’t have the chance to purchased eyeglasses on the other stores. And I just heard that some of the opticians keep you away from ZenniOpt where Zenni: the #1 online eyeglasses store offers the most fashionable, high quality and yet inexpensive eyeglasses.

      A puppy from my vacation

      May 21st, 2010

      Two weeks ago I got a chance to have a week vacation leave from my work. So I took the chance to make the most of my week vacation. Since my sister and I had an invitation from a friend of mine to visit her and stay in her house, I decided to take the chance. My friend lives out of the country, so I asked my sister to accompany me. We stayed at my friend’s house during the whole vacation and had a lot of fun. We enjoyed each day of our mini vacation. We went to different places and enjoyed eating different and new kinds of food. We explore the city and got the chance to take pictures in their historical establishments.

      During our touring and shopping on the city proper I saw a mini pet shop, it is actually small where only few pets are there. I saw a puppy which I really liked for a long time, so I grab the chance and bought it. When I bought the dog, the problem I encountered was how I can bring my puppy back home. Good thing my friend is working in a cargo liner company and she helped me send my puppy to my house. In all, the week vacation was awesome and I will surely have a vacation with my friend again soon.

      Acquiring for loans

      April 28th, 2010

      Acquiring for assistance from the financing companies is getting more common these days. It is definitely because of the easier and much faster processing of the applications. There are a lot of financing companies that already provide a financing site online to which the processing, applications and all of the other transactions are through internet. It does not necessary for anyone to do the applications personally. And many of the financing sites do not requires much documents needed and sometimes there is no need to provide for collateral.

      For these reasons many of us are acquiring or acquired for such loans like personal loans, payday loans or even for a small business loans. For some reason I had acquired for a payday loans before. I barely needed for emergency cash that time so all I did was applied for a payday loan. The application is easy and in less that a week I received the loan that I needed. For a small business loans, I have a friend who acquired for it two weeks ago because she going to open a small business near her house. According to her, she signed up an application through internet and waited for three days for the approval. When she gets approved, she waited for about less than ten days and the funds were released. Yesterday she already opened her new small business.

      Business plans

      April 26th, 2010

      Yesterday at work my officemate told me she wants to open s small business this year. Since her family is getting bigger and her kids will be entering the school soon, she needs to do something to get more source of income. According to her with the help of the business loans that is been offering online, she will be going to start her small business this year. She applied this business loan online. Actually there are a lot already that is been offering different loans online like personal loans and small business loans. According to my officemate the processing of loans are made easy these days. Knowing that it is available online, surely the application and the processing are made a lot easier. One thing that is good about on this is no much requirements needed. Unlike the other financing company out there, they require a lot of documents plus you need to provide for the collateral. Apply for whatever loans you desired on this site and gets approve in just one to three days and will release the funding in just seven to ten days. Easy and fast service, the processing made easier and you don’t have to apply for it personally.

      Eyeglasses that promise to give..

      April 22nd, 2010

      Since I bought lowest price progressive glasses at the #1 online Rx glasses store, lot of my officemates got interested. Most of them wanted to purchase eyeglasses on this site. They got more interested when they read Eric’s review of Zenni Optical that proves that this online eyeglasses store produces high quality, fashionable and trendy eyeglasses that costs really cheap. According to them, why buy something that costs expensive when you can buy the same thing that costs cheap. These eyeglasses are available online and promise you to give a very high quality product of great durability, safety and comfort at truly reasonable and affordable prices.