程序代写代做代考 data mining database decision tree Lecture 6 – 1
Lecture 6 – 1
DSCI 4520/5240
DATA MINING
DATA MINING AT WORK:
Telstra Mobile Combats Churn with SAS®
As Australia’s largest mobile service provider, Telstra Mobile is reliant on
highly effective churn management.
In most industries the cost of retaining a customer, subscriber or client is
substantially less than the initial cost of obtaining that customer. Protecting
this investment is the essence of churn management. It really boils down to
understanding customers — what they want now and what they’re likely to
want in the future, according to SAS.
“With SAS Enterprise Miner we can examine
customer behaviour on historical and predictive
levels, which can then show us what ‘group’ of
customers are likely to churn and the causes,” says
Trish Berendsen, Telstra Mobile’s head of Customer
Relationship Management (CRM).
Modified from Slides of UNT DSCI 4520/5240
Lecture 6 – 2
DSCI 4520/5240
DATA MINING
Data Mining in the telecom
industry: RingaLing Telecom
RingaLing is losing 40,000 customers every month, and only winning a
few of those customers back. They are painfully aware that the cost of
keeping an existing customer can be up to ten times lower than the cost of
acquiring a new one. They desperately need a cost effective way of
decreasing customer churn rate.
Until recently, RingaLing, a large public
telecommunications company, held the
monopoly for the entire telecommunications
market. Now privatized and without the
advantages of the monopolistic situation,
competition is coming from consortiums of
foreign denationalized companies, new
entrants, and cable companies who are offering
very tempting proposals to consumers.
Lecture 6 – 3
DSCI 4520/5240
DATA MINING
Data Mining in the telecom
industry: RingaLing Telecom
Subsequently, Martin Miner, a young and promising marketing analyst is
summoned to the Marketing Director’s office and asked to solve this
problem. Working with the IT department, Martin explains that he needs a
way to be able to access and analyze all the company data.
The CEO of RingaLing is worried by his
company’s falling share price and the
rate at which customers are leaving.
Despite substantial general price
reductions, loyal customers are leaving
by the thousands. The CEO gives the
marketing director six months to bring
the situation under control.
Lecture 6 – 4
DSCI 4520/5240
DATA MINING
Getting the lines crossed — the
difficulty of data access
RingaLing has over 50 million customer files in addition to billions of call
records, and data from both the customer service and the billing departments.
They also have some competitive information, including competitor pricing
policies and market share by area. The data resides in different offices across
the globe, in 12 different file formats, and on seven different platforms.
Martin decides the first step is the development of a SAS data warehouse,
enabling him to have access to all the data he needs in one place. Using SAS
Institute’s Rapid Warehousing Methodology, this takes only a matter of weeks.
Thanks to the data warehouse, the quality and consistency of the data is much
better. All the data is summarized and grouped in a way that makes it easy to
get a singular view of individual RingaLing customers. Even if a customer has
multiple accounts, for example a mobile phone as well as a fixed phone, the
database is smart enough to know that this is one customer instead of two.
Lecture 6 – 5
DSCI 4520/5240
DATA MINING
The Data Mining Process
Now, Martin is ready to start mining.
Initially, he is interested in the probability that a given
customer will cancel their contract within the next year
and the controllable variables that might influence
them. If he knows this, he will be able to manage the
churn rate (the rate at which customers cancel and
subscribe).
A sample of the data is taken using the sampling capabilities of SAS software. This
ensures that the 10% sample accurately represents the customer base as a whole.
Initially, Martin plots the probability that a customer will leave over the next year, and
finds it to be fairly consistent and somewhat depressing. Using geographical
visualization he is able to highlight certain areas where the customers are most likely
to leave, which appear to be around certain major cities. He then decides to explore the
data using a 3D scatter plot to see the relation between the size of bill, area they live in,
and likelihood to leave. He notes that most of the customers at risk for leaving tend to
be those who have the highest and the lowest bills.
variables
= features
Lecture 6 – 6
DSCI 4520/5240
DATA MINING
The Data Mining Process
Martin decides to integrate some more data. First, he looks at his company’s
competitors and areas in which they provide services, as well as the range of services
provided (e.g. business, domestic, international). He then looks at their pricing policy
and product details.
Martin now uses several data mining techniques to model this information so he can
predict whether a customer is likely to leave or not. He uses decision trees to eliminate
variables which are not important, and surprisingly finds that income plays a much less
important role than he would have expected. Having identified several key variables,
he then uses neural networks to build a model which will predict whether a person is
likely to leave or not, given their characteristics.
Following this, he identifies that the people who are likely to leave are typically either
those who have very large bills or very low ones. It appears that those who make
international calls are more likely to leave. Another point is that people who make
frequent calls to the same numbers are more prone to leaving.
At this stage he decides to review his findings and concludes that he needs to introduce
more data on the pricing policies of the competitors for different types of products.
Lecture 6 – 7
DSCI 4520/5240
DATA MINING
The Data Mining Process
Martin then creates a new model and suggests the following
strategies:
• Special tariff for frequent international callers,
enabling them to pay a lower cost per time unit.
• Low usage tariffs, giving a lower fixed price for the line
rental and then possibly higher costs for the actual calls.
• High user tarriffs, giving higher fixed price for line
rental and lower costs for the actual calls.
• Special prices for frequently called numbers
Following the implementation of these strategies, there was a drastic reduction in the
number of customers leaving. After only three months, customer churn fell from 40,000
to only 10,000. On top of this, they were able to target products to customers who fit
certain optimal profiles. This resulted in a gain of another 20,000 customers a month.
Martin Miner has played a key role in this process and is rewarded with a large bonus
and pay raise. He is subsequently promoted, and it becomes obvious that the marketing
director is grooming him to become his replacement.
Lecture 6 – 8
DSCI 4520/5240
DATA MINING
Questions to Ponder
Q: What are the other possible “variables” that
• cannot be obtained from the company’s own data,
• yet will affect whether a customer will leave RingaLing?
(Hint: think about your own experience or your friends’)