|
2007 PAKDD AWARD
Introduction
Salford-Systems and Anlytici were one of the three winners of the 2007 PAKDD (Pacific Asian Knowledge Discovery in Databases) Cup. The competition tasked each participant with developing a cross-sell model with data from a consumer financial institution (the details of which are listed below). Utilizing 40 attributes of credit card holders, each model had to identify those card holders that were most likely to take out a mortgage with the company. From a field of participants that included academic practitioners and consultants from across the globe, Analytici and Salford-Systems were tied for second (there were two runner ups).
Cross-Selling Problem
The real-world dataset for this year's competition was donated by a consumer finance company with the aim of possibly finding better solutions for a cross-selling business problem.
The company currently has a customer base of credit card customers as well as a customer base of home loan (mortgage) customers. Both of these products have been on the market for many years, although for some reason the overlap between these two customer bases is currently very small. The company would like to make use of this opportunity to cross-sell home loans to its credit card customers, but the small size of the overlap presents a challenge when trying to develop an effective scoring model to predict potential cross-sell take-ups.
A modeling dataset of 40,700 customers with 40 modeling variables (as of the point of application for the company's credit card), plus a target variable, was provided to the participants. This was a sample of customers who opened a new credit card with the company within a specific 2-year period and who did not have an existing home loan with the company. The target categorical variable "Target_Flag" had a value of 1 if the customer then opened a home loan with the company within 12 months after opening the credit card (700 random samples), and a value of 0 if otherwise (40,000 random samples). A prediction dataset (8,000 sampled cases) was also provided to participants with similar variables but withholding the target variable.
The data mining task was to produce a score for each customer in the prediction dataset, indicating a credit card customer's propensity to take up a home loan with the company (the higher the score, the higher the propensity).
Techniques
Utilizing Salford-Systems suite of data mining software and analytical routines developed by Don Cozine of Analytici and Salford-Systems, hundreds of models and segments were constructed and identified. Amongst these results were segments with response rates of up to 9% (in comparison to a base response rate of 1.7%), regression splines that could be used as database triggers or smart selects and model ensembles that average results (across modeling techniques) to enhance model accuracy and stability.
Bios
Don Cozine is the Director of Statistical Analysis for Analytici. A graduate of Louisiana State University , Don held a Board of Regents Fellowship while pursuing graduate studies in Econometrics and Statistics and for the past ten years he has provided statistical and marketing analytical consulting and support for variety of industries and channels.
Efforts include banner advertising effectiveness and optimization for ONDCP (Office of National Drug Control Policy, multi-channel roi (return on investment) offer and campaign optimization and effectiveness for blue chip brands such as Barnes and Noble and American Express Travel. Statistical models and segmentation for companies brands such as AT&T, Hilton Honors and Costco.
|