In this article, we’ll show you how to apply the ideas from two previous tips to tackle a common business scenario. Let’s say you have information about regular customers to your store, including demographics as well as prior purchases. Using a model you’ve built with those customers, you want to find out which of your new customers are likely to purchase beer so you can give them coupons for a new malt beverage.
Here’s what the model might look like:
CREATE MINING MODEL Customers
(
CustomerKey LONG KEY,
Gender TEXT DISCRETE,
MaritalStatus TEXT DISCRETE,
Products TABLE PREDICT
(
Product TEXT KEY
)
)
USING Microsoft_Decision_Trees
The case level columns contain the demographic information and the nested table contains a list of products purchased by each customer.
Now, let’s build our query. First we’ll use the principles from Returning flattened results from DMX queries and Filter top-level rows with condition on a nested table column to restrict the predictions to beer:
SELECT FLATTENED t.[CustomerKey],
(SELECT $Probability AS BeerProbability
FROM Predict(Products, INCLUDE_STATISTICS)
WHERE Product = 'Beer' ) AS Prod
FROM [CustomerModel]
PREDICTION JOIN
Then, we add an outer SELECT and another condition to restrict the query to customers who have a 60% or higher likelihood of purchasing beer:
SELECT * FROM
( SELECT FLATTENED t.[CustomerKey],
(SELECT $Probability AS BeerProbability
FROM Predict(Products, INCLUDE_STATISTICS)
WHERE Product = 'Beer' ) AS Prod
FROM [CustomerModel]
PREDICTION JOIN
SHAPE {
OPENQUERY([CustomerDS],
'SELECT [MaritalStatus], [Gender], [CustomerKey]
FROM [dbo].[Customers]
ORDER BY [CustomerKey]')}
APPEND({
OPENQUERY([CustomerDS],
'SELECT [Product], [CustomerID]
FROM [dbo].[Products]
ORDER BY [CustomerID]')}
RELATE [CustomerKey] TO [CustomerID]) AS [Products] AS t
ON
[CustomerModel].[Gender] = t.[Gender] AND
[CustomerModel].[MaritalStatus] = t.[ MaritalStatus] AND
[CustomerModel].[Products].[Product] = t.[Products].[Product] ) as s
WHERE [Prod.BeerProbability] > 0.6
That’s it! This query gives us the list of customers who will be targeted for the promotional offer.
Note: The above query demonstrates the use of an outer SELECT statement to add other conditions. In this particular case, since the condition only uses columns from the model’s nested table, the outer SELECT can be eliminated by pushing the condition into the innermost WHERE clause:
(SELECT $Probability AS BeerProbability
FROM Predict(Products, INCLUDE_STATISTICS)
WHERE Product = 'Beer' AND $Probability > 0.6)