Você está na página 1de 5

2010

Clustering of Products
By Usage for Retail
Using Association Rule

SUBHJEET KUMAR
09BS0002422
IBS KOLKATA
13/09/2010
ASSOCIATION RULE:

Association rule mining finds interesting associations and/or correlation relationships among
large set of data items. Association rules shows attribute value conditions that occur frequently
together in a given dataset. A typical and widely-used example of association rule mining is
Market Basket Analysis.

For example, data are collected using bar-code scanners in supermarkets. Such market basket
databases consist of a large number of transaction records. Each record lists all items bought by a
customer on a single purchase transaction. Managers would be interested to know if certain
groups of items are consistently purchased together. They could use this data for adjusting store
layouts (placing items optimally with respect to each other), for cross-selling, for promotions, for
catalog design and to identify customer segments based on buying patterns.

Association rules provide information of this type in the form of "if-then" statements. These
rules are computed from the data and, unlike the if-then rules of logic, association rules are
probabilistic in nature.

In addition to the antecedent (the "if" part) and the consequent (the "then" part), an association
rule has two numbers that express the degree of uncertainty about the rule. In association
analysis the antecedent and consequent are sets of items (called item sets) that are disjoint (do
not have any items in common).

The first number is called the support for the rule. The support is simply the number of
transactions that include all items in the antecedent and consequent parts of the rule. (The
support is sometimes expressed as a percentage of the total number of records in the database.)

The other number is known as the confidence of the rule. Confidence is the ratio of the number
of transactions that include all items in the consequent as well as the antecedent (namely, the
support) to the number of transactions that include all items in the antecedent.

For example, if a supermarket database has 100,000 point-of-sale transactions, out of which
2,000 include both items Bread and Jam and 800 of these include item Milk, the association rule
"If Bread and Jam are purchased then Milk is purchased on the same trip" has a support of 800
transactions (alternatively 0.8% = 800/100,000) and a confidence of 40% (=800/2,000). One way
to think of support is that it is the probability that a randomly selected transaction from the
database will contain all items in the antecedent and the consequent, whereas the confidence is
the conditional probability that a randomly selected transaction will include all the items in the
consequent given that the transaction includes all the items in the antecedent.

Lift is one more parameter of interest in the association analysis. It is nothing but the ratio of
Confidence to Expected Confidence. Lift is a value that gives us information about the increase
in probability of the "then" (consequent) given the "if" (antecedent) part.

2 SUBHJEET KUMAR 09BS0002422


DESCRIPTION:

From the data of a retail store an association rule has to be prepared to know the probability of
purchasing of a particular item when some other item is selected. Formally it can be explained as
the conditional probability of purchasing a product ; i.e., P(A|B). This association analysis has
been done by the help of SAS 9.2 and SAS Enterprise Miner. To know this probability a market
basket analysis is done with the help of a statistical model. In this report the associations of
products are taken as output consideration and CNO is taken as ID. In this report minsup is taken
as 1.8 and the minconf for rule generation is taken as 62%. By taking this confidence interval the
interpretation for the association for the products became more prominent to develop a rule for
consideration.

ANALYSIS:

Lift - In this report we can observe that the lift value is much small for all the cases, which
means the occurrence of the consequent of the concerned product is not strong enough, i.e., the
probability is not much prominent. But among all the products mentioned the lift value is highest
for (AJA, 23M  MIL). Hence the association of MIL is more in the case of the customer who
are purchasing AJA and 23M. Similarly for (LIN, ESB, 23M MIL) the lift is also high which
indicates the chances of buying MIL is more for the purchaser of LIN, ESB and 23M.

From the result it is seen that there are only 9 rules formed from the data after all the calculation.
(Appendix-1). All those rules have very less support. This indicates that the associations that it
captured in the rules have not occurred very frequently in the database. Maximum supports
among all the rules are very less in i.e., 2.51%. This implies that the rule is applicable for very
small set of customers. Hence its application for very small set of customer is not suggested.

While considering the cost of ambience , if its not much then the retail store can apply some of
the rules which are mentioned below :

Rules Confidence Support Lift


GWL  MIL 63.54 1.55 2.48
LIN, GJO  ESB 63.46 2.51 2.07
LIN, CAD  ESB 64.41 1.93 2.10
AJA, 23M  MIL 75.28 1,70 2.94
LIN, ESB, 23M MIL 63.28 2.05 2.47

These are the rules which can be beneficial for the retailers to place them near to each other
product wise. Here we can view that any of the antecedent cause the revenue generation by these
consequent. LIN is causing the drive of sale in MIL and ESB. And ESB also drives the sales of
MIL. This can be further seen from the graph shown in the next page.

3 SUBHJEET KUMAR 09BS0002422


From the graph given above and from the table 1&2 (in APPENDIX – 1) it is seen that the
frequencies of the purchase of ESB is maximum followed by MIL and LIN. But the minimum
sales occurred for GJO. So the antecedent and the consequent association is clearly seen from the
output of SAS.

Hence to increase the sales of the one which got minimum contribution in revenue generation
two products has to be made in a single bundle to initiate a combo offer to drive the sales of the
backward one in the retail store. Hence, if there are not any difficulties in administration issues
the retail store can go for the combo offer of ESB and GJO.

Similarly for the two most frequent purchased products can be kept side by side in two separate
shelves can also lead to the customer satisfaction as well as the drive in the revenue generation
by the maximum number of purchase.

CONCLUSION

From the data captured by retailers it can be easily interpreted that the antecedent and consequent
items can be kept near by to increase the sales for both of the products. Here ESB and GJO can
be kept together as a bundle instead of designing the retail layout. For ESB, MIL and LIN
emphasis on retail layout can be given. This way the purchase of the products can be maximized

4 SUBHJEET KUMAR 09BS0002422


by the help of association rule by designing and bundling the products, which will give a proper
ROI.

APPENDIX - 1
Table - 1

Table - 2

5 SUBHJEET KUMAR 09BS0002422

Você também pode gostar