Você está na página 1de 25

Artificial Intelligence -

Applications for Managers


Assignment 3









GROUP 1:
Sl No NAME ROLL NO
1 Ramprakash R 1311113
2 Babu H.L 1311161
3 Sreenath C.P 1311203
4 Karthik Arumugham 1311299

23 AUGUST 2014

AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
2 | P a g e

Contents
Q1. Iterated prisoners dilemma (IPD) game ................................................................................................ 3
Q2. Decision Tree from RWEKA package using J48 ...................................................................................... 7
Q3. Genetic Algorithms applied to travelling salesman problem ............................................................... 10
Q4. DEEPNET Training problem .................................................................................................................. 18
Q5. NET Logo problem ................................................................................................................................ 24



AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
3 | P a g e

Q1. Iterated prisoners dilemma (IPD) game

The iterated prisoners dilemma (IPD) game is a prisoners dilemma (PD) game played over and
over again. The reinforcement learner for the iterated prisoners dilemma (IPD) game is built
using the Net-logo. This model is a multiplayer version of the iterated prisoner's dilemma. It is
intended to explore the strategic implications that emerge when the world consists entirely of
prisoner's dilemma like interactions. It is a 2-D problem as shown below.

Net-Logo File:


N N N
N A N
N N N

A: Agent
N: Neighbor for the agent A

A) States:
Each one of the cells is an agent and the agent can interact with each one of the neighbors which
are about 8 in number. In each round of play an agent can interact with one of the neighbors. For
the Net logo code implementation, we have considered that an agent can interact with four of its
neighbors.


A



Actions:
Co-operate: always cooperate
Defect: always defect
Other different types of strategies such as Tit-for-Tat, unforgiving, random, etc could also be
modelled for increased complexity of the problem.

The strategy which yields better pay-offs will get replicated. To do this each agent keeps track of
the pay-offs history, say pay-offs in the last ten iterations. The agents would compare the pay-
offs in these last ten iterations with each other and then decide to switch to the better strategy.
Higher the history length, better will be the pay-offs. The twin strategy is shown as below,

AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
4 | P a g e


C
C C D
D

C: Co-operate
D: Defect
Reward (Pay-off Matrix):
Neighbors Action
A
g
e
n
t

s

A
c
t
i
o
n

C D
C 3 0
D 4 1
(C = Cooperate, D = Defect)
Since the IPD is a symmetric game, the rewards are same for both the agent and the partner. The
pay-offs are with respect to the decisions taken by the agent and the partner. For example when
they both co-operate, the pay-offs would be 3. When they both defect, the pay-offs would be 1.
Similarly if the process is iterated for say 10 times with a pay-off history length of 10, the total
pay-off would be sum of all the 10 individual pay-offs. So the plan to switch strategies is based
on the option which yields higher returns.
Interface:


Setup: Setup the world to begin playing the multi-person iterated prisoner's dilemma. One of the
patches is set to defect (red) at the start
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
5 | P a g e

Go once: The agents walk around the world and interact by taking only one step.
Go (loop): The agents walk around the world and interact continuously.
Sliders: This determines the last x number of pay-offs to be considered to take the next decision.
Plots:
There are 1200 patches which can take red/green
Total Payoff: The total payoff is a good indicator of how well a strategy is doing
Population: Numbers of Co-operate and Defects

Output Plots:

History-length: 1


Observation: Converges to all defects with pay-off of 1. The total pay-off reduces to 1200.

History-length: 10
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
6 | P a g e



Observation: The number of defects keep increasing, reducing the total pay-off

History-length: 50



Observation: Maximum co-operate strategy. The total pay-off remains near the optimal 3600
points.
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
7 | P a g e

Q2. Decision Tree from RWEKA package using J48
USING THE J48 FUNCTION TO CREATE DECISION TREE FROM RWEKA
PACKAGE IN R AND OBTAINING TREE WITH LOWEST ERROR VALUE
Step 1: Splitting the dataset into training and test sets
Import the transfusion data into R.
The variables in the transfusion data have been renamed as follows:
>colnames(transfusion)<-c("recency","frequency","monetary","time","donation")
20% of the dataset shall be used as test set.
smp_size <- floor(0.8 * nrow(transfusion))
train_ind <- sample(seq_len(nrow(transfusion)), size = smp_size)

This will create a vector of random numbers of length equal to 20% of the number of rows.

Let us define two dataframes, train and test for the training and testing data sets respectively.
Data is split using the following commands.

train <- transfusion[train_ind, ]
test <- transfusion[-train_ind, ]

Step 2: Constructing the decision tree
The J48 function in the RWeka package is used to construct the decision tree as follows:
tree<-
J48(donation~recency+frequency+monetary+time,data=train,control=Weka_control(R=T
RUE,A=TRUE,M=5,N=10))
evaluate_Weka_classifier(tree,numFolds=100,cost=matrix(c(0,14,10,0),ncol=2),seed=12
3,class=TRUE)
submit<-predict(tree,newdata=test)

After the results are obtained the output is saved as:
write.csv(submit,file="pred1.csv",row.names=FALSE)
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
8 | P a g e

Step 3: Calculating the accuracy of the results in the decision tree
The results contain the last column Donation. This is compared with the original column in the
test data and the result accuracy is obtained
Parameter values used: M-5, N-10, Cost matrix(c(0,14,10,0),ncol=2), numFold = 100
Test set data:
Number of 0s = 122
Number of 1s = 38
Net No. of 0s
correctly
predicted
No. of 1s
correctly
predicted
%
Accuracy
(0s)
%
Accuracy
(1s)
No. of
correctly
predicted
values
%
Accuracy
(Overall)
150 105 3 93.75 7.894737 108 72

Step 4: Change parameter values and recalculate accuracy
Changing the parameter values and running the code for J48 decision tree to
M-15, N-20, cost=matrix(c(0,10,14,0),ncol=2), numFolds=80
Calculating accuracy:

Net No. of 0s
correctly
predicted
No. of 1s
correctly
predicted
%
Accuracy
(0s)
%
Accuracy
(1s)
No. of
correctly
predicted
values
%
Accuracy
(Overall)
150 111 2 99.10714 5.263158 113 75.33333

Step 5: Repeat this step for more number of times and recalculate accuracy
Total Values (actual) 150
No. of 0s (actual) 112
No. of 1s (actual) 38
Net
No. of 0s
correctly
predicted
No. of 1s
correctly
predicted
% Accuracy
(0s)
% Accuracy
(1s)
No. of correctly
predicted values
% Accuracy
(Overall)
1 105 3 94% 8% 108 72%
2 111 2 99% 5% 113 75%
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
9 | P a g e

3 112 0 100% 0% 112 75%
4 112 0 100% 0% 112 75%
5 110 8 98% 21% 118 79%

Step 6: Comparing with previous training results
Comparing with the results from the neural network training in the previous assignment, this
method seems to provide more accurate results
From 2
nd
assignment:
Net
No. of 0s
correctly
predicted
No. of 1s
correctly
predicted
% Accuracy
(0s)
% Accuracy
(1s)
No. of correctly
predicted values
% Accuracy
(Overall)
1 104 8 95% 20% 112 75%
2 104 8 95% 20% 112 75%
3 104 8 95% 20% 112 75%
4 104 8 95% 20% 112 75%

As seen from this the highest accuracy seems to be 75%, whereas 79% accuracy is obtained
using the decision trees.

Overall results:

Test Output.xlsx

R Code:

Transfusion data:
transfusion.csv


AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
10 | P a g e

Q3. Genetic Algorithms applied to travelling salesman problem
For implementing Genetic Algorithm to solve the travelling salesman problem, we have chosen
Srilanka and the distance between the chosen 20 cities is appended below as a .CSV file.
Reciprocal of Tour length is used as the Fitness function.
group1.csv

R File

SETTING SEED VALUE:
To replicate the results of a GA search, throughout the experiment, seed was maintained as
123.Seed is the integer value containing the random number generator state
MODEL 1:
Parameter values for this model have been set as follows:
Population size =50
Cross over probability = 0.8
Mutation probability = 0.20
Elitism = 2 # the number of best fitness individuals to survive at each generation
Tour length of our First model was calculated using the following command:
apply(GA@solution, 1, tourLength, G) # G is the distance matrix
Tour length of our First model is: 1630
Summary of the GA gave was printed using the following command:
summary (GA)
-----------------------------------+
| Genetic Algorithm |
+-----------------------------------+

GA settings:
Type = permutation
Population size = 50
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
11 | P a g e

Number of generations = 5000
Elitism = 2
Crossover probability = 0.8
Mutation probability = 0.2

GA results:
Iterations = 627
Fitness function value = 0.0006365372
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20
x21
[1,] 12 8 21 11 14 18 1 19 16 15 13 20 7 4 5 3 9 10 17 2
6
[2,] 2 17 10 9 3 5 4 7 20 13 15 16 19 1 18 14 11 21 8 12
6
[3,] 9 3 5 4 7 20 13 15 16 19 1 18 14 11 21 8 12 6 2 17
10

Representation of the tour is shown below:

MODEL 2
In our second model, we have set the parameter values as follows:
Population size = 100
Cross over probability = 0.90
Mutation probability = 0.05
Elitism = 5 # the number of best fitness individuals to survive at each generation
Summary of Model 2 is shown below:
+-----------------------------------+
| Genetic Algorithm |
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
12 | P a g e

+-----------------------------------+

GA settings:
Type = permutation
Population size = 100
Number of generations = 5000
Elitism = 5
Crossover probability = 0.9
Mutation probability = 0.05

GA results:
Iterations = 648
Fitness function value = 0.0006188119
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20
x21
[1,] 14 11 21 8 10 4 5 3 9 17 2 12 6 15 16 13 20 7 19 1
18
[2,] 14 11 21 8 10 5 4 3 9 17 2 12 6 15 16 13 20 7 19 1
18
[3,] 18 14 11 21 8 10 4 5 3 9 17 2 12 6 15 16 13 20 7 19
1
[4,] 1 18 14 11 21 8 10 5 4 3 9 17 2 12 6 15 16 13 20 7
19
[5,] 18 14 11 21 8 10 5 4 3 9 17 2 12 6 15 16 13 20 7 19
1


Tour Length for Model 2: 1616

Representation of Tour per MODEL 2:



MODEL 3:

Parameter values for this model are as follows:
Population size = 200
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
13 | P a g e

Cross over probability = 0.90
Mutation probability = 0.05
Elitism = 5 # the number of best fitness individuals to survive at each generation
Summary of Model 3 is shown below:

+-----------------------------------+
| Genetic Algorithm |
+-----------------------------------+

GA settings:
Type = permutation
Population size = 200
Number of generations = 5000
Elitism = 5
Crossover probability = 0.9
Mutation probability = 0.05

GA results:
Iterations = 557
Fitness function value = 0.0006253909
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21
[1,] 13 16 2 17 10 9 3 5 4 7 19 1 18 14 11 21 8 12 6 15 20
[2,] 20 15 2 17 10 9 3 5 4 7 19 1 18 14 11 21 8 12 6 16 13
[3,] 16 2 17 10 9 3 5 4 7 19 1 18 14 11 21 8 12 6 15 20 13
[4,] 6 15 20 13 16 2 17 10 9 3 5 4 7 19 1 18 14 11 21 8 12
[5,] 20 13 16 2 17 10 9 3 5 4 7 19 1 18 14 11 21 8 12 6 15

TOUR LENGTH FOR MODEL 3: 1690

Representation of the tour is shown below



AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
14 | P a g e

MODEL 4:

Parameter values for this model are as follows:
Population size = 200
Cross over probability = 0.95
Mutation probability = 0.1
Elitism = 5 # the number of best fitness individuals to survive at each generation
Summary of Model 4 is shown below:
+-----------------------------------+
| Genetic Algorithm |
+-----------------------------------+

GA settings:
Type = permutation
Population size = 200
Number of generations = 5000
Elitism = 5
Crossover probability = 0.95
Mutation probability = 0.1

GA results:
Iterations = 1296
Fitness function value = 0.0006365372
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20
x21
[1,] 19 1 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20 13 15
16
[2,] 1 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20 13 15 16
19
[3,] 16 19 1 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20 13
15
[4,] 15 16 19 1 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20
13
[5,] 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20 13 15 16 19 1
18
[6,] 13 15 16 19 1 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7
20
[7,] 18 14 11 21 8 12 6 2 17 10 9 3 5 4 7 20 13 15 16 19
1

TOUR LENGTH FOR MODEL 4: 1630

Representation of the tour is as follows:

AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
15 | P a g e



MODEL 5:

Parameter values for this model are as follows:
Population size = 300
Cross over probability = 0.99
Mutation probability = 0.3
Elitism = 7 # the number of best fitness individuals to survive at each generation
Summary of Model 5 is shown below:
+-----------------------------------+
| Genetic Algorithm |
+-----------------------------------+

GA settings:
Type = permutation
Population size = 300
Number of generations = 5000
Elitism = 7
Crossover probability = 0.99
Mutation probability = 0.3

GA results:
Iterations = 1625
Fitness function value = 0.0006365372
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20
x21
[1,] 19 16 15 13 20 7 4 5 3 9 10 17 2 6 12 8 21 11 14 18
1
[2,] 5 3 9 10 17 2 6 12 8 21 11 14 18 1 19 16 15 13 20 7
4
[3,] 3 9 10 17 2 6 12 8 21 11 14 18 1 19 16 15 13 20 7 4
5
[4,] 14 18 1 19 16 15 13 20 7 4 5 3 9 10 17 2 6 12 8 21
11
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
16 | P a g e

[5,] 10 17 2 6 12 8 21 11 14 18 1 19 16 15 13 20 7 4 5 3
9
[6,] 10 17 2 6 12 8 21 11 14 18 1 19 20 13 15 16 7 4 5 3
9
[7,] 2 6 12 8 21 11 14 18 1 19 16 15 13 20 7 4 5 3 9 10
17

TOUR LENGTH FOR MODEL 5: 1571

Representation of the Tour is as follows:

MODEL 6:

Parameter values for this model are as follows:
Population size = 50
Cross over probability = 0.7
Mutation probability = 0.1
Elitism = 3 # the number of best fitness individuals to survive at each generation
Summary of Model 6 is shown below:

+-----------------------------------+
| Genetic Algorithm |
+-----------------------------------+

GA settings:
Type = permutation
Population size = 50
Number of generations = 5000
Elitism = 3
Crossover probability = 0.7
Mutation probability = 0.1

GA results:
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
17 | P a g e

Iterations = 611
Fitness function value = 0.0006024096
Solutions =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20
x21
[1,] 7 19 1 18 14 11 21 8 10 17 2 12 6 15 16 20 13 9 3 5
4
[2,] 7 19 1 18 14 11 21 8 10 17 2 12 6 15 16 13 20 9 3 5
4
[3,] 5 3 9 13 20 16 15 6 12 2 17 10 8 21 11 14 18 1 19 7
4
[4,] 19 1 18 14 11 21 8 10 17 2 12 6 15 16 13 20 9 3 5 4
7

TOUR LENGTH FOR MODEL 6: 1586

Representation of the Tour is as follows:


SUMMARY OF RESULTS:

POPULATION
CROSSOVER
PROBABILITY
MUTATION
PROBABILITY ELITISM TOUR LENGTH
MODEL 1 50 0.8 0.2 2 1630
MODEL 2 100 0.9 0.05 5 1616
MODEL 3 200 0.9 0.05 5 1690
MODEL 4 200 0.95 0.1 5 1630
MODEL 5 300 0.99 0.3 7 1571
MODEL 6 50 0.7 0.1 3 1586


AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
18 | P a g e

OBSERVATIONS & CONCLUSION:

1. Model 5 gives the shortest tour with a tour length of 1571
2. From Models 2 & 3, we could see that, everything else remaining same, an increase in
population increased the tour length.
3. Further, improvements in Tour length were observed as cross over probability was
increased.
4. Proper tuning of these two factors (Crossover probability and Population) can work well
to improve the performance of the Genetic Algorithm models.
5. Almost similar models, Model 1 and 6 shows that a decrease in Mutation probability and
increase in the number of survivors at each generation (Elitism) improves results.

Q4. DEEPNET Training problem
1. Transfusion data:
STEP 1 : Uploading data & Defining variables:

STEP 2: SPLITTING INTO TRAIN & TEST SETS (80/20:
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
19 | P a g e



STEP 3: Uploading data & Defining variables:


AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
20 | P a g e

STEP 4: TRAINING & TEST ACCURACY:


STEP 5: PREDICTION USING THE TRAINED ENSEMBLE:
The test data set was applied to the trained ensemble and the confusion matrix is shown below:

AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
21 | P a g e



OBSERVATIONS:
1. As we could see, the deep net is good in predicting Zeros and not ones.
2. One reason for this behaviour could be the highly skewed number of Zeros and Ones in the test
data set (1:9)
3. Deep nets are particularly suitable for complex data sets with very high number of variables and
could potentially lead to errors when used on a small dataset that doesnt have enough number
of observations in both the classes.


AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
22 | P a g e

2. TURKISH stock data:
Data preprocessing:
The stockdata was rearranged to match the format of ERSATZ. Date and ISE(TL based) was removed
since we will be using the ISE (USD based).
ERSATZ considers the last column as the dependent variable and the same has been done. Final data
uploaded in the system is appended below:
data_akbilgic.csv

Screenshots of ERSATZ is shown step by step:
Step 1: UPLOADING DATA


STEP 2 : SPLITTING INTO TRAINING AND TEST DATASETS
In this step, the stock data is split into training and test sets in the ratio of 4:1 (80% Train, 20% Test)
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
23 | P a g e


STEP 3 : CREATING ENSEMBLE
We faced problem in this step as the system repeatedly gave errors.

Since the ERSATZ system has worked for our previous data, we presume the following to be the reasons
for this:
Non-integers and negative integers in the dependent variable
Too many classes in the dependent variable
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
24 | P a g e

Q5. NET Logo problem
Extend the model to show what happens when a third technology enters the fray and is
selected by users based on either affinity or neighbors?

Existing technologies are Red and Blue.
Let the third technology be Green.

As given in the problem, user choice depends on either affinity towards technology or choice of
the neighbors. These choices are calibrated based on the slider button which gives division of
percentage.

Addition of a new technology called Green should also follow the same principles described
above. Hence required code changes are done in the attached NetLogo model file.

Initially logic based on assumption that Red =1 and Blue = -1,
If (sum of values of neighbors is > 0) then set the color of current patch to Red
If (sum of values of neighbors is < 0) then set the color of current patch to Blue

To accommodate the new technology Green, the new logic has extra condition of,

Red =1, Blue =0, Green = -1
If (sum of values of neighbors is > 0) then set the color of current patch to Red
If (sum of values of neighbors is < 0) then set the color of current patch to Green
If (sum of values of neighbors is = 0) then set the color of current patch to Blue

Logic: For a given Green patch, the count of neighbors,
If Number of Red > Number of Green, then set the color to Red
If Number of Green > Number of Red, then set the color to Green
If Number of Red = Number of Green or Number of Blue is highest, then set the color to Blue

Hence, in new model there will be three-way competition among all the three technologies with
equal probability based on the Initial-Adoption and Neighbor-Affinity settings.

Observation: Three mobile operating systems, Android, Windows and iOS.
For a given Neighbor Affinity the fraction of population or market share remains same after few
ticks. Hence no major changes observed in market share after few ticks. We attribute this
phenomenon to market competitiveness of all the three organizations.

But if we change the Neighbor Affinity in between the run, there is a trend of substantial change
in market share and lines will cross to reverse the dominance in market too.
AIAM ASSIGNMENT 3 GROUP 1 23 August 2014
25 | P a g e


What happens when there is negative publicity for a technology? How can this be modeled
in a neighborhood?

When there is a negative feedback, highly used technology by neighbors will have negative
impact on the given patch.

To accommodate the new technology Green with negative feedback, the new logic has
conditions as,
If (sum of values of neighbors is > 0) then set the color of current patch to Green
If (sum of values of neighbors is < 0) then set the color of current patch to Red
If (sum of values of neighbors is = 0) then set the color of current patch to randomly to
any of the three colors.
Note: If number of Red = Number of Green, then we are not sure of superior technology. Hence
we will assign random color to this patch.

Logic: For a given Green patch, the count of neighbors,
If Number of Red > Number of Green, then set the color to Green
If Number of Green > Number of Red, then set the color to Red
If Number of Red = Number of Blue or Number of Green is highest, then set the color
randomly

Hence, in new model there will be three-way competition among all the three technologies to
decrease its presence with equal probability based on the Initial-Adoption and Neighbor-Affinity
settings.

Observation: Three mobile operating systems, Android, Windows and iOS.

For a given Neighbor Affinity the fraction of population or market share remains same after few
ticks. Hence no major changes observed in market share after few ticks. We attribute this
phenomenon to market competitiveness of all the three organizations.

But if we change the Neighbor Affinity in between the run, there is a trend of substantial change
in market. Rich will become poor and poor becomes Richer. Hence weaker technology has
high chances of increasing market share

Net-logo files: