Você está na página 1de 46

Aluno: LEONARDO MAXIMINO BERNARDO 33209961

Caros Alunos,
Como observaram, as últimas unidades não possuem atividades de autocorreção. É proposta
uma atividade mais prática, considerando que vocês já possuem instalado o software Weka
é importante vocês reproduzirem os experimentos vistos nas unidades.

1 - Você terá que analisar as características dos clusters gerados e relacioná-los com as regras
geradas pelo apriori, descreva isso em um relatório e com as regras e clusters gerados. (peso
3)

2 - No segundo experimento você deverá usar a base de dados "IrisDataSet" no arquivo


"iris.csv" bastante conhecida para experimentos e clustering. Você deverá executar o
experimento com o Kmeans no Weka e verificar qual é o melhor número de clusters para o
modelo gerado, utilizando o erro RMS com um gráfico, como foi feito na unidade 6 com a
base de dados "A". (peso 3)

Faça o download do CSV aqui: "iris.csv"

Boa atividade!

Respostas

1)

RELATÓRIO

A figura seguinte apresenta um código em python utilizando o pacote apyori, numpy e pandas para
obtenção de regras sobre o irisDataSet.
As regras de associação geradas pelo algoritmo Apriori aplicado ao conjunto de dados iris
apresentadas foram as seguintes:

• RelationRecord: Cada RelationRecord representa uma regra de associação.


• items: São os itens envolvidos na regra de associação.
• support: O suporte indica com que frequência o conjunto de itens aparece no conjunto de
dados. Por exemplo, um valor de suporte de 0,33 significa que o conjunto de itens aparece
em 33% das transações.
• ordered_statistics: Esta é uma lista de objetos OrderedStatistic, cada um representando uma
regra específica.
• OrderedStatistic: Contém informações sobre uma regra de associação específica.
o items_base: O antecedente (lado esquerdo) da regra.
o items_add: O consequente (lado direito) da regra.
o confidence: A confiança é uma medida da confiabilidade da regra. Indica a probabilidade do
consequente ser verdadeiro quando o antecedente é verdadeiro. Por exemplo, uma confiança de
0,33 significa que o consequente provavelmente ocorre em 33% das transações onde o
antecedente está presente.
o lift: O lift mede o quanto mais provável é que o consequente seja verdadeiro quando o antecedente
é verdadeiro em comparação com sua frequência esperada. Valores de lift maiores que 1 indicam
que a ocorrência do antecedente aumenta a probabilidade do consequente, enquanto valores de
lift menores que 1 indicam uma diminuição na probabilidade.

A saída do código Apriori implementado em python, indica que as regras de associação geradas pelo
algoritmo indicam que cada classe ('Setosa', 'Versicolor', 'Virginica') de flores iris aparece com igual
frequência no conjunto de dados (33,33%).

Comparação das regras obtidas na clusterização com as regras obtidas com o Apriori

A análise dos clusters gerados revela padrões distintos na distribuição das características dos dados
no conjunto de dados iris. Ao utilizar diferentes números de clusters, observamos variações na
separação dos grupos.

Clusterização com 2 clusters:

• Cluster 0 (Centróides mais próximos de Setosa):


o Compreende 67% dos dados.
o Médias mais altas de sepal.length, sepal.width, petal.length, e petal.width em comparação com
Cluster 1.
o A maioria das instâncias classificadas como Setosa.
• Cluster 1 (Centróides mais próximos de versicolor):
o Compreende 33% dos dados.
o Médias mais baixas de sepal.length, sepal.width, petal.length, e petal.width em comparação com
Cluster 0.
o Todas as instâncias classificadas como versicolor.
Na clusterização com 2 clusters, observamos uma clara separação entre Setosa e Versicolor, com os
clusters representando essas duas classes de forma distinta. No entanto, a distribuição uniforme das
classes do conjunto de dados iris nas regras geradas pelo Apriori não reflete essa distinção tão
claramente.

Clusterização com 3 clusters:

• Cluster 0 (Centróides mais próximos de Setosa):


o Compreende 33% dos dados.
o Médias mais baixas de sepal.length, sepal.width, petal.length, e petal.width em comparação com
Cluster 1 e Cluster 2.
o Todas as instâncias classificadas como Setosa.
• Cluster 1 (Centróides mais próximos de versicolor):
o Compreende 33% dos dados.
o Médias intermediárias de sepal.length, sepal.width, petal.length, e petal.width.
o Todas as instâncias classificadas como versicolor.
• Cluster 2 (Centróides mais próximos de Virginica):
o Compreende 33% dos dados.
o Médias mais altas de sepal.length, sepal.width, petal.length, e petal.width.
o Todas as instâncias classificadas como Virginica.

Ao aumentar para 3 clusters, observamos uma separação mais nítida entre as três classes do conjunto
de dados iris, o que se alinha mais estreitamente com as regras do Apriori, que sugerem uma
distribuição uniforme entre as classes.

Clusterização com 10 clusters:


• Os 10 clusters representam subdivisões mais finas dos dados, onde cada um abrange uma
proporção variável dos dados.
• Atribuir interpretações diretas para cada cluster pode ser complexo, pois eles não
correspondem diretamente às classes originais.
• Pode-se observar que alguns clusters têm instâncias predominantemente de uma classe
específica, enquanto outros têm uma mistura de classes.

Por fim, ao utilizar 10 clusters, obtemos uma subdivisão mais fina dos dados, sem uma
correspondência direta com as classes originais. Esta abordagem pode ser útil para identificar padrões
mais sutis nos dados, mas pode complicar a interpretação direta em termos das classes de íris.

Em resumo, enquanto as regras do Apriori tratam todas as classes do conjunto de dados iris
igualmente, a clusterização revela padrões mais distintos e separação clara entre as classes,
especialmente quando menos clusters são utilizados. A escolha do número de clusters depende do
objetivo da análise e da necessidade de separação das classes.
2)

Nº clusters 2:

=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 2 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 7

Within cluster sum of squared errors: 62.127790750538175


Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1

(150.0) (100.0) (50.0)

===============================================

sepal.length 5.8433 6.262 5.006

sepal.width 3.0573 2.872 3.428

petal.length 3.758 4.906 1.462

petal.width 1.1993 1.676 0.246

variety Setosa Versicolor Setosa

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 100 ( 67%)

1 50 ( 33%)

Nº clusters 3:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 3 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 3

Within cluster sum of squared errors: 7.801559361268048

Initial starting points (random):


Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2

(150.0) (50.0) (50.0) (50.0)

==========================================================

sepal.length 5.8433 5.936 5.006 6.588

sepal.width 3.0573 2.77 3.428 2.974

petal.length 3.758 4.26 1.462 5.552

petal.width 1.1993 1.326 0.246 2.026

variety Setosa Versicolor Setosa Virginica

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 50 ( 33%)

1 50 ( 33%)

2 50 ( 33%)

Nº clusters 4:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 4 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 4

Within cluster sum of squared errors: 6.597925743648829

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor
Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3

(150.0) (24.0) (26.0) (50.0) (50.0)

=====================================================================

sepal.length 5.8433 6.3292 5.5731 6.588 5.006

sepal.width 3.0573 2.9792 2.5769 2.974 3.428

petal.length 3.758 4.6 3.9462 5.552 1.462

petal.width 1.1993 1.4625 1.2 2.026 0.246

variety Setosa Versicolor Versicolor Virginica Setosa

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 24 ( 16%)

1 26 ( 17%)

2 50 ( 33%)

3 50 ( 33%)

Nº clusters 5:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 5 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 4

Within cluster sum of squared errors: 6.277659330769319

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor
Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4

(150.0) (19.0) (19.0) (50.0) (50.0) (12.0)

================================================================================

sepal.length 5.8433 5.8789 5.5526 6.588 5.006 6.6333

sepal.width 3.0573 2.9211 2.4526 2.974 3.428 3.0333

petal.length 3.758 4.4211 3.8632 5.552 1.462 4.6333

petal.width 1.1993 1.4105 1.1579 2.026 0.246 1.4583

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 19 ( 13%)

1 19 ( 13%)

2 50 ( 33%)

3 50 ( 33%)

4 12 ( 8%)

Nº clusters 6:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 6 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 6

Within cluster sum of squared errors: 6.1159421000391125

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor
Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5

(150.0) (9.0) (17.0) (50.0) (50.0) (11.0) (13.0)

===========================================================================================

sepal.length 5.8433 6.1889 5.4706 6.588 5.006 6.6545 5.7615

sepal.width 3.0573 2.6667 2.4765 2.974 3.428 3.0455 2.9923

petal.length 3.758 4.5444 3.7941 5.552 1.462 4.6636 4.3308

petal.width 1.1993 1.3778 1.1294 2.026 0.246 1.4727 1.4231

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 9 ( 6%)

1 17 ( 11%)

2 50 ( 33%)

3 50 ( 33%)

4 11 ( 7%)

5 13 ( 9%)

Nº clusters 7:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 7 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 6

Within cluster sum of squared errors: 5.217629646927634

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor
Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6

(150.0) (9.0) (17.0) (50.0) (14.0) (11.0) (13.0) (36.0)

======================================================================================================

sepal.length 5.8433 6.1889 5.4706 6.588 5.3786 6.6545 5.7615 4.8611

sepal.width 3.0573 2.6667 2.4765 2.974 3.8786 3.0455 2.9923 3.2528

petal.length 3.758 4.5444 3.7941 5.552 1.5071 4.6636 4.3308 1.4444

petal.width 1.1993 1.3778 1.1294 2.026 0.2786 1.4727 1.4231 0.2333

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 9 ( 6%)

1 17 ( 11%)

2 50 ( 33%)

3 14 ( 9%)

4 11 ( 7%)

5 13 ( 9%)

6 36 ( 24%)

Nº clusters 8:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 8 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 6

Within cluster sum of squared errors: 4.859535068386244

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor
Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7

(150.0) (9.0) (17.0) (50.0) (7.0) (11.0) (13.0) (26.0) (17.0)

=================================================================================================================

sepal.length 5.8433 6.1889 5.4706 6.588 5.5286 6.6545 5.7615 5.0731 4.6882

sepal.width 3.0573 2.6667 2.4765 2.974 4.0429 3.0455 2.9923 3.5192 3.0353

petal.length 3.758 4.5444 3.7941 5.552 1.4714 4.6636 4.3308 1.5 1.4

petal.width 1.1993 1.3778 1.1294 2.026 0.2857 1.4727 1.4231 0.2692 0.1941

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 9 ( 6%)

1 17 ( 11%)

2 50 ( 33%)

3 7 ( 5%)

4 11 ( 7%)

5 13 ( 9%)

6 26 ( 17%)

7 17 ( 11%)
Nº clusters 9:

=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 9 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 5

Within cluster sum of squared errors: 4.678874159874298


Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Cluster 8: 5.5,2.4,3.7,1,Versicolor

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7 8

(150.0) (4.0) (15.0) (50.0) (7.0) (11.0) (9.0) (26.0) (17.0) (11.0)

=================================================================================================================
===========

sepal.length 5.8433 6.2 5.74 6.588 5.5286 6.6636 5.9222 5.0731 4.6882 5.3909

sepal.width 3.0573 2.425 2.7933 2.974 4.0429 3.0091 3.0778 3.5192 3.0353 2.3727

petal.length 3.758 4.725 4.1467 5.552 1.4714 4.6273 4.5556 1.5 1.4 3.6364

petal.width 1.1993 1.475 1.2533 2.026 0.2857 1.4455 1.5333 0.2692 0.1941 1.0818

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa Versicolor

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 4 ( 3%)

1 15 ( 10%)

2 50 ( 33%)
3 7 ( 5%)

4 11 ( 7%)

5 9 ( 6%)

6 26 ( 17%)

7 17 ( 11%)

8 11 ( 7%)

Nº clusters 10:

=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 10 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans
======

Number of iterations: 5

Within cluster sum of squared errors: 4.587500225526149

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Cluster 8: 5.5,2.4,3.7,1,Versicolor

Cluster 9: 4.3,3,1.1,0.1,Setosa

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7 8 9

(150.0) (4.0) (15.0) (50.0) (7.0) (11.0) (9.0) (24.0) (15.0) (11.0) (4.0)

=================================================================================================================
======================

sepal.length 5.8433 6.2 5.74 6.588 5.5286 6.6636 5.9222 5.0958 4.78 5.3909 4.4

sepal.width 3.0573 2.425 2.7933 2.974 4.0429 3.0091 3.0778 3.5333 3.14 2.3727 2.8

petal.length 3.758 4.725 4.1467 5.552 1.4714 4.6273 4.5556 1.5083 1.4333 3.6364 1.275

petal.width 1.1993 1.475 1.2533 2.026 0.2857 1.4455 1.5333 0.2708 0.2 1.0818 0.2

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa Versicolor Setosa

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===


Clustered Instances

0 4 ( 3%)

1 15 ( 10%)

2 50 ( 33%)

3 7 ( 5%)

4 11 ( 7%)

5 9 ( 6%)

6 24 ( 16%)

7 15 ( 10%)

8 11 ( 7%)

9 4 ( 3%)

Nº clusters 50:

=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 50 -A
"weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety
Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 5

Within cluster sum of squared errors: 0.675478167559253

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Cluster 8: 5.5,2.4,3.7,1,Versicolor

Cluster 9: 4.3,3,1.1,0.1,Setosa

Cluster 10: 6,2.7,5.1,1.6,Versicolor

Cluster 11: 5.7,2.5,5,2,Virginica

Cluster 12: 4.6,3.1,1.5,0.2,Setosa

Cluster 13: 7.4,2.8,6.1,1.9,Virginica

Cluster 14: 5.9,3,5.1,1.8,Virginica

Cluster 15: 6.9,3.2,5.7,2.3,Virginica

Cluster 16: 4.9,3.1,1.5,0.2,Setosa

Cluster 17: 6.7,3.3,5.7,2.5,Virginica

Cluster 18: 7.2,3.6,6.1,2.5,Virginica

Cluster 19: 7.3,2.9,6.3,1.8,Virginica

Cluster 20: 6.1,2.8,4.7,1.2,Versicolor

Cluster 21: 5,3.5,1.3,0.3,Setosa

Cluster 22: 6.3,3.3,4.7,1.6,Versicolor

Cluster 23: 5.9,3,4.2,1.5,Versicolor


Cluster 24: 5.7,3,4.2,1.2,Versicolor

Cluster 25: 6.7,3.3,5.7,2.1,Virginica

Cluster 26: 7.7,2.6,6.9,2.3,Virginica

Cluster 27: 5,3.2,1.2,0.2,Setosa

Cluster 28: 4.6,3.6,1,0.2,Setosa

Cluster 29: 6.1,2.6,5.6,1.4,Virginica

Cluster 30: 6.2,2.2,4.5,1.5,Versicolor

Cluster 31: 6.7,2.5,5.8,1.8,Virginica

Cluster 32: 6.3,2.5,4.9,1.5,Versicolor

Cluster 33: 7.7,2.8,6.7,2,Virginica

Cluster 34: 4.9,2.4,3.3,1,Versicolor

Cluster 35: 5.4,3.4,1.7,0.2,Setosa

Cluster 36: 6.3,2.8,5.1,1.5,Virginica

Cluster 37: 5.1,3.5,1.4,0.3,Setosa

Cluster 38: 4.8,3.4,1.6,0.2,Setosa

Cluster 39: 6.7,3.1,5.6,2.4,Virginica

Cluster 40: 4.7,3.2,1.6,0.2,Setosa

Cluster 41: 5,2,3.5,1,Versicolor

Cluster 42: 4.4,2.9,1.4,0.2,Setosa

Cluster 43: 6.9,3.1,5.4,2.1,Virginica

Cluster 44: 5.6,2.8,4.9,2,Virginica

Cluster 45: 7.9,3.8,6.4,2,Virginica

Cluster 46: 4.8,3.1,1.6,0.2,Setosa

Cluster 47: 5.5,2.5,4,1.3,Versicolor

Cluster 48: 5.2,2.7,3.9,1.4,Versicolor

Cluster 49: 6.7,3,5.2,2.3,Virginica

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

(150.0) (2.0) (5.0) (2.0) (7.0) (5.0) (3.0) (6.0) (2.0) (5.0) (1.0) (1.0) (1.0) (2.0) (2.0) (6.0)
(2.0) (6.0) (3.0) (1.0) (2.0) (1.0) (9.0) (4.0) (3.0) (7.0) (3.0) (1.0) (2.0) (1.0) (2.0) (2.0) (3.0)
(1.0) (3.0) (3.0) (3.0) (3.0) (3.0) (3.0) (3.0) (3.0) (1.0) (1.0) (4.0) (5.0) (2.0) (1.0) (6.0) (1.0)
(2.0)

=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
==========

sepal.length 5.8433 6.65 6.38 6.8 5.5286 6.8 6.0667 5.1333 4.4 5.66 4.3 6 4.9 4.5 7.35
6.2 6.85 4.8833 6.4333 7.2 7.2 6.1 5.0889 6.15 5.6333 5.6857 6.4667 7.7 5 4.6 6.05 6.25
6.4667 6.3 7.6667 5 5.4333 6.2667 5.0667 4.7333 6.5333 4.6667 5 4.5 6.825 5.74 7.8 4.8
5.6167 5.2 6.4

sepal.width 3.0573 2.8 2.9 3.05 4.0429 3.1 2.9333 3.7 2.95 2.42 3 2.7 2.5 3.15 2.85
3 3.2 3.0333 3.3333 3.6 3.1 2.8 3.4333 3.275 3 2.8714 3.3 2.6 3.25 3.6 2.4 2.25
2.5667 2.5 2.9333 2.4 3.4333 2.7667 3.6667 3.4 3.1 3.2 2 2.3 3.025 2.7 3.8 3.1
2.5667 2.7 2.8

petal.length 3.758 4.7 4.32 5.15 1.4714 4.74 4.6 1.4833 1.35 3.78 1.1 5.1 4.5 1.4 6.2
5.2333 5.8 1.4667 5.7667 6.1 5.9 4.7 1.4778 4.625 4.4 4.1143 5.4 6.9 1.3 1 5.3 4.45
5.3667 4.9 6.4667 3.2 1.5 4.9333 1.6667 1.6333 5.5667 1.4333 3.5 1.3 5.5 5.04 6.55 1.6
4.0833 3.9 5.6

petal.width 1.1993 1.45 1.32 2.3 0.2857 1.5 1.4333 0.2 0.2 1.04 0.1 1.6 1.7 0.2 1.85
1.8 2.3 0.1833 2.4667 2.5 1.7 1.2 0.2778 1.625 1.5 1.2429 2.1333 2.3 0.2 0.2 1.45 1.4
1.8667 1.5 2.1333 1.0333 0.2667 1.7 0.4667 0.2333 2.3 0.2 1 0.3 2.075 2.04 2.1 0.2
1.25 1.4 2.15

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa Versicolor Setosa Versicolor
Virginica Setosa Virginica Virginica Virginica Setosa Virginica Virginica Virginica Versicolor Setosa Versicolor Versicolor Versicolor
Virginica Virginica Setosa Setosa Virginica Versicolor Virginica Versicolor Virginica Versicolor Setosa Virginica Setosa Setosa
Virginica Setosa Versicolor Setosa Virginica Virginica Virginica Setosa Versicolor Versicolor Virginica

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 2 ( 1%)

1 5 ( 3%)

2 2 ( 1%)

3 7 ( 5%)

4 5 ( 3%)

5 3 ( 2%)

6 6 ( 4%)

7 2 ( 1%)

8 5 ( 3%)

9 1 ( 1%)

10 1 ( 1%)

11 1 ( 1%)
12 2 ( 1%)

13 2 ( 1%)

14 6 ( 4%)

15 2 ( 1%)

16 6 ( 4%)

17 3 ( 2%)

18 1 ( 1%)

19 2 ( 1%)

20 1 ( 1%)

21 9 ( 6%)

22 4 ( 3%)

23 3 ( 2%)

24 7 ( 5%)

25 3 ( 2%)

26 1 ( 1%)

27 2 ( 1%)

28 1 ( 1%)

29 2 ( 1%)

30 2 ( 1%)

31 3 ( 2%)

32 1 ( 1%)

33 3 ( 2%)

34 3 ( 2%)

35 3 ( 2%)

36 3 ( 2%)

37 3 ( 2%)

38 3 ( 2%)

39 3 ( 2%)

40 3 ( 2%)

41 1 ( 1%)

42 1 ( 1%)

43 4 ( 3%)

44 5 ( 3%)

45 2 ( 1%)

46 1 ( 1%)

47 6 ( 4%)

48 1 ( 1%)

49 2 ( 1%)
Nº clusters 100:

=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 100 -
A "weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data

=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 4

Within cluster sum of squared errors: 0.19572102742784764

Initial starting points (random):


Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Cluster 8: 5.5,2.4,3.7,1,Versicolor

Cluster 9: 4.3,3,1.1,0.1,Setosa

Cluster 10: 6,2.7,5.1,1.6,Versicolor

Cluster 11: 5.7,2.5,5,2,Virginica

Cluster 12: 4.6,3.1,1.5,0.2,Setosa

Cluster 13: 7.4,2.8,6.1,1.9,Virginica

Cluster 14: 5.9,3,5.1,1.8,Virginica

Cluster 15: 6.9,3.2,5.7,2.3,Virginica

Cluster 16: 4.9,3.1,1.5,0.2,Setosa

Cluster 17: 6.7,3.3,5.7,2.5,Virginica

Cluster 18: 7.2,3.6,6.1,2.5,Virginica

Cluster 19: 7.3,2.9,6.3,1.8,Virginica

Cluster 20: 6.1,2.8,4.7,1.2,Versicolor

Cluster 21: 5,3.5,1.3,0.3,Setosa

Cluster 22: 6.3,3.3,4.7,1.6,Versicolor

Cluster 23: 5.9,3,4.2,1.5,Versicolor

Cluster 24: 5.7,3,4.2,1.2,Versicolor

Cluster 25: 6.7,3.3,5.7,2.1,Virginica

Cluster 26: 7.7,2.6,6.9,2.3,Virginica

Cluster 27: 5,3.2,1.2,0.2,Setosa

Cluster 28: 4.6,3.6,1,0.2,Setosa

Cluster 29: 6.1,2.6,5.6,1.4,Virginica

Cluster 30: 6.2,2.2,4.5,1.5,Versicolor

Cluster 31: 6.7,2.5,5.8,1.8,Virginica

Cluster 32: 6.3,2.5,4.9,1.5,Versicolor

Cluster 33: 7.7,2.8,6.7,2,Virginica

Cluster 34: 4.9,2.4,3.3,1,Versicolor

Cluster 35: 5.4,3.4,1.7,0.2,Setosa

Cluster 36: 6.3,2.8,5.1,1.5,Virginica


Cluster 37: 5.1,3.5,1.4,0.3,Setosa

Cluster 38: 4.8,3.4,1.6,0.2,Setosa

Cluster 39: 6.7,3.1,5.6,2.4,Virginica

Cluster 40: 4.7,3.2,1.6,0.2,Setosa

Cluster 41: 5,2,3.5,1,Versicolor

Cluster 42: 4.4,2.9,1.4,0.2,Setosa

Cluster 43: 6.9,3.1,5.4,2.1,Virginica

Cluster 44: 5.6,2.8,4.9,2,Virginica

Cluster 45: 7.9,3.8,6.4,2,Virginica

Cluster 46: 4.8,3.1,1.6,0.2,Setosa

Cluster 47: 5.5,2.5,4,1.3,Versicolor

Cluster 48: 5.2,2.7,3.9,1.4,Versicolor

Cluster 49: 6.7,3,5.2,2.3,Virginica

Cluster 50: 5.4,3.9,1.7,0.4,Setosa

Cluster 51: 6.7,3,5,1.7,Versicolor

Cluster 52: 5,3,1.6,0.2,Setosa

Cluster 53: 6.3,3.3,6,2.5,Virginica

Cluster 54: 6.8,3,5.5,2.1,Virginica

Cluster 55: 5,3.6,1.4,0.2,Setosa

Cluster 56: 5.1,3.8,1.9,0.4,Setosa

Cluster 57: 5.7,3.8,1.7,0.3,Setosa

Cluster 58: 5.8,2.8,5.1,2.4,Virginica

Cluster 59: 5.6,2.5,3.9,1.1,Versicolor

Cluster 60: 5.5,2.4,3.8,1.1,Versicolor

Cluster 61: 5.2,3.5,1.5,0.2,Setosa

Cluster 62: 5.3,3.7,1.5,0.2,Setosa

Cluster 63: 5.7,2.8,4.5,1.3,Versicolor

Cluster 64: 5.2,3.4,1.4,0.2,Setosa

Cluster 65: 5,2.3,3.3,1,Versicolor

Cluster 66: 6,3,4.8,1.8,Virginica

Cluster 67: 6.8,3.2,5.9,2.3,Virginica

Cluster 68: 5,3.3,1.4,0.2,Setosa

Cluster 69: 5.5,2.3,4,1.3,Versicolor

Cluster 70: 4.4,3.2,1.3,0.2,Setosa

Cluster 71: 4.5,2.3,1.3,0.3,Setosa

Cluster 72: 4.9,2.5,4.5,1.7,Virginica

Cluster 73: 6.3,2.9,5.6,1.8,Virginica

Cluster 74: 5,3.4,1.5,0.2,Setosa


Cluster 75: 7.7,3,6.1,2.3,Virginica

Cluster 76: 7.2,3,5.8,1.6,Virginica

Cluster 77: 5.5,3.5,1.3,0.2,Setosa

Cluster 78: 6.1,2.8,4,1.3,Versicolor

Cluster 79: 4.8,3,1.4,0.1,Setosa

Cluster 80: 6.5,3.2,5.1,2,Virginica

Cluster 81: 5.6,2.9,3.6,1.3,Versicolor

Cluster 82: 7.7,3.8,6.7,2.2,Virginica

Cluster 83: 6.3,2.5,5,1.9,Virginica

Cluster 84: 5.4,3.7,1.5,0.2,Setosa

Cluster 85: 5,3.4,1.6,0.4,Setosa

Cluster 86: 5.1,3.7,1.5,0.4,Setosa

Cluster 87: 5.4,3,4.5,1.5,Versicolor

Cluster 88: 6,2.2,5,1.5,Virginica

Cluster 89: 6.3,3.4,5.6,2.4,Virginica

Cluster 90: 5.8,2.7,3.9,1.2,Versicolor

Cluster 91: 6.4,2.8,5.6,2.2,Virginica

Cluster 92: 6.4,3.2,5.3,2.3,Virginica

Cluster 93: 4.9,3,1.4,0.2,Setosa

Cluster 94: 6.3,2.7,4.9,1.8,Virginica

Cluster 95: 5.1,2.5,3,1.1,Versicolor

Cluster 96: 5.6,3,4.5,1.5,Versicolor

Cluster 97: 5.1,3.8,1.5,0.3,Setosa

Cluster 98: 6.7,3.1,4.7,1.5,Versicolor

Cluster 99: 6.2,2.8,4.8,1.8,Virginica

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99

(150.0) (1.0) (3.0) (1.0) (4.0) (2.0) (3.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (2.0) (1.0) (1.0) (1.0) (1.0) (1.0) (4.0) (1.0) (3.0) (1.0) (1.0) (1.0) (1.0) (1.0) (2.0) (1.0)
(1.0) (2.0) (1.0) (2.0) (1.0) (1.0) (3.0) (1.0) (3.0) (1.0) (1.0) (2.0) (3.0) (1.0) (1.0) (2.0) (1.0)
(1.0) (2.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (2.0) (2.0) (2.0) (1.0) (3.0) (1.0) (1.0)
(2.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (3.0) (2.0) (1.0) (2.0) (1.0) (1.0) (1.0) (2.0) (1.0) (1.0)
(1.0) (1.0) (3.0) (1.0) (1.0) (1.0) (2.0) (3.0) (3.0) (1.0) (2.0) (2.0) (1.0) (1.0) (2.0) (4.0) (1.0)
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
============================================================================================================

sepal.length 5.8433 6.5 6.4 6.9 5.55 6.95 6.0667 4.9 4.4 6 4.3 6 5.7 4.6 7.4 5.9
6.9 4.9 6.7 7.2 7.3 6.1 5 6.15 5.9 5.6667 6.7 7.7 5 4.6 6.1 6.25 6.7 6.3
7.65 4.9 5.4 6.3 5.1 4.7333 6.7 4.6667 5 4.4 6.85 5.7333 7.9 4.8 5.5 5.2 6.7
5.4 6.7 5 6.3 7.1 5 5.1 5.7 5.8 5.65 5.5 5.15 5.3 5.6667 5.2 5 6.05 6.8
5 5.5 4.4 4.5 4.9 6.4 5.05 7.7 7.2 5.5 6.1 4.8 6.5 5.6 7.7 6.3 5.4 5.0333
5.1 5.4 6 6.25 5.8 6.4333 6.4 4.85 6.35 5.1 5.6 5.1 6.7 6.2

sepal.width 3.0573 2.8 2.9 3.1 4.175 3.15 2.9333 3.6 3 2.2 3 2.7 2.5 3.1 2.8 3
3.2 3.1 3.3 3.6 2.9 2.8 3.5 3.275 3 2.9667 3.3 2.6 3.2 3.6 2.6 2.25 2.5 2.5
2.9 2.4 3.4 2.8 3.5 3.4 3.1 3.2 2 2.9 3.05 2.7333 3.8 3.1 2.55 2.7 3 3.9
3 3 3.3 3 3.6 3.8 3.8 2.8 2.55 2.4 3.5 3.7 2.7667 3.4 2.3 3 3.2 3.3 2.3
3.2 2.3 2.5 3 3.4 3 3.1 3.5 2.8 3 3.1 2.9 3.8 2.5 3.7 3.4 3.7 3 2.2
3.4 2.6667 2.8667 3.2 3 2.7 2.5 3 3.8 3 2.8

petal.length 3.758 4.6 4.4 5.1 1.4 4.8 4.6 1.4 1.3 4 1.1 5.1 5 1.5 6.1 5.1
5.7 1.5 5.7 6.1 6.3 4.7 1.3 4.625 4.2 4.1667 5.7 6.9 1.2 1 5.6 4.45 5.8 4.9
6.65 3.3 1.6 5.1 1.4 1.6333 5.6 1.4333 3.5 1.4 5.45 5.0333 6.4 1.6 4.2 3.9 5.2
1.5 5 1.6 6 5.9 1.4 1.9 1.7 5.1 3.7 3.75 1.45 1.5 4.2667 1.4 3.3 4.85 5.9
1.4 4 1.3 1.3 4.5 5.5333 1.5 6.1 5.9 1.3 4 1.4 5.15 3.6 6.7 5 1.5 1.6333
1.5 4.5 5 5.5 4 5.6667 5.3 1.4 5.1 3 4.5 1.55 4.575 4.8

petal.width 1.1993 1.5 1.3 2.3 0.225 1.45 1.4333 0.1 0.2 1 0.1 1.6 2 0.2 1.9
1.8 2.3 0.15 2.5 2.5 1.8 1.2 0.3 1.625 1.5 1.2667 2.1 2.3 0.2 0.2 1.4 1.4 1.8
1.5 2.05 1 0.3 1.5 0.3 0.2333 2.4 0.2 1 0.2 2.1 1.9333 2 0.2 1.25 1.4 2.3
0.4 1.7 0.2 2.5 2.1 0.2 0.4 0.3 2.4 1.05 1.05 0.2 0.2 1.3 0.2 1 1.8 2.3
0.2 1.3 0.2 0.3 1.7 1.8 0.2 2.3 1.7 0.2 1.3 0.1 2 1.3 2.2 1.9 0.2 0.5 0.4
1.5 1.5 2.35 1.1333 2.1667 2.3 0.25 1.85 1.1 1.5 0.25 1.425 1.8

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa Versicolor Setosa Versicolor
Virginica Setosa Virginica Virginica Virginica Setosa Virginica Virginica Virginica Versicolor Setosa Versicolor Versicolor Versicolor
Virginica Virginica Setosa Setosa Virginica Versicolor Virginica Versicolor Virginica Versicolor Setosa Virginica Setosa Setosa
Virginica Setosa Versicolor Setosa Virginica Virginica Virginica Setosa Versicolor Versicolor Virginica Setosa Versicolor Setosa
Virginica Virginica Setosa Setosa Setosa Virginica Versicolor Versicolor Setosa Setosa Versicolor Setosa Versicolor Virginica
Virginica Setosa Versicolor Setosa Setosa Virginica Virginica Setosa Virginica Virginica Setosa Versicolor Setosa Virginica
Versicolor Virginica Virginica Setosa Setosa Setosa Versicolor Virginica Virginica Versicolor Virginica Virginica Setosa Virginica
Versicolor Versicolor Setosa Versicolor Virginica

Time taken to build model (full training data) : 0.02 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 1 ( 1%)
1 3 ( 2%)

2 1 ( 1%)

3 4 ( 3%)

4 2 ( 1%)

5 3 ( 2%)

6 1 ( 1%)

7 1 ( 1%)

8 1 ( 1%)

9 1 ( 1%)

10 1 ( 1%)

11 1 ( 1%)

12 1 ( 1%)

13 1 ( 1%)

14 1 ( 1%)

15 1 ( 1%)

16 2 ( 1%)

17 1 ( 1%)

18 1 ( 1%)

19 1 ( 1%)

20 1 ( 1%)

21 1 ( 1%)

22 4 ( 3%)

23 1 ( 1%)

24 3 ( 2%)

25 1 ( 1%)

26 1 ( 1%)

27 1 ( 1%)

28 1 ( 1%)

29 1 ( 1%)

30 2 ( 1%)

31 1 ( 1%)

32 1 ( 1%)

33 2 ( 1%)

34 1 ( 1%)

35 2 ( 1%)

36 1 ( 1%)

37 1 ( 1%)

38 3 ( 2%)
39 1 ( 1%)

40 3 ( 2%)

41 1 ( 1%)

42 1 ( 1%)

43 2 ( 1%)

44 3 ( 2%)

45 1 ( 1%)

46 1 ( 1%)

47 2 ( 1%)

48 1 ( 1%)

49 1 ( 1%)

50 2 ( 1%)

51 1 ( 1%)

52 1 ( 1%)

53 1 ( 1%)

54 1 ( 1%)

55 1 ( 1%)

56 1 ( 1%)

57 1 ( 1%)

58 1 ( 1%)

59 2 ( 1%)

60 2 ( 1%)

61 2 ( 1%)

62 1 ( 1%)

63 3 ( 2%)

64 1 ( 1%)

65 1 ( 1%)

66 2 ( 1%)

67 1 ( 1%)

68 1 ( 1%)

69 1 ( 1%)

70 1 ( 1%)

71 1 ( 1%)

72 1 ( 1%)

73 3 ( 2%)

74 2 ( 1%)

75 1 ( 1%)

76 2 ( 1%)
77 1 ( 1%)

78 1 ( 1%)

79 1 ( 1%)

80 2 ( 1%)

81 1 ( 1%)

82 1 ( 1%)

83 1 ( 1%)

84 1 ( 1%)

85 3 ( 2%)

86 1 ( 1%)

87 1 ( 1%)

88 1 ( 1%)

89 2 ( 1%)

90 3 ( 2%)

91 3 ( 2%)

92 1 ( 1%)

93 2 ( 1%)

94 2 ( 1%)

95 1 ( 1%)

96 1 ( 1%)

97 2 ( 1%)

98 4 ( 3%)

99 1 ( 1%)

Nº clusters 146:
=== Run information ===

Scheme: weka.clusterers.SimpleKMeans -init 0 -max-candidates 100 -periodic-pruning 10000 -min-density 2.0 -t1 -1.25 -t2 -1.0 -N 146 -
A "weka.core.EuclideanDistance -R first-last" -I 500 -num-slots 1 -S 10

Relation: iris

Instances: 150

Attributes: 5

sepal.length

sepal.width

petal.length

petal.width

variety

Test mode: evaluate on training data


=== Clustering model (full training set) ===

kMeans

======

Number of iterations: 3

Within cluster sum of squared errors: 0.008237176418015264

Initial starting points (random):

Cluster 0: 6.1,2.9,4.7,1.4,Versicolor

Cluster 1: 6.2,2.9,4.3,1.3,Versicolor

Cluster 2: 6.9,3.1,5.1,2.3,Virginica

Cluster 3: 5.5,4.2,1.4,0.2,Setosa

Cluster 4: 6.9,3.1,4.9,1.5,Versicolor

Cluster 5: 6.1,3,4.6,1.4,Versicolor

Cluster 6: 4.9,3.6,1.4,0.1,Setosa

Cluster 7: 4.4,3,1.3,0.2,Setosa

Cluster 8: 5.5,2.4,3.7,1,Versicolor

Cluster 9: 4.3,3,1.1,0.1,Setosa

Cluster 10: 6,2.7,5.1,1.6,Versicolor

Cluster 11: 5.7,2.5,5,2,Virginica

Cluster 12: 4.6,3.1,1.5,0.2,Setosa

Cluster 13: 7.4,2.8,6.1,1.9,Virginica

Cluster 14: 5.9,3,5.1,1.8,Virginica

Cluster 15: 6.9,3.2,5.7,2.3,Virginica

Cluster 16: 4.9,3.1,1.5,0.2,Setosa

Cluster 17: 6.7,3.3,5.7,2.5,Virginica

Cluster 18: 7.2,3.6,6.1,2.5,Virginica

Cluster 19: 7.3,2.9,6.3,1.8,Virginica

Cluster 20: 6.1,2.8,4.7,1.2,Versicolor

Cluster 21: 5,3.5,1.3,0.3,Setosa

Cluster 22: 6.3,3.3,4.7,1.6,Versicolor

Cluster 23: 5.9,3,4.2,1.5,Versicolor

Cluster 24: 5.7,3,4.2,1.2,Versicolor

Cluster 25: 6.7,3.3,5.7,2.1,Virginica


Cluster 26: 7.7,2.6,6.9,2.3,Virginica

Cluster 27: 5,3.2,1.2,0.2,Setosa

Cluster 28: 4.6,3.6,1,0.2,Setosa

Cluster 29: 6.1,2.6,5.6,1.4,Virginica

Cluster 30: 6.2,2.2,4.5,1.5,Versicolor

Cluster 31: 6.7,2.5,5.8,1.8,Virginica

Cluster 32: 6.3,2.5,4.9,1.5,Versicolor

Cluster 33: 7.7,2.8,6.7,2,Virginica

Cluster 34: 4.9,2.4,3.3,1,Versicolor

Cluster 35: 5.4,3.4,1.7,0.2,Setosa

Cluster 36: 6.3,2.8,5.1,1.5,Virginica

Cluster 37: 5.1,3.5,1.4,0.3,Setosa

Cluster 38: 4.8,3.4,1.6,0.2,Setosa

Cluster 39: 6.7,3.1,5.6,2.4,Virginica

Cluster 40: 4.7,3.2,1.6,0.2,Setosa

Cluster 41: 5,2,3.5,1,Versicolor

Cluster 42: 4.4,2.9,1.4,0.2,Setosa

Cluster 43: 6.9,3.1,5.4,2.1,Virginica

Cluster 44: 5.6,2.8,4.9,2,Virginica

Cluster 45: 7.9,3.8,6.4,2,Virginica

Cluster 46: 4.8,3.1,1.6,0.2,Setosa

Cluster 47: 5.5,2.5,4,1.3,Versicolor

Cluster 48: 5.2,2.7,3.9,1.4,Versicolor

Cluster 49: 6.7,3,5.2,2.3,Virginica

Cluster 50: 5.4,3.9,1.7,0.4,Setosa

Cluster 51: 6.7,3,5,1.7,Versicolor

Cluster 52: 5,3,1.6,0.2,Setosa

Cluster 53: 6.3,3.3,6,2.5,Virginica

Cluster 54: 6.8,3,5.5,2.1,Virginica

Cluster 55: 5,3.6,1.4,0.2,Setosa

Cluster 56: 5.1,3.8,1.9,0.4,Setosa

Cluster 57: 5.7,3.8,1.7,0.3,Setosa

Cluster 58: 5.8,2.8,5.1,2.4,Virginica

Cluster 59: 5.6,2.5,3.9,1.1,Versicolor

Cluster 60: 5.5,2.4,3.8,1.1,Versicolor

Cluster 61: 5.2,3.5,1.5,0.2,Setosa

Cluster 62: 5.3,3.7,1.5,0.2,Setosa

Cluster 63: 5.7,2.8,4.5,1.3,Versicolor


Cluster 64: 5.2,3.4,1.4,0.2,Setosa

Cluster 65: 5,2.3,3.3,1,Versicolor

Cluster 66: 6,3,4.8,1.8,Virginica

Cluster 67: 6.8,3.2,5.9,2.3,Virginica

Cluster 68: 5,3.3,1.4,0.2,Setosa

Cluster 69: 5.5,2.3,4,1.3,Versicolor

Cluster 70: 4.4,3.2,1.3,0.2,Setosa

Cluster 71: 4.5,2.3,1.3,0.3,Setosa

Cluster 72: 4.9,2.5,4.5,1.7,Virginica

Cluster 73: 6.3,2.9,5.6,1.8,Virginica

Cluster 74: 5,3.4,1.5,0.2,Setosa

Cluster 75: 7.7,3,6.1,2.3,Virginica

Cluster 76: 7.2,3,5.8,1.6,Virginica

Cluster 77: 5.5,3.5,1.3,0.2,Setosa

Cluster 78: 6.1,2.8,4,1.3,Versicolor

Cluster 79: 4.8,3,1.4,0.1,Setosa

Cluster 80: 6.5,3.2,5.1,2,Virginica

Cluster 81: 5.6,2.9,3.6,1.3,Versicolor

Cluster 82: 7.7,3.8,6.7,2.2,Virginica

Cluster 83: 6.3,2.5,5,1.9,Virginica

Cluster 84: 5.4,3.7,1.5,0.2,Setosa

Cluster 85: 5,3.4,1.6,0.4,Setosa

Cluster 86: 5.1,3.7,1.5,0.4,Setosa

Cluster 87: 5.4,3,4.5,1.5,Versicolor

Cluster 88: 6,2.2,5,1.5,Virginica

Cluster 89: 6.3,3.4,5.6,2.4,Virginica

Cluster 90: 5.8,2.7,3.9,1.2,Versicolor

Cluster 91: 6.4,2.8,5.6,2.2,Virginica

Cluster 92: 6.4,3.2,5.3,2.3,Virginica

Cluster 93: 4.9,3,1.4,0.2,Setosa

Cluster 94: 6.3,2.7,4.9,1.8,Virginica

Cluster 95: 5.1,2.5,3,1.1,Versicolor

Cluster 96: 5.6,3,4.5,1.5,Versicolor

Cluster 97: 5.1,3.8,1.5,0.3,Setosa

Cluster 98: 6.7,3.1,4.7,1.5,Versicolor

Cluster 99: 6.2,2.8,4.8,1.8,Virginica

Cluster 100: 5.5,2.6,4.4,1.2,Versicolor

Cluster 101: 5.9,3.2,4.8,1.8,Versicolor


Cluster 102: 6.6,3,4.4,1.4,Versicolor

Cluster 103: 5.7,2.8,4.1,1.3,Versicolor

Cluster 104: 5.4,3.4,1.5,0.4,Setosa

Cluster 105: 5.7,4.4,1.5,0.4,Setosa

Cluster 106: 4.8,3.4,1.9,0.2,Setosa

Cluster 107: 5.6,3,4.1,1.3,Versicolor

Cluster 108: 5.1,3.8,1.6,0.2,Setosa

Cluster 109: 5,3.5,1.6,0.6,Setosa

Cluster 110: 5.1,3.3,1.7,0.5,Setosa

Cluster 111: 6.7,3.1,4.4,1.4,Versicolor

Cluster 112: 5.8,2.7,4.1,1,Versicolor

Cluster 113: 4.8,3,1.4,0.3,Setosa

Cluster 114: 6,2.2,4,1,Versicolor

Cluster 115: 5.7,2.9,4.2,1.3,Versicolor

Cluster 116: 6.1,3,4.9,1.8,Virginica

Cluster 117: 4.9,3.1,1.5,0.1,Setosa

Cluster 118: 6.4,3.1,5.5,1.8,Virginica

Cluster 119: 5.7,2.6,3.5,1,Versicolor

Cluster 120: 6,3.4,4.5,1.6,Versicolor

Cluster 121: 6,2.9,4.5,1.5,Versicolor

Cluster 122: 5.1,3.4,1.5,0.2,Setosa

Cluster 123: 7,3.2,4.7,1.4,Versicolor

Cluster 124: 6.5,3,5.2,2,Virginica

Cluster 125: 5.8,2.7,5.1,1.9,Virginica

Cluster 126: 7.2,3.2,6,1.8,Virginica

Cluster 127: 5.8,2.6,4,1.2,Versicolor

Cluster 128: 6.4,2.8,5.6,2.1,Virginica

Cluster 129: 7.6,3,6.6,2.1,Virginica

Cluster 130: 5.6,2.7,4.2,1.3,Versicolor

Cluster 131: 5.1,3.5,1.4,0.2,Setosa

Cluster 132: 6.4,3.2,4.5,1.5,Versicolor

Cluster 133: 6.5,3,5.8,2.2,Virginica

Cluster 134: 6.2,3.4,5.4,2.3,Virginica

Cluster 135: 6.4,2.7,5.3,1.9,Virginica

Cluster 136: 6.3,2.3,4.4,1.3,Versicolor

Cluster 137: 4.7,3.2,1.3,0.2,Setosa

Cluster 138: 4.6,3.2,1.4,0.2,Setosa

Cluster 139: 6.5,3,5.5,1.8,Virginica


Cluster 140: 5.4,3.9,1.3,0.4,Setosa

Cluster 141: 5.2,4.1,1.5,0.1,Setosa

Cluster 142: 5.8,4,1.2,0.2,Setosa

Cluster 143: 6.5,2.8,4.6,1.5,Versicolor

Cluster 144: 6.6,2.9,4.6,1.3,Versicolor

Cluster 145: 6.4,2.9,4.3,1.3,Versicolor

Missing values globally replaced with mean/mode

Final cluster centroids:

Cluster#

Attribute Full Data 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
144 145

(150.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (2.0) (1.0) (1.0) (1.0) (1.0) (2.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (2.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0)
(1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (1.0) (2.0) (1.0)

=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================================================================================
=================================================

sepal.length 5.8433 6.1 6.2 6.9 5.5 6.9 6.1 4.9 4.4 5.5 4.3 6 5.7 4.6 7.4 5.9
6.9 4.9 6.7 7.2 7.3 6.1 5 6.3 5.9 5.7 6.7 7.7 5 4.6 6.1 6.2 6.7 6.3 7.7
4.9 5.4 6.3 5.1 4.7 6.7 4.7 5 4.4 6.85 5.6 7.9 4.8 5.5 5.2 6.7 5.4 6.7 5
6.3 7.1 5 5.1 5.7 5.8 5.6 5.5 5.2 5.3 5.7 5.2 5 6 6.8 5 5.5 4.4 4.5
4.9 6.3 5 7.7 7.2 5.5 6.1 4.8 6.5 5.6 7.7 6.3 5.4 5 5.1 5.4 6 6.3 5.8
6.4 6.4 4.9 6.3 5.1 5.6 5.1 6.7 6.2 5.5 5.9 6.6 5.7 5.4 5.7 4.8 5.6 5.1 5
5.1 6.7 5.8 4.8 6 5.7 6.1 4.9 6.4 5.7 6 6 5.1 7 6.5 5.8 7.2 5.8 6.4
7.6 5.6 5.1 6.4 6.5 6.2 6.4 6.3 4.7 4.6 6.5 5.4 5.2 5.8 6.5 6.7 6.4

sepal.width 3.0573 2.9 2.9 3.1 4.2 3.1 3 3.6 3 2.4 3 2.7 2.5 3.1 2.8 3
3.2 3.1 3.3 3.6 2.9 2.8 3.5 3.3 3 3 3.3 2.6 3.2 3.6 2.6 2.2 2.5 2.5 2.8
2.4 3.4 2.8 3.5 3.4 3.1 3.2 2 2.9 3.05 2.8 3.8 3.1 2.5 2.7 3 3.9 3 3
3.3 3 3.6 3.8 3.8 2.8 2.5 2.4 3.5 3.7 2.8 3.4 2.3 3 3.2 3.3 2.3 3.2 2.3
2.5 2.9 3.4 3 3 3.5 2.8 3 3.2 2.9 3.8 2.5 3.7 3.4 3.7 3 2.2 3.4 2.7
2.8 3.2 3 2.7 2.5 3 3.8 3.1 2.8 2.6 3.2 3 2.8 3.4 4.4 3.4 3 3.8 3.5
3.3 3.1 2.7 3 2.2 2.9 3 3.1 3.1 2.6 3.4 2.9 3.4 3.2 3 2.7 3.2 2.6 2.8
3 2.7 3.5 3.2 3 3.4 2.7 2.3 3.2 3.2 3 3.9 4.1 4 2.8 2.85 2.9

petal.length 3.758 4.7 4.3 5.1 1.4 4.9 4.6 1.4 1.3 3.7 1.1 5.1 5 1.5 6.1 5.1
5.7 1.5 5.7 6.1 6.3 4.7 1.3 4.7 4.2 4.2 5.7 6.9 1.2 1 5.6 4.5 5.8 4.9 6.7
3.3 1.7 5.1 1.4 1.5 5.6 1.6 3.5 1.4 5.45 4.9 6.4 1.6 4 3.9 5.2 1.7 5 1.6
6 5.9 1.4 1.9 1.7 5.1 3.9 3.8 1.5 1.5 4.5 1.4 3.3 4.8 5.9 1.4 4 1.3 1.3
4.5 5.6 1.5 6.1 5.8 1.3 4 1.4 5.1 3.6 6.7 5 1.5 1.6 1.5 4.5 5 5.6 3.9
5.6 5.3 1.4 4.9 3 4.5 1.5 4.7 4.8 4.4 4.8 4.4 4.1 1.5 1.5 1.9 4.1 1.6 1.6
1.7 4.4 4.1 1.4 4 4.2 4.9 1.5 5.5 3.5 4.5 4.5 1.5 4.7 5.2 5.1 6 4 5.6
6.6 4.2 1.4 4.5 5.8 5.4 5.3 4.4 1.3 1.4 5.5 1.3 1.5 1.2 4.6 4.7 4.3

petal.width 1.1993 1.4 1.3 2.3 0.2 1.5 1.4 0.1 0.2 1 0.1 1.6 2 0.2 1.9 1.8
2.3 0.2 2.5 2.5 1.8 1.2 0.3 1.6 1.5 1.2 2.1 2.3 0.2 0.2 1.4 1.5 1.8 1.5 2
1 0.2 1.5 0.3 0.25 2.4 0.2 1 0.2 2.1 2 2 0.2 1.3 1.4 2.3 0.4 1.7 0.2
2.5 2.1 0.2 0.4 0.3 2.4 1.1 1.1 0.2 0.2 1.3 0.2 1 1.8 2.3 0.2 1.3 0.2 0.3
1.7 1.8 0.2 2.3 1.6 0.2 1.3 0.1 2 1.3 2.2 1.9 0.2 0.4 0.4 1.5 1.5 2.4 1.2
2.2 2.3 0.2 1.8 1.1 1.5 0.3 1.5 1.8 1.2 1.8 1.4 1.3 0.4 0.4 0.2 1.3 0.2
0.6 0.5 1.4 1 0.3 1 1.3 1.8 0.1 1.8 1 1.6 1.5 0.2 1.4 2 1.9 1.8 1.2
2.1 2.1 1.3 0.2 1.5 2.2 2.3 1.9 1.3 0.2 0.2 1.8 0.4 0.1 0.2 1.5 1.35 1.3

variety Setosa Versicolor Versicolor Virginica Setosa Versicolor Versicolor Setosa Setosa Versicolor Setosa Versicolor
Virginica Setosa Virginica Virginica Virginica Setosa Virginica Virginica Virginica Versicolor Setosa Versicolor Versicolor Versicolor
Virginica Virginica Setosa Setosa Virginica Versicolor Virginica Versicolor Virginica Versicolor Setosa Virginica Setosa Setosa
Virginica Setosa Versicolor Setosa Virginica Virginica Virginica Setosa Versicolor Versicolor Virginica Setosa Versicolor Setosa
Virginica Virginica Setosa Setosa Setosa Virginica Versicolor Versicolor Setosa Setosa Versicolor Setosa Versicolor Virginica
Virginica Setosa Versicolor Setosa Setosa Virginica Virginica Setosa Virginica Virginica Setosa Versicolor Setosa Virginica
Versicolor Virginica Virginica Setosa Setosa Setosa Versicolor Virginica Virginica Versicolor Virginica Virginica Setosa Virginica
Versicolor Versicolor Setosa Versicolor Virginica Versicolor Versicolor Versicolor Versicolor Setosa Setosa Setosa Versicolor
Setosa Setosa Setosa Versicolor Versicolor Setosa Versicolor Versicolor Virginica Setosa Virginica Versicolor Versicolor Versicolor
Setosa Versicolor Virginica Virginica Virginica Versicolor Virginica Virginica Versicolor Setosa Versicolor Virginica Virginica Virginica
Versicolor Setosa Setosa Virginica Setosa Setosa Setosa Versicolor Versicolor Versicolor

Time taken to build model (full training data) : 0 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 1 ( 1%)

1 1 ( 1%)

2 1 ( 1%)

3 1 ( 1%)

4 1 ( 1%)

5 1 ( 1%)

6 1 ( 1%)
7 1 ( 1%)

8 1 ( 1%)

9 1 ( 1%)

10 1 ( 1%)

11 1 ( 1%)

12 1 ( 1%)

13 1 ( 1%)

14 1 ( 1%)

15 1 ( 1%)

16 1 ( 1%)

17 1 ( 1%)

18 1 ( 1%)

19 1 ( 1%)

20 1 ( 1%)

21 1 ( 1%)

22 1 ( 1%)

23 1 ( 1%)

24 1 ( 1%)

25 1 ( 1%)

26 1 ( 1%)

27 1 ( 1%)

28 1 ( 1%)

29 1 ( 1%)

30 1 ( 1%)

31 1 ( 1%)

32 1 ( 1%)

33 1 ( 1%)

34 1 ( 1%)

35 1 ( 1%)

36 1 ( 1%)

37 1 ( 1%)

38 2 ( 1%)

39 1 ( 1%)

40 1 ( 1%)

41 1 ( 1%)

42 1 ( 1%)

43 2 ( 1%)

44 1 ( 1%)
45 1 ( 1%)

46 1 ( 1%)

47 1 ( 1%)

48 1 ( 1%)

49 1 ( 1%)

50 1 ( 1%)

51 1 ( 1%)

52 1 ( 1%)

53 1 ( 1%)

54 1 ( 1%)

55 1 ( 1%)

56 1 ( 1%)

57 1 ( 1%)

58 1 ( 1%)

59 1 ( 1%)

60 1 ( 1%)

61 1 ( 1%)

62 1 ( 1%)

63 1 ( 1%)

64 1 ( 1%)

65 1 ( 1%)

66 1 ( 1%)

67 1 ( 1%)

68 1 ( 1%)

69 1 ( 1%)

70 1 ( 1%)

71 1 ( 1%)

72 1 ( 1%)

73 1 ( 1%)

74 1 ( 1%)

75 1 ( 1%)

76 1 ( 1%)

77 1 ( 1%)

78 1 ( 1%)

79 1 ( 1%)

80 1 ( 1%)

81 1 ( 1%)

82 1 ( 1%)
83 1 ( 1%)

84 1 ( 1%)

85 1 ( 1%)

86 1 ( 1%)

87 1 ( 1%)

88 1 ( 1%)

89 1 ( 1%)

90 1 ( 1%)

91 1 ( 1%)

92 1 ( 1%)

93 1 ( 1%)

94 1 ( 1%)

95 1 ( 1%)

96 1 ( 1%)

97 1 ( 1%)

98 1 ( 1%)

99 1 ( 1%)

100 1 ( 1%)

101 1 ( 1%)

102 1 ( 1%)

103 1 ( 1%)

104 1 ( 1%)

105 1 ( 1%)

106 1 ( 1%)

107 1 ( 1%)

108 1 ( 1%)

109 1 ( 1%)

110 1 ( 1%)

111 1 ( 1%)

112 1 ( 1%)

113 1 ( 1%)

114 1 ( 1%)

115 1 ( 1%)

116 1 ( 1%)

117 1 ( 1%)

118 1 ( 1%)

119 1 ( 1%)

120 1 ( 1%)
121 1 ( 1%)

122 1 ( 1%)

123 1 ( 1%)

124 1 ( 1%)

125 2 ( 1%)

126 1 ( 1%)

127 1 ( 1%)

128 1 ( 1%)

129 1 ( 1%)

130 1 ( 1%)

131 1 ( 1%)

132 1 ( 1%)

133 1 ( 1%)

134 1 ( 1%)

135 1 ( 1%)

136 1 ( 1%)

137 1 ( 1%)

138 1 ( 1%)

139 1 ( 1%)

140 1 ( 1%)

141 1 ( 1%)

142 1 ( 1%)

143 1 ( 1%)

144 2 ( 1%)

145 1 ( 1%)

Tabela K-clusters X Erro quadrado

Código Python visualização dos dados da tabela K-clusters X Erro quadrado


import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Data
data = {
'K-clusters': [i for i in range(1, 13)],
'Erro quadrado': [62.127790750538175, 7.801559361268048, 6.597925743648829, 6.277659330769319,
6.1159421000391125, 5.217629646927634, 4.859535068386244, 4.678874159874298,
4.587500225526149, 0.675478167559253, 0.19572102742784764, 0.008237176418015264]
}

# Create DataFrame
df = pd.DataFrame(data)

# Data visualization
sns.set_style("whitegrid")
plt.figure(figsize=(10, 6))
sns.lineplot(data=df, x='K-clusters', y='Erro quadrado', marker='o', color='b', markersize=8, linewidth=2)
plt.title('Erro Quadrado vs. K-Clusters')
plt.xlabel('K-Clusters')
plt.ylabel('Erro Quadrado')
plt.xticks(df['K-clusters'])
plt.show()

Após plotar o gráfico, podemos analisar onde ocorre o "joelho" no gráfico, ou seja, o ponto
onde há uma mudança significativa na inclinação da curva. Este ponto indica o número ideal
de clusters para o modelo. Identificando visualmente o ponto onde ocorre essa mudança de
inclinação, podemos determinar o valor ideal de clusters para a base de dados fornecida.

Após analisar o gráfico, parece que há uma mudança significativa na inclinação da curva por
volta de 10 clusters. Antes de 10 clusters, o erro quadrado diminui rapidamente à medida que
o número de clusters aumenta, mas após 10 clusters, a redução no erro quadrado é muito
mais gradual.
Portanto, podemos concluir que o número ideal de clusters para a base de dados fornecida
parece ser em torno de 10. Isso significa que dividir os dados em 10 clusters provavelmente
captura a estrutura subjacente dos dados de forma adequada, minimizando o erro quadrado.

Você também pode gostar