Você está na página 1de 67

Classificação de Padrões em Imagens

Classificação de Cenas e Segmentação Semântica

Jefersson Alex dos Santos

jefersson@dcc.ufmg.br

DCC
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Roteiro da Aula

1 Supervised Classification

2 Deep Vs Hand-Crafted Features


Strategies to Exploit ConvNets
Experimental Analysis

3 Segmentação Semântica
Context Window-Based Approach
Fully Convolutional Neural Network

2 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Pattern Classifier
Typical Steps

Building

Multimedia Pattern
Feature Classifier
Dataset Representation Training
Extraction

3 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Pattern Classifier
Typical Steps

Building

Multimedia Pattern
Feature Classifier
Dataset Representation Training
Extraction

Example

Artificial Neural Networks


Support Vector Machines
Nearest Neighbors
...

3 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Pattern Classifier
Typical Steps

Building

Multimedia Pattern
Feature Classifier
Dataset Representation Training
Extraction

How to use the classifier?

3 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Pattern Classifier
Typical Steps

Building

Multimedia Pattern
Feature Classifier
Dataset Representation Training
Extraction

Using

Multimedia Pattern Predicted


Feature Classifier Class
Object Representation
Extraction

3 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

K-Nearest Neighbor

4 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

K-Nearest Neighbor

4 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

K-Nearest Neighbor

4 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

K-Nearest Neighbor

4 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

K-Nearest Neighbor

4 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Decision Trees

5 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Decision Trees

5 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Support Vector Machines

What is the best classifier?

6 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Support Vector Machines

6 / 46
DCC029/868 - Processamento Digital de Imagens
Supervised Classification

Support Vector Machines

6 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features

Roteiro da Aula

1 Supervised Classification

2 Deep Vs Hand-Crafted Features


Strategies to Exploit ConvNets
Experimental Analysis

3 Segmentação Semântica
Context Window-Based Approach
Fully Convolutional Neural Network

7 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features

Small Training Data

Challenges:
1 Deep learning needs large amount of data to train
2 Many applications (e.g. remote sensing) typically has small amount of annotated
data

8 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features

Small Training Data

Challenges:
1 Deep learning needs large amount of data to train
2 Many applications (e.g. remote sensing) typically has small amount of annotated
data

Research Questions:
1 Is it possible to transfer features from every-day pics to the remote sensing
domain?
2 Do transferred features more effective than fully-trained?
3 How to better exploit deep learning in remote sensing data?

Reference
K. Nogueira, O. A. B. Penatti and J. A. dos Santos. Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recognition, 61,
539-556, 2017.
8 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

1 - Fully Training
Training from scratch

Target Dataset

9 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

1 - Fully Training
Training from scratch

classification
Target Dataset

...
conv

conv

conv

fully

fully
9 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

1 - Fully Training
Training from scratch

Random Initialized ConvNet Fully Trained on


the Target Dataset

classification
Target Dataset

...
conv

conv

conv

fully

fully
9 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random Initialized ConvNet

Original Dataset

classification
conv
...
conv

conv

fully

fully

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random Initialized ConvNet

Original Dataset

classification
conv
...
conv

conv

fully

fully
Target Dataset

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random Initialized ConvNet

Original Dataset

classification
conv
...
conv

conv

fully

fully
Target Dataset

...
conv

conv

conv

fully

fully
Fine-Tuning ConvNet

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random Initialized ConvNet

Original Dataset

classification
conv
...
conv

conv

fully

fully
Target Dataset

...
conv

conv

conv

fully

fully
Fine-Tuning ConvNet

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random Initialized ConvNet

Original Dataset

classification
conv
...
conv

conv

fully

fully
Transfer of
trained weights

Target Dataset

...
conv

conv

conv

fully

fully
Fine-Tuning ConvNet

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
Random.Initialized.ConvNet

Original.Dataset

classification
conv
...
conv

conv

fully

fully
Transfer.of. Different.number.of.
trained.weights classes⇒no.weight.
transfer

classification
Target.Dataset

...
conv

conv

conv

fully

fully
Fine-Tuning.ConvNet

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

2 - Fine-Tuning
RandomPInitializedPConvNet

OriginalPDataset

classification
conv
...
conv

conv

fully

fully
TransferPofP DifferentPnumberPofP
trainedPweights classes⇒noPweightP
transfer

classification
TargetPDataset

...
conv

conv

conv

fully

fully
Fine-TuningPConvNet

PossiblePLayersPtoP
Freeze

10 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

3 - Feature Extractor

Pre-trained ConvNet

classification
...
conv

conv

conv

fully

fully

11 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

3 - Feature Extractor

Pre-trained ConvNet

classification
...
conv

conv

conv

fully

fully
deep feature
vector

11 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Strategies to Exploit ConvNets

3 - Feature Extractor

Pre-trained ConvNet

classification
...
conv

conv

conv

fully

fully
deep feature classification
vector (SVM)

11 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

Datasets
Examples

UCMerced land-use RS19 Brazilian Coffee Scenes


Dataset Dataset Dataset

(a) Agricultural (c) Beach (e) Coffee

(b) Dense Residential (d) Football Field (f) Non-coffee

12 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

Some Experiments

A - Generalization Power Evaluation


B - Comparison of ConvNets Strategies
C - Comparison with Baselines

13 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

A - Generalization Power Evaluation

100

90
Average accuracy (%)

80

70

60

50

40
LCH
BIC

I
LAS
ACC

et
Ove sm k
BS 5k

VG et
G9
G9
G9

sm0k

Caf eNet
ha k

s k
SAS

Ale atL

Goo G16
Ove eatS
HO 20
HO 40
BD 810

BS 10
sm
BD 5
BD 1m
GIS

feN
xN
r Fe
HO

rF

gL
Feature representation
UCMerced Land-use Dataset

14 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

A - Generalization Power Evaluation

100

90
Average accuracy (%)

80

70

60

50

40
LCH
BIC

I
LAS
ACC

et
Ove sm k
BS 5k

VG et
G9
G9
G9
G9
ha 00

sm0k

Caf eNet
s k
SAS

Ale eatL

Goo G16
Ove eatS
HO 14
HO 20
HO 40
BD 180

BS 1m0
BD 5
BD 1m
GIS

feN
xN
s
HO

rF
rF

gL
Feature representation
RS19 Dataset

15 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

A - Generalization Power Evaluation

100

90
Average accuracy (%)

80

70

60

50

40
LCH
BIC

I
LAS
ACC

et
Ove sm k
BS 5k
BS 5k

VG et
G9
G4 9

sm0k

Caf Net
smk
SAS

Ale atL

Goo 16
Ove eatS
HO 20
BD 50

BS 10
sm

sw
GIS

G
BD 1

feN
xN
r Fe
HO

gL e
rF
Feature representation
Brazilian Coffee Scenes Dataset

16 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

A - Generalization Power Evaluation


Conclusions

It is possible to exploit feature representation learned in computer vision datasets


into the remote sensing scenario
Deep features generalize better to aerial dataset than to agricultural ones
Agricultural images are composed of finer and more homogeneous textures/color

17 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies

UCMerced Land-use Dataset

18 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies

RS19 Dataset

19 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies

Brazilian Coffee Scenes Dataset

20 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies

missclassified
Medium Residential −−−−−−−→ Dense Residential
into

21 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies

missclassified
Commercial −−−−−−−→ Park
into

22 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

B - Comparison of ConvNets Strategies


Conclusions

Feature representation learned in everyday image datasets can be adjusted to


the remote sensing domain
Fully training was not a good strategy maybe due to the small amount of labeled data
available
Fine tuning is usually the best strategy
Replacing the last softmax layer by SVM was a better solution

23 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

C - Comparison with Baselines

UCMerced Land-use Dataset

24 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

C - Comparison with Baselines

RS19 Dataset

25 / 46
DCC029/868 - Processamento Digital de Imagens
Deep Vs Hand-Crafted Features
Experimental Analysis

C - Comparison with Baselines

Brazilian Coffee Scenes Dataset

26 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica

Roteiro da Aula

1 Supervised Classification

2 Deep Vs Hand-Crafted Features


Strategies to Exploit ConvNets
Experimental Analysis

3 Segmentação Semântica
Context Window-Based Approach
Fully Convolutional Neural Network

27 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica

Semantic Segmentation
(Aka: Pixelwise classification)

Convolutional Neural Networks (CNN) are the currently state-of-the-art in several


tasks
Bounding box detection
Keypoint detection
Image classification
Natural next step of CNN would be semantic segmentation

28 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica

Semantic Segmentation
(Aka: Pixelwise classification)

Semantic segmentation: assigning to each pixel in an image a category-level


label
Fundamental task for total image understanding
Difficult because global information resolves “what” while local information
resolves “where”

29 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Context Window-Based Approach

Context Window-Based Approach


context
window

pixel to be
classified

Reference
K. Nogueira, M. Dalla Mura, J. Chanussot, W. R. Schwartz, J. A. dos Santos. Learning to semantically segment high-resolution remote sensing images. In: 2016 23rd
International Conference on Pattern Recognition (ICPR), 2016, Cancun.

30 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Context Window-Based Approach

Final Segmentation Process

Original Context Probability


Image Windows Map

ConvNet

...

31 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Context Window-Based Approach

Datasets
Agriculture Dataset

(r) Image

(s) Ground-Truth

32 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Context Window-Based Approach

Some Results
Agriculture Dataset

True Positive True Negative False Positive False Negative


Relevance Maps

33 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully-Convolutional Networks
Deep learning can learn global and local semantic information
Main contributions of the paper:
Fully Convolutional Neural Network (FCN)
Based on fine-tuning networks

Reference
J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. CVPR, p. 3431-3440, Boston, USA, 2015

34 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network

Network with only convolutional layers (convolution plus pooling)


Output has spatial correspondence with the input
Trained end-to-end for semantic segmentation
No pre- or pos-processing at all

35 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network


Convolutionization

Fully connected layer is not a special layer


Normal convolutional layer with specific parameters
Thus, every CNN can be transformed into a FCN

36 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network


Convolutionization

37 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network


Upsampling

Output has lower dimension than input image


Upsampling is needed to perform a correspondence
Deconvolution can be trained in a neural network
Makes the inverse of the convolution

38 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network


Skips

Combines prediction layer with lower layers with finer strides


Combination makes local predictions that respect glocal structures
Overview:
Extracts features from middle layers
Add 1 × 1 convolution to perform class predicton
Fuse output
Deconvolution

39 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Fully Convolutional Neural Network


Convolutionization

40 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Experiments
Metrics

41 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Results
Pascal VOC 2011,2012

42 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Results
NYUDv2

43 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Results
SIFT Flow Dataset

44 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Results
Pascal VOC

45 / 46
DCC029/868 - Processamento Digital de Imagens
Segmentação Semântica
Fully Convolutional Neural Network

Conclusion

FCN achieved state-of-the-art for semantic segmentation


Simultaneously simplifying and speeding up learning and inference

46 / 46