Strategies To Disrupt Online Child Pornography Networks

Strategies to Disrupt Online Child Pornography Networks
KILA JOFFRES, MARTIN BOUCHARD, RICHARD FRANK, BRYCE WESTLAKE SIMON FRASER UNIVERSITY
Partially supported by the International Cybercrime Research Centre, SFU
Study Objectives
To use a specially designed Web-crawler to extract
online child pornography networks To determine which attack strategies are most effective at disrupting these networks

Strategies include: hub, bridge, and fragmentation attacks. Measures of disruption include: density, clustering, compactness, and average path length.
Motivation
Child Pornography and the Web

The web has facilitated the distribution of and access
to child pornography through its

Apparent anonymity, Global reach, and Lack of regulation
The United Nations estimated that there are over 4
million websites with child pornography
Online Intervention Strategies

Current attempts to limit child exploitation have
often focused on:

Chat room stings Injunctions against websites hosting child pornography Establishing hotlines and complaint sites, and image databases
There are two problems with this approach: Overreliance on investigating and targeting sites in isolation Current enforcement efforts have been met with limited success Social network analysis can produce a more effective
method of disrupting online child pornography sites

Which node to attack?

Which node to attack?
(IPv6 network, April 2008 http://www.informationweek.com/galleries /showImage?galleryID=246&imageID=10& articleID=210600289)
The Topology of the Web

It is important to consider the topology of the Web Online networks have two important structural
features

Power-law distribution (aka scale-free), Small-world properties
The Web is distinguished by a few very highly
connected nodes or hubs The average path length within the Web ranges from 16 to 19
It also has a higher degree of clustering than is expected from random networks
Identifying Attack Strategies

Scale-free (power-law) networks are resilient to
random attacks but vulnerable to targeted attacks
for example, there are 13 root name servers in the Internet, take those out, domain names
Different attack strategies Hub attacks remove nodes with lots of links to and from Bridge attacks remove nodes that broker (connect) Fragmentation attack remove nodes such that it would sever the greatest number of connections in the network
Methods
Network Extraction
Takes as input a starting webpage
11
Network Extraction
Retrieve that page
Websites: 1
Boy Girl Child Love Teen Lolli* Young Bath* Innocent Smooth/ Hairless Mastur* Sex Penis Vagina Anal Oral Naked Virgin
Pages: 1
Frequency
3 5 11 10 12 16 11 5 0 20 6 20 15 5 13 0 10 14
12
Network Extraction
Retrieve one of the linked pages
Websites: 2
Pages: 2
Frequency
8 4 16 9 3 13 0 20 8 1 9 15 1 9 12 6 9 20
13
Network Extraction
Websites: 3
Pages: 3
Frequency
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14
Network Extraction
www.microsoft.com
15
Network Extraction
Websites: 3 Pages: 3
16
Network Extraction
Retrieve one of the pages
17
Network Extraction
Retrieve one of the pages until done
18
Network Extraction
Statistics are aggregated up to the
Websites: 3
Pages: 10
Frequency
website level
338 1863 1217 425 1862 833 1506 1640 1891 959 1486 997 1221 1610 662 1702 1244 166
19
Network Extraction
4 pages
Result in a
network Aggregate to Server Level
2 links 1 link
2 links
3 pages
3 pages 2 0
Methods
Three limits were imposed on CENE: 1. A limit of 250,000 webpages 2. A limit of 200 websites 3. Each webpage had to contain at least 7 of 63 child pornography-related keywords
Many of these keywords were: commonly used by the Royal Canadian Mounted Police (RCMP) to locate illegal child-related content, and used in our former studies of online child pornography
Methods
Limitations to CENE False positives Password protected websites
Methods
For this study, two networks were extracted using
different starting websites
Network A
as identified as girl-centered including mostly female-related terms such as vagina, Lolita, girl, and so on.
Network B
boy-centered including mostly male-related terms as penis and boy.
Methods
The goal is to identify the most effective attack
strategies to disrupt online child pornography networks Four attack strategies were assessed
1. 2. 3.
4.
Hub attacks (using the measure of degree centrality) Bridge attacks (using the measure of betweenness), Fragmentation attacks (using the measure developed by Borgatti), Random attacks (where each node has an equal chance of being targeted)
Methods
The removal of websites identified by these attack
strategies followed a sequential process which involved

1.
2. 3.
identifying the website that scored highest for one measure, removing it, and reanalyzing the network to identify the next top website
This process was repeated until five websites were
eliminated
Methods
The impact of the attack strategies used was assessed
on four outcome measures

Density Clustering coefficient Average path distance Distance-based cohesion
Results
Results
NETWORK
Measure
Nodes Ties Density Clustering Coefficient
DESCRIPTIVES Network Network A
Network B
46 111 extent to which a network overall degree of variance 150 663 is compact (how close in network centrality other) 0.0725 websites are to each 0.0543 0.442 0.424
Average Path Length Distance-Based Cohesion

Centralization Out In
likelihood that number of existing ties / two websites, which are linked number of possible ties 19.852% 21.124% to one particular website, are also linked to one 13.037% 22.041% another
3.490 0.200
2.409 0.131
Results - Density
Network Measure Network A Density Ties Left (Change) 0.0561 92 (22.62%) 0.0506 (30.207%) 0.0500 (31.034%) 0.0506 (30.207%) (0.551%) 83 82 83 Network B Density Ties (Change) Left 0.0482 537 (11.233%) 0.0469 (13.627%) 0.0442 (18.6%) 0.0455 (16.206%) (0.368%) 522 492 506
Fragmentation Betweenness Degree Out In
Possibly due to the differences in network size, with the removal of 5 nodes Random Attack 0.0732 120 0.0541 602 having a greater impact in the smaller Network A.
Results
Network A, before and after the out-degree attack
Results
Network B, before and after the out-degree attack
Results - Clustering Coefficient

Network
Measure
Fragmentation Betweenness Degree Out In Random Attack
Network A
0.514 (16.289%) 0.438 (0.09%)
Network B
0.430 (1.415%) 0.426 (0.471%)
( 2.941%) - 0.429 Certain attacks in this 0.422 (0.471%) network actually increase clustering 0.415 (6.108%) 0.434 (2.358%)
- suggests certain changes to the network are prome to 0.441 (0.226%) 0.432 (1.886%) leaving it with more tightlyknit groups
Results - Distance-Based Cohesion

extent to which a network is compact (how close websites are to each other)
Measure Fragmentation Betweenness Degree Out
Network Network A 0.093 (53.50%) 0.085 (57.50%) 0.103 (48.50%) 0.119 (40.50%) 0.207 (3.50%)
Bridge attacks were very successful
Network B 0.073 (44.27%) 0.075 (42.75%) 0.082 (37.40%) 0.085(35.11%) 0.129 (1.35%)
In Random Attack
Results Average Path-Length

# of paths
Measure Fragmentation
Betweenness Degree Out In
1021 230
Network A 1.852 (46.934%)

2.014 (42.292%) 2.738 (21.547%) 3.431 (1.69%)
Network B 1.741 (27.729%)

1.812 (24.782%) 1.980 (17.808%) 2.049 (14.943%)
Random Attack
859
3.574 (2.406%)
2.414 (0.207%)
Network B was initially much less compact than Network A. Effect of attack more easily seen.
Discussion & Conclusion
Discussion
The purpose of this paper was to isolate those attack
strategies (hub, bridge, fragmentation) that would maximally disrupt two online child exploitation networks Three general findings emerged:
1. 2.
3.
Targeted attacks are more effective than random ones For different outcome measures (density, clustering, distance), different intervention strategies are warranted For different networks, different attack strategies are more or less effective
Discussion
To reduce density and clustering hub attacks To reduce network reachability fragmentation To reduce network compactness fragmentation In certain cases, the bridge attack was almost as
effective, and in one case more effective, than other strategies for Network A
Conclusion
This project has practical implications in terms of Focusing the effective use of police resources, and Decreasing the accessibility of online child pornography. Pairing the web-crawler with social network analyses Assists in target prioritization Identifies websites that would maximally disrupt the network Prioritizes targets The current study provides methodological
guidelines on which to base such decisions
Limitations
The inclusion of false positives Limitations of CENE # of pages # of websites
Failure to account for the content of the websites
Future Directions
Adopting longitudinal designs Tracking the way networks evolve when attacked and how they recover from, or adapt more easily to, specific attacks Modifying the Web-crawler to extract other
networks, such as ones relating to terrorism, drug use, or other illegal behaviours
Strategies to Disrupt Online Child Pornography Networks

KILA JOFFRES, MARTIN BOUCHARD, RICHARD FRANK, BRYCE WESTLAKE SIMON FRASER UNIVERSITY
Thank you!

Strategies To Disrupt Online Child Pornography Networks

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Strategies To Disrupt Online Child Pornography Networks

Enviado por

Direitos autorais:

Formatos disponíveis

Strategies to Disrupt Online Child Pornography Networks

Partially supported by the International Cybercrime Research Centre, SFU

Child Pornography and the Web

to child pornography through its

Apparent anonymity, Global reach, and Lack of regulation

The United Nations estimated that there are over 4

million websites with child pornography

Online Intervention Strategies

often focused on:

method of disrupting online child pornography sites

Online Intervention Strategies

Online Intervention Strategies

(IPv6 network, April 2008 http://www.informationweek.com/galleries /showImage?galleryID=246&imageID=10& articleID=210600289)

The Topology of the Web

Power-law distribution (aka scale-free), Small-world properties

The Web is distinguished by a few very highly

Identifying Attack Strategies

random attacks but vulnerable to targeted attacks

Takes as input a starting webpage

Retrieve that page

network Aggregate to Server Level

different starting websites

strategies followed a sequential process which involved

This process was repeated until five websites were

on four outcome measures

Density Clustering coefficient Average path distance Distance-based cohesion

DESCRIPTIVES Network Network A

Average Path Length Distance-Based Cohesion

Fragmentation Betweenness Degree Out In

Network A, before and after the out-degree attack

Network B, before and after the out-degree attack

Results - Clustering Coefficient

Results - Distance-Based Cohesion

Measure Fragmentation Betweenness Degree Out

Results Average Path-Length

Network A 1.852 (46.934%)

Network B 1.741 (27.729%)

Discussion & Conclusion

guidelines on which to base such decisions

Failure to account for the content of the websites

Strategies to Disrupt Online Child Pornography Networks

Você também pode gostar