Escolar Documentos
Profissional Documentos
Cultura Documentos
KILA JOFFRES, MARTIN BOUCHARD, RICHARD FRANK, BRYCE WESTLAKE SIMON FRASER UNIVERSITY
Study Objectives
To use a specially designed Web-crawler to extract
online child pornography networks To determine which attack strategies are most effective at disrupting these networks
Strategies include: hub, bridge, and fragmentation attacks. Measures of disruption include: density, clustering, compactness, and average path length.
Motivation
Chat room stings Injunctions against websites hosting child pornography Establishing hotlines and complaint sites, and image databases
There are two problems with this approach: Overreliance on investigating and targeting sites in isolation Current enforcement efforts have been met with limited success Social network analysis can produce a more effective
features
connected nodes or hubs The average path length within the Web ranges from 16 to 19
It also has a higher degree of clustering than is expected from random networks
for example, there are 13 root name servers in the Internet, take those out, domain names
Different attack strategies Hub attacks remove nodes with lots of links to and from Bridge attacks remove nodes that broker (connect) Fragmentation attack remove nodes such that it would sever the greatest number of connections in the network
Methods
Network Extraction
11
Network Extraction
Websites: 1
Boy Girl Child Love Teen Lolli* Young Bath* Innocent Smooth/ Hairless Mastur* Sex Penis Vagina Anal Oral Naked Virgin
Pages: 1
Frequency
3 5 11 10 12 16 11 5 0 20 6 20 15 5 13 0 10 14
12
Network Extraction
Retrieve one of the linked pages
Websites: 2
Boy Girl Child Love Teen Lolli* Young Bath* Innocent Smooth/ Hairless Mastur* Sex Penis Vagina Anal Oral Naked Virgin
Pages: 2
Frequency
8 4 16 9 3 13 0 20 8 1 9 15 1 9 12 6 9 20
13
Network Extraction
Retrieve one of the linked pages
Websites: 3
Boy Girl Child Love Teen Lolli* Young Bath* Innocent Smooth/ Hairless Mastur* Sex Penis Vagina Anal Oral Naked Virgin
Pages: 3
Frequency
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14
Network Extraction
Retrieve one of the linked pages
www.microsoft.com
15
Network Extraction
Retrieve one of the linked pages
Websites: 3 Pages: 3
16
Network Extraction
Retrieve one of the pages
Websites: 3 Pages: 4
17
Network Extraction
Retrieve one of the pages until done
Websites: 3 Pages: 4
18
Network Extraction
Statistics are aggregated up to the
Websites: 3
Boy Girl Child Love Teen Lolli* Young Bath* Innocent Smooth/ Hairless Mastur* Sex Penis Vagina Anal Oral Naked Virgin
Pages: 10
Frequency
website level
338 1863 1217 425 1862 833 1506 1640 1891 959 1486 997 1221 1610 662 1702 1244 166
19
Network Extraction
4 pages
Result in a
2 links 1 link
2 links
3 pages
3 pages 2 0
Methods
Three limits were imposed on CENE: 1. A limit of 250,000 webpages 2. A limit of 200 websites 3. Each webpage had to contain at least 7 of 63 child pornography-related keywords
Many of these keywords were: commonly used by the Royal Canadian Mounted Police (RCMP) to locate illegal child-related content, and used in our former studies of online child pornography
Methods
Limitations to CENE False positives Password protected websites
Methods
For this study, two networks were extracted using
Network A
as identified as girl-centered including mostly female-related terms such as vagina, Lolita, girl, and so on.
Network B
boy-centered including mostly male-related terms as penis and boy.
Methods
The goal is to identify the most effective attack
strategies to disrupt online child pornography networks Four attack strategies were assessed
1. 2. 3.
4.
Hub attacks (using the measure of degree centrality) Bridge attacks (using the measure of betweenness), Fragmentation attacks (using the measure developed by Borgatti), Random attacks (where each node has an equal chance of being targeted)
Methods
The removal of websites identified by these attack
2. 3.
identifying the website that scored highest for one measure, removing it, and reanalyzing the network to identify the next top website
eliminated
Methods
The impact of the attack strategies used was assessed
Results
Results
NETWORK
Measure
Nodes Ties Density Clustering Coefficient
Network B
46 111 extent to which a network overall degree of variance 150 663 is compact (how close in network centrality other) 0.0725 websites are to each 0.0543 0.442 0.424
likelihood that number of existing ties / two websites, which are linked number of possible ties 19.852% 21.124% to one particular website, are also linked to one 13.037% 22.041% another
3.490 0.200
2.409 0.131
Results - Density
Network Measure Network A Density Ties Left (Change) 0.0561 92 (22.62%) 0.0506 (30.207%) 0.0500 (31.034%) 0.0506 (30.207%) (0.551%) 83 82 83 Network B Density Ties (Change) Left 0.0482 537 (11.233%) 0.0469 (13.627%) 0.0442 (18.6%) 0.0455 (16.206%) (0.368%) 522 492 506
Possibly due to the differences in network size, with the removal of 5 nodes Random Attack 0.0732 120 0.0541 602 having a greater impact in the smaller Network A.
Results
Results
Measure
Fragmentation Betweenness Degree Out In Random Attack
Network A
0.514 (16.289%) 0.438 (0.09%)
Network B
0.430 (1.415%) 0.426 (0.471%)
( 2.941%) - 0.429 Certain attacks in this 0.422 (0.471%) network actually increase clustering 0.415 (6.108%) 0.434 (2.358%)
- suggests certain changes to the network are prome to 0.441 (0.226%) 0.432 (1.886%) leaving it with more tightlyknit groups
Network Network A 0.093 (53.50%) 0.085 (57.50%) 0.103 (48.50%) 0.119 (40.50%) 0.207 (3.50%)
Bridge attacks were very successful
Network B 0.073 (44.27%) 0.075 (42.75%) 0.082 (37.40%) 0.085(35.11%) 0.129 (1.35%)
In Random Attack
Measure Fragmentation
Betweenness Degree Out In
1021 230
Random Attack
859
3.574 (2.406%)
2.414 (0.207%)
Network B was initially much less compact than Network A. Effect of attack more easily seen.
Discussion
The purpose of this paper was to isolate those attack
strategies (hub, bridge, fragmentation) that would maximally disrupt two online child exploitation networks Three general findings emerged:
1. 2.
3.
Targeted attacks are more effective than random ones For different outcome measures (density, clustering, distance), different intervention strategies are warranted For different networks, different attack strategies are more or less effective
Discussion
To reduce density and clustering hub attacks To reduce network reachability fragmentation To reduce network compactness fragmentation In certain cases, the bridge attack was almost as
effective, and in one case more effective, than other strategies for Network A
Conclusion
This project has practical implications in terms of Focusing the effective use of police resources, and Decreasing the accessibility of online child pornography. Pairing the web-crawler with social network analyses Assists in target prioritization Identifies websites that would maximally disrupt the network Prioritizes targets The current study provides methodological
Limitations
The inclusion of false positives Limitations of CENE # of pages # of websites
Future Directions
Adopting longitudinal designs Tracking the way networks evolve when attacked and how they recover from, or adapt more easily to, specific attacks Modifying the Web-crawler to extract other
networks, such as ones relating to terrorism, drug use, or other illegal behaviours
Thank you!