Escolar Documentos
Profissional Documentos
Cultura Documentos
version:5.3.SP1
BrowserEngine
Trademarks
FAST ESP, the FAST logos, FAST Personal Search, FAST mSearch, FAST InStream, FAST AdVisor,
FAST Marketrac, FAST ProPublish, FAST Sentimeter, FAST Scope Search, FAST Live Analytics, FAST
Contextual Insight, FAST Dynamic Merchandising, FAST SDA, FAST MetaWeb, FAST InPerspective,
GetSmart, NXT, LivePublish, Folio, FAST Unity, FAST Radar, RetrievalWare, AdMomentum, and all
other FAST product names contained herein are either registered trademarks or trademarks of Fast
Search & Transfer ASA in Norway, the United States and/or other countries. All rights reserved. This
documentation is published in the United States and/or other countries.
Sun, Sun Microsystems, the Sun Logo, all SPARC trademarks, Java, and Solaris are trademarks or
registered trademarks of Sun Microsystems, Inc. in the United States and other countries.
Netscape is a registered trademark of Netscape Communications Corporation in the United States and
other countries.
Microsoft, Windows, Visual Basic, and Internet Explorer are either registered trademarks or trademarks
of Microsoft Corporation in the United States and/or other countries.
Red Hat is a registered trademark of Red Hat, Inc.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.
AIX and IBM Classes for Unicode are registered trademarks or trademarks of International Business
Machines Corporation in the United States, other countries, or both.
HP and the names of HP products referenced herein are either registered trademarks or service marks,
or trademarks or service marks, of Hewlett-Packard Company in the United States and/or other countries.
Remedy is a registered trademark, and Magic is a trademark, of BMC Software, Inc. in the United States
and/or other countries.
XML Parser is a trademark of The Apache Software Foundation.
All other company, product, and service names are the property of their respective holders and may be
registered trademarks or trademarks in the United States and/or other countries.
Web Site
Please visit us at: http://www.fastsearch.com/
Contacting FAST
FAST
Cutler Lake Corporate Center
117 Kendrick Street, Suite 100
Needham, MA 02492 USA
Tel: +1 (781) 304-2400 (8:30am - 5:30pm EST)
Fax: +1 (781) 304-2410
Product Training
E-mail: fastuniv@microsoft.com
To access the FAST University Learning Portal, go to: http://www.fastuniversity.com/
Sales
E-mail: sales@fastsearch.com
Contents
Preface..................................................................................................ii
Copyright..................................................................................................................................ii
Contact Us...............................................................................................................................iii
5
FAST Enterprise Search Platform
6
Chapter
1
About BrowserEngine
Topics: The BrowserEngine is a highly scalable and configurable component that extracts
links and text from JavaScript and Adobe Flash files. The BrowserEngine is used
• About the BrowserEngine by the FAST Enterprise Crawler and may be called from the Document Processing
pipeline.
• Architecture
FAST Enterprise Search Platform
Architecture
The BrowserEngine is a stand-alone ESP component, capable of processing HTML documents containing
javascripts and Flash files. It accomplishes this by emulating a browser's internal environment, without the
need for a display.
The BrowserEngine is implemented in Java and runs as a separate process. This provides isolation from
other components (in particular, from the Enterprise Crawler), in the case of a fatal error. This design also
allows a component to use multiple BrowserEngines, or multiple components can use the same BrowserEngine.
The following diagram illustrates the major functional modules within the BrowserEngine, and shows the
datapaths that will be referenced in the following discussion.
8
About BrowserEngine
returned. Otherwise, it is delivered to the JavaScript Handler. The first step is to run a user-definable page
preprocessor to initialize the DOM tree, before any processing of the page contents takes place. This allows
the BrowserEngine to simulate support for browser plug-ins such as Adobe Reader, Apple QuickTime or
Windows Media Player, and also permits initialization of settings such as User-Agent, or the screen size. The
page preprocessor is written in JavaScript, in order to provide quick and easy customization.
After the page preprocessor has initialized the DOM tree, the BrowserEngine parses the HTML document,
fetches external dependencies and populates the DOM tree with HTML elements. External dependencies,
such as scripts and frames, will be looked up in a local dependency cache, or fetched indirectly via the
Enterprise Crawler, which acts as a cacheing proxy. It is also capable of fetching resources directly from the
network, if used by components other than the crawler. The document is loaded just as a real browser would,
by executing scripts and onLoad handlers.
In addition to the page preprocessor, there is an optional script preprocessor that can modify the source code
of every snippet of JavaScript code before it is executed.
After the document is loaded, the constructed DOM tree is passed to a configurable pipeline of extractors.
The pipeline stages create a text representation of the HTML document, extract cookies, generate a document
checksum, simulate user interactions and extract links. This data and metadata is returned to the calling
component.
9
Chapter
2
Configuring the BrowserEngine
Topics: The BrowserEngine can run out of the box with Fast ESP. However, you may
want change the preprocessors and/or the pipeline to fit your needs.
• Enterprise Crawler considerations
• Configuration via XML File
FAST Enterprise Search Platform
Changes made to this file, or any other files used by the BrowserEngine configuration, will not take effect
until the BrowserEngine is restarted.
Parameter Description
port Base port number, which is used to listen for requests from the Enterprise Crawler.
Note: The BrowserEngine also uses port number "port+1". Both ports must be free.
maxThreads The number of BrowserEngine threads created to process documents. This attribute limits the
number of documents which can be processed concurrently. Note that setting this value too
high can result in wasted CPU utilization due to scheduling, resulting in lower document
throughput. Also, it can cause the BrowserEngine to run out of Java heap space. Thus, a better
solution is to start multiple instances of the BrowserEngine.
maxQueueSize The limit on requests that may be accepted and queued, waiting for an available processing
thread. If the queue becomes full, the BrowserEngine will deny further requests from the
Enterprise Crawler until processing threads become available.
Example:
<server maxThreads="100" maxQueueSize=”100” port="50000"/>
12
Configuring the BrowserEngine
Browser Tag
Parameter Description
• Mozilla
• InternetExplorer
useSSL Specifies if the BrowserEngine should use SSL when requesting external dependencies from
the Enterprise Crawler. The attribute should be set to false when used in a FAST ESP installation
with the crawler.
Note: This setting only affects the BrowserEnginer interactions with the Enterprise Crawler,
which may still use SSL to retrieve the dependency.
evaluationTimeout The total maximum time (in seconds) that a document can use on processing. This includes
time used on waiting for external dependencies. Documents which uses a longer time than this
specified value is aborted by the BrowserEngine. In this case the Enterprise Crawler will store
the original document and follow the links it finds.
terminateTimeout The terminateTimeout option sets the maximum time (in seconds) a thread can run before the
BrowserEngine is shutdown. This prevents potential endless spinning threads, not properly
timed out by the evolutionTimeout mechanism, of hogging all system recourses.
Example:
<browser type="mozilla" allowPopups="false" useSSL="true" evaluationTimeout="3600">
Browser sub-tags
Within the browser tag, there are four configurable tags:
• cache
• blacklist
• flash
• javascript
Cache
Parameter Description
size Specifies the cache size in megabytes (MB). The cache improves the performance by reducing
the traffic between the BrowserEngine and the Enterprise Crawler whenever there are external
dependencies.
ttl The maximum time (in milliseconds) that a cache entry may exist in the cache. If the cache
becomes full, cache entries are removed in a Least Recently Used order.
Example:
<cache size="25" ttl="3600000"/>
13
FAST Enterprise Search Platform
Blacklist
Parameter Description
reqexp value The blacklist tag contains a list of regular expressions used to exclude requests for external
dependencies. Before the BrowserEngine requests an external dependency, it checks if the
URI matches a regular expression. If there is a match, the request is not submitted, and the
BrowserEngine will continue to process the document without downloading the dependency. A
common usage is to block advertisements.
Example:
<blacklist>
<regexp value="as-us\.falkag\.net"/>
<regexp value="doubleclick\.net"/>
</blacklist>
JavaScript
Parameter Description
timeout Specifies the maximum time (in milliseconds) that the JavaScript engine is allowed to execute
a snippet of JavaScript code. If the timeout limit is reached the execution of the JavaScript code
will be aborted. This prevents the BrowserEngine from becoming stuck in endless loops.
scriptPreprocessor Specifies the URL or java resource path to the script preprocessor JavaScript code.
pagePreProcessor Specifies the URL or java resource path to the pre preprocessor JavaScript code.
Example:
<javascript timeout="5000">
<pagePreProcessor src="/pagePreProcessor.js"/>
<scriptPreProcessor src="/scriptPreProcessor.js"/>
</javascript>
1. Create or modify the page preprocessor file according to your needs, and save it to the directory containing
the BrowserEngine configuration file.
2. Edit the BrowserEngine configuration file to specify this page preprocessor.
3. Restart the BrowserEngine.
Example: A page preprocessor which emulates support for the Adobe Reader.
14
Configuring the BrowserEngine
1. Create or modify the script preprocessor file according to your needs and save it to the directory containing
the BrowserEngine configuration file.
2. Edit the BrowserEngine configuration file to specify this script preprocessor.
3. Restart the BrowserEngine.
Parameter Description
Pipeline tag
The extractor pipeline has four primary responsibilities:
• create the processed HTML document
• retrieve cookies
• create a checksum
• extract links
Additional functionality can also be included in the pipeline.
The configuration of the pipeline consists of parameters to control overall processing, and the list of extractors
to be run for each page.
15
FAST Enterprise Search Platform
Attribute Description
obeyNoIndex Specifies whether the extractors should obey the HTML noindex meta tag or not (boolean).
abortOnFailure Specify if the pipeline should abort if an extractor in the pipeline fails, or if the BrowserEngine
should return the partial processed document. If set to "true" and a document fails, the document
will not be stored by the Enterprise Crawler and none of the links will be followed. If set to "false"
the document will be stored, and the extracted links may be followed (depending on the crawl
collection configuration.
Example:
<pipeline maxIterations="1" obeyNoIndex="false"
abortOnFailure="false">
Pipeline sub-tags
Within the pipeline tag, many extractors may be defined. The BrowserEngine will execute the extractors in
the specified order. Each extractor tag has two attributes; name and class. In addition there may be multiple
params tags.
Attribute Description
params An optional list of parameters. A params tag has three attributes; name, value and data type.
HTMLOutput
The extractor generates a HTML document from the DOM tree.
Note: This extractor must always be first in the pipeline!
Example:
Cookies
The extractor extracts any cookies which have been created or modified by the executed JavaScript code.
Example:
Checksum
This extractor generates an MD5 checksum of the document. The checksum is based on the result of
HTMLOutput, with the HTML tags removed. This is the same algorithm used by default in the Enterprise
Crawler.
16
Configuring the BrowserEngine
Example:
AttributeValueExtractor
This extractor retrieves links from HTML attributes. The AttributeValueExtractor takes a series of string
parameters. The "name" parameter is the name of the HTML tag, and "value" is the attribute within this HTML
tag to extract links from.
Example:
<extractor name="AttributeValueExtractor"
class="com.fastsearch.jscriptserver.extractors.AttributeValueExtractor">
<param name="body" value="background" type="str"/>
<param name="embed" value="src" type="str"/>
</extractor>
Clicker
The extractor attempts to simulate user input by “clicking” on elements. This extractor takes one string
parameter, "click". The parameter contains a semicolon separated list of elements to click on.
Example:
EventHandlerRunner
This extractor gets links by triggering JavaScript events. The event handler runner class has one string
parameter, the "events" parameter. The value of this parameter is a semicolon separated list of events, which
the extractor will execute to retrieve new links.
Example:
<extractor name="EventHandlerRunner"
class="com.fastsearch.jscriptserver.extractors.EventHandlerRunner">
<param name="events" value="onFocus; onBlur; onClick; onMouseDown;" type="str"/>
</extractor>
ScriptExtractor
The script extractor uses regular expressions to extract links from JavaScript tags.
Example:
<extractor name="ScriptExtractor"
class="com.fastsearch.jscriptserver.extractors.ScriptExtractor">
</extractor>
17
FAST Enterprise Search Platform
FormExtractor
This extractor tries to extract links from forms by "triggering" submit button of forms.
Example:
<extractor name="FormExtractor"
class="com.fastsearch.jscriptserver.extractors.FormExtractor">
</extractor>
CSSExtractor
The extractor retrieves links from cascading style sheets definitions.
Example:
</extractor>
MetaURLFinder
The MetaURLFiner extractor extracts links from within HTML meta tags.
Example:
<extractor name="MetaURLFinder"
class="com.fastsearch.jscriptserver.extractors.MetaURLFinder">
</extractor>
UserScript
The UserScript extractor makes it possible to create extractors using JavaScript. Thus, if none of the other
extractors are able to retrieve the links you can write your own extractor. The extractor has one parameter,
"src". The parameter specifies the location to your JavaScript file. It can be a URL or a java resource path.
Example:
<extractor name="JavaScriptExtractor"
class="com.fastsearch.jscriptserver.extractors.UserScript">
<param name="src" value="/JavaScriptExtractor.js" type="str"/>
</extractor>
Note that this script will be executed like any other script within a page. Please be cautious when naming
variables and functions. The last line in the script must be an object containing the extracted links. The object
must have named properties with their corresponding values being arrays of strings. The name of a property
is the link type, and the array is the list of URIs found for that particular link type.
Example: A user script which extracts image links from a page.
18
Configuring the BrowserEngine
links;
Flash settings
Setting Description
config Specifies the URI to the flash configuration file, which is used to configure flash extraction.
timeout Maximum time (in milliseconds) that the BrowserEngine will use to process a flash file before
the processing is aborted.
Example:
<flash config="file:///home/user/FlashConfig.xml" timeout="5000"/>
If a Flash configuration file is not specified in the BrowserEngine configuration, the BrowserEngine will use
its default configuration for Flash processing.
Configuration file
The Flash configuration file includes an ExtractLinksFromText tag. This tag has an attribute enable which
can be set to true or false. Setting this attribute to true allows the BrowserEngine to identify links from the
extracted text from the Flash file. Enabling this option will increase the processing time of Flash files.
Note: Most of the links in a Flash file are not contained within the text itself, thus this is just an extra
option to find additional links.
Setting Description
prefix Specifies a prefix. Tokens starting with this value will be identified as links.
suffix Specifies a suffix. Tokens ending with this value will be identified as links.
<FlashConfig>
<ExtractLinksFromText enabled="false">
<prefix> http </prefix>
<suffix> txt </suffix>
<prefix> ftp </prefix>
<suffix> js </suffix>
<suffix> html </suffix>
</ExtractLinksFromText>
</FlashConfig>
Example
Below is an example file.
<config>
<blacklist>
<regexp value="http://ads\."/>
19
FAST Enterprise Search Platform
<regexp value="doubleclick\.net"/>
</blacklist>
<javascript timeout="5000">
<scriptPreProcessor src="/scriptPreProcessor.js"/>
<pagePreProcessor src="/pagePreProcessor.js"/>
</javascript>
</browser>
<extractor name="HtmlOutput"
class="com.fastsearch.jscriptserver.extractors.HtmlOutput">
</extractor>
</extractor>
</extractor>
<extractor name="MetaURLFinder"
class="com.fastsearch.jscriptserver.extractors.MetaURLFinder">
</extractor>
</pipeline>
</config>
20
Chapter
3
Operating the BrowserEngine
Topics: This chapter describes how to perform tasks such as starting/stopping, monitoring
and logging of the BrowserEngine.
• Starting and Stopping
• Logging
• Monitoring
• Tuning
• Restrictions
FAST Enterprise Search Platform
Logging
The BrowserEngine produces logs which can help determining the state of a URI or the state of the whole
system. By default, it logs to the $FASTSEARCH/var/log/browserengine directory.
Startup, shutdown, and status messages are the only type of messages sent to the Log Server in order to
reduce network traffic. Messages on a document-level are therefore only logged to the node it's running on.
If you are using the Enterprise Crawler with the BrowserEngine, it also produces log messages that can be
valuable in tracking down what is happening to a specific URI. Refer to the FAST Enterprise Crawler Guide
for more information.
22
Operating the BrowserEngine
1. Open $FASTSEARCH/components/browserengine/WEB-INF/classes/log4j.xml
2. Change the configuration to your needs and save the file.
3. Using the Node Controller, restart the BrowserEngine.
Monitoring
The BrowserEngine can currently be monitored by reading the log files and by using a set of methods exposed
through XML-RPC.
If you are using the BrowserEngine in combination with the crawler, the crawleradmin tool has an option that
displays statistics for a particular Master (Crawler) node:
$FASTSEARCH/bin/crawleradmin --browserengine
When run on an UberMaster, the output is a list of all the BrowserEngines that are used by the Master nodes.
Tuning
The BrowserEngine may easily get overloaded or run out of Java heap space due to the fact that processing
an HTML document like a browser and executing JavaScripts is a heavy task. This section explains how to
modify configuration settings in order to balance the workload.
Server
Performance may be improved by changing the maxThreads setting, to increase or decrease the thread pool
size. If the BrowserEngine uses too many threads, valuable CPU cycles will be wasted on thread scheduling,
thus lowering throughput. Also, configuring the BrowserEnigne with too many threads increases the probability
of running out of Java heap space. Thus, a better solution may be to run multiple BrowserEngine instances.
While configuring an engine with too few threads may also result in low throughput, as many of the threads
may be blocked waiting for external dependencies. The optimal number of threads is dependent on the
operating system, hardware and the content that is crawled. To tune it, you need to closely monitor the system
before and after the thread pool size has been modified, and measure the affect of each change on
performance.
Browser
Increase the Cache section size parameter, or the TTL setting. This should increase the cache hit ratio, which
means that the number of requests for external scripts and frames is decreased. As a result, fewer threads
in the BrowserEngine will be blocked.
Pipeline
Configure the pipeline to use the minimal set of extractors you need. For instance, if you are only interested
in extracting image links, the default pipeline configuration would involve too much unneeded processing.
23
FAST Enterprise Search Platform
Node deployment
Move the BrowserEngine to a faster (or less heavily utilized) server, or run multiple BrowserEngine instances
on several nodes. Note that the Enterprise Crawler must be reconfigured if the BrowserEngine deployment
is changed.
Restrictions
In this section two common limitations of the BrowserEngine are discussed.
AJAX
The BrowserEngine does not fully support AJAX (Asynchronous JavaScript and XML). It will extract all links
found in XMLHttpRequest calls, thus if permitted the crawler will follow these links. However, note that it will
not try to download and execute the code
24
Chapter
4
BrowserEngine reference information
Topics: This chapter contains various reference information about the BrowserEngine
such as command line parameters and the XML-RPC interface.
• BrowserEngine binary
• XML-RPC Browser Interface
• XML-RPC Status Interface
• Extractor processing examples
FAST Enterprise Search Platform
BrowserEngine binary
The BrowserEngine is invoked by a shell script located at:
UNIX: $FASTSEARCH/components/browserengine/bin/browserengine.sh
Windows: %FASTSEARCH%\components\browserengine\bin\browserengine.cmd
Syntax: browserengine.(sh|cmd) [options] configfile
Option Description
configfile The configuration file as a URL or java resource path. You can specify a configuration file from
the configserver by using the following url syntax:
configserver://<ModuleName>/<FilePath>
For instance:
configserver://BrowserEngine/BrowserConfig.xml
Note: If you want to specify a configuration file on the file system, the URL looks like this:
file:///<FilePath>/<FileName>
HTML processing
Map Browser.process(String url, byte[] content, List headers, String proxyHost, int
proxyPort, List extraHeaders)
where
Option Description
headers A list of HTTP headers where each entry in the list is a list of length two containing the name
and value of a header. As a minimum, a content-type header with text/html must be supplied.
By adding Set-Cookie headers, you can define which cookies that should be available for
JavaScripts on the page.
proxyHost The hostname or IP address to a HTTP proxy (if any).
26
BrowserEngine reference information
Option Description
extraHeaders Headers that will be sent with external dependency requests.
Returns a map containing the result (links, cookies, HTML and so on).
Flash processing
byte[] Flash.process(String url, byte[] content)
where
Option Description
url The URL or some other identifier for the Flash content.
Returns a map containing various statistical information about the server since it started. Example output (in
the form of a python dictionary):
...
'Total Requests': 2,
'Failed Requests': 0,
'Percentage Statistics':
{ 'CacheHit': 50.0
},
'Pipeline Performance (ms)':
{ 'AttributeValueExtractor': {'avg': 39, 'count': 2, 'max': 54, 'min': 24, 'tot':
78},
'CSSExtractor': {'avg': 3, 'count': 2, 'max': 4, 'min': 3, 'tot': 7},
...
},
'Time Statistics (ms)':
{ 'ExternalResource': {'avg': 532, 'count': 1, 'max': 532, 'min': 532, 'tot': 532},
'PageLoading': {'avg': 1193, 'count': 2, 'max': 1990, 'min': 397, 'tot': 2387},
...
}
...
Map threads()
Returns a map where the keys are thread-ids and the values are maps describing the work status of the
corresponding thread. Example output (in the form of a python dictionary):
...
'pool-2-thread-43': {'started': 1180012778,
'status': 'loading_page',
'url': 'http://somewhere.com/somepage1.html'},
'pool-2-thread-44': {'status': 'idle/dead'},
'pool-2-thread-45': {'started': 1180013128,
'status': 'processing_page',
27
FAST Enterprise Search Platform
'url': 'http://somewhere.com/somepage2.html'},
...
Map getQueueStatus()
Returns a map containing two values,QueueSize and MaxQueueSize. This can be useful to determine whether
or not the BrowserEngine is overloaded.
void quit()
HTMLOutput
Input to extractor:
<html>
<head>
<script language="javascript">
document.writeln('standalone<br>');
function test(arg) {
document.writeln('## function test run from: '+arg+'<br>');
}
test('HEADER');
</script>
</head>
<body>
<script language="javascript">test('BODY');</script>
</body>
</html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>
<body>
standalone<br/>
## function test run from: HEADER<br/>
## function test run from: BODY<br/>
</body>
</html>
Cookies extractor
Input to extractor:
<html>
<head>
<script language="javascript">
function test() {
var param = "cookie_name_";
for (i=1; i<10; i++) {
createCookie(param+i, "val"+i, i);
}
28
BrowserEngine reference information
</head>
<body>
<script language="javascript"> test() </script>
</body>
</html>
Checksum generator
Input to extractor:
<html>
<head>
<script language="javascript">
function test() {
document.writeln("<a href=\"test.html\">test.html </a>");
}
</script>
</head>
<body>
<script language="javascript"> test() </script>
</body>
</html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>
<body>
<a href="test.html">test.html </a>
29
FAST Enterprise Search Platform
</body>
</html>
If the BrowserEngine is not configured, and the Enterprise Crawler generate the checksum, it can result in a
different checksum. The JavaScript code of the document is not processed, so there might be different content
in the document.
If the Enterprise Crawler were to process an HTML document that is identical to the JavaScript processed
document, it would generate the same checksum as the BrowserEngine.
Example: HTML used in Enterprise Crawler for checksum generation
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>
<body>
<a href="test.html">test.html </a>
</body>
</html>
AttributeValue extractor
Input to extractor:
<img src="img_src_dyn.gif">
img_src_dyn.gif
Clicker extractor
Input to extractor:
<html>
<head>
<title> JavaScript testing... </title>
<script language="javascript">
function createLink() {
var protocol = "http";
var sitename = "www.example.com";
var doc = "/cl.html";
30
BrowserEngine reference information
</head>
<body>
<center>
<div id="click">
<img src="image.jpg" onclick="createLink();">
</div>
</center>
</body>
</html>
deadlink.html
http;//www.example.com/cl.html
EventHandlerRunner extractor
Input to extractor:
<html>
<head>
<script language="javascript">
function createLink() {
var protocol = "http";
var sitename = "www.example.com";
var doc = "event.html";
<body>
<center>
<div id="click">
<img src="picture.jpg" onMouseOut="createLink();">
</div>
</center>
</body>
</html>
deadlink.html
http://www.example.com/event.html
Script extractor
Input to extractor:
// document.location = 'http://www.example.com/docLoc.html';
// window.open("http://www.example.com/someOpen4.html", "window name");
http://www.example.com/docLoc.html
http://www.example.com/someOpen4.html
31
FAST Enterprise Search Platform
Form extractor
Input to extractor:
action_dyn.html
action_static.html
CSS extractor
Input to extractor:
<style type="text/css">
@import "1.css";
@import url('2.css');
body{background-image: url('3.jpg')}
</style>
1.css
2.css
3.jpg
MetaURLFinder extractor
Input to extractor:
http://fast.no/link.html
UserScript extractor
JavaScript defined as userscript:
Input to userscript:
<html>
<body>
<script language="javascript">
var testvar = 'test.html';
vartest = 'MAGIC_'+testvar;
</script>
</body>
</html>
MAGIC_test.html.
32