Escolar Documentos
Profissional Documentos
Cultura Documentos
ABSTRACT concert1 . The author of the article rightly observes that the
Application-level web security refers to vulnerabilities inher- process \requires no particular technical skill"; the attack
ent in the code of a web-application itself (irrespective of the merely involves saving the HTML form to disk, modifying
technologies in which it is implemented or the security of the the price (stored in a hidden form eld) using a text editor
web-server/back-end database on which it is built). In the and reloading the HTML form back into the browser. A
last few months application-level vulnerabilities have been recent article published in ZD-Net [17] suggests that be-
exploited with serious consequences: hackers have tricked tween 30% and 40% of e-commerce sites throughout the
e-commerce sites into shipping goods for no charge, user- world are vulnerable to this simple attack. Internet Security
names and passwords have been harvested and condential Systems (ISS) identied eleven widely deployed commercial
information (such as addresses and credit-card numbers) has shopping-cart applications which suer from the vulnerabil-
been leaked. ity [14].
In this paper we investigate new tools and techniques The price changing attack is a consequence of an appli-
which address the problem of application-level web secu- cation-level security hole. We use the term application-
rity. We (i ) describe a scalable structuring mechanism facil- level web security to refer to vulnerabilities inherent in the
itating the abstraction of security policies from large web- code of a web-application itself (irrespective of the tech-
applications developed in heterogenous multi-platform envi- nology in which it is implemented or the security of the
ronments; (ii ) present a tool which assists programmers de- web-server/back-end database on which it is built). Most
velop secure applications which are resilient to a wide range application-level security holes arise because web applica-
of common attacks; and (iii ) report results and experience tions mistakenly trust data returned from a client. For
arising from our implementation of these techniques. example, in the price-changing attack, the web application
makes the invalid assumption that a user cannot modify the
price because it is stored in a hidden eld.
Categories and Subject Descriptors Application-level security vulnerabilities are well known
D.2.2 [Software Engineering]: Design Tools and Tech- and many articles have been published advising develop-
niques|modules and interfaces ; D.2.12 [Software Engi- ers on how they can be avoided [22, 23, 28]. Fixing a sin-
neering]: Interoperability|interface denition languages gle occurrence of a vulnerability is usually easy. However,
the massive number of interactions between dierent com-
General Terms ponents of a dynamic website makes application-level se-
curity challenging in general. Despite numerous eorts to
Security, Design tighten application-level security through code-review and
other software-engineering practices [18] the fact remains
Keywords that a large number of professionally designed websites still
Application-Level Web Security, Security Policy Description suer from serious application-level security holes. This ev-
Language, Component-based Design idence suggests that higher-level tools and techniques are
required to address the problem.
In this paper we present a structuring technique which
1. INTRODUCTION helps designers abstract security policies from large web ap-
On the 25th January, 2001, an article appeared in a re- plications. Our system consists of a specialised Security-
spected British newspaper entitled Security Hole Threatens Policy Description Language (SPDL) which is used to pro-
British E-tailers [13]. The article described how a journalist gram an application-level rewall (referred to as a secu-
hacked a number of e-commerce sites, successfully buying rity gateway). Security policies are written in SPDL and
goods for less than their intended prices. The attacks re- compiled for execution on the security gateway. The secu-
sulted in a number of purchases being made for 10 pence rity gateway dynamically analyses and transforms HTTP
each including an internet domain name (ivehadyou.org.uk), requests/responses to enforce the specied policy.
a \Wales Direct" calendar and tickets for a Jimmy Nail pop
1
Copyright is held by the author/owner(s). Some readers may argue that 10p is the true value of tickets
WWW2002, May 7–11, 2002, Honolulu, Hawaii, USA. to such a concert. A full discussion of this topic is outside
ACM 1-58113-449-5/02/0005. the scope of this paper.
396
The remainder of the paper is structured as follows: Sec- of a form-box called searchName. On the server-side this
tion 2 surveys a number of application-level attacks and dis- search string (stored in the variable $searchName) is used to
cusses some of the reasons why application-level vulnerabili- build an SQL query. This may involve code such as:
ties are so prevalent in practice. In Section 3 we describe the
technical details of our system for abstracting application- $query = "SELECT forenname,tel,fax,email
level web security. Our methodology is illustrated with an FROM personal
extended example in Section 4 and we discuss how the ideas WHERE surname=`$searchName'; ";
in this paper may be generalised in Section 5. We have im- However, if the user enters the following text into the
plemented the techniques discussed in this paper. The per- searchName form box:
formance of our implementation is evaluated in Section 6.
Related work is discussed in Section 7; nally, Section 8 '; SELECT password,tel,fax,email FROM personal
concludes. WHERE surname=`Sharp
397
A major cause of application-level security vulnerabilities
Web Server Security Gateway
398
Note that the reason we insert JavaScript into forms dy- validation element allows complex constraints to be en-
namically (rather than inserting it statically into les in the coded in a general purpose validation language. The content
web repository) is that many applications use server-side of the validation element is a validation expression written
code to generate forms on-the-
y. Although there is scope in a simple, call-by-value, applicative language which is es-
for analysis of web scripting languages to insert validation sentially a simply-typed subset of Standard ML [20]. (Note
code statically this is a topic for future work. that the precise details of the language are not the main
focus of this paper. In principle any language could be used
3.2 Security Policy Description Language to express validation constraints. For expository purposes,
At the top level an SPDL specication is an XML docu- we choose to make the language as simple as possible.)
ment. The DTD corresponding to SPDL is shown in Fig- The abstract syntax of the validation language is shown
ure 2. A policy element contains a series of URL and cookie in Figure 3. A well-formed validation expression has type
elements. For each URL element a number of parameters boolean. If the validation expression of parameter, p, eval-
are declared. The attributes of a parameter element with uates to true then this signies that p contains valid data;
name = p place constraints on data passed via p: conversely evaluating to false highlights a validation failure.
Badly typed validation programs are rejected by a compile-
The maxlength and minlength attributes specify the time type-checking phase (see Section 3.3). Within valida-
maximum and minimum length of data passed via p. tion expressions, the value of the eld specied in the en-
closing parameter element is referred to as this. Values of
Setting required to \Y" species that p must always other (declared) GET and POST parameters can be refer-
contain a (non-zero length) value; enced as getparam.name and postparam.name respectively.
Setting MAC to \Y" species that the value ofp must
In this way validation rules can be dependent on the values
be accompanied by a Message Authentication Code of multiple parameters.
(MAC) [27] generated by the server. This prevents A number of primitive-dened functions and binary oper-
the user from changing the value of the parameter to ators are provided. Although we do not list them all here,
arbitrary values (see Section 3.4.2). those of particular importance are outlined below:
Arithmetic operators +, -, * and / can be applied to
The type attribute species the data-type of p (either both integers and
oating point values. String con-
int, float, bool or string). catenation is represented by the inx operator ++.
The method attribute determines whether the specied con- The function format(s,regexp) returns true i s is of
straints apply to p passed as a GET-parameter (i.e. a URL the form specied by regular expression, regexp.
argument) or a POST-parameter (i.e. returned from a form).
Setting method to GETandPOST means that the constraints We provide the function mid(s,l,r) which returns the
within the parameter element are applicable to both GET substring of s which starts at character l and nishes
and POST parameters with name = p. (The GETandPOST at character r inclusively. (Characters of s are num-
option is particularly useful if parts of a web-application bered from 1).
are written in a language which does not force a distinction Functions are provided to cast between dierent types.
between GET and POST parameters with the same name| For example, String.fromInt(i) returns the string
e.g. PHP.) representation of integer i.
For example, consider the following security policy de-
scription: Function isdefined(p) takes a parameter (for exam-
ple, postparam.p or getparam.p) and returns a boolean
<policy> indicating whether p is dened (i.e. has been passed to
<URL prefix="http://example"> the URL in the HTTP request). Using an undened
<parameter name="p1" maxlength="4" parameter as an argument to any other function or op-
type="int" required="Y" erator leads to a dynamically generated error message.
MAC="N">
</parameter> Transformation rules are much simpler than validation ex-
<parameter name="p2" method="POST" pressions and are delimited by the <transformation> tag.
maxlength="3" type="string"> The contents of a transformation element nested within
</parameter> a parameter element, p, species a pipeline of transforma-
</URL> tions to be applied to data received via p. For example, if
</policy> we always wanted to apply transformation t1 followed by
t2 to parameter p passed via a given URL then our SPDL
This example species constraints on parameters passed to specication would contain:
URLs with prex \http://example". <URL prefix="...">
The rst parameter element denes constraints to be ap- <parameter name="p" ...>
plied to a parameter named p1 (either GET or POST); the <transformation> t1 | t2 </transformation>
second parameter element denes constraints to be applied </parameter>
to a POST parameter named p2. </URL>
We hope that the attributes of parameter elements cover
the majority of validation constraints that designers require. Transformations are selected from a pre-dened library. In
However, in some circumstances a greater degree of control is our current implementation we have dened the following
required: this is provided by the validation element. The transformations:
399
<!ELEMENT policy (URL*, cookie*)>
Figure 2: The XML DTD for the Security Policy Description Language
e x (variables)
j c (constants)
j (
f e1 ; : : : ; ek ) (function calls)
j getparam:c (value of GET parameters)
j postparam:c (value of POST parameters)
j this (value of this eld)
j e1 hopi e2 (binary inx operators)
j if e1 then e2 else e3 (conditionals)
j let d : : : d in e end (local declarations)
d val x : t = e (immutable bindings)
j fun f (x1 : t; : : : ; xk : t) : t = e (function denitions)
t int j float j string j bool (types)
400
EscapeSingleQuotes Replace all single quotes with their Receive
HTML character encoding. HTTP Request
style tags, <b>, <u> and <i> and anchors of the form Fail Fail Fail
<a href="..."> ... </a>).
Return Error Return
Facility is provided for the user to dene other transforma- Page HTTP Response
401
by the SPDL compiler) is inserted into HTML forms (see the client; since clients do not know the secret they cannot
Section 3.4.1), MaxLength attributes on form elements are construct their own MACs.
set according to the SPDL specication and, nally, message Before describing the algorithm used to annotate the gen-
authentication codes are generated for form elds and URL- erated HTML with MACs we make a few auxiliary de-
parameters (see Section 3.4.2) if required. nitions. Consider a list of pairs, l = [(k1 ; v1 ); : : : ; (kn ; vn )].
We dene s ort(l) to be l sorted by k-values and v als(l) to be
3.4.1 Client-side Form Validation the list [v1 ; : : : ; vn ]. Function sortVals is the composition of
For each HTML-form in the HTTP response the secu- sort and vals (i.e. s ortV als(l) = v als(s ort(l))). Appending
rity gateway inserts JavaScript code to perform validation of lists is performed by the binary inx operator `@'. Now
checks on the client's machine. (Recall that the insertion consider an HTTP request, Hr eq , which triggers response
of JavaScript is merely to enhance usability|the generated Hr es . The algorithm for annotating the body of Hr es with
402
where the parameter mac stores the message authentication on the server-side in order to generate dynamic responses
code corresponding to p1 = 4, p2 = 5 with the time-stamp to client requests. A number of attacks on web-servers (or
stored in parameter time. indeed badly congured web-servers) can result in this em-
When data is received from a client via GET/POST pa- bedded code being transmitted to the client in source form4 .
rameters then the values of those parameters which have This can be potentially devastating since it gives crackers
their MAC attribute set to \Y" are fed back into the MAC detailed information about the inner-workings of the appli-
generation algorithm described above. Note that the origi- cation. For example, in some (badly written) applications
nal time-stamp (returned from the client as a GET parame- the code contains plaintext passwords used to authenticate
ter) is also required to recompute the MAC. We compare the against back-end database servers.
recalculated MAC with the MAC returned from the client in We have demonstrated that our security gateway can pro-
order to determine whether any parameters were tampered tect against such attacks by searching for sequences of char-
with. acters which delimit embedded code in HTTP responses
When designing the MAC algorithm one of our major con- (e.g. <%, %> for ASP or <?php, ?> for PHP). Detection of
cerns was to avoid replay attacks [29] where clients replay such delimiters implies (with reasonable probability) that
messages already annotated with MACs in unexpected con- server-side code is about to leaked. Hence, if any delimiters
texts. We take two steps to avoid such attacks: are found the security gateway lters the HTTP response
1. We include a time-stamp in the MAC and do not ac- and returns a suitable error message to the client informing
cept MACs which are more than a few minutes old. them that the page they requested is unavailable.
2. Rather than generating separate MACs for each indi- 4. CASE STUDY
vidual protected eld, we generate a single MAC for all To illustrate our methodology we consider using our sys-
protected client-side state. This protects against cut- tem to secure a simple e-commerce system. Consider the
and-splice attacks (in which MAC-annotated elds are following scenario:
swapped into other messages). As a nal step in a purchasing transaction, users are sent
Despite these preventative measures, the responsibility for an HTML form requesting their surname, credit-card num-
ensuring that replay attacks are not damaging ultimately ber and its expiry date. The price and product-ID are stored
rests with the security policy designer. For example, in the in hidden form elds on the form. For example, when pur-
case study of Section 4 a MAC is generated for both the chasing a product with productID = 144264, the form sent
productID and Price elds. Although users can replay such to the client is as follows:
messages this results in multiple purchases of the same prod-
<form method="POST" action="http://www.example/buy.asp">
uct for the correct price. The key is that the MAC prevents <input type="hidden" name="price" value="423.54">
the Price and productID being modied independently. <input type="hidden" name="productID" value="144264">
<input type="text" name="surname">
3.5 Extensions <input type="text" name="CCnumber">
As well as applying the validation and transformation <input type="text" name="expires">
rules of SPDL specications, our security gateway performs </form>
a number of other tasks. Two of these are described in this
section. A single cookie, sessionKey, is used to uniquely identify
clients' sessions. Once purchases have been made, an or-
3.5.1 Restricting Values of Select Parameters der record is entered into the company's back-end database
Select parameters (delimited using the <select> tag in which can be subsequently viewed on their local intranet.
HTML forms) invite users to choose options from a pre- For the purposes of this example let us assume that the
specied list. Although web designers often make the as- system is vulnerable in the following ways:
sumption that clients will only select values present in the 1. The \modifying values of hidden form-elds" attack
list, a simple form-modication attack allows clients to sub- (as described in the Introduction) can be used to re-
mit arbitrary data in select parameters. duce the price.
The security gateway protects against such an attack, pre-
venting clients from submitting values for select parameters 2. JavaScript can be embedded in the surname eld which,
that were not present in the original HTML form. Our im- when viewed on the company's intranet, leads to cross-
plementation involves the use of a control eld which en- site scripting vulnerabilities.
codes dynamically generated lists of valid values for select
parameters. The control-eld is a hidden form-eld gen- 3. SQL can be entered into the surname eld using the
erated automatically by the security-gateway and inserted attack described in Section 2.
into forms which contain <select>s. When form parame- 4. The session key is predictable since it is created using
ters are returned to the server, the value of the control eld a time-seeded random number generator; clients can
is decoded and used to validate values of select parameters. spoof other active sessions by modifying the value of
To prevent the control eld being maliciously modied by the sessionKey cookie.
clients, its value is included in the calculation of the form's 4
message authentication code. It is amusing to observe that entering a few VBscript key-
words into a search engine (we tried `dim "<%" set con'
3.5.2 Protecting against Server Misconfiguration on Google) results in a number of matches, some of which
contain ASP code unwittingly returned from miscongured
Web applications often consist of a number of les contain- servers. Even if the problem was subsequently xed, Google
ing embedded code (e.g. PHP or VBscript) which is executed keeps a cached copy of the code!
403
The SPDL specication corresponding to the form's ac-
5. GENERALISING OUR SYSTEM We experimented with this idea by programming our se-
Whilst we advocate the use of a specialised Security Policy curity gateway directly in OCAML [16], using a comprehen-
Description Language in the majority of cases, we recognise sive HTTP library to process HTTP requests and responses.
that there are circumstances where the increased
exibility We found that, even when using a general purpose program-
of a general purpose programming language may be desir- ming language to express the security policy, using a security
able. gateway to structure an application still provides a number
In such cases we argue that using a security gateway to of advantages. In particular:
abstract security-related code is still a useful technique. In- we found the features of OCAML (notably its strict
stead of generating code for the security gateway via the type system) more conducive to writing security crit-
SPDL compiler, we observe that the security policy can be ical code than other languages more commonly used
encoded in a programming language of choice and compiled for web application development;
directly for execution on the security gateway. Although
one loses the specialised features of the SPDL, using a gen- by abstracting the security policy cleanly we gained
eral purpose programming language provides designers with the usual advantages of maintainability, clarity, in-
a greater degree of freedom. (Of course with great power creased code re-use and reduced code size.
comes great responsibility [15]. Programmers must take ex-
tra care to structure their security policy enforcement code
carefully.)
404
<policy>
<url prefix="http://www.example/buy.asp">
<parameter name="price" method="POST" maxlength="10" minlength="1" required="Y" type="float" >
<validation> this isGreaterThan 0.0 </validation>
</parameter>
<parameter name="productID" method="POST" maxlength="10" minlength="1" required="Y" type="int" />
<parameter name="surname" method="POST" maxlength="30" minlength="2" required="Y" MAC="N" type="string">
<transformation> EscapeSingleQuotes | EscapeDoubleQuotes </transformation>
</parameter>
<parameter name="CCnumber" method="POST" maxlength="16" minlength="16" MAC="N" required="Y" type="int">
<validation>
let fun first(s:string):string = String.mid(s,1,1)
fun rest(s:string):string = String.mid(s,2,String.length(s)-1)
fun double(s:string,a:bool):string =
if s="" then "" else (if a then first(s)
else String.fromInt ( Int.fromString( first(s) ) * 2 ))
++ (double (rest (s), not a))
fun sum(s:string):int =
if s="" then 0 else (Int.fromString (first(s))) + (sum (rest(s)))
in sum(double(this,false)) % 10 = 0
end
</validation>
</parameter>
<parameter name="expires" method="POST" maxlength="5" minlength="5" MAC="N" required="Y" type="string">
<validation> format(this,"\d\d/\d\d") and
Int.fromString( mid(s,1,2) ) <= 12 and Int.fromString( mid(s,1,2) ) >= 0
</validation>
</parameter>
</url>
<cookie name="sessionKey" maxlength="15" minlength="15" type="int" />
</policy>
405
Sanctum Inc. provide a product called AppShield [26]
406
On their website Sanctum claim that their \AppShield [11] eEye Digital Security. %u-encoding IDS bypass
software secures your site by blocking any type of applica- vulnerability. Advisory AD20010705,
tion manipulation through the web". Clearly this is false: if http://www.eeye.com/html/
it were possible to solve all application-level security prob- Research/Advisories/AD20010705.html.
lems with a black-box tool then there would be no need for [12] D. Flanagan. JavaScript: The Denitive Guide.
further security research.6 In contrast, we do not claim that O'Reilly. ISBN: 1565923928.
we have found a automatic x for all application-level se- [13] S. Goodley. Security hole threatens british e-tailers.
curity problems: although our tool helps to secure a web The Daily Telegraph Newspaper (UK). 25th January,
application it still requires a competent, security-aware en- 2001.
gineer to write/check the security policies by hand. http://www.telegraph.co.uk/et?pg=
Based on the research reported in this paper, we claim /et/01/1/25/ecnsecu2.html.
that our methodology provides a stronger foundation for se- [14] Internet Security Systems (ISS). Form tampering
cure web applications than conventional tools and develop- vulnerabilities in several web-based shopping cart
ment techniques. In addition, we believe that applying this applications. ISS alert.
methodology in practice would make a signicant and im- http://xforce.iss.net/alerts/advise42.php.
mediate impact to the many websites which currently suer [15] B. Jemas, B. M. Bendis, and M. Bagley. Ultimate
from application-level security vulnerabilities. Spider-man: Power and Responsibility. Marvel Books.
ISBN: 078510786X.
Acknowledgement [16] X. Leroy. The Objective Caml System Release 3.0.
This work was supported by (UK) EPSRC GR/N64256 INRIA, Rocquencourt, France, 2000.
and The Schi Foundation. Both authors are sponsored by [17] L. Lorek. New e-rip-o maneuver: Swapping price
AT&T Laboratories, Cambridge. tags. ZD-Net. 5th March, 2001.
http://www.zdnet.com/intweek/stories/
The authors wish to thank Alan Mycroft, Andrei Serjan-
news/0,4164,2692337,00.html.
tov and Richard Clayton for their valuable comments and
suggestions. [18] Microsoft. HOWTO: Review ASP code for CSSI
vulnerability.
http://support.microsoft.com/support/
9. REFERENCES kb/articles/Q253/1/19.ASP.
[1] The <bigwig> project. [19] R. Milner. A theory of type-polymorphism in
http://www.brics.dk/bigwig/. programming. Journal of Computer and System
[2] MySQL database server. http://www.mysql.com/. Sciences, 17(3), 1978.
[3] PHP hypertext preprocessor. http://www.php.net/. [20] R. Milner, M. Tofte, R. Harper, and D. MacQueen.
[4] Squid web proxy cache. http://www.squid-cache.org/. The Denition of Standard ML (Revised). MIT Press,
[5] D. Box. Simple object access protocol (SOAP) 1.1. 1997.
world wide web consortium (W3C). May 2000. [21] C. Musciano and B. Kennedy. HTML & XHTML: The
http://www.w3.org/TR/SOAP. Denitive Guide (4th Edition). O'Reilly, 2000. ISBN:
[6] C. Brabrand, A. Mller, M. Ricky, and 0-596-00026-X. See page 328.
M. Schwartzbach. Powerforms: Declarative client-side [22] R. Peteanu. Best practices for secure web
form eld validation. World Wide Web Journal, 3(4), development. Security portal.
http://securityportal.com/cover/
2000. coverstory20001030.html.
[7] CERT. Advisory CA-2000-02: malicious HTML tags [23] R. Peteanu. Best practices for secure web
embedded in client web requests. development: Technical details. Security portal.
http://www.cert.org/
http://securityportal.com/articles/webdev20001103.html.
advisories/CA-2000-02.html.
[8] CERT. Understanding malicious content mitigation [24] R. Petrusha, P. Lomax, and M. Childs. VBscript in a
nutshell: a desktop quick reference. O'Reilly. ISBN:
for web developers. 1565927206.
http://www.cert.org/tech tips/
malicious code mitigation.html. [25] R. Rivest. The MD5 message digest algorithm.
[9] R. Clayton, G. Danezis, and M. Kuhn. Real world Internet Request For Comments. April 1992. RFC
patterns of failure in anonymity systems. In 1321.
Proceedings of the Workshop on Information Hiding, [26] Sanctum Inc. AppShield white paper. March 2001.
volume 2137. Springer-Verlag, LNCS, 2001. Available from http://www.sanctuminc.com/.
[10] E. Damiani, S. De Capitani di Vimercati, [27] B. Schneier. Applied cryptography: protocols,
S. Paraboschi, and P. Samarati. Fine grained access algorithms, and sourcecode in C. John Wiley & Sons,
control for soap e-services. In Proceedings of the 10th New York, 1994.
International World Wide Web Conference, pages [28] L. D. Stein. Referer refresher.
504{513. ACM, May 2001. http://www.webtechniques.com/archives/1998/09/webm/.
[29] P. Syverson. A taxonomy of replay attacks. In
Computer Security Foundations Workshop VII. IEEE
6 Computer Society Press, 1994.
If you don't nd this argument convincing then consider
the following counter-example: deleting JavaScript valida-
tion code on the client-side is an \application manipulation
through the web" which AppShield will not detect.
407