Você está na página 1de 8

Risk-Based Access Control Decisions Under Uncertainty

Ian Molloy, Pau-Chen Cheng, Jorge Lobo Luke Dickens, Alessandra Russo
IBM Research {molloyim,pau,jlobo}@us.ibm.com Imperial College {lwd03,ar3}@doc.ic.ac.uk

Charles Morisset
Royal Holloway, University of London charles.morisset@rhul.ac.uk

AbstractThis paper addresses making access control decisions under uncertainty, when the benet of doing so outweighs the need to guarantee the correctness of the decisions. For instance, when there are limited, costly, or failed communication channels to a policy-decision-point (PDP). Previously, local caching of decisions has been proposed, but when a correct decision is not available, either a PDP must be contacted, or a default decision used. We improve upon this model by using learned classiers of access control decisions. These classiers, trained on known decisions, infer decisions when an exact match has not been cached, and uses intuitive notions of utility, damage and uncertainty to determine when an inferred decision is preferred over contacting a remote PDP. There is uncertainty in the inferred decisions, introducing risk. We propose a mechanism to quantify such uncertainty and risk. The learning component continuously renes its models based on inputs from a central PDP in cases where the risk is too high or there is too much uncertainty. We validated our models by building a prototype system and evaluating it against real world access control data. Our experiments show that over a range of system parameters, it is feasible to use machine learning methods to infer access control decisions. Thus our system yields several benets, including reduced calls to the PDP, communication cost and latency; increased net utility and system survivability.

I. I NTRODUCTION Modern access control systems rely on the PEP/PDP architecture: a Policy Enforcement Point (PEP) forwards each access request made within an application to a Policy Decision Point (PDP), which analyses this request and returns an allow/deny authorization decision. In some solutions, such as [1][3], a PDP is commonly implemented as a dedicated authorization server and is located on a different node than the PEPs. To enforce a consistent policy throughout the system, this architecture relies on the PEP being able to contact the PDP to query decisions, and therefore suffers from a single point of failure. In particular, key factors that affect the performance of the PEP are latency of the communication with the PDP, reliability and survivability of this connection (and the PDP itself), as well as the aggregated impact of communication costs (which can include contacting a human to perform the authorization). In several contexts such as mobile applications, these costs may be prohibitive. A number of approaches have been proposed to address these issues, one common theme is to cache access control decisions [4], [5] at the PEP, such that the PEP does not have to forward the same access request more than once to the PDP: either the PEP nds the request-decision pair in the history, or it forwards the request. Others explore the tradeoff between efciency and accuracy in an access control context, e.g., Ni

et al. [6] use a machine-learning algorithm to build a classier of policy decisions from past request-decision pairs. Once the classier is accurate enough, it is used as a local decision point. As with any such approximate classier, there is a degree of uncertainty in every decision; thus, every decision made is inherently associated with a risk of being wrong, either incorrectly allowing or denying some requests. This paper generalises of the machine-learning approach from [6], by taking account of the uncertainty in each locally inferred decision. We propose a model where this uncertainty can be used to determine risk of making this decision. This allows the policy to trade-off the utility of making the local decision against the risk associated with getting it wrong: if favourable it will enact the local decision, otherwise it defers to the central PDP. Commonly used access control models [7] [11] dene access control policies in terms of model-specic relationship between subjects and objects. Our approach is not tied to a particular model, but it uses attributes of access requests, e.g., subjects and objects, and assumes the policy can be dened using these attributes. We describe a model for quantifying uncertainty and risk, and show how these can inform access control decisions. Our work focuses on estimating the uncertainty about the correctness of the inferred access control decisions, and assumes that estimates of (or distributions over) the gain, loss and damage associated with correct and incorrect decisions are available (or can be obtained). There are many situations in which the model can be applied. For instance, the scenario described in [6], where expensive calls to a call center to resolve access control decisions are made. There is also a large body of research in the estimation of the value of resources and the cost of security breaches and misuse. [12] describes a proof of concept method to estimate the value of resources in an enterprise, such as jewel code, acquisition plans, trade secrets, and personal information. The estimates of these values can be continuously rened and veried by the current operations of the company. For example, IBM was granted over 5,000 patents in 2010 alone, and licenses many of these to other organizations, providing an accurate distribution of their value over time. Other examples include cases where ne grained service level agreements are dened a priori such as the credit card example briey discussed in Section IV-D (see also experimets in Section V-C). We present distributed access control system to illustrate and test this principle: local decision points determine whether the tradeoff between utility and uncertainty associated with a

local decision is favorable, and defer to the central PDP for a binding decision if not. This general framework allows us to explore the balance between avoiding potential errors in local decisions, and the desire to limit reliance on the central PDP. We present three methodologies for making decisions:

Expected Utility A risk neutral approach that weights expected gains and damage equally. Risk Adjusted Utility A pessimistic view of utility, where uncertainty about damage reduces desirability. Independent Risk Constraints Augments expected utility with independent risk thresholds, where high utility decisions can be rejected if outweighed by the risk.

This system is validated with data from a large corporation, used to grant entitlements and permissions to system administrators. Classiers are initially untrained and begin by always consulting the central PDP. Over time, the classiers are trained using the central PDPs decisions, improving accuracy and reducing the uncertainty, yielding fewer queries. Due to space constraints, experiments focus on the risk adjusted utility measure. Compared with the caching algorithm, our results show that our learning based approach reduces the queries to PDP by as much as 75% and increases the system utility up to eight fold when the cache hit ratio is low. When the hit ratio is high, performance is comparable to the caching algorithm. Our results from this dataset show that our machine-learning based approach robust and accurate and can be viably used for informing distributed access control decisions; and more conservative measures of risk can be applied if needed. We believe these techniques can generalize to other use cases and domains, including information sharing in military and coalition environments; nancial applications like credit report processing; and load balancing services like Netix or document repositories. Compared to pure machine-learning solutions, our methods estimate the uncertainty of each classication, allowing one to arbitrarily minimize the pitfalls of using approximate decisions, but can ameliorate the higher communication costs of naive caching solutions. This paper is organized as follows. Section II briey discusses related work. Section III discusses the architecture of access control systems based on our approach. We present in Section IV different techniques to assess the risk and uncertainty of local access control decisions. An experiment and evaluation of our approach, applied on different scenarios, is presented in Section V. Section VI concludes this paper. II. R ELATED W ORK The caching of access control decisions has been explored in [1][3], and the policy evaluation framework in [13] uses a caching mechanism, which can dramatically decrease the time to evaluate access control requests. Clearly, caching approaches are only valid when the cache is sure to return the same decision as the central PDP, and techniques to ensure strong cache consistency are proposed in [14]. In distributed systems, each node can build its own cache in collaboration with other nodes, improving the accuracy of each cache [15]. These caching methods return a local decision on a request

only if this exact request has been previously submitted and answered. In the context of the Bell-LaPadula model [7], Crampton et al. [4] present a method to calculate previously unseen decisions, based on the history of submitted requests and responses, and the formal rules governing Bell-LaPadula. A similar approach for the RBAC model [8] is presented in [5] and an extension to Bloom lters is used in [16]. These inference techniques require knowledge of the underlying access control model, and are most efcient for hierarchical and structured access control models. Inference algorithms have been proposed based on the relationship of the subjects and objects in a database [17], [18]. However, the structure of the subject/object space still needs to be known. The use of machine-learning to make access control decisions is proposed in [6], where the authors note the classier can, at times, return a decision that conicts with the central PDP. A notion of risk in access control is introduced in [19] [21], where they blur the distinction between allowing and denying an access; instead, each access is associated with a potential utility and damage. We reuse these notions here, in order to estimate the potential impact of each access. III. U NCERTAINTY IN ACCESS C ONTROL Figure 1 depicts the architecture of our system for access control under uncertainty. An access request req is a triple (s, o, a), where subject s accesses object o with access mode a. We assume at least one oracle node with the full policy, called the central PDP; and a correct local decision is one that agrees with the central policy. Each node has a local PDP with three parts. The decision-proposer (proposer) returns a guess of the correct decision and a measure of the uncertainty. The riskassessor (assessor) determines whether the decision is taken locally or deferred to the central PDP. The decision-resolver (resolver) returns an enforceable decision. While any discrete set of enforceable decisions is supported, here we restrict the set to be Datm = {allow, deny }. A PEP passes requests to the proposer, and enforces decision from the resolver.
req DecisionRisk Proposer (dec, ) Dprp Assessor req req PDP req, dec Datm PEP req dec Ddef DecisionResolver

Datm : Dprp : Ddef :

Atomic decisions, e.g. {allow, deny} Proposals, e.g. Datm [0, 1] Deferrable decisions, e.g. Datm {defer}
Fig. 1. Architecture of a local node

a) Decision-Proposer: A decision-proposer receives an access request and returns a proposal (dec, ), where dec Datm is a guess of the correct decision for this request, and is an uncertainty measure of the correctness of dec. Most simply is a probability p, and a proposal (allow, p) is logically equivalent to (deny, 1 p) if Datm = {allow, deny}. When p is estimated, we may want to quantify the uncertainty in p. In our learning based approach, the proposer is mainly a

classier produced by a machine-learning algorithm, which treats a decision in Datm as a class and assigns a request to a class based on the requests attributes; and the uncertainty is a pair (, ); where , [1, ] are parameters of a beta distribution [22], Beta(|, ). This distribution captures our uncertainty about p, and is a variable approximating p. The and values are computed from the number of correct and incorrect decisions made by the classier when it is evaluated against the training and testing sets of the the learning process. Elements of these sets are past requests tagged with the corresponding correct decisions. The beta distribution ts well with our learning based approach and gives ner control over risk estimationsee Section IV. The learning approach makes the assumption that two requests with similar attributes lead to the similar decisions. This assumption is supported by underlying principles of many access control models, a large body of work on learning access control policies, such as role mining, and best practices exercised by large organizations. Many models, such as attributebased access control, assign rights based on attributes; one only needs to nd an appropriate distance measure, such as the Hamming distance between the subjects attributes to capture similarity. Though it is possible to come up with a conguration that violates it, this assumption has been empirically shown to hold for concrete systems [6], and the large body of research on role mining [23], [24]. In some models such as the Chinese Wall [9], the decision to grant an access can also depend on the context. Such models would need to be restructured to encode context into the attributes of the subjects and objects; the notion of distance between requests will be inuenced by this encoding. b) Risk Assessor: The risk-assessor takes a request and a proposal and determines whether a decision can be made locally, or should be deferred to the central PDP. The risk assessor can return any decision in the set of deferrable decisions, Ddef = Datm {defer}. A trivial assessor might have an uncertainty threshold above which all decisions are deferred to the central PDP all other decisions being made locally. However, different requests have different potential impacts on the system. Some requests have high potential utility for the system, and wrongly denying them would not realise this. Some requests have high potential damage, and wrongly allowing them is a threat to the integrity and survival of the system. Other factors to consider are the cost of contacting the central PDP, and the systems risk appetite, which controls how risk-averse decisions should be. A risk-averse decision trades off expected utility for greater certainty in the outcome. More details on the risk assessor are given in Section IV. c) Decision Resolver: A decision-resolver takes a request, r, and a deferrable decision, decd Ddef and returns an enforcable decision, deca Datm . Typically, deca = decd if decd Datm ; otherwise, a r is deferred to the central PDP.

IV. A SSESSING L OCAL D ECISIONS This section considers how the assessor takes a proposal from the proposer, and returns a rm decision. As discussed, the proposal may conict with the decision that would have been made if passed to the central policy. If a request that would be allowed by the central PDP is denoted as valid, then there are two kinds of errors that the local PDP could make: false-allows (allowing an invalid request) and falsedenies (denying a valid request). These errors can have negative consequences: a false-allow can result in leakage of information, corruption of information, or privilege escalation; a false-deny can result in disruption of service, breach of service level agreements (SLA), and or possibly reputational damage. Making a correct decision leads to gains and benets. Ideally, an access control system should enable as many appropriate accesses as possible, and so our primary concern is to maximize the number and value of these accessessubject to limiting costs and damages. This trades off the potential gains of each access control decision, against the costs and damages that may result from poor decisions. Consider an assessor deciding whether to locally allow or deny a request, (s, o, a), or to pass this decision on to a central PDP. The assessor has a ternary choice to make: whether to allow the request locally, allow; to deny locally, deny; or to defer the decision to the central PDP, defer. We assume that the central PDP is an oracle, and always gives the correct decision. If the assessor decides allow, then either the request is valid (a true-allow) and the system makes a gain, g , because an appropriate access has been made; or the request is invalid and the decision has caused some amount of damage, dA , due to an inappropriate access being made. Likewise, if the assessor decides deny, then either the request was invalid, deny is the correct decision and nothing additional happens (i.e. a gain of 0), or the request was valid and the deny decision is incorrect, and a damage, dD , is incurred for a false-deny. If the assessor decides to defer and the request is valid, then the central PDP will grant the request and the system will make a gain, g , but incur a cost, c (the contact cost). If the decision is defer and request is not valid then the central PDP will deny the request and there will be no gain and again a contact cost, c. As the central PDP is an oracle, it never makes an incorrect decision, so there is no damage associated with this. These gains, costs and damages are shown in Table I (a). In general, the gains, costs and damages can be probability distributions over potential values. For instance, in some cases, wrongly allowing an access only implies the probability of an attack. For simplicity in our examples, we assume these to be xed, known values unless otherwise stated. Standard gains and costs (negative gains) are kept separate from damage, which is assumed to be associated with rare and costly negative impacts; this distinction becomes important, when we consider risk measures in Sections IV-B and IV-C and/or when considering probability distributions over these values. The impacts presented in Table I (a) are not the most general that could be conceived, but are sufciently rich to be of interest.

(a) Outcome Gains and Damage Request Valid gain damage allow deny defer g 0 gc 0 dD 0 Request Invalid gain damage 0 0 c dA 0 0

(b) Expected utilities U (|allow, p) allow deny defer E(pg (1 p)dA ) E(pdD ) E(pg c) U (|deny, p ) E((1 p )g p dA ) E((1 p )dD ) E((1 p )g c)

TABLE I W HEN GAINS , COSTS AND DAMAGES ARE INCURRED AND THEIR INFLUENCE ON UTILITY.

For example, there may be a gain associated with denying an invalid request, e.g., an increase in system reputation, or a damage in allowing a valid request, e.g., increasing the number of copies of a document. All of the approaches described in this paper make a tradeoff between the uncertainties in classication and in estimation of these gains and damages. To apply our methodology to a real example, we must be able to (approximately) compute these gains and damages or nd probability distributions over them. We note here that there are many applications and domains where these gains and damages can be estimated either empirically or analytically. For example, in nancial service transactions like credit cards, these costs may be estimated easily or provided a priori by some agreement. Other examples include cases where ne grained service level agreements are dened a priori. We discuss further details in Section IV-D. A. Expected Utility To aid this decision, the proposer attempts to classify request, (s, o, a), to determine whether it should be allowed or denied, and include some measure of its uncertainty in this classication. In the simplest case, the proposer can return one of two kinds of answers, e.g. (allow, p) or (deny, p ). The rst response, (allow, p) means that the proposer has classied the request as an allow with probability p that this is correct, i.e. that the request is valid. Likewise, (deny, p ) means that the proposer has classied the request as a deny with probability p that the request is invalid. The expected utility for a decision, dec Ddef , is the expected gain of each outcome (request valid or invalid) minus the expected damage, and these utilities are shown in Table I(b). In this approach, we assume a risk neutral posture, meaning deferrable decisions are based purely on the expected utility, U (), for each decision. The assessor simply returns the decision corresponding to the highest utility. There are cases where the assessor may not defer but return a decision contrary to the proposers. For example, when the proposer returns (allow, p) given a request where dA dD , dD < c and g c, then it may occur that U (defer) < U (deny). Since our goal is to execute the central policy as faithfully as possible, our system always defers to

the central PDP in these cases. Hence, given a request req and a decision proposer returning a decision and a probability, the expected utility assessor, eu , returns, given a request req : allow if (req ) = (allow, p) and U (allow) U (defer); deny if (req ) = (deny, p) and U (deny) U (defer); defer otherwise. Thus in this rst approach, we directly use the certainty of the decision to compute the expected utility and choose the decision with the highest utility. However, this approach can lead to rare but signicant damages due to incorrect decisions. Because while the probability of an incorrect decision may be very small to produce a small expected damage, the value of damage can be signicant. In the following approaches we will try to explicitly account for the risk of such signicant damage. B. Risk Adjusted Utility Now lets assume that we are making the same decision, but that we want to use a risk based measure. In other words, we want to explicitly account for uncertainty in the proposal. For this, the proposer returns a beta distribution, describing uncertain knowledge about the probability of a correct classication. If the proposer classies the request as valid then it returns (allow, , ), where , [1, ] are the parameters of the beta distribution [22]. Likewise, if it classies the request as invalid then it returns (deny, , ). Again we focus just on the allow case, and assume that the assessor can return either allow and defer. If the proposer returns (allow, , ), we denote by p = + (the expectation of the beta distribution), as the expected probability that this decision is correct. The risk adjusted utility assumes that any uncertainty in the utility is dominated by the damage term, this can be when the damage is xed, known and very large (the case we examine here), or if there is a probability distribution over damage with high variance. Taking advantage of this assumption, we use the expected values of gain and cost, but are pessimistic when considering the impact of damage and use the expected shortfall at some signicance value of n. For example, a 5% expected shortfall (n = 0.05) uses the average damage in the worst 5% of universes. The risk adjusted utility for various decisions and proposer outputs are shown in Table II (a). For our example of xed, known value of damage dA and given proposal (allow, , ), we can simplify further by using the pessimistic probability, p n , which is essentially the expected shortfall of the probability, p, given and e.g. compute value x such that the incomplete beta function of x equals n, p n is expected value of p conditioned on p < x. This corresponds to taking a pessimistic view of the evidence, to imagine there is a higher probability of incurring damage than would normally be expected. Therefore probability p n < p = + , takes a pessimistic view of how certain the proposer is to have classied correctly when considering the (allow) pg (1 p impact of damage, and U n )dA . Given a request req and a proposer returning a proposal (req ), the risk adjusted utility assessor, au , returns,

(a) Risk Adjusted Utilities

(|allow, , ) U
allow deny defer
E(pg )+ ESn ((1-p)d A)

(|deny, , ) U

ESn ((1-p )d D) E((1-p )g c)

E(pg c)

(b) Risks

R(|allow, , )
allow deny defer
ESn (pg (1-p)d A)

R(|deny, , )

ESn ((1-p )d D) ESn ((1-p )g c)

ESn (pg c)

TABLE II R ISK A DJUSTED U TILITIES AND R ISKS FOR VARIOUS DECISIONS GIVEN PROPOSER OUTPUTS .

given a request req : allow if (req ) = (allow, , ) and (allow) U (defer); deny if (req ) = (deny, , ) and U (deny) U (defer); defer otherwise. This approach differs U from the risk-neutral approach from the previous section, in taking a risk averse approach, which mitigates against very large damages. C. Independent Risk Constraints Our second risk method, called independent risk constraints, uses the standard expected utility to rank the decisions in order of risk neutral preference, and then reject any which do not satisfy our risk constraint(s). The proposer classies the requests and returns either (allow, , ) or (deny, , ), and we calculate the expected utilities as they appear in Table I (b) using p = + for the probability of allow, and p = + for the probability of deny. The risk, R(), for each deferrable decision is calculated independently. The risk for a decision, dec, takes a pessimistic view of the utility, and is the expected shortfall, at signicance n, of the inverse of the utility. The risks for each decision and proposer output are shown in Table II (b). To simplify, as we did for the risk adjusted utility, we could assume large xed, known damages, and approximate the risk using the pessimistic probability. For example, given proposal (allow, , ), we calculate the risk as R(allow) (1p n )dA . The decision making process involves either a xed threshold t > 0, or more generally for each deferrable decision, dec Ddef , with utility, udec , a threshold that depends on the utility, i.e. t(udec ). We then take the following steps 1) Find expected utilities and risks as shown in Tables I (b) and II (b). 2) Rank the decisions according to their utilities. 3) Check each decision, dec Ddef in turn (highest utility rst), and choose the rst decision whose risk is below t(udec ).
For example, assume that probability p = + is sufciently close to 1 that allow is preferable to defer. Notice also that defer carries zero risk and will always be risk acceptable. Given request req and decision proposer returning proposal (req ), the independent risk assessor, ic , returns, given a request req : allow if (req ) = (allow, , ), U (allow) U (defer) and R(allow) t(U (allow)); deny if (req ) = (deny, , ), U (deny) U (defer) and

R(deny) t(U (deny)); defer otherwise. More generally still we may not always have a decision with zero risk, and hence we may need some way to resolve a decision when all decisions exceed the risk threshold. Further, one may wish to be able to override a decision whose risk is greater than t(udec ) by compensating for the additional aggregate risk. For example, additional risk mitigating measures, risk escrow services, or bounding the aggregate risk and limiting its distribution [21] have been proposed. This is an orthogonal research problem and the most appropriate technique is a subject of future research. Both risk methods have their advantages and disadvantages. For instance, risk adjusted utility behaves more like standard expected utilities, and there is always a clearly dened decision for the assessor. Conversely, the independent risk assessor keeps the two measures utility and risk separate, is more exible and we could dene multiple risk thresholds at different signicance levels. However, we do not necessarily know how to deal with situations where all decisions violate the risk constraint(s). D. Estimating Gain and Damage with Uncertainty The instantiation of the parameters in order to address a real-world situation is a complex task, as it requires one to evaluate the damage and utility associated with access control entities. In some instances, all values are easily derivable and in a common set of units. For example, when processing credit card transactions, all values are in currencies, and rates are agreed upon a priori through formal contracts. Consider the Visa Debit Processing Service [25], [26]: the damage from a false allow is the transaction amount, plus a Chargeback Fee, e.g., a xed amount of $15 or a percentage, e.g., 5%; the gain is the prot margin for the items sold, minus any processing fees, e.g., 20% the value of the transaction; the contact cost is an Authorisation Fee, a xed value to process the transaction or a percentage (or both), e.g., $1 or 3%; and the false deny can be estimated as a xed fraction of the gains from customers that do not have alternative forms of payment. In many scenarios, the actual value of gains, losses, and damages may be given in different units, e.g., currency, reputation, and physical assets, and may themselves be uncertain. Our discussion above assumes these measures are point estimates in a common set of units, for example currency, where there is further uncertainty in such a conversion. We suggest the same techniques used to calculate the risk of making an incorrect decision described in Section IV-B can be applied to selecting conservative, risk-adjusted conversion rates and handling the uncertainty in value distributions. If one obtains a probability density function (PDF) over values, say human life to dollars, the expected value can be used for gains, while a pessimistic value can be used for damages, resulting in conservative estimates. For example, if we take US household income as an estimator, then the median income, around $34 55K, is the expected and the top 5%, $157K, might be used for the risk. Clearly, our method will work best when these values are known, such as in credit card processing use case. When these

values have distributions over some ranges, our approach is sound as long as such distributions are conservative, overestimating damages and losses while under-estimating gains. Further, note that we are being conservative here by taking the pessimistic probability and the pessimistic damage (the worst n% of the worst n%.) V. E XPERIMENTS In this section, we describe our prototype system and experimental results. To evaluate risk-based decision making, we devise a simple decentralized access control system consisting of a single local risk-based policy enforcement and decision point, and a central policy decision point that we call the oracle. Users submit requests, which are locally evaluated, and an estimate of the uncertainty calculated as described in Section IV-B. The system evaluates potential risks and gains, and determines when decisions can be made locally, and when the oracle must be queried at a cost. We simulate a large number of requests, e.g., 100,000, logging the request, decisions, and uncertainty. Afterwards, we compare each decision with the oracle, and aggregate the net utility of the system, including any gains, costs and damages incurred during operation. We rst describe our experimental system in more detail. For clarity, we distinguish between standard damage (of a false-allow) with loss (damage of a false-deny). A. Experiments The oracle is considered infallible, i.e., it always returns correct results, and contains a real access control policy from a system that provisions administrative access to a commercial system obtained from a corporation. The policy consists of 881 permissions and 852 users, with 25 attributes describing each user; there are 6691 allowed requests. As we do not have access to actual permission usage logs, e.g., requests or frequency of permission use, we simulate incoming requests by sampling from the full policy, generating both valid (allow) and invalid (deny) requests. Users are selected uniformly randomly, and for permissions we rst determine whether the request is valid with input probability, pv , then if valid (invalid) choose from the pool of valid (invalid) requests in an unbiased way. This allows us to generate sample requests with an arbitrary frequency of valid transactions, e.g. 30%, 50% or 80%, and hence to model different scenarios, such as a system under attack (mostly invalid requests), or normal usage (mostly valid requests). B. Classiers Our experimental prototype is implemented in Python, and uses PyML for the support vector machine1 [27] we use as our classier. A support vector machine (SVM) is a supervised learning method for classication and regression. Given labeled points in a k -dimensional space, an SVM nds a hyperplane (normal vector w and offset b from the origin) that separates the two classes and maximizes the minimal distance between any point and the hyperplane. The class of an unseen
1 PyML

provides a wrapper for libSVM

point x is determined by which side of the hyperplane it is on, i.e., sign(w x b) [27]. See Figure 2. An SVM is a linear classier, however a kernel can be applied rst to support nonlinear classication. Kernels may result in over tting when the number of input datapoints is too small, e.g., less than or equal to the degree of a polynomial kernel. To prevent this, we do not train the SVM until a sufcient number of data points are available. We use both polynomial and Gaussian kernels. Intuitively, there is less uncertainty in a prediction Hyperplane Support Vectors i the farther x j away a point k is from the hyperplane, a distance given by | w x b |, and the uncertainty highest Fig. 2. A Support Vector Machine classier & (, ) is calculation for uncertainty. between the support vectors (the points closest to the hyperplane). We dene an (i , i ) value for each band (hyper-volume) corresponding to the point i in our cached training set. Any point correctly classied closer to the hyperplane increments i , while each point farther away incorrectly classied increments i . For example, in Figure 2, the incorrectly predicted point k will increment j , while correctly classifying a new point x would increment i (and points farther away). The and values are only valid within their respective bands, inclusively. We use the points in the training set as our initial sample, and further update these quantities when we defer to the oracle if the uncertainty is too high. When we retrain, all , values are reset. The hyperplane itself has = 1, = 1. In our setting, the data points represent users. We map a user to a k -dimensional space by converting their attributes into binary vectors. For example, consider a Department attribute of a company that has three values: R&D, Sales, and Marketing. These are represented by the binary vectors 0, 0, 1 , 0, 1, 0 , and 1, 0, 0 , respectively. The nal k -bit vector for a user is the concatenation of all attributes. In the remainder of this paper when we refer to a subject s S , we are referring to a point in some k -dimensional space. Finally, there are | O A | SVM classiers, one per permission, trained on subjects s S . An alternative implementation is to dene a single classier whose input is S O A, however, based on the results from [6], some object-right requests have more uncertainty than others. To simplify calculating the uncertainty of the classied requests, we choose to dene a classier per permission. Each PDP based on an SVM will store a small number of data points (subjects and decisions) in a buffer for later retraining. Before predicting the decision using an SVM, we
al lo w
1,
j

3,

de ny

rst check to see if the point is in the cached buffer. When it is, we return the known decision with zero uncertainty (assuming a static policy). When a new input and decision is specied by querying the oracle, it is provided to the SVM so that it may be retrained. When the number of samples exceeds the size of the buffer a new SVM is trained and the point correctly classied and project farthest from the hyperplane is discarded. Thus the buffer contains the points that most closely dene the support vectors. We use polynomial (degree four) and radial bias function (RBF) kernels in our experiments. We implement several baseline PDPs to compare against the risk-based PDP: a local PDP that will always defer to the central PDP (Default PDP), and a rst-in-rst-out cache (FIFO Cache), such that if the request is not in the cache, the oracle is queried. We use two FIFO cache sizes: (E) the size of the FIFO cache is equal to the total memory of the SVMs; (Inf) the FIFO cache is unbounded. We also implement three learning-based PDPs: the rst one (SACMAT09) is a PDP consistent with the approach taken by Ni et al. , where we train an SVM for each permission and accept any decision they return, i.e., we assume there is zero uncertainty. The second one (Seeded SVM) is a PDP where an SVM is constructed for each permission and seeded with n random users granted the permission, and m random users not granted the permission. We train and test on the n + m users and we use n = m = 10. Finally, the third one (Unseeded SVM) uses an untrained SVM created for each permission. Before the SVMs can be trained a minimum number of allow and deny decisions must be specied by querying the oracle. The buffer of an unseeded SVM is the same as a seeded SVM. We note that our approach is not tied to any specic classier, and we introduce a black box classier as an example. This classier randomly selects a level of uncertainty, and then returns a correct or incorrect decision sampled from that accuracy. The performance of our experiments further reinforces the validity of our assumption that similar requests receive similar decisions. C. Evaluation Results In our experiments, we rst compare our general approach with a default policy that always contacts the central PDP, and the two FIFO caches, the bounded size cache (the same as all SVMs combined), and the unbounded (see previous section for details). For these experiments we restrict ourselves to the seeded SVM settings and evaluate the results against the scenarios described in Figure 3 (cc = the cost of contacting the central PDP; g = the gains for true-allow, d = the damage of false-deny, l = the loss of false-allow. For space limitations we only show tables where true-denies have 0 gain and 0 loss). The gure shows that regardless of the parameters our approach behave well, i.e. calls to the central PDP are reduced (in some cases as much as 75% see Figure 3(a)) and the utility is better than in any of the other standard approaches. This conrms the independence of our approach from the setting of the parameters making it applicable to many situations.

Transaction Number

(a) Small cc, g = 2cc, d = l = 2g

Transaction Number

(b) Small cc, g = 4cc, d = 10g , l = 0

Utility

Transaction Number

(c) Small cc, g = 10cc, d = 2cc, l = 10g Fig. 3.


10000 8000 6000 4000 2000 00 200000 150000 100000 50000 0 50000 1000000 Central Queries

Evaluation under different system parameters.


10000 8000 6000 4000 2000 00 350000 300000 250000 200000 150000 100000 50000 0 500000 Central Queries

Default PDP Seed (1,1) SVM Seed (n,m) SVM Simple FIFO (E) Simple FIFO (Inf)
2000 4000 6000 8000 10000

Utility

Utility

Default PDP Seed (1,1) SVM Seed (n,m) SVM Simple FIFO (E) Simple FIFO (Inf)
2000 4000 6000 8000 10000

Utility

2000

4000 6000 Transaction Number

8000

10000

Utility

2000

4000 6000 Transaction Number

8000

10000

(a) 30% Valid Requests

(b) 50% Valid Requests

Fig. 4. Risk-based classiers outperform traditional caching with high cachemiss rates.

Next, we consider the performance impact of the ratio of invalid versus valid requests on the classiers. We evaluate this for two reasons. First, the ratio in a typical deployment is unknown. Second, [6] shows that SVMs result in more false-denies than false-allows, so varying the ratio may reveal useful insights. We found that the sampling rate of valid versus invalid requests impacts the rate of cache hits for the FIFO implementations greatly affecting their performance. When the number of cache hits was low, for example, by sampling more invalid requests, we explore a larger amount of the possible request space, resulting in more cache misses before a hit is returned. Conversely, at a high rate of valid requests, a larger fraction of the valid sampled requests reside in the cache (in part due to the smaller sampling size because the fraction of allowed requests is so low). The higher number of false-denies also results in increased uncertainty, causing the SVMs to be retrained more frequently, and thus deferring more requests. The results are shown in Figure 4. Next, we consider the potential impact of trusting the SVM classiers without rst evaluating their uncertainty, as is done in [6]. Here, the false-allows and false-denies can result in devastating losses. To illustrate this, we use the decision resolver (SACMAT09) described earlier which accepts any predicted decision, regardless of the risk. Figure 5 illustrates the potential decrease in utility from a small number of falsepositives in the classier. Finally, we evaluate how well the systems perform when

reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

R EFERENCES
Fig. 5. The impact of not evaluating the classier risk. [1] Entrust, Getaccess design and administration guide, September 1999. [2] G. Karjoth, Access control with IBM Tivoli Access Manager, ACM Trans. Inf. Syst. Secur., 2003. [3] Netegrity, Siteminder concepts guide, 2000. [4] J. Crampton, W. Leung, and K. Beznosov, The secondary and approximate authorization model and its application to Bell-LaPadula policies, in Proceedings of 11th ACM Symposium in Access Control Models and Technologies, 2006. [5] Q. Wei, J. Crampton, K. Beznosov, and M. Ripeanu, Authorization recycling in rbac systems, in Proceedings of the 13th ACM symposium on Access control models and technologies, 2008. [6] Q. Ni, J. Lobo, S. B. Calo, P. Rohatgi, and E. Bertino, Automating role-based provisioning by learning from examples, in SACMAT, 2009. [7] L. LaPadula and D. Bell, Secure Computer Systems: A Mathematical Model, Journal of Computer Security, 1996. [8] D. F. Ferraiolo and D. R. Kuhn, Role-based access control, in Proceedings of the 15th National Computer Security Conference, 1992. [9] D. F. C. Brewer and M. J. Nash, The Chinese Wall Security Policy, in Proceedings of the IEEE Symposium on Security and Privacy, 1989. [10] P. Bonatti and P. Samarati, Regulating service access and information release on the web, in Proceedings of the 7th ACM conference on Computer and communications security, 2000. [11] L. Wang, D. Wijesekera, and S. Jajodia, A logic-based framework for attribute based access control, in Proceedings of the 2004 ACM workshop on Formal methods in security engineering, 2004. [12] Y. Park, S. C. Gates, W. Teiken, and S. N. Chari, System for automatic estimation of data sensitivity with applications to access control and other applications, in Demo at SACMAT11, 2011. [13] K. Borders, X. Zhao, and A. Prakash, Cpol: high-performance policy evaluation, in Proceedings of the 12th ACM conference on Computer and communications security, 2005. [14] M. Wimmer and A. Kemper, An authorization framework for sharing data in web service federations, in Secure Data Management, 2005. [15] Q. Wei, M. Ripeanu, and K. Beznosov, Cooperative secondary authorization recycling, in Proceedings of the 16th international symposium on High performance distributed computing, 2007. [16] M. V. Tripunitara and B. Carbunar, Efcient access enforcement in distributed role-based access control (RBAC) deployments, in Proc. of the 14th ACM Symp. on Access control models and technologies, 2009. [17] A. Rosenthal and E. Sciore, Administering permissions for distributed data: factoring and automated inference, in Proceedings of the fteenth annual working conference on Database and application security, 2002. [18] S. Rizvi, A. Mendelzon, S. Sudarshan, and P. Roy, Extending query rewriting techniques for ne-grained access control, in Proc. of the 2004 ACM SIGMOD international Conf. on Management of data, 2004. [19] L. Zhang, A. Brodsky, and S. Jajodia, Toward Information Sharing: Benet And Risk Access Control (BARAC), 7th IEEE Intl. Workshop on Policies for Distributed Systems and Networks (POLICY06), 2006. [20] P.-C. Cheng, P. Rohatgi, C. Keser, P. A. Karger, G. M. Wagner, and A. S. Reninger, Fuzzy multi-level security: An experiment on quantied riskadaptive access control, in IEEE Symp. on Security and Privacy, 2007. [21] I. Molloy, P.-C. Cheng, and P. Rohatgi, Trading in risk: Using markets to improve access control, in NSPW, 2008. [22] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2007. [23] I. Molloy, N. Li, J. Lobo, Y. A. Qi, and L. Dickens, Mining roles with noisy data, in SACMAT10, 2010. [24] M. Frank, A. Streich, D. Basin, and J. Buhmann, A probabilistic approach to hybrid role mining, in CCS 09: Proceedings of the 16th ACM conference on Computer and communications security, 2009. [25] Visa Debit Processing Service, Transaction Processing, Authorization Processing, http://www.visadps.com/services/authorization processing.html. [26] Card-Present, Merchants, Visa USA, http://usa.visa.com/merchants/risk management/card present.html. [27] V. N. Vapnik, Statistical Learning Theory. Wiley-Interscience, September 1998, ISBN 9780471030034.

the communication costs become prohibitive. Here, we set the cost to contact the oracle to be ve-times the gains from allowing a valid request, and set the damage to be four-times the communication costs. This type of scenario may correspond to a military setting where communication is expensive, e.g., requiring a satellite link, or where radio usage should be minimized to avoid triangulation by the enemy. The results, shown in Figure 6, illustrate that while the risk-based outperforms the FIFO caches and reduces queries by 65%, it cannot keep the utility positive. However, it should be noted that the FIFO caches lose over three-times more utility due to the communication costs.

Fig. 6. Communication costs is 5 times the gains, and damages are 10 times the gains.

VI. C ONCLUSION This paper presents a new learning and risk-based architecture for distributed policy enforcement under uncertainty. The trade-off between the uncertainty and utility of decisions determines whether they can be taken locally or the central PDP queried. We present three approaches, Expected Utility, Risk Adjusted Utility, and Independent Risk Constraints, each representing a different treatment of risk. Our approach is validated using data from a large corporation, and consistently performs better than a naive caching mechanism, in a number of different scenarios. Under ideal conditions, the rate of central PDP queries is reduced up to 75% and an eight-fold increase in system utility is seen. Such improvements occur when the cache hit rate is low. We plan to compare our method with the inference techniques cited in Section II. Dynamic access control policies could also be considered, where the correct decision for each request can change over time. In this context, the uncertainty returned by the proposer must combine uncertainty due to classier error with uncertainty due to possible policy updates. Finally, additional research is needed to better determine when a classier should be retrained and when a given sample should be discarded, particularly challenging when dealing with dynamic policy changes. ACKNOWLEDGMENT
This research was sponsored by the U.S. Army Research Laboratory and the U.K. Ministry of Defence and was accomplished under Agreement Number W911NF-06-3-0001. The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the ofcial policies, either expressed or implied, of the U.S. Army Research Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K. Government. The U.S. and U.K. Governments are authorized to

Você também pode gostar