Você está na página 1de 32

J Supercomput (2013) 65:185216 DOI 10.

1007/s11227-011-0711-4

Ontology as a Service (OaaS): a case for sub-ontology merging on the cloud


Andrew Flahive David Taniar Wenny Rahayu

Published online: 20 October 2011 Springer Science+Business Media, LLC 2011

Abstract Cloud computing is a revolution in the information technology industry. It allows computing services provided as utilities. The traditional cloud services include Software as a Service, Platform as a Service, Hardware/Infrastructure as a Service, and Database as a Service. In this paper, we introduce the notion of Ontology as a Service (OaaS), whereby the ontology tailoring process is a service in the cloud. This is particularly relevant as we are moving toward Cloud 2.0multi-cloud providers to provide an interoperable service to customers. To illustrate OaaS, in this paper we propose sub-ontology extraction and merging, whereby multiple sub-ontologies are extracted from various source ontologies, and then these extracted sub-ontologies are merged to form a complete ontology to be used by the user. We use the Minimum extraction method to facilitate this. A walkthrough case study using the UMLS metathesaurus ontology is elaborated, and its performance in the cloud is also discussed. Keywords Cloud computing Ontology 1 Introduction Cloud computing is nowadays getting increasing attention by information technology industries, as it is regarded by many a revolution in the computing practices
A. Flahive W. Rahayu ( ) Department of Computer Science and Computer Engineering, La Trobe University, Bundoora, Victoria 3086, Australia e-mail: W.Rahayu@latrobe.edu.au A. Flahive e-mail: apahive@gmail.com D. Taniar Clayton School of Information Technology, Monash University, Clayton, Victoria 3800, Australia e-mail: David.Taniar@monash.edu

186

A. Flahive et al.

through the concept of utility computingthat is, computing services provided as utilities [30, 36]. An increasing number of service providers have arisen in the market. To name a few, major players in cloud computing include Google, Amazon, and Microsoft. They provide various services from computing power, storage, to various applications. These cloud providers are enterprise cloudan enterprise that provides fee for service utilities. Current services in the enterprise cloud exist in various formats. Software as a Service (SaaS) is a model whereby applications or piece of software hosted by a service provider is given customers or users access through the Internet. Because the software is hosted off-site, the customer does not have to maintain it or support it. Customer pays to the host only when the software is used. Following the SaaS model, Platform as a Service (PaaS) is another model whereby the host supplies all the resources (or platform) required to build applications, and, of course, the services completely accessible through the Internet without the need to download or install the software. PaaS includes services to cover the entire lifecycle of application development, from design, coding, testing, and deployment to hosting. The current stage of cloud technology still prevents a smooth transfer of customers platform from one cloud to another, should the customer decide to switch between clouds. Not only software and platforms are provided as a service in the cloud, hardware and infrastructure are also a service. Hardware as a Service (HaaS) or sometime known as Infrastructure as a Service (IaaS) is a model whereby the host provides the hardware and infrastructure to the customers. Basically HaaS/IaaS allows the customers/users to rent any infrastructure resources, such as server, network, memory, CPU cycles, storage spaces, etc. These infrastructures can be dynamically adjusted depending on need by the applications, and resources are billed based on a utility computing basis. Databases are often regarded as a service (i.e. Databases as a ServiceDaaS), whereby the host offers to host the entire database system of the customers, and consequently, the complexity and cost of running a database is no longer borne by the customers/users, but by the cloud host. All of these services are undeniably the core of cloud computing. However, cloud computing technology is progressing rapidly. Current cloud technology follows a single enterprise centric model, whereby one particular service for a customer is provided by one enterprise cloud. For example, PaaS allows a customer to have its platform service provided by one cloud provider. If, for instance, the customer needs a database from another cloud, or a particular piece of software from another cloud, or a particular hardware from a different cloud, a smooth collaboration between these clouds to support the same customer need is still in its infancy at this stage. It is expected that in the near future, there is a second wave of cloud computing, perhaps it is Cloud 2.0, whereby it allows multi-cloud providers to provide an interoperable service to customers [36]. This may mean that the storage is in one cloud, the application is in a different cloud, and the hardware may be in another cloud. As a result, this model promotes reusability and shareability. Customers are encouraged to search and then use the best services from various clouds. With this philosophy in mind, in this paper we introduce a concept of Ontology as a Service (OaaS).

Ontology as a Service (OaaS): a case for sub-ontology merging

187

Ontology is undoubted the breath of many applications that are domain specic. Some of the largest ontologies that are currently being used are the Gene Ontology [18, 19], the UMLS ontology [35], and the WordNet Ontology [29, 37]. Each of these ontologies represents the explicit knowledge that is understood by those in the particular domain. Although ontologies represent knowledge of a domain using complex interconnecting relationships between seemingly disparate pieces of information more completely than any other way, accessing the knowledge within is not a simple task. In other words, large ontologies are very difcult to use in their entirety. This is largely due to their size and complexity of information contained within. Therefore, extracting smaller sub-ontologies, or ontology tailoring, makes the task of reusing ontologies easier and ore benecial. It is then expected that the large ontologies, such as UMLS, WordNet, etc., will be the source of more specialized and smaller ontologies tailored for a certain application. It is naturally expected that the source ontologies be multiple large ontologies, each of which is extracted, and then merged to create a smaller and tailored ontology for a specic domain. Therefore, extracting sub-ontologies, as well as merging them is a primary process. OaaS is able to provide ontology as a service whereby users are able to reuse and extract existing ontologies on the cloud, as well as merge the extracted subontologies. This service is made possible through the multi-cloud platform, which is a vision to be fullled in a not too distant future. Different cloud platforms may host different source ontology, and other cloud platforms may provide the algorithms to extract and merge sub-ontologies. 1.1 Scope of project Our project on OntoCloud is an ongoing project tackling how ontologies are extracted and further processed, tailored for a specic application domain. Figure 1 shows the scope of our project. The foundation of this project lies in the sub-ontology extraction, whereby we dened processes on how sub-ontology can be extracted. Sub-ontology extraction is often referred to as Reuse and Extract of ontology, in which basically parts of the source ontology are reused in the extracted sub-ontology. We have published this in Algorithmica [5]. Once sub-ontologies are extracted from the source ontology, they can be further processed. Combining with the concept of reuse of ontology, we have identied four sub-ontology operations, namely: (i) Reuse, Extract and Extend, (ii) Reuse, Extract and Add, (iii) Reuse, Extract and Merge, and (iv) Reuse, Extract and Replace. For simplicity, we use the terms Extend, Add, Merge, and Replace, respectively. Each of these four ontology tailoring techniques are developed using two extraction methods: Minimum and Maximum Extraction methods, aiming at making the extraction process more suitable to the cloud environment by enabling a choice of extraction method based on a particular cloud platform. The Maximum Extraction method aims to extract as many elements from the source ontology as possible in the initial stage of removing requirements to revisit the source ontology. The Minimum Extraction method aims to extract as few elements as possible in the initial stage but may require further services from the cloud.

188

A. Flahive et al.

Fig. 1 Our project on OntoCloud: ontology engineering for the cloud

Figure 1 depicts parts of the project that we have published. The Extend part of the project focusses on extending sub-ontology with new concepts and relationships. The main rationale for doing this is that the extracted sub-ontology is normally not extensive enough to be used in its entirety in a particular domain, and hence it is necessary to add with new concepts. We have published this in the Computer Standard and Interfaces journal [14]. The Add part of the project focusses on adding sub-ontologies with another complete ontology. This is often needed especially when the user is already using an ontology in their application but needs to extend their ontology with terms and concepts extracted by the source ontology. The maximum and minimum methods for this process have been published in the Logic Journal of IGPL [15]. The focus of this paper is on the Merge part of the project, whereby multiple extracted sub-ontologies are merged to form a complete ontology to be used by the user. Due to the length of the proposed method, in this paper we focus only the Minimum Extraction method of the Merge process. Other parts of the project (e.g. Maximum Extend and Merge, and the complete Replace) are being completed and their results will soon appear in various publication outlets. Before the proposed methods are described, it is critical to have deep understanding, particularly on the concept of ontologies that we are using, and the sub-ontology extraction methods. These will be described in the Preliminary Section in this paper. Then our proposed method will follow. We will also illustrate our proposed method in a walkthrough case study using the UMLS meta-thesaurus example. An analysis of the performance and applicability of our proposed method in a cloud environment will be presented as well.

Ontology as a Service (OaaS): a case for sub-ontology merging

189

2 Preliminaries and our previous work Our proposed method on ontology merging requires a complete understanding of two important concepts: (i) the concept of complete and valid ontologies, and (ii) subontology extraction. The former can be used as a metric to validate whether or not the extracted sub-ontology is valid and complete, whereas the latter denes the process that will be used as the basis for our proposed ontology merging. 2.1 Complete and valid ontologies When extracting a sub-ontology from a big ontology, one important criterion for the sub-ontology is that it must be complete and valid. Our previous work has elaborated these two concepts [15]. For the sake of completeness of this paper, however, in this section, we re-iterate the concept of complete and valid, because this is the foundation of ontology engineering, in particular sub-ontology extraction. To determine the completeness of an ontology, certain semantic checks must take place. These checks ensure that the ontology maintains the meaning of the information extracted. Once an ontology is said to be complete, it is assumed to be semantically correct and accepted, based on the meaning of its contents. Denition 1 A complete ontology (O ) is dened as an ontology that is semantically perfect with no ambiguous terms or notions. A complete ontology is one that is accepted based on the meaning of its contents rather than its interconnectivity. Whereas a complete ontology is one that is semantically perfect, a valid ontology is one where all of its interconnections are architecturally perfect. As an ontology can be viewed as an ontology graph, a valid ontology is described as one that adheres to proper connected graph denitions. Denition 2 A valid ontology (O ) is dened as an ontology that can be viewed as a connected graph, with every concept connected to every other concept through one or more relationships. Checking for a complete and valid ontology ensures that the ontology conforms to all of the rules regarding the meaning of its contents as well as the interconnectivity of its elements. Ontologies are made up of four main elements: (i) concepts, (ii) properties, (iii) property mappings, and (iv) relationships. Different ontology languages specify numerous features that make up an ontology, but most of these languages comprise these four basic elements. These four elements are used in an ontology to describe relevant objects, relationships and information from the real world. Denition 3 An ontology is made up of a set of Concepts (C ), Properties (P ), Property Mappings (T ) and Relationships between the Concepts (R ). Let O dene an ontology, let C dene the set of concepts in the ontology, let P dene the set of properties of the concepts, let T dene the set of property mappings,

190 Fig. 2 All Elements: Concepts (C ), Properties (P ), Property Mappings (T ) and Relationships (R )

A. Flahive et al.

mapping properties to concepts, and let R dene the set of relationships that relate one concept to another: O = {C, P , T , R } Concepts are the nodes or objects that identify something that exists. Relationships are used to indicate a similarity between two concepts within an ontology. They can either link two concepts together or loop back and link to the same concept. Properties provide extra features used to identify the concept. The property mapping element is similar to a relationship element, but it links a property to a concept rather that one concept to another. The four elements are illustrated in Fig. 2. This ontology denition describes an ontology in terms of the elements that comprise it. However, the presence of these elements alone does not necessarily mean that they make up a complete and valid ontology. An ontology will be dened using an ontology graph. Before ontology graphs are introduced, a denition of a graph is provided: Denition 4 A Graph (G) is a set of vertices (V ) and edges (E ). A vertex is usually conceptualized as a point. An edge is usually conceptualized as line connecting two points or connecting one point to itself. Hence, Graph G = {V , E }. Denition 5 A path in a graph is a sequence of alternating vertices (V ) and edges (E ) in which no edges or vertices are used twice and each edge connects two different vertices. Hence, Path P = {V , E }. An ontology contains concepts and relationships that are all connected in some way. If there are some concepts that are disconnected, then these concepts would form a second ontology and hence fail to form a single valid ontology. A single ontology

Ontology as a Service (OaaS): a case for sub-ontology merging Fig. 3 A connected Graph (G)

191

cannot be in more than one part. A graph with the property of connectedness is called a connected graph. This is one where every vertex is connected to every other vertex in the graph via a path, P . An example of a connected graph is shown in Fig. 3. Denition 6 A Graph G is connected if every vertex is connected to every other vertex by a number of interconnecting edges. An ontology that can be mapped as a graph is called an Ontology Graph. An ontology graph is a simplied version of an ontology, and all ontologies can be mapped to an ontology graph. Denition 7 An ontology graph (Go) is a connected graph derived from an ontology (O ). Given an Ontology O = {C, P , T , R } and a Graph G = {V , E } Go = {Vo, Eo}, with Vo C P , and Eo T R

Given the denition of an ontology in Denition 3, it is assumed that all ontologies can be mapped to an Ontology Graph (Go) in this way. To enable an ontology to be checked for validity, its Concepts (C) and Properties (P ) must become vertices and its Attribute Mappings (T ) and Relationships (R) must become edges. Therefore, we use the graph theory techniques to assist with the checking of ontologies validity. 2.2 Sub-ontology extraction Sub-ontology extraction which extracts a sub-ontology from a bigger ontology is the main foundation of ontology tailoring. Our previous work has elaborated various optimization schemes for sub-ontology extraction [38]. In this section, we only summarize important points of sub-ontology extraction which will be used as the basis for our sub-ontology merging. The rst step in extracting an ontology is to identify which individual elements are required to be reused and which should not be reused. This is done by attaching a list of associated labels to the elements, from within the ontology, specifying whether they should or should not be present in the nal solution. This process of selecting/deselecting which elements to be reused is called labelling. The process of labelling is dened as follows.

192

A. Flahive et al.

Denition 8 Labelling of an ontology denes the set of concepts, properties, property mappings and relationships that should or should not be reused in the resulting sub-ontology. 1. A Selected labelling indicates that the element is required to be reused and it will be extracted as part of the sub-ontology. 2. A Deselected labelling indicates that the element should not be a part of the extracted sub-ontology. 3. An Undecided labelling indicates that the element is not required at this time but may be changed to selected or deselected if required to form a complete and valid sub-ontology. Let L be the set of Labels, L = {selected, deselected, undecided} Denition 9 Let dene when a set of elements or an entire ontology has had labelling applied to each of their elements. The elements of an Ontology (O ) are said to be labelled when: (O) = (C), (P ), (T ), (R). Once the elements within the ontology have been labelled with selected, deselected, and undecided, (O), they are ready to be extracted. However, certain checks must take place before extraction to ensure the result is a complete and valid sub-ontology. Notation 1 If an element x within an entity is given a label (l ), it can be represented as (x) = l. Here labelling ( ) is applied to Ontology (O ): x X {C, P , T , R }. Where certain labelled entities need to be described, the following notation is used. Notation 2 Let S all selected entities, let D all deselected entities and let U all undecided entities. Given these notations, a labelling ( ) of Ontology (O), S can only become a sub-ontology after it has been checked for completeness and validity, as described in Denitions 1 and 2. Therefore, being complete and valid means the following. Notation 3 If S O , then S $O , where the symbol $ is used to indicate the existence of a sub-ontology.

Ontology as a Service (OaaS): a case for sub-ontology merging

193

Once the labelling has been applied to the base ontology, a number of concepts, properties, property mappings and relationships can be extracted as a sub-set of the base ontology. A sub-set cannot be classed as a sub-ontology, but a sub-ontology can be classed as a sub-set. Denition 10 A Sub-set (S ), of an ontology (O ), is a part or section of that ontology, containing a sample of its concepts, properties, property mappings and relationships. Let C s C , let P s P , let T s T and let R s R such that: S = {C s , P s , T s , R s } A Sub-set (S ) of an Ontology (O ) is hereon dened as SO As there are no rules governing the validity of a sub-set of an ontology, a sub-set does not necessarily conform to the constraints of a complete and valid ontology and thus cannot be assumed to be a complete and valid sub-ontology. When attempting to reuse a domain ontology, it is important that the extracted sub-ontology is a complete and valid sub-ontology in its own right, such that certain ontology operations can be applied to it with condence. Denition 11 Let O be a complete and valid sub-ontology of ontology (O ), where C C , and P P , and T T , and R R : O = {C , P , T , R } A sub-ontology (O ) of ontology (O ) is therefore dened as O $O Notation 4 Where S is a sub-set of O and O is a complete and valid sub-ontology of O , we have the following. If S = O , thenS !$O, where !$indicates the negation of $. Sub-sets and sub-ontologies are both very different but share many elements from the original ontology. Their main difference is that the elements of a sub-ontology (O ) form a complete and valid ontology whereas a sub-set (S) does not necessarily conform to this standard. A sub-set may contain a mixture of unrelated pieces of information extracted from a large ontology. A sub-ontology also contains pieces of information from a large ontology but the pieces of information are structured and form a complete and valid ontology in their own right. A sub-ontology is essentially an ontology that is derived from a larger ontology. A sub-set can also be derived from a large ontology that is not an ontology.

194

A. Flahive et al.

Proposition 1 A sub-set (S ) of ontology (O ) can only be equivalent to sub-ontology (O ) of ontology (O ) if, given O = {C , P , T , R } and S = {C s , P s , T s , R s } S = O only if { C = C s } { P = P s } { T = T s } {R = R s } otherwise: S = O When the sub-set contains exactly the same elements as the sub-ontology, then the sub-set (S) is equal to the sub-ontology (O ). If they are not equal, the sub-set (S) is not a sub-ontology of (O ). Just as labelled ontologies are checked for completeness, they must also be checked for validity before the extraction of a complete and valid sub-ontology. Validating the labelled ontology involves the interconnectedness of the ontology. As such, the ontology is mapped to an ontology graph, as per Denition 7. There are several checks that must take place on the extracted ontology graph to ensure the result of the extraction is a valid connected sub-ontology. The rst restriction is that of a single ontology graph, as there cannot be more than one ontology graph classed as a single ontology. As ontologies exist to specify a domain of existence, any unconnected ontology can be seen as not being part of the domain and thus being a separate ontology in its own right. Denition 12 An Ontology Graph (Go) is a graph representation derived from a single ontology. A single valid ontology graph cannot have separate ontologies included in its domain. Ontologies that are not connected in an Ontology Graph are deemed to be separate ontologies. Similarly, a single valid Ontology Graph should not contain any sub-parts that are not linked in any way to the main ontology graph. Such sub-parts are called Islands. Islands are caused when certain elements from the original ontology are not selected to be included in the resulting sub-set, leaving some elements physically separated from the main ontology graph. Denition 13 An island (Gi) is a small ontology graph derived from the original ontology (O ), along with the main ontology graph (Go), which is physically disconnected from the main ontology graph (Go). Here is dened as NULL. Given Ontology O = {C, P , T , R }, and Ontology Graph Go = {Vo, Eo}, an Island is dened as Gi = {Vi, Ei} with Vi = , Gi Go = , and Vi C P , Ei T R Islands (Gi) are sub-ontologies. If these islands are required to belong to one ontology solution, then a relationship between them must be found. Islands must be

Ontology as a Service (OaaS): a case for sub-ontology merging

195

reconnected to the main ontology graph via a path of vertices and edges from the original ontology or removed altogether. This will result in a single, valid and connected sub-ontology graph (O o). If they are not reconnected, then the solution fails to be a single valid ontology. A hybrid link is often used to connect an island to the main ontology graph. A hybrid link is a path that links one concept to another. A hybrid link only contains enough concepts and relationships to connect the two concepts together. No other information is sought. This means that the hybrid links are not semantically complete. This is not often an issue, though, as these elements are not part of the initial criteria and of lower signicance in the nal solution. Denition 14 A hybrid link (or a hybrid relationship) is a path (P ) of concepts (C ) and relationships (R ) that are used to connect islands (Gi(0 n)) to the main ontology graph (Go). HL = P = {C, R } If semantic completeness were mandatory to all elements in the extracted subontology, then a nal completeness check would be required to ensure that all hybrid links are semantically complete. Proposition 2 For a Sub-Ontology Graph (O o) to be a valid mapping of an Ontology (O ) there must exist no Islands (Gi). To resolve this more elements are required to be retrieved from the original ontology (O ) to attach all Islands to main Ontology Graph (Go) through a Hybrid Link (HL ) or be removed. This will ensure a completely connected single valid ontology graph (Go). We have O o = Go + (Gi1 + HL1 ) + (Gi2 + HL2 ) + + (Gin + HLn ) or O o = Go Gi1 Gi2 Gin or any combination of the two that results in all Islands being addressed. In most cases, though, (2) will be used, as islands were rst created because elements in that island were required in the solution and hence must exist in the nal ontology graph. Notation 5 If the main ontology graph (Go) is equal to the derived sub-ontology graph (O o), then (Go) contains only one ontology graph and no Islands. If Go + Gi = Go, then Go = O o (2) (1)

196 Fig. 4 A labelled ontology

A. Flahive et al.

3 Proposed ontology merging using minimum extraction method 3.1 Minimum extraction method As described earlier in the Introduction section, there are two methods for ontology merging: (i) Minimum extraction, and (ii) Maximum extraction. In this paper, we focus on the Minimum extraction method, whereby it aims to extract as fewer elements as possible in the initial stage, but may require further visits to the original ontology, in order to extract more concepts and relationships. This method extracts the bare minimum of elements from the large base ontology. After further operations are completed further elements may be required. At this time the elements are requested for and retrieved from the original domain ontology. The benets are that there are potentially less data being retrieved if good quality labelling was applied. The drawbacks are that there may be further communication with the base ontology if further elements are required. These requests for more elements may reduce the overall efciency of the ontology tailoring process if too many extra elements are required. In the end, the result is a complete and valid solution as per Denitions 1 and 2. Given an initial labelling of ontology 1, as in Fig. 4, a minimum extraction approach to this labelled ontology would look to only extract the concepts and relationships that have been specically selected. This would include concepts C 10, C 11, C 16, C 17, C 18, C 19 and the connecting relationships. These extracted elements are shown in Fig. 5. The next step, however, will involve determining which other elements will be required to make a complete and valid sub-ontology. This will mean returning to the original labelled ontology to retrieve concepts C 12, C 13, C 14, C 15 and the connecting relationships. This will form a valid sub-ontology but other elements may be required to achieve a complete solution. The nal complete and valid sub-ontology is shown in Fig. 6.

Ontology as a Service (OaaS): a case for sub-ontology merging Fig. 5 Initial minimum extraction

197

Fig. 6 Final sub-ontology

3.2 Merging sub-ontologies using minimum extraction Merging two related ontologies involves taking the union of the concepts of both ontologies and bridging them. One of the main issues is to nd common points to merge the ontologies, and additionally, knowledge workers must make sure to include as many merging points as possible in their initial labelling of the two ontologies to ensure a strong merge. The notation below ensures that both ontology 1 and ontology 2 are complete and valid at the start of the merging process. Notation 6 Ensure that ontology 1 and ontology 2 are complete and valid ontologies as per Denitions 1 and 2. Here: O1 = {C1 , P1 , T1 , R1 } and O2 = {C2 , P2 , T2 , R2 } Assume that: O1 = Complete and Valid

198

A. Flahive et al.

and O2 = Complete and Valid Also, O1 = O2 but, C1 C2 = Therefore, Mp (O1 O2 ) = So as long as ontology 1 and ontology 2 contain at least one similar concept, the merging of sub-set 1 and sub-set 2 can occur. Ontology 1 and ontology 2 can now have initial labelling applied at this point so that it is clear which elements are required to be merged. Notation 7 Apply labelling to Ontology 1 and Ontology 2 as per Denition 8. Given: O1 = {C1 , P1 , T1 , R1 } and O2 = {C2 , P2 , T2 , R2 } Applying labelling: (O1 ) = (C1 ), (P1 ), (T1 ), (R1 ) and (O2 ) = (C2 ), (P2 ), (T2 ), (R2 ) As the labelling has been applied the minimum extraction method will extract as few elements as possible so as to reach a complete and valid merged ontology. There are many steps in the merging process; these are summarized below. Check for consistency completeness of the initially labelled ontology 1 and ontology 2. Extract all selected elements into sub-set 1 and sub-set 2. Check that there is at least one similar merge point in both sub-sets. Ensure validity of each sub-set so that every element is connected to a valid merge point. Retrieve any required elements from the original ontology. Merge each sub-ontology of sub-set 1 to each corresponding sub-ontology of subset 2 at each of the merge points. Check for validity of the new merged ontology. Check for semantic completeness of the new merged ontology and retrieve any nal elements from the original ontologies to ensure a complete and valid solution. 3.2.1 Completeness checking Consistency checking is required so as to ensure that there are no conicts with the initial labelling of ontology 1 ( (O1 )) and ontology 2 ( (O2 )). This process may change the labelling so as to pass this initial consistency completeness check. If consistency cannot be reached then the process must stop until a compromise is reached in the labelling. In a normal sub-ontology extraction process, the ontology labellings (O1 ) and (O2 ) would now be checked for completeness so that they conformed to a complete ontology as per Denition 1. However, as the extracted sections of O1 and O2 are

Ontology as a Service (OaaS): a case for sub-ontology merging

199

intended to be merged, some semantically incomplete sections of O1 may be made complete by the merging of some sections of O2 and vice versa. So this completeness check will take place closer to the nal solution. 3.2.2 Extract sub-sets The minimum requirements for extraction are all of the selected elements of the labelled ontologies (O1 ) and (O2 ). Therefore all of the selected elements (s ) of the labelled ontologies are extracted into sub-set 1 (S1 ) and sub-set 2 (S2 ). Notation 8 Where S1 is a sub-set of O1 and S2 is a sub-set of O2 : given: O1 = {C1 , P1 , T1 , R1 }, and given: O2 = {C2 , P2 , T2 , R2 }, and (O2 ) = (C2 ), (P2 ), (T2 ), (R2 ) and (O1 ) = (C1 ), (P1 ), (T1 ), (R1 )

Extract all selected elements from each s (O) = s (C), s (P ), s (T ), s (R) S1 = s (O1 ) S2 = s (O2 ) S1 O1 S2 O2 S1 !$ O1 S2 !$ O2 As a valid sub-ontology is not ensured at this time, the elements must be extracted as a sub-set, and not as a sub-ontology. 3.2.3 Check for a valid merge point It is necessary to make sure that at each valid merge point the surrounding elements form individual sub-ontologies of their own before the merge takes place. Every element from sub-set 1 must be linked via a path to a valid merge point. Likewise every element in sub-set 2 must also link via a path to a valid merge point. If the elements cannot be linked by some path to a valid merged point, then the extraction of either sub-set 1 or sub-set 2 cannot be classed as valid. More elements must be retrieved from the original ontology so as to link these outlying elements to the best valid merge point. It is important to dene such a set when it comes to merging or joining ontologies together. As ontologies may be joined at different points it must be clear as to which concepts are similar so that these may be added to the list of selected concepts to be extracted if required.

200

A. Flahive et al.

Denition 15 Let Mp be dened as the set of concepts that are common to two different ontologies, where O1 = {C1 , P1 , T1 , R1 } and O2 = {C2 , P2 , T2 , R2 } Mp (O1 O2 ) = C1 C2 This means that Mp (O1 O2 ) is a sub-set of the concepts of O1 and O2 . Mp (O1 O2 ) C1 C2 As there may be several Merge Points, it is necessary though, to make sure that at each merge point the connected elements form individual complete and valid subontologies on their own. Every element in Sub-set 1 (S1 ) must be linked by a path (P ) to a merge point. If they do not, then the extraction cannot be classed as valid at this point. More elements may need to be retrieved from the original ontology so as to link these outlying elements to the best merge point. Proposition 3 At least one valid Merge Point (Mp ) must exist in the sub-set (S ) if it is to be merged with another ontology or sub-set. Let there be given S1 = {C1 , P1 , T1 , R1 } and the set of merge points: Mp (O1 O2 ) = There must contain at least one merge point in the set of concepts (C1 ): C1 Mp (O1 O2 ) = If this condition is not satised then this validity check fails as the sub-set of ontology 1 cannot be merged to sub-set of ontology 2 at any point. At least one merge point must exists. Notation 9 Determining merge points as per Denition 15 and Proposition 3: given S1 = {C1 , P1 , T1 , R1 } and S2 = {C2 , P2 , T2 , R2 } ensure a set of valid merge points: Mp (S1 S2 ) = One similar merge point must be contained in the selected set of concepts from sub-set 1 and the selected set of concepts from sub-set 2 (i.e. for the merge points to be valid merge points they must exist in both C1 and C2 ). If there are no intersecting concepts in the selected concepts of both C1 and C2 then the validity check fails as the sub-set of ontology 1 cannot be merged with the sub-set of ontology 2 at any point. At least one valid merge point must exist in both sub-set 1 and sub-set 2. and O2 = {C2 , P2 , T2 , R2 } with S1 O1

Ontology as a Service (OaaS): a case for sub-ontology merging

201

3.2.4 Ensure validity before merging Now assuming that there is at least one valid merge point included in the concepts of sub-set 1 and sub-set 2, this validity check must make sure that each element of each of the sub-sets is part of a valid sub-ontology that is centred around at least one of the valid merge points. Each sub-set must contain at least one valid sub-ontology. Proposition 4 At least one valid Merge Point (Mp ) must exist in each sub-ontology (O (i)) of sub-set (S1 ) if S1 is to be added to another ontology. Let there be given S1 = {C1 , P1 , T1 , R1 } where O1 (i)$S1 O1 and the set of merge points: Mp (O1 O2 ) = There must be contained at least one merge point in each sub-ontology (O1 (i)): O1 (i) Mp (O1 O2 ) = Given this proposition, every element should then be part of a valid sub-ontology that contains at least one merge point. Notation 10 Check that each element is linked to a valid merge point as stated in Proposition 4: given S1 = {C1 , P1 , T1 , R1 } and S2 = {C2 , P2 , T2 , R2 } x X S1 a A S 2 y Mp (S1 S2 ) b Mp (S1 S2 ) and given a path p as per Denition 5: p = x 1 , x 2 , . . . , yn p = a 1, a 2, . . . , bn If there are no valid merge points in sub-set 1 and sub-set 2 ({y |b}n ) then the sub-sets cannot be considered valid. In the same way if any points along the paths x1 , x2 , . . . , yn or a1 , a2 , . . . , bn are not found in the sub-sets {S1 |S2 } then they also cannot be considered valid. There may be many paths that link the elements to the appropriate merge point ({x1 |a1 } to {yn |bn }), but only one is required to have each element along the path included in the sub-sets of ontology 1 and 2. As there may be only a few valid merge points, the paths may need to include many elements. and

202

A. Flahive et al.

3.2.5 Retrieve further elements Every element x of path p that is not included in S1 must be retrieved from ontology 1 and included in sub-set 1. Also every element a of path p that is not included in S2 must be retrieved from ontology 2 and included in sub-set 2. Notation 11 Retrieve any elements x that are not in S1 and retrieve any elements a that are not in S2 . Let there be given S1 = {C1 , P1 , T1 , R1 } where x X S1 and given S2 = {C2 , P2 , T2 , R2 } where a A S2 Now add x to S1 : S1 = S1 + x a A retrieve a from O2 Now add a to S2 : S2 = S2 + a If there is only one valid merge point, then S1 and S2 should each become a valid sub-ontology around this valid merge point. Otherwise S1 and S2 may contain two or more valid sub-ontologies (O1 (i)) of ontology 1 (O1 ) and (O2 (j )) of ontology 2 (O2 ). At this stage the sub-ontologies (O1 (i)) that make up sub-set 1 (S1 ) and (O2 (j )) that make up sub-set 2 (S2 ) are valid but not complete. After the merge they may become complete as elements may be shared between the two. 3.2.6 Merge sub-ontologies Once sub-set 1 (S1 ) and sub-set 2 (S2 ) are validated, the sub-sets should contain one or more valid sub-ontologies. Each sub-ontology will have one or more valid concepts in it that will be used as merging points. The sub-ontologies will now be merged at these points. This will then form one large ontology Q. Notation 12 Add each sub-ontology of sub-set 1 to each sub-ontology of sub-set 2 form a new ontology Q12 : where O1 (i)$S1 O1 and O2 (j )$S2 O2 Q12 = O1 (i) + O2 (j ) x X retrieve x from O1

Ontology as a Service (OaaS): a case for sub-ontology merging Fig. 7 The merged, unconnected sub-ontologies of Q12

203

The nal ontology Q12 is still far from nished. As illustrated in Fig. 7, Q12 may contain unconnected, merged, sub-ontologies. This is where the sub-ontologies from Q12 have merged successfully at the valid Merge Points, however, some merged subontologies may still be disconnected from the main sub-ontology. In Fig. 7, the grey merged ontology Q12 (2) is not connected to the white merged sub-ontology Q12 (1). More elements from either ontology 1 or ontology 2 will be required to be retrieved to be able to connect Q12 (2) to Q12 (1). At this stage Q12 cannot be classed as a complete and valid ontology as it does not form one complete connected ontology graph, as per Denition 12. 3.2.7 Merge validity As Q12 may contain unconnected sub-ontologies, a nal validity check is required to make sure that there are no additional elements required, from either ontology 1 or ontology 2, to connect any of these unconnected sub-ontologies. Notation 13 For Q12 to become a valid ontology, its ontology graph (GQ12 ) must be checked for interconnectedness. Given a merged sub-set Q12 = {C, P , T , R } and ontology Graph GQ12 = {ViQ12 , EiQ12 } with (VQ12 |ViQ12 ) C P and (EQ12 |EiQ12 ) T R GQ12 = {Gi Q112 + HL1 , Gi Q212 + HL2 , . . . , Gi Qk12 + HLk } For each island of Q12 a hybrid link path from Gi Q112 to Gi Qk12 , thus linking all sub-ontologies (or islands) of Q12 forming one large valid ontology Q12 . Thus, Q12 = Q12 = valid

204

A. Flahive et al.

The Hybrid Links identied must now be retrieved from either ontology 1 or ontology 2 and inserted into Q12 forming a valid ontology Q12 . 3.2.8 Final completeness check At this stage the ontology Q12 cannot be considered the nal solution. Q12 is now valid as it has a valid ontology graph but the elements have not been checked for completeness. For Q12 to be deemed complete and valid, a nal completeness check is required for each of the elements in the merged ontology Q12 . This will ensure that all elements are semantically complete and contain no ambiguous terms. This nal step involves applying completeness to the large valid ontology Q12 . Notation 14 Check the entire Q12 ontology for semantic completeness as per Denition 1. where Q12 = Q12 and Q12 = {Q112 + HL1 , Q212 + HL2 , . . . , Qk12 + HLk } and S12 O1 O2 and S12 Q12 = Apply a semantic completeness check to Q12 . S12 = Q12 (semantic_completeness) Q = Q12 + S12 Q12 = complete To get Q12 to this nal stage many other elements may have been retrieved from either O1 or O2 or both. This nal retrieval may add extra transport overheads but far fewer elements may have been transferred from the original ontologies than a maximum approach. If other ontologies are to be merged with these two ontologies then the adding technique discussed [15] should be employed to extract a sub-ontology from the third ontology and add this to the two merged ontologies. Using this method many ontologies can be merged together to form a collaboration of information from many different domains.

4 A walkthrough case study Healthcare has been one of the major applications in the cloud. A number of major players in healthcare industry that provide SaaS in the cloud, including Microsofts HealthVault [26] and AdvancedMD [1]. Microsofts HealthVault provides a software and services platform to people for managing their health data, whereas AdvancedMD provides healthcare providers to support their medical billing software needs.

Ontology as a Service (OaaS): a case for sub-ontology merging

205

Fig. 8 A portion of the UMLS meta-thesaurus ontology as a connected graph

Fig. 9 The current pharmacy ontology as a connected graph

To illustrate our proposed sub-ontology extraction and merging, we use a case study using a portion of the Unied Medical Language (UMLS) meta-thesaurus [35]. Our case study is neither to manage personal health data nor to provide medical billing, but to construct an ontology that will be useful for a specic medical domain. As a running example, see Fig. 8, which shows a portion of the UMLS meta-thesaurus (i.e. UMLS ontology is one of the largest in the medical domain and contains information from many sources). In our case study, there is a second source ontology, that is, a pharmacy ontology (see Fig. 9), which has previously been extracted from the UMLS. Now the user wants to extract and merge sub-ontologies from both source ontologies (i.e. UMLS and the pharmacy ontology). The aim of this case study is to show a walkthrough of the extraction and merging processes. In order to extract a sub-ontology from the UMLS ontology, Table 1 shows the requirements of the required sub-ontology, whereas the requirements of the sub-

206 Table 1 UMLS ontology requirements Name Food Olive Oil Liquid Oils Fats Niacin Flaxseed Pills Vitamin B Complex Micronutrients Solid Drug Form Fatty Acids, Unsat Status Excluded Included Included Included Included Excluded Excluded Excluded Included Excluded Included Included Name

A. Flahive et al. Status Included Included Included Excluded Included Included Included Included Included Included Included Excluded

Omega Fatty Acids Trans Fatty Acids Citric Acid Orderable Drugs Vitamin A Vitamin B6 Vitamin B12 Vitamin D Vitamin E Vitamin K Vitamins Solution

Table 2 Pharmacy ontology requirements

Name Liquid Honey Bars Tamiu Coldrex Panadol Dispensable Drugs Vitamin Shake Fatty Acids, Unsat

Status Excluded Included Included Excluded Excluded Excluded Excluded Included Included

Name Weight Loss Formula Omega Fatty Acids Trans Fatty Acids Protein Bars Health Bars 2 mgram 5 mgram 10 mgram Protein Shake

Status Included Included Included Included Included Included Included Included Included

ontology to be extracted from the pharmacy ontology are shown in Table 2. We then use the information from these two tables to label the UMLS ontology and the pharmacy ontology. The labellings of both ontologies are shown in Fig. 10 and Fig. 11. All the selected elements from both ontologies are required in the nal solution. The nal solution must also be one complete and valid ontology containing only information that customers are allowed to view. The rst step is to check for consistency in the labelling of the UMLS ontology and the pharmacy ontology. If there are any consistency issues then these must be rectied as continuing the process would not lead to a complete and valid solution. After this initial consistency check all of the selected elements are extracted into two sub-sets, a UMLS sub-set and a pharmacy sub-set. These two sub-sets are shown in Figs. 12 and 13. At this point there must be at least one valid merge point in both the UMLS subset and the pharmacy sub-set as per Notation 9. A valid merge point is different to a merge point in that a valid merge point must exist in both sub-sets whereas a normal merge point may only exist in one or the other or neither. Even though there may be many merge points in both sub-sets there may not be many valid merge points. At

Ontology as a Service (OaaS): a case for sub-ontology merging

207

Fig. 10 Labelling of the UMLS ontology

Fig. 11 Labelling of the pharmacy ontology

least one valid merge point must exist for the process to continue, if not, then one must be found. In this example there are three valid merge points, Fatty Acids, Unsaturated, Trans Fatty Acids and Omega 6 Fatty Acids. Having these valid merge points enables both sub-sets to merge sometime in the future. For now, a validity check is

208

A. Flahive et al.

Fig. 12 The UMLS extracted minimum sub-set

Fig. 13 The pharmacy minimum sub-set

required to ensure that each element in both sub-sets is able to be connected via a path to the valid merge point(s). In the case of the UMLS sub-set, 11 extra elements are required to be retrieved so as to form a valid sub-ontology. In the case of the Pharmacy ontology 15 extra elements are required to be retrieved to form a valid sub-ontology. These new subontologies are shown below in Figs. 14 and 15. Both gures show that some very long paths of elements were required to be retrieved and added to the sub-sets before they could be classed as valid sub-ontologies. Now that both sub-sets have become valid sub-ontologies, they can be merged at the valid merge points, identied earlier, conforming to Notation 10. This forms one large valid sub-ontology and is shown in Fig. 16. At this stage a nal validity check may still be required. Multiple sub-ontologies may exist if there are several valid merge points used to merge the two sub-sets. This

Ontology as a Service (OaaS): a case for sub-ontology merging

209

Fig. 14 The valid UMLS minimum sub-ontology

Fig. 15 The valid pharmacy minimum sub-ontology

validity check will ensure that any sub-ontologies are joined to form one large valid sub-ontology. As can be seen in Fig. 16, the ontology is already valid as each element is connected to every other element by a path. Therefore, a valid merged sub-ontology can be assumed. Several elements in the new valid diet and health ontology may not be semantically complete at this stage. Several elements may still be required to be retrieved from either the UMLS ontology or the pharmacy ontology to ensure the completeness of

210

A. Flahive et al.

Fig. 16 The new valid merged diet and health ontology

Fig. 17 The new complete and valid diet and health ontology

the ontology solution. In this case the Vegetable Oil element is extracted from the UMLS ontology and included in the nal solution. This ensures that the merged diet and health ontology is complete and valid. The nal merged ontology solution is shown in Fig. 17.

5 Analysis and discussions A simulation to test our OntoCloud has been conducted, whereby we used two local clouds to host each of the source ontologies (i.e. UMLS and the pharmacy ontology). A sub-ontology is required to be extracted from both source ontologies before being merged. The extraction process is performed in each local cloud, respectively, whereas the merging process is carried out in one cloud by exporting the extracted sub-ontology from the other. As an experiment, we tested various sizes of the source ontologies, in order to highlight the differences between the two. Ontology 1 is of size 50 GB and ontology 2 is of size 10 GB. The quality of the labelling of both ontologies will be 25%,

Ontology as a Service (OaaS): a case for sub-ontology merging

211

50%, and 75% with the selected amount being 20% for each. We developed algorithms to tailor the ontologies to various user specications. The cost of applying different algorithms is determined by the value of the algorithm complexity variables, as these determine how much work must to into tailoring the ontology in order to reach a solution. Simple algorithms may only do surface scans of the ontology to reach a solution. These simple types do not take much processing power to complete. Complex algorithms perform more in-depth scans of the ontology and tend to involve traversing many long paths within the ontology. For complex algorithms applied to large ontologies, much data processing power is required in order to nd a solution in a reasonable time frame. Both simple and complex algorithms take an amount of processing power to complete and this amount depends on the size of the ontology. The larger the ontology and the more complex the algorithm, the greater the amount of processing required to reach the solution. In our experimentations, a number of simple and complex algorithms were tested. With the minimum extraction method, the initial extraction phase only requires the selected elements of both ontology 1 and ontology 2. As 20% of both ontologies are labelled selected, about 10 GB is required to be extracted from the rst cloud and about 2 GB from the second cloud. Therefore, the processing in each cloud is rather minimum. Then the smaller sub-ontology from the second cloud is exported to the rst cloud for merging. All elements from each sub-ontology are minimum required by the user. However, there are elements which are still required to be retrieved from the source ontologies, as the new sub-set does not conform to complete and valid ontology standardsconsequently, extra elements are retrieved. The amount of extra elements that were required to be retrieved from the source ontology amounted to an extra 2.5 GB being extracted from the rst cloud, and 10 GB extra being extracted from the second cloud, if give a labelling of around 50%. Therefore, at the end, although the second cloud initially produces only 2 GB, but in the second stage where extra elements were required, it mounts up to another 10 GB. On the other hand, the rst cloud where the merging process also takes place, produces roughly about the same size in both stages as that in the second cloud. At the end, a total of 25 GB of both source ontologies had to be retrieved. Hence, the size on the extracted sub-ontology does not make too much difference when using the minimum extraction method. However, in terms of the processing power, depending on the level of the complexity of the algorithm, the processing power in the cloud may determine where the merging process may be carried out.

6 Related work We categorize related work in this eld into two categories: (i) ontology engineering covering ontology merging and ontology segmentation, and (ii) semantic grid ontology. These are outlined next. 6.1 Ontology merging and segmentation Large domain ontologies are very complex. Many people proposed methods to efciently manage these, including the efcient traversal techniques, whilst other suggest

212

A. Flahive et al.

new languages for describing large domain ontologies. These attempts although useful are very specialized, as their solutions focus on a particular domainnot general domain ontologies. A few researchers, such as Dou et al. [11] and Noy [17], have realized the need for merging large domain ontologies. Merging large domain ontologies is not an easy task either. The Chimaera tool [28] was created to assist with merging and managing ontologies. The browser-style ontology editing tool that they created enabled the merging of two similar ontologies. It performed ontology matching semiautomatically based on name similarity and taxonomic structure. Noy [17] summarizes some of the main issues with merging ontologies, also called semantic integration. It focusses on several levels including schema level, class matching between different ontologies and creating a merged ontology through inferences using the bridges made between the two source ontologies. The basis for merging ontologies is ontology comparison. Noy [17] surveyed OntoMerge [11] and GLUE [10]. OntoMerge was developed to assist with semantic integration in the Semantic Web. The authors use a general-purpose inference engine to enable translations between mapped ontologies. The correspondence between two ontologies is expressed as a set of bridging axioms relating classes and properties of the two source ontologies. GLUE [10] is an example of a system that uses machine learning techniques to discover mappings between concepts in different ontologies. GLUE exploits the information in the data instances and the taxonomic structure of ontologies to generate the mapping rules. Regardless of the contributions made by GLUE toward ontology merging, domain experts are still required to check the accuracy of the simple mappings made. The domain experts often end up writing the more complex rules themselves. Ontology interoperability and conceptual graph theory was addressed by Corbett [8]. The author discusses comparing and merging ontologies, as well as ltering large knowledge bases to obtain more relevant information. Problems arise when attempting to compare ontologies as some of the labels have different meanings and the semantics of the types vary. The authors have attempted to solve this comparison problem by demonstrating a method for automated comparison of ontologies represented as concept-type hierarchies. A simple merge operation, using join and type subsumption, is used to perform ontology merging. The extension of ontologies using Conceptual Graph theory helps to strengthen the adoption of ontologies as the key to knowledge representation and reasoning. Deriving tailored ontologies from large base ontologies involves reusing and extracting information that enables individuals to obtain only specic parts of the ontology for their intended use. Smaller ontologies, rather than whole base ontologies, are much better suited to the user. Some initial rules are outlined in Wouters et al. [38], Bhatt et al. [3, 4]. They discuss several of the steps taken to identify key requirements of ontologies that cannot be disregarded when extracting sub-ontologies. Ontology segmentation is another approach. Seidenberg and Rector [31] attempt to solve a number of difculties with large ontologies through segmentation. Smaller ontologies are easier to classify, traverse and use in day-to-day activities. They proposed a segmentation algorithm that traverses ontologies to nd the best means of segmentation. Such segmentation algorithms are fast and provide sub-ontology-like

Ontology as a Service (OaaS): a case for sub-ontology merging

213

solutions but are not specically related to the requirements of the user or intended application. The ltering system uses a coarse grain approach and excludes individual concept or relationship selection. 6.2 Semantic grid ontology As information exploding in the Internet is inevitable, the way forward is to ensure that this information is for humans to understand, not machines. The way forward is to make the data machine-processable and machine-interpretable [6], so that machines can process the data themselves with little human interaction. And hence, Semantic Web is regarded by many researchers a way to solve this issue by providing context awareness and structure to the web content. The goal of the Semantic Web initiative is to create a universal medium for the exchange of data, to smoothly interconnect pieces of information, applications for global sharing of information. Many researchers argue that the core of the Semantic Web is Ontologies [33]. Ontologies are the key to providing this structure to the Semantic Web, and they provide the complex semantic structure required to enable the information to be interpreted by machine. In the Semantic Web architecture, web ontology languages are built on top of RDF(S), which become the base of the Semantic Web [9]. It is understandable that Semantic Web will not survive without the development and acceptance of ontologies [27]. Apart from the Semantic Web, the Grid is the other main component that has been motivated for the Semantic Grid. Formally, the large-scale, problem-solving mechanism in virtual organization is commonly known as the Grid. The Grid enables large computational and storage devices to be linked globally to form a network of powerful resources. The goal was to establish a globally distributed collection of heterogeneous resources [2] to allow resource sharing among many users. Grid computing focusses on the amount of computation that can be salvaged across a large distribution of heterogeneous machines. Researches in the Grid environment have been attempting to make resources act seamlessly over a considerable distance but so far this has fallen short of expectations. Update of data in the grid has been one of the major focusses [23, 25, 32, 34], including replication [24], recovery and consistency [21, 22]. In order to virtualize the Grid, Web Services and Grid Services are used. Web and Grid Services, like OGSA (Open Grid Services Architecture) [16] are the main core of the grid architecture for sharing its distributed computing resources. These services enable capabilities to be constructed dynamically and transparently from distributed services. It opens the door to design and develop new Grid applications with the ability to reuse existing components and information resources, allowing a coordinated approach to managing these resources. Following Semantic Web, and the Grid through Web and Grid Services, Semantic Grid is evolved. The Semantic Grid has been developed due to the need of semantics in the Grid infrastructure [20]. This is an important improvement as it allows for efcient knowledge management among its services, better enabling computers and people to work together [7]. One way to achieve the Semantic Grid is through the use of ontologies. Our previous work has presented Grid Ontology [1214] in the context of Semantic Grid, whereby we have tested ontology processing mechanism using the grid infrastructures.

214

A. Flahive et al.

From the Grid, now there is a strong movement toward the Cloud. Although it is understood that although some parts are alike, they are much different due to their historical background and visions. The Grid was intended to solve computationally extensive problems by stealing resources of others, whereas the Cloud has been intended to realize the concept of utility computing, whereby services are utilities. Whereas the former is commonly used in e-Science to solve gigantic problems, the latter is intended to offer solutions to a large bulk of smaller problems. Ontology engineering and tailoring is very much suitable for the cloud, as ontology can be provided as a service. It is not about cooperative, but rather a service.

7 Conclusions and future work In this paper, we have introduced the notion of Ontology as a Service (OaaS), which may open new wave of services in the cloud. OaaS is a service whereby cloud providers provide not only the ontology but the application and infrastructure to tailor the source ontology to the users requirements. In this paper, we elaborate ontology extraction and sub-ontology merging process. As it is more likely that multiple cloud providers host different source ontologies, OaaS, and in particular our proposed extraction and merging techniques will provide a basis rationale for moving toward a multiple cloud providers environmenta movement toward the current cloud technology is driving. Our proposed ontology extraction and merging is based on the minimum extraction method, whereby only the required concepts are initially extracted from the source ontologies. However, because the result of the extracted information may likely be invalid or incomplete sub-ontology, further concepts and elements extractions are needed. As a case study, we have demonstrated a walkthrough of the entire sub-ontology extraction and merging process using the UMLS meta-thesaurus ontology. Our analysis shows that the cloud environment is very much suitable to support OaaS. In the future, we plan to compare the minimum extraction method and the maximum extraction method, and evaluate their efciency in the cloud computing platform. Other parts of the project as shown in Fig. 1 that are still being completed, such as the Replace part, will be the main focus of this project.

References
1. AdvancedMD (2010) http://www.advancedmd.com 2. Atkinson MP (2003) Databases and the grid: who challenges whom? In: BNCOD 2003, pp 12 3. Bhatt M, Flahive A, Wouters C, Rahayu W, Taniar D, Dillon TS (2004) A distributed approach to subontology extraction. In: Proceedings of the 18th international conference on advanced information networking and applications (AINA2004), vol 1, pp 636641 4. Bhatt M, Wouters C, Flahive A, Rahayu W, Taniar D (2004) Semantic completeness in sub-ontology extraction using distributed methods. In: Proceedings of the international conference on computational science and its applications (ICCSA2004), volume 3. Lecture notes in computer science, vol 3045. Springer, Berlin, pp 508517 5. Bhatt M, Flahive A, Wouters C, Rahayu W, Taniar D (2006) MOVE: a distributed framework for materialized ontology view extraction. Algorithmica 45(3):457481

Ontology as a Service (OaaS): a case for sub-ontology merging

215

6. Bussler C, Fensel D, Maedche A (2002) A conceptual architecture for semantic web enabled web services. SIGMOD Rec 31(4):2429 7. Cannataro M, Talia D (2004) Semantics and knowledge grids: building the next-generation grid. IEEE Intell Syst 19(1):5663 8. Corbett D (2004) Interoperability of ontologies using conceptual graph theory. In: Proceedings of the international conference on computational science (ICCS2004), pp 375387 9. Cuenca Grau B (2004) A possible simplication of the semantic web architecture. WWW 704713 10. Doan A, Madhavan J, Domingos P, Halevy AY (2002) Learning to map between ontologies on the semantic web. WWW 2002:662673 11. Dou D, McDermott DV, Qi P (2005) Ontology translation on the semantic web. J Data Semant 2:35 57 12. Flahive A, Rahayu W, Taniar D, Apduhan BO (2004) A distributed ontology framework for the grid. In: Proceedings of the international conference on parallel and distributed computing: applications and technologies (PDCAT2004). Lecture notes in computer science, vol 3320. Springer, Berlin, p 68 13. Flahive A, Rahayu W, Taniar D, Apduhan BO (2005) A distributed ontology framework in the semantic grid environment. In: Proceedings of the 19th international conference on advanced information networking and applications (AINA 2005), pp 193196 14. Flahive A, Taniar D, Rahayu W, Apduhan BO (2009) Ontology tailoring in the Semantic Grid. Comput Stand Interfaces 31(5):870885 15. Flahive A, Taniar D, Rahayu W, Apduhan BO (2011) Ontology expansion: appending with extracted sub-ontology. Log J IGPL 19(5):618647 16. Foster IT, Kesselman C, Nick JM, Tuecke S (2002) Grid services for distributed system integration. Computer 35(6):3746 17. Fridman Noy N (2004) Semantic integration: a survey of ontology-based approaches. SIGMOD Rec 33(4):6570 18. Gene Ontology Consortium (2000) Gene ontologytool for the unication of biology. Nat Genet 25(1):2529 19. Gene Ontology (2006) The gene ontology. http://geneontology.org 20. Goble CA, De Roure D (2004) The semantic grid: myth busting and bridge building. In: ECAI, pp 11291135 21. Goel S, Sharda H, Taniar D (2003) Preserving data consistency in grid databases with multiple transactions. In: Proceedings of the 2nd international workshop on grid and cooperative computing (GCC 2003), Part II. Lecture notes in computer science, vol 3033. Springer, Berlin, pp 847854 22. Goel S, Sharda H, Taniar D (2004) Failure recovery in grid database systems. In: Proceedings of the 6th workshop on distributed computing (IWDC 2004). Lecture notes in computer science, vol 3326. Springer, Berlin, pp 2730 23. Goel S, Sharda H, Taniar D (2004) Atomic commitment in grid database systems. In: Proceedings of the IFIP international conference on network and parallel computing (NPC 2004). Lecture notes in computer science, vol 3222. Springer, Berlin, pp 2229 24. Goel S, Sharda H, Taniar D (2005) Replica synchronisation in grid databases. Int J Web Grid Serv 1(1):87112 25. Goel S, Sharda H, Taniar D (2005) Atomic commitment and resilience in grid database systems. Int J Grid Util Comput 1(1):4660 26. HealthVault (2010) http://www.healthvault.com 27. Kim H (2002) Predicting how ontologies for the semantic web will evolve. Commun ACM 45(2):48 54 28. McGuinness DL, Fikes R, Rice J, Wilder S (2000) An environment for merging and testing large ontologies. In: Proceedings of the seventh international conference principles of knowledge representation and reasoning (KR2000), pp 483493 29. Miller GA (1995) Wordneta lexical database for English. Commun ACM 38(11):3941 30. Miller M (2009) Cloud computing: web-based applications that change the way you work and collaborate online. Que Publishing, Indianapolis 31. Seidenberg J, Rector AL (2006) Web ontology segmentation: analysis, classication and use. WWW 2006:1322 32. Taniar D, Goel S (2007) Concurrency control issues in Grid databases. Future Gener Comput Syst 23(1):154162 33. Taniar D, Rahayu W (eds) (2006) Web semantics and ontology. Idea Group Publisher, Hershey 34. Taniar D, Leung CHC, Rahayu JW, Goel S (2008) High performance parallel database processing and grid databases. Wiley, New York

216

A. Flahive et al.

35. UMLS (2002) Unied medical language system, 13th edn. US Department of Health and Human Services, National Institutes of Health, National Library of Medicine 36. Velte AT, Toby JV, Elsenpeter R (2010) Cloud computing: a practical approach. McGraw Hill, New York 37. Wordnet (2010) Wordnet, a lexical database for the English language. http://wordnet.princeton.edu 38. Wouters C, Dillon TS, Rahayu W, Chang E (2002) A practical walkthrough of the ontology derivation rules. In: DEXA 2002, pp 259268

Você também pode gostar