In the Case-Based reasoning Retrieval is the one of most important phase, because the overall effectiveness of the Case-Based Reasoning system is depending on the retrieval Phase. To solve the target problem retrieve the useful cases from the database. To perform the retrieval process, CBR system typically exploit similarity knowledge is called the similarity-based retrieval. Similarity measures are used in similarity-based retrieval to approximate the usefulness of cases with respect to the target problem. In this paper, we propose and develop a retrieval strategy for hierarchical case that combine the support-count bit from multilevel (SC-BF) and soft-matching criteria for generating the soft-matching class association rules. Second apply the unified knowledge of similarity and association knowledge (USIMSCAR) for the improvement the performance of the similarity based retrieval (SBR). Using the various association rules mining technique, generate the association rule.
Título original
A Retrieval Strategy for Case-Based Reasoning using USIMSCAR for Hierarchical Case
In the Case-Based reasoning Retrieval is the one of most important phase, because the overall effectiveness of the Case-Based Reasoning system is depending on the retrieval Phase. To solve the target problem retrieve the useful cases from the database. To perform the retrieval process, CBR system typically exploit similarity knowledge is called the similarity-based retrieval. Similarity measures are used in similarity-based retrieval to approximate the usefulness of cases with respect to the target problem. In this paper, we propose and develop a retrieval strategy for hierarchical case that combine the support-count bit from multilevel (SC-BF) and soft-matching criteria for generating the soft-matching class association rules. Second apply the unified knowledge of similarity and association knowledge (USIMSCAR) for the improvement the performance of the similarity based retrieval (SBR). Using the various association rules mining technique, generate the association rule.
In the Case-Based reasoning Retrieval is the one of most important phase, because the overall effectiveness of the Case-Based Reasoning system is depending on the retrieval Phase. To solve the target problem retrieve the useful cases from the database. To perform the retrieval process, CBR system typically exploit similarity knowledge is called the similarity-based retrieval. Similarity measures are used in similarity-based retrieval to approximate the usefulness of cases with respect to the target problem. In this paper, we propose and develop a retrieval strategy for hierarchical case that combine the support-count bit from multilevel (SC-BF) and soft-matching criteria for generating the soft-matching class association rules. Second apply the unified knowledge of similarity and association knowledge (USIMSCAR) for the improvement the performance of the similarity based retrieval (SBR). Using the various association rules mining technique, generate the association rule.
International Journal of Advanced Engineering Research and Technology (IJAERT)
Volume 2 Issue 2, May 2014, ISSN No.: 2348 8190
65 www.ijaert.org A Retrieval Strategy for Case-Based Reasoning using USIMSCAR for Hierarchical Case Daxa k. Patel Department of Computer Science of Engineering and Technology, PIET, Limda Gujarat Technology University Vadodara, India Abstract In the Case-Based reasoning Retrieval is the one of most important phase, because the overall effectiveness of the Case-Based Reasoning system is depending on the retrieval Phase. To solve the target problem retrieve the useful cases from the database. To perform the retrieval process, CBR system typically exploit similarity knowledge is called the similarity- based retrieval. Similarity measures are used in similarity-based retrieval to approximate the usefulness of cases with respect to the target problem. In this paper, we propose and develop a retrieval strategy for hierarchical case that combine the support- count bit from multilevel (SC-BF) and soft-matching criteria for generating the soft-matching class association rules. Second apply the unified knowledge of similarity and association knowledge (USIMSCAR) for the improvement the performance of the similarity based retrieval (SBR). Using the various association rules mining technique, generate the association rule. Keywords CBR, Association knowledge (AK), Association rule mining (ARM), Case-Based Reasoning, Multilevel association rules, data mining I. INTRODUCTION The fundamental premise of the case-based reasoning is that the experience in the form of past cases can be leveraged to solve new problems [1]. An individual experience is called a case, and its collection is stored in the case based. Typically, each case is described by the problem description and the corresponding solution description. Among the four typical phases in CBR are the retrieval, reuse, revise, and retain, the retrieval is a key phase in CBR, since the success of CBR system is heavily reliant on the performance of retrieval [2]. Its aim is to retrieve the useful or relevant cases that can be successfully used to solve a target problem. If the retrieved cases are not useful, CBR systems may not eventually produce the suitable solution to the problem. Typically, retrieval is achieved through the specific strategy leveraging similarity knowledge (SBR) [2]. In SBR, SK is used to estimate the usefulness of stored cases with respect to the target the target problem. SK is usually encoded via similarity measures between the problem and stored cases, by using the measures, SBR find cases ranked by their similarities to the problem, and then their solutions are used to solve the problem. However, there are two main problems in the SBR. First, SBR is too much dependent on domain experts to define SK in practice [3]. No clear methodology or general approaches to support the modelling of such measures in an intelligent way have been developed yet. Thus, defining SK is still very complicated, time-consuming, and hard to practices. Second the similarity measure is very often static so that the definition is highly possible to be applied to all target problems consistently. In this paper, propose the association analysis of cases for the hierarchical cases. Association knowledge represents strongly evident, interesting relationships between known problem features and solutions shared by a large number of cases. Aim of retrieval in this paper is to retrieve the combined set of both cases and rules relevant to the target problem, where the relevance is determined by quantification method using the integration of similarity knowledge and the association knowledge. Association knowledge is dynamic in that according to the characteristic of the target problems, the best set of rules can be differently chosen and leveraged for the retrieval process. The key strength of unified knowledge of similarity and soft-matching class association rules (USIMSCAR). For USIMSCAR to enable the retrieval process with hierarchical cases, addressed the two issues: the first is how to formalize similarity measure encoding similarity knowledge and the second is how to generate soft- matching class association rules (scars) encoding association knowledge. Similarity measure for the hierarchical cases has to able to adequately compute the similarity between the same level cases or the different level cases. The generation of scars from hierarchical cases basically requires a mechanism that discovers frequent itemsets from cases at different-levels. This issue may be addressed by using an algorithm that extended Apriori allowing for mining multi- level association rules. Support-count bit from (SC-BF) multilevel , these algorithms are proposed with the aim of finding frequent itemsets at the top most level and then progressively deepening the mining process into their frequent descendants at lower concept level. Therefore, by integrating SC-BF algorithm and the soft-matching criterion, generate scars from hierarchical cases II. RELATED WORK Similarity based reasoning has been widely used in the different case based reasoning application. Such as the medical diagnosis[4][5], IT service management[6], product recommendation[7] and personal rostering decision [8], to International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 2, May 2014, ISSN No.: 2348 8190 66 www.ijaert.org predict the similar cases having the appropriate solution for the target problem. Similarity based retrieval achieved through a nearest neighbour retrieval [2]. The idea of k-NN is that retrieval is achieved through retrieving the k most similar cases to the target problem. The limitation of the k-NN lies in allowing irrelevant attributes to influence the similarity computation. Approaches integrating data mining and k-NN have often been applied in the Case Based Reasoning research to improve the k-NN through two main schemes. The first is to integrate feature selection (FS) or feature weighting (FW) into the k- NN. In this context, FS is used to choose the relevant features of the cases [5], [8], FW is applied to estimate optimal weights of the original features of cases [9], [10], or their combination is used to leverage their advantages [4]. The second scheme is to combine the data clustering with k-NN, where the structure of clustered cases is leveraged to guide more relevant cases [11], [12]. Given the case base, a set of clusters is constructed, where each cluster represents the group of relevant cases. For case retrieval, the similarity between the target problem and each case is combined with the relevance of the clustered group containing the case considered. The improved clustering technology is applied in the case- based reasoning decision making system. During the setup stage of the case library, the test result sets are clustered by the improved CURE_KNN algorithm to identify the central points and setup indexes of these subsets. In the retrieval process the distance between the target case and each center is compared to select the subset with the largest similarity. Retrieval and maintenance of the large case library is particularly effective; ensure the retrieval efficiency and the quality, overcome the nearest neighbor retrieval method for its disadvantages of the low efficiency on large scale case database searching [13]. Applying both inductive indexing and the nearest neighbor techniques in the case base retrieval phase, to retrieve the set of matching case inductive indexing will be used and then nearest neighbour is used to rank the cases in the set according to the similarity of the target case. III. BACKGROUND OF SIMILARITY KNOWLEDGE AND ASSOCIATION KNOWLEDGE A. Background of Similarity Knowledge In the case based reasoning context, similarity knowledge encoded via measure computing similarity between the target problem Q and cases. The higher the similarity between Q and case C is the more useful case C for problem Q. It is the local global principle that decomposes the similarity measure by the local similarities for individual attributes of cases and the global similarity aggregating these similarities [14]. A global similarity function can be arbitrarily complex, but simple functions are usually used such as weighted average aggregation [14]. B. Background of AK AK aims to represent evidently interesting relationships shared by the large number of relevant stored cases, using the combination of various DM techniques. These are the ARM [15], class ARM [16], and soft-matching ARM (SARM) [17]. 1) ARM: ARM aims to mine certain interesting relationships, called associations, in the transaction database [15]. It focuses on discovering the set of highly co-occurred features shared by large number of records in the database. In the Case Based Reasoning context, ARM can be used to discover interesting relationships from the given case base. A transaction and the item can be seen as the case and an attributevalue pair, respectively. Apriori [15] is one of the traditional algorithms for the ARM. Interestingness measures are useful to evaluate the quality and rank the large number of ARs extracted [18]. Generating association rules that have greater support as compared to user defined minimum threshold and confidence greater than user defined minimum confidence is the main problem of Association Rule Mining. 2) Class ARM: Class association rules (CARs) [17] are the special subset of ARs whose consequents are restricted to the single target variable. In the Case Based Reasoning context, the CAR is seen as an AR whose consequent holds the item formed as the pair of the solution attribute and the value of it. 3) SARM: Consider a rule X Y. A limitation of the traditional ARM algorithms (e.g., Apriori [15]) is that itemsets X and Y are discovered based on the equality relation. when dealing with items similar to each other, these algorithms may perform poorly. For example, in the supermarket sales database, Apriori cannot find rules like 80% of the customers who buy products similar to milk (e.g., cheese) and products similar to eggs (e.g., mayonnaise) also buy bread. To address this issue, the soft-matching criterion was proposed [17], where the antecedents and consequents of ARs are found by similarity assessment. Using this criterion, the problem of SARM is to find all rules of the form X Y, where the soft support and soft confidence of each rule are not less than minsupp and minconf, respectively. The definitions of soft support and soft confidence are generalized by using support and confidence. This generalization is done by allowing items to match, as long as their similarity exceeds the user-specified minimum similarity minsim. IV. EXISTING SYSTEM In case-based reasoning, the case represents the problem- solving experience from the past. The case is structured into the two main parts. The first part is the problem part that contains the description characterizing the past problem. The second part is the solution part that contains the description of the suitable solution for the described problem. To represent the cases formally, many CBR systems generally adopt well- known knowledge representation attribute-value pairs. In this process two algorithms is generated that soft-matching class association rule (SCAR) and unified knowledge of similarity and soft- matching class association rule (USIMSCAR). In the International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 2, May 2014, ISSN No.: 2348 8190 67 www.ijaert.org SCAR Algorithm, generate the soft-matching class association rules for the frequent itemsets (concept of the soft-matching criteria describe in the section III). In the second algorithm, combined the similarity knowledge and the association knowledge for improve the performance of the similarity Based Retrieval (SBR). V. AK REPRESENTATION In this section explain the proposed system. Here, same process for the Case-Based Reasoning (CBR), but use the structural representation for the case base that is the hierarchical representation. The hierarchical representation represents each case at multiple levels of abstraction. The hierarchical representation is the simple extension of the attribute- value pair representation allows for the description of the cases with the complex hierarchical structure [23]. To represent the Hierarchical cases with the USIMSCAR, need to address the following two issues: first is how to formalize similarity measures encoding similarity knowledge and the second is how to generate SCARs encoding association knowledge. The similarity measure for the hierarchical cases has to be able to adequately compute the similarity between the same- level cases or the different level cases. The generation of the SCARs from the hierarchical cases basically requires the mechanism that discovers frequent itemsets from the cases at the different levels. Used the Support- Count and Bit-from multilevel (SC-BF) Algorithm, using these algorithm finding the frequent itemsets at the top most level and then progressively deepening the mining process into their frequent descendants at lower concept levels. By integrating SC-BF algorithm and the soft-matching criterion, generate the SCARs from the hierarchical cases. This section presents our approach for extracting and representing association knowledge using the technique describe in the section III. The aim of association knowledge building is: 1) representing strongly evident association between known problem features and the solution from the given case base, 2) valuably combined these associations along with SK in unified knowledge of similarity and soft- matching class association rules (USIMSCAR). Mine multilevel association rules efficiently using concept hierarchies, and the soft-matching criteria. Hierarchical algorithm defines sequence of mappings from the set of low- level concepts to higher-level [19]. Using the concept hierarchies, first retrieve the frequent itemsets from the case base at the same level or different level and then combined the soft-matching criteria to generate the soft-matching class association rules. A SCAR has an implication of the form X y, where X is the frequent itemset representing problem features that occur frequently and are discovered by the soft- matching criterion from the case base. And y is the solution item. A SCAR X y thus implies that the target problem Q is likely to be associated with the solution contained in the y, if Qs problem features are sufficiently similar to the X. In a concept hierarchy, this is represented as the tree with the root as D. This uses the hierarchical information to encoded transaction table instead of original transaction table. This is because the DM query is usually in relevance to only the portion of the transaction database, instead of all the items in the database. It is beneficial to first gather the relevant set of data and then work again and again on task-relevant set [20] [21]. Encoding can be performed during the gathering of task relevant data and thats why there is no extra encoding pass needed. VI. RESULTS AND ANALYSIS . This section provides the comparisons between the existing method and the proposed method in term of the accuracy and the time complexity. For car evaluation database, take the total time for retrieve frequent itemsets from the hierarchical case and generating the soft-matching rules from the frequent itemsets and finally generate the association rule, is less compared to the existing method. And accuracy is also improved in the hierarchical case. Below figure show the Accuracy between the existing method and proposed method. TABLE Attribute Vs Accuracy (%) for Car Dataset Car Dataset Attributes Existing Algorithm(USIMSCAR) Modified Method (Hierarchical USIMSCAR) 1 15 20 2 15 23 3 24 33 4 24 30 5 30 38 6 30 37 7 60 65 8 24 29 The graphical representation for the car database is shown in Fig. 1. Fig. 1.Accuracy for Car Dataset In the graph it is clearly seen that accuracy for proposed Algorithm is improved. Time complexity is also improved in International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 2, May 2014, ISSN No.: 2348 8190 68 www.ijaert.org the proposed method. Table II Contains the information about execution time for running the existing method and the proposed method. Proposed method take less time to run the algorithm for the number of attributes enter by the user. TABLE II Attributes Vs Time for Car Dataset Car Dataset Attributes Existing Algorithm(USIMSCAR) Modified Method (Hierarchical USIMSCAR) 1 1248426 1035337 2 1376311 0837051 3 1182619 0661093 4 1275880 0970096 5 1518538 0543652 6 1211486 0627721 7 1218748 0457310 8 0518805 0425308 Fig. 2. Show the graphical representation for execution time. Fig. 2. Execution Time for Car Dataset Also analysis the results for the Camera Database and the PC Database. Table III contain the accuracy for the existing database and the proposed database. TABLE III Attribute Vs Accuracy (%) for PC Dataset PC Dataset Attributes Existing Algorithm(USIMSCAR) Modified Method (Hierarchical USIMSCAR) 1 51 60 2 32 38 3 43 49 4 43 50 Fig. 3. Represent the graphical represent for the PC Database. Table IV contains execution time for running the PC Database. Proposed algorithm takes less time to execute the algorithm. So it improves the execution time and accuracy for the experimental database. Fig. 3. Accuracy for PC Dataset TABLE IV Attributes Vs Time for PC Dataset PC Dataset Attributes Existing Algorithm(USIMSCAR) Modified Method (Hierarchical USIMSCAR) 1 1639146 0629975 2 0745293 0395333 3 0694610 0376362 4 0736399 0384672 The graphical representation for the execution time for the PC Database is shown in Fig. 4. Fig. 4 Execution Time for PC Dataset International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 2, May 2014, ISSN No.: 2348 8190 69 www.ijaert.org VII. ADVANTAGES OF PROPOSED SYSTEM Take the less memory to store the entire data because the data was grouped at the branch level and reduce the execution time. It improves its accuracy. VIII. CONCLUSIONS AND FUTURE WORK In this paper present the case based reasoning for the hierarchical structure, in this process first retrieve frequent itemsets from the hierarchical case using the SC-BF algorithm after that applied Soft-matching criteria for generating the SCARs. Second combined the Association knowledge and the similarity knowledge for the improvement of the similarity based retrieval. Advantages of using the SC-BF algorithm is Take the less memory to store the entire data because the data was grouped at the branch level and reduce the execution time. It improves its accuracy. So it improves the total performance of the case-based reasoning. As future work, USIMSCAR could also be extended for cases with complex structures such as object-oriented and semantic web-based cases [2], [22]. For USIMSCAR to run with such cases, two issues must be addressed: 1) how to define similarity measures for the cases; and 2) how to formalize AK from the cases. ACKNOWLEDGMENT I would like to express the deepest appreciation to Rahul Joshi who has guided me and for their support and motivation that they have provided. He has always been willingly present whenever I needed the slightest support from his. I would not like to miss a chance to say thank for the time that he spared for me, from his extremely busy schedule. REFERENCES [1] R. Lopez De Mantaras, D. McSherry, D. Bridge, D. Leake, B. Smyth, S. Craw, B. Faltings, M. L. Maher, M. T. Cox, K. Forbus, M. Keane, A. Aamodt, and I. Watson, Retrieval, reuse, revise and retention in CBR, Knowledge. Eng. Rev., vol. 20, no. 3, pp. 215240, 2005. [2] Y. Guo, J. Hu, and Y. Peng, Research on CBR system based on data mining, Appl. Soft Comput., vol. 11, no. 8, pp. 5006 5014, 2011. [3] H. Ahn and K.-J. Kim, Global optimization of case-based reasoning or breast cytology diagnosis, Expert Syst. Appl. , vol. 36, no. 1, pp. 724734, 2009. [4] B. Pandey and R. Mishra, Case-based reasoning and data mining integrated method for the diagnosis of some neuromuscular disease, Int. J. Med. Eng. Informat., vol. 3, no. 1, pp. 115, 2011. [5] Y.-B. Kang, A. Zaslavsky, S. Krishnaswamy, and C. Bartolini, A knowledge-rich similarity measure for improving IT incident resolution process, in Proc. ACM Symp. Appl. Comput., 2010, pp. 17811788. [6] F. Lorenzi and F. Ricci, Case-based recommender systems: A unifying view, in Intelligent Techniques for Web Personalization, vol. 3169. Berlin, Germany: Springer, 2005, pp. 89113. [7] A. Aamodt and E. Plaza, Case-based reasoning: Foundational issues,methodological variations, and system approaches, AI Commun., vol. 7,pp. 3959, Mar. 1994. [8] G. R. Beddoe and S. Petrovic, Selecting and weighting features using a genetic algorithm in a case-based reasoning approach to personnel rostering, Eur. J. Oper. Res., vol. 175, no. 2, pp. 649671, 2006. [9] K. Bradley and B. Smyth, Personalized information ordering: A case study in online recruitment, Knowl.-Based Syst., vol. 16, nos. 56, pp. 269275, 2003. [10] C. M. Vong, P. K. Wong, and W. F. Ip, Case-based classification system with clustering for automotive engine spark ignition diagnosis, in Proc. 9th Int. Conf. Comput. Inf. Sci., Aug. 2010, pp. 1722. [11] F. Azuaje, W. Dubitzky, N. Black, and K. Adamson, Discovering relevance knowledge in data: A growing cell structures approach, IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 30, no. 3, pp. 448460, Jun.2000. [12] Z. Y. Zhuang, L. Churilov, F. Burstein, and K. Sikaris, Combining data mining and CBR for intelligent decision support for pathology ordering by general practitioners, Eur. J. Oper. Res., vol. 195, no. 3, pp. 662675, 2009. [13] L.Tong and D.Wu, Research on optimization of case-based reasoning system, 3 rd international conference on control, automation and system engineering ,2013. [14] A. Stahl, Learning of knowledge-intensive similarity measures in casebased reasoning, Ph.D. dissertation, Artificial Intelligence nowledge- Based Systems Research Group, Tech. Univ. Kaiserslautern, Kaiserslautern,Germany, 2003. [15] R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases, ACM SIGMOD Rec., vol. 22, no. 2, pp. 207216, Jun. 1993 [16] B. Liu, W. Hsu, and Y. Ma, Integrating classification and association rule mining, in Knowledge Discovery and Data Mining. Berlin, Germany: Springer, 1998, pp. 8086. [17] U. Y. Nahm and R. J. Mooney, improve information extraction Using soft-matching mined rules , in Proc. AAAI Workshop Adaptive Text Extract. Mining, 2004, pp. 2732 [18] L. Geng and H. J. Hamilton, Interestingness measures for data mining: A survey, ACM Comput. Surv., vol. 38, no. 3, Article 9, Sep. 2006. [19] H. Ravi Sankar, Dr. M.M. Naidu, An Innovative Algorithm for Mining multilevel ARs, Proceeding of the 25th IASTED International multi-conference AI and applications February 12- 14, 2007, Innsbruck Austria. [20] Predrag Stanii, Savo Tomovi, "Apriori Multiple Algorithm for Mining Association Rules," 124X Information Technology and Control, vol.37, pp.311-320, 2008. [21] Mehmet Kaya, Reda Alhajj, "Mining Multi-Cross-Level Fuzzy Weighted Association Rules," Second IEEE International Conference on Intelligent Systems.vol.1, pp.225- 230, 2004. [22] V. Nebot and R. Berlanga, Mining association rules from semantic web data, in Proc. 23rd Int. Conf. Ind. Eng. Appl. Appl. Intell. Syst., 2010, pp. 504513. [23] Y.B.Kang, S.Krishnaswamy, and A.Zaslavsky, A Retrieval Strategy for Case-Based Reasoning using Similarity and Association Knowledge, IEEE ,March 20,2013.
Comparative Study of End Moments Regarding Application of Rotation Contribution Method (Kani's Method) & Moment Distribution Method For The Analysis of Viaduct Frame