Você está na página 1de 17

Ahmed et al.

, Int J Med Lab Res 2017, 2(2): 60-76


ISSN 2456-4400

REVIEW ARTICLE INTERNATIONAL JOURNAL OF MEDICAL LABORATORY RESEARCH (IJMLR)

APPLICATION OF DATA MINING AND KNOWLEDGE MANAGEMENT IN


SPECIAL REFERENCE TO MEDICAL INFORMATICS: A REVIEW

Yogita Gupta1, Rana Khudhair Abbas Ahmed2*, Dr. Sandeep Kumar Kautish3
1
Assistant Professor, Aryabhatta Institute of Engineering & Technology, Barnala Punjab, India
2
Alrafidain University College, Baghdad, Iraq
3
Professor – Computer Science, Guru Kashi University, Talwandi Sabo, Bathinda Punjab, India

Received:11 May, 2017/Revision:29 May, 2017/Accepted:15 July, 2017


ABSTRACT: In recent times, while India is moving towards Digital Revolution, Data Mining and
Knowledge Management are two of the areas which attract many researchers as both have high potential
in terms of developing new techniques to be applied in different domains of human life. Medical science
is one of the areas where new inventions and developments are coming up as the results of integration of
Data Mining and Knowledge Management tools. The aim of this paper is to present detailed survey of
available literature on recent advancements and notable contributions in the field of applications of Data
Mining and Knowledge Management tools especially focused on Medical Informatics. After presenting
introduction about the aim and scope of the topic, section two of the paper reestablish the concepts of
Data mining and its conventional techniques i.e. Probabilistic & Statistical Models, Rule Induction,
Neural Networks and Analytical Learning and the section ends with presenting Knowledge Management
concept and its linkage with Data mining and Medical Science field. In section three, all the previous
relevant works of Data mining and knowledge management are critically analyzed, explained and
categorized on the basis of their applicability which is followed by section four which presents discussion
on all the previous works and highlight the advantages and disadvantages of various methods and tools
with further scope of future research and limitations. The author concludes the paper while emphasizing
on security and privacy concerns of medical data and attract readers’ attention towards validation of
medical data which is used for medical judgments and decisions.

KEYWORDS: Knowledge Management, Data Mining, Medical Informatics, Informatics, Medical


Decisions

INTRODUCTION:
Biomedical field is one of the areas from amount of data at various public and private
where huge data is being generated on daily repositories. However, only storage of data
basis. The digitization of critical data of has no use if it is not stored in organized
biomedical field in form of various reports, manner as per right context. Conventional
anatomic images and patient records has techniques of data storage were not capable
resulted into large amount of data. With the to manage mass data in efficient manner
advent of latest computer technologies in because of their failure in ensuring the
databases, we are capable to store such large availability of right data on right time.

Corresponding Author:
Rana Khudhair Abbas Ahmed
Alrafidain University College, Baghdad, Iraq

www.ijmlr.com/IJMLR© All right are reserved


60
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Therefore Data Mining comes forward in management tools in combination, facilitates


order to store the huge data categorically prevention of various possible errors.
and make it available as per requirements.
Data Mining can help us out in terms of data MATERIALS AND METHODS:
storage and retrieval but for once again data
itself has no value until we perceive some Data mining:
useful information from the data which is
called Knowledge Discovery in Databases The field of Data Mining is about 25
(KDD). It helps us to understand complex years old and the very early researchers of
physical, structural or biological behavior of the field advocated the need of new
real world entities that the data belongs to. technology and tools because conventional
Knowledge Discovery is one of the aspect of tools of data storage and management were
Knowledge Management which deals with unable to handle mass data in desirable way
effective and efficient access and use of data and additionally, their productivity is too
which helps the perceiver in decision low in comparison of their cost. Data mining
making process and ultimately ensures that aims to analyze set of given data or
the right decisions are made in a given information in order to identify novel and
situation. Though we have briefly discussed potentially useful patterns (Fayyad et al.,
Data Mining and Knowledge Management, 1996). In a very precise manner, we can
but it’s necessary to have a complete define Data Mining as the process of
understanding about both of them. In the discovering the patterns, associations or
following sections, we will explain about relationships among data by the use of
background of Data Mining and Knowledge analytical techniques which results into the
Management and their emerging creation of a model which will enable to
collaboration specifically in the field of achieve knowledge.
biomedical and medical informatics. In broader context, Data Mining has two
The key motivation behind the study is the categories, unsupervised and supervised. As
nature of data mining and knowledge tools the name suggests, unsupervised techniques
which supports decision making and are refers to category in which there is no
being widely adopted in variety of guidance variable i.e. like hypothesis or any
application areas i.e. military, marketing, model. Clustering is one of such methods
banking, education, disaster management, which can be called unsupervised tools. On
weather forecasting and many more. the other hand, supervised models are tools
Authors made an attempt to draw readers’ in which a model is built prior to analysis
attention on the worthy role and significant which is followed by application of data into
presence of data mining and knowledge model for in order to estimate the
management tools in numerous fields of parameters of model. Regarding biomedical
medical science. Such tools are not only and clinical data, the best part of Data
significantly useful in diagnosis of many Mining algorithms is that they are predictive
diseases like hepatitis, diabetes, breast i.e. they can learn from past examples.
cancer, skin cancer, lung cancer, kidney Therefore resulting model presents
failure, kidney stone, heart disease, liver formalized knowledge which guides good
disorder but also in providing forecasting diagnostic options. Few of most popular
information which helps in curing them. data mining techniques are given below -
Altogether data mining and knowledge

www.ijmlr.com/IJMLR© All right are reserved


61
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Probabilistic and statistical models statistical learning theory in which two or


more classes gets separated with the help of
The origin of probabilistic and statistical hyperplane (Vapnik, 1998), which is being
analysis techniques belongs to 1990s. The found from statistical analysis of features of
statistical analysis keeps and utilizes data different classes. Several biomedical
analysis and knowledge discovery goals applications such as biomedical
which are quite similar to machine learning classification problems are used in this
methods but the inception of former model and receives positive results, for
technique is different from AI research. example disease state classification or
There are a few popular techniques which medical diagnosis which are based on
are used in biomedical data analysis i.e. time genetic variables or patient indicators
series analysis, multi-dimensional scaling, respectively. Other than this model, SVM
regression analysis and principal component model is also used in document
analysis. These techniques are often referred classification for its best performance
as most prominent and reliable techniques in among several learning methods (Joachims,
comparison of latest machine learning 1998; Yang and Liu, 1999).
techniques especially in the field of medical
data analysis. Symbolic learning and rule induction
In the context of biomedical analysis, the
Bayesian Model is considered as most There are number of ways available by
superior model in the field. Having the roots which learning can take place but people
lying in pattern recognition (Duda and Hart, always choose the easiest one by which they
1973) the Bayesiam Model works on storing can understand the concept without much
the probability of each class, each feature efforts. Symbolic learning is one of similar
and each feature of each class on the basis of kind of techniques which have been proved
training data and such probabilities are very effective in learning process. Symbolic
further used to classify new classes. Each learning is the key process in knowledge
class usually has equally independent data discovery (KDD) and medical data
features and the concept is known as Naïve mining. Learning by examples, learning by
Bayesian model which is one of the words being told, rote learning, learning
variations of Bayesian model. Naïve from discovery (Carbonell et al., 1983),
Bayesian model has been accepted in these are few common types of symbolic
different domains on the basis of its learning. Learning by example got emerged
simplicity in terms of easy to understand in recent times as one of the most effective
(Fisher, 1987; Kononenko, 1993). In the tool in the process of symbolic learning
typical biomedical data mining, the specifically in the field of knowledge
Bayesian model has been used for analyzing management and data mining. In learning by
the microarray and genomic analysis example approach, the different classes are
because it is mathematically rigid and defined with different training examples in
moreover sophisticated in terms of modeling better manner by the use general concept
and its implementation. Therefore, it is quite description. The general concept
simple and understandable model. descriptions are can be generated by using
Out of all machine learning techniques, the pattern identify algorithms. Patterns are
support vector machines (SVMs) model is essentials in order to generate general
very popular in recent times due to its concept description. Couples of indentifying
powerful mechanism which is based on techniques are being used by number of

www.ijmlr.com/IJMLR© All right are reserved


62
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

learning algorithms for recognizing various representation of knowledge can be


patterns. In this contrast, few of most considered as key difference between
popular algorithms are Quinlan’s ID3 symbolic learning and neural networks.
decision-tree building algorithm (Quinlan, Specifically in neural networks, the
1983) and decision-tree building algorithm interconnected neurons, weighted synapses,
(Quinlan, 1983). A decision tree can be and threshold logic units (Rumelhart et al.,
generated in ID3 as per given set of objects 1986) emphasized the knowledge depiction
which is followed by classification of in comparison of other techniques i.e.
decision tree which ensures that all the given decision tree and production rules which are
objects are in proper and right manner. The primarily used in symbolic learning
algorithm executes and produces minimum techniques. The unknown objects or
entropy by which the algorithm can examples can be classified by learning
successfully identify the attributes from each algorithms. The adjustment of connected
step. These attributes further enable us to weights in the networks is important for
classify all the objects into different classes categorized unknown examples correctly. In
which ensure that the ambiguity of order to retrieve various concepts and
information is minimized. This process knowledge from the network, activation
results into creation of a set of production algorithms can be utilized over the nodes
rules or decision tree. (Chen et al., 1995). Several computational
Although the effectiveness of SVM or models of neural networks have been
neural networks is better than symbolic developed over past years but the most
learning techniques but generally it bears widely model is feedforward or
negligence from biologist or physicians due backpropagation. In the feedforward model
to their “black-box” complex mechanism. the information propagates only in forward
On the other hand, the Symbolic learning direction from input layer to output layer
techniques are widely popular due to their through hidden layer (if present) (Rumelhart
characteristics of results which are et al., 1986). There is no loop present in this
understandable and easy to interpret too. model so that propagation of information
It is essential in biomedical or (learning example) through nodes produces
bioinformatics that the data interpretation a predicted output.
techniques are easy to understand and
Analytic learning and fuzzy logic
efficient in application and results. It attracts
many practitioners and researchers to utilize In analytical learning, modeling and
such techniques for the benefits of society. representation of knowledge is primarily
based on logical rules and reasoning which
Neural networks is achieved by applying either fuzzy logics
or other similar techniques. On the basis of
An artificial neural network is an abstract logical rules and reasoning, analytical proofs
of model or graph used in computers which are generated. Analytical problems needs
consists of dynamic nodes which are such complex rules (compiled set of proofs)
interconnected using weighted links (like in order to minimize the exhaustive
neurons from as actual human nervous searching process. Analytic learning
system). It is equivalent to human nervous represents knowledge as logical rules and
system and aspires to attain human like performs reasoning operations onto such
performance. The approach of rules for searching related proofs. Further

www.ijmlr.com/IJMLR© All right are reserved


63
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

proofs can be compiled in form of complex Hybrid approach


rules for future which can be used to solve
similar problems with a smaller number of The hybrid approach in Knowledge
searches required. For an example, if we discovery is become much admired in now
wish to improve the speed of a parsing days because several practical biomedical
System, (Samuelson et al., 1991) proposed knowledge management and data mining
the use of analytic learning for representing system take up hybrid approaches. The
grammatical rules. In real world and reasons for differentiating paradigms are
conventionally, there is no difference in “more historical than scientific pointed by
between values and classes of analytic (Langley and Simon 1995). Many systems
learning systems, because traditional have been built to combine different
analytic learning systems are based on approaches because of indistinguishable
complex computing rules. The fuzzy limits of different paradigms.
systems are the ideal solution of such kind
of situation. Fuzzy logic and fuzzy system Knowledge management
operates the range of real number system
from 0 to 1 (Zedah, 1965) which suggests Researchers (Davenport et al, 1998) have
the use of values of true or false. In various advocated that Knowledge Management is
new applications, imprecision and an effective tool to understand various
approximate reasoning are also performed dimensions of organizational performance
by fuzzy logic principles. There is necessity and productivity. Although there is no
of analytical learning and fuzzy logic in universally accepted definition of
biomedical knowledge discovery. knowledge but few core characteristics are
identified as knowledge is abstract in nature,
Big data techniques inferential and essential for decision making.
Healthcare data has enough potential for
Though Big Data concept is not very generating massive knowledge and
new, now days it is being used as major emerging tools of Data Mining and
buzz in IT industry. Healthcare is one of Knowledge Management has served the
area in which Big Data technologies has purpose in effective manner in past one
brought huge impact. It is witnessed that decade.
three Vs of data i.e. velocity (pace of Knowledge Management is a contemporary
generation of data), variety, and volume (P. philosophy which has successfully utilized
Zikopoulos, 2011) are a native aspect of the in businesses in recent years. Actually it is
data which is produced by Healthcare field. an extension of three basic trends –
As per McKinsey Global Institute, if US  The expansion of globalization of
healthcare sector were to use big data business world which enforces
effectively then the sector could create more industry to work towards trying to
than $300 billion in value every year. Image gain competitive advantage over
Processing, Signal Processing and Genomics their competitors and to learn how to
(Ashwin Belle, 2015) are the major areas make products “better, cheaper and
where Big Data technologies are widely faster”.
used in recent times.  The widespread use of digital
information-texts, audios, videos and
pictures with the help of Internet

www.ijmlr.com/IJMLR© All right are reserved


64
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

which makes all such information world population is still not in position to
available 24*7 in desirable form. utilize such advancements in medical
 The increasing complexity of science due to poverty, but it is a fact that
business processes which is knowledge management has served as major
multiplied by increasing customer driving force in recent advances in medical
demands and expectations which sciences. The use and application of
compels businesses to deliver “the knowledge management in medical
right information at the right time in informatics can be categorically divided in
right manner”. few major areas. Content Management in
Knowledge Management came into first of these in which medical reports,
existence in late 1990s which inspired prescriptions and lab reports are the contents
business houses towards applying many which can be effectively managed by the use
revolutionary changes in all business of knowledge management. It can serve as
processes i.e. production, supply chain, centralized “library” having various “layers”
sales, accounts and finance and even human of information which can disseminated by
resource management also. These changes various professionals working at different
were related to acquire, store, process, levels in medical fields like doctors, persons
interpret and utilize the DATA residing at working in medical labs, medical data
various levels (including vertical and experts and researchers. Knowledge
horizontal) of the business units. Soon the Transfer is another tool of knowledge
popularity of Knowledge Management got management which is based on knowledge
viral and many big business houses i.e. IBM, sharing which can change the behavior and
Shell, British Petroleum started using the outcomes of processes. Knowledge sharing
same and all of these companies gained is the simple process of diffusion of
huge profits from out of it. Now Knowledge innovation within the processes of
Management is an established discipline that organization which can be referred as
focuses on collaboration; collaboration knowledge transfer which will impart the
among people, processes and technologies in adoption and implementation of critical
such a way that results into “knowledge” success factors. The results received after
which is the driving force of goals of the this knowledge transfer process is the next
business. area which needs to be assessed in order to
Specifically in the context of Medical get maximum benefits of Knowledge
Informatics, the use of knowledge Management. It is referred to as tracking the
management has been increased drastically results in terms of outcome measures,
since past two decades as a result of heavy process measures and satisfaction measures
investments towards improvement in the which measures the performances and
quality of health care services. Though this interpretations of all three measures.
is not sufficient, especially when 60% of

RESULTS & FINDINGS


Summarized tabular statement has also been
In this section we have explained all the given of various recent advances in
available and studied literature while developments in biomedical field which
highlighting its applicability, uses and covers almost all notable developments
limitations (if any). One comparative and since from 2000 to present.

www.ijmlr.com/IJMLR© All right are reserved


65
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Data mining techniques

Prather et al. (1997) used Knowledge chronic disease treatment and proposed a
Data Discovery (KDD) to indentify factors model named chronic prognosis and
for improving various parameters of prenatal diagnosis (CDPD) system. Kuo et al (2007)
care of obstetrical patients. Breault et al. discovered useful rules from different
(2002) used CART tool onto the data clusters of insurance databases by
warehouse of diabetic patients and identified integrating cluster analysis and association
factors affecting diabetics. They were the rules mining techniques.
first to discover that younger age predicts While medical information systems are
bad control over diabetics. Su et al. (2006) getting widespread, various types of medical
also presented a model for diabetic control data are emerging including text,
by using data mining tools. Table 1 also documents, audio, speech, hypertext,
shows one of the researches of similar type. graphics and images, etc. Similarly new
Wilson et al (2004) applied KDD advents in Internet era is another dimension
technologies in pharma-covigilance which contributes towards the expansion of
situations for detecting signals earlier than versatility of types of data.
other conventional methods. Lian et al  Multimedia data is one of type of data
(2003) presented a research about which is considered next generation data
prescription probability and they correlated mining techniques. Text Mining, image
the specified prescription with a preference mining, web mining and video mining
function based on the preferences of patients also part of same category in which
in prior clinical experiences. It was an many researches are either going on and
excellent example and application of have been conducted in recent time.
probability theory which resulted into dose Other than medical images and signals,
optimization framework. Susan and Warren unstructured free texts are highly
(2000) proposed conditional probability prominent area in which researchers
(CP) model for optimizing the drug lists pursuing their research because it is
with the application of discriminant analysis difficult to interpret such data. The
and multiple linear regression. It process of extracting useful information
demonstrated relationship between diagnosis and knowledge from textual documents
and medication and proposed a posterior or data is called Text Mining. For
probability based on priori probability where example, we can utilize text mining
first refers to what medication is needed and techniques in order to search text related
the later one refers to what diagnosis has to causes of diabetes in different text
made. This approach is quite similar to documents. Cohen and Hunter define
Warren et al (1998). text mining as “the use of automated
methods for exploiting the enormous
Emerging concepts amount of knowledge available in
biomedical literature” (Cohen and
Many researchers combined artificial Hunter, 2008). Semantic parsing and
intelligence and data mining techniques hidden vector state model (Zhou et al.,
together and proposed improved models for 2006) can be used to mine the text on
decision support in clinical situations. given unstructured text.
Huang (2007) combined Case Based
 Image mining is another area where
Reasoning (CBR) and data mining for
huge medical data is available because
finding a solution model for supporting the

www.ijmlr.com/IJMLR© All right are reserved


66
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

medical procedures employ images as human body with multi dimensional


most preferred tool for medical view which enables medical practices
diagnosis. Mining work onto images is more effective and efficient.
totally different phenomena as images  Another field which is very much
are most complex data structures in expanded with recent advancements in
comparison of other types i.e. text and digital revolution is audio and digital
documents. In broader sense, most of TV. Researchers are encouraged to
the images mining techniques focus on discover and reveal useful patterns in the
searching, retrieving and comparing the video contents and technically, this field
features of similarity and differences of is known as video mining. It is observed
query image and the stored images. One that technical persons prefer to use
of classic research in image mining camera in order to capture each
presents the use of different data mining operation and it leads to good
techniques for tumor classification in opportunity for applying data mining
digital mammography (Antonie et al. techniques while using video retrieval
2001) and the findings of research methods. One of empirical research in
suggests that associate rules can provide video mining presents video database
better results as compared to neural management framework (Kohavi et al.,
networks. Another research introduced a 2002) for video structure and event
concept of image sequence similarity mining.
patterns (ISSP) (Zhu et al., 2003) for  World Wide Web is undoubtedly largest
scanning brain images in order to database that ever exists. Hence Web
discover the possible Space-Occupying Mining can be considered as biggest
Lesion (PSO). 3D informatics is source for medical records and data.
becoming very popular since last few More than 80% of such data is available
years and recent advancements in image over web in form of electronic
enhancement techniques have provided documents which widespread and freely
support in same. Basically 3D accessible for everyone. The only
informatics refers to gathering, difficulty in using this huge data for the
manipulating, classifying, storing, process of knowledge discovery is
retrieving, navigating, presenting and dynamic nature of web pages i.e.
displaying extensively complex multi contents of web pages are constantly
dimensional data. Such representation changed. Technically defined, web
may include different dimensions which mining is a process of using data mining
may be independent or dependent both techniques to automatically retrieve and
and these dimensions are position, time extract information from Internet for the
and scale. 3D medical informatics is process of knowledge discovery.
relatively very new discipline as
compared to any other mining technique. Below given table provides a
It was started in 20th century when comprehensive comparative statement of
Roentgen discovered x-ray and used research and development in Medical
them successfully to imaging human Informatics with the use of data mining tools
body. CAT Scan, tomography, MRI and Knowledge Management concepts.
scan came in the latter half of the
century and now 3D scanning has
presented a refined way to look into

www.ijmlr.com/IJMLR© All right are reserved


67
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Table 1: Comparative Statement


Reference Summary Limitation, if any
Megalooikonomou et al. (2000) Introduced Statistical methods for Compatibility problems in types of
knowledge discovery of patterns and images and their uses
associations between clinical data
and images
Brossette et al. (2000) Introduced Data Mining No support for Images and their
Surveillance System (DMSS) which interpretations
discovers useful patterns related to
infections and antimicrobial
resistance from clinical data
Antonie et al. (2001) Classification of Medical Images
and detection of anomalies
Coulter et al. (2001) Presented and discovered relation Work was limited up to specific kind
between antipsychotic drugs and of drugs
myocarditis
Li et al. (2004) Discovered cancer detection method
with feature detection methods and
to compare obtained results in order
to detect protein patterns
A.M. Wilson et al. (2004) Detection of medical signals earlier
using pharma-covigilance methods
Delen et al. (2005) They developed prediction models
on breast cancer using a dataset by
combining two data mining
algorithms (artificial neural
networks and decision trees) along
with statistical method (logistic
regression)
Su et al. (2006) One of the prominent approaches to
predict diabetes while using four
data mining methods in
combination.
Phillips-Wren et al. (2008) Assessment of utilization of
healthcare resources by lung cancer
patients related to their demographic
characteristics like ethnic
background, medical histories which
will guide in medical decision
making and further drafting in
public medical policies
Z.Y. Zhuang (2009) Intelligent DSS for pathology test
ordering by GPs with the use of
Case Based Reasoning (CBR)
Lopez-Vallverdu et al. (2012) Wearable Sensing systems Used only for personal health care,
Expensive technique
Gaura et al. (2013) Used Experimental sensor data to Research performed on non medical
perform monitoring non clinical data.
health data
Huang et al. (2013) Online data mining of abnormal Applicable only for online tools and
period patterns from medical sensor difficult to interpret
data streams
Clifton et al. (2013) Proposed personalized e-health Deviations in results of different
monitoring using wearable sensors samples

www.ijmlr.com/IJMLR© All right are reserved


68
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

F. E. Dewey et al. (2014) Proposed actionable Implications of current public health


recommendations for analyzing policies, delivery of care and its
genome-scale data associated costs.
T. G. Kannampallil et al. (2014) Presented rational analysis of
information seeking behavior in
critical care
T. Hussain and Q. T. Nguyen (2014) Proposed Molecular imaging
technique for cancer diagnosis and
surgery.
K. Bernatowicz et al. (2015) Presented the concept of 4D CT The technology is still at inception
(Four dimensional tomography) stage
C. M. C. Tempany et al. (2015) Proposed Advanced Multimodal Used only for cancer therapy
Image-Guided Operating (AMIGO)
Suite incorporating angiographic X-
ray
system, MRI, 3D ultrasound, and
PET/CT imaging
W. Y. Hsu (2015) Proposed Segmentation-based Reducing the volume of data while
compression technique maintaining important data such as
anatomically relevant data
L. Qu, F. Long, and H. Peng (2015) Presented 3D registration of Used only for segmentation and
biological images annotation of images
J. M. Blum, H. Joo, H. Lee, and M. Implemented hospital wide Quality and accuracy of data is a
Saeed (2015) waveform capture system concern
M. Attin, G. Feld, H. Lemus et al. Inhospital early detection system for
(2015) cardiac arrest based on
Electrocardiogrpah parameters
from telemetry along with
demographic information

Knowledge management tools

Artificial Intelligence techniques can be effective and was an early success, but it
considered as the first ever methods have could not be used in actual clinical setups. It
been used in knowledge management in the had two major reasons; resistance of people
field of biomedical techniques. It started in because they were unfamiliar with computer
1970s when MYCIN program was developed technologies. Even many medical
in order to support medical decision making practitioners did not believe that computer
(Shortliffe et al, 1976). MYCIN was an can perform better than humans. Another
intelligent computer program in which reason was heavy costs incurred in
knowledge obtained from experts was establishing such systems because
represented as set of programming computers were bulky and extensively
constructs like IF-THEN rules. It worked as expensive machines in 1970s. Heathfield et
the motivation of another type of system al. advocated that patient record
called expert systems which became very management systems are highly desirable
popular knowledge management tool in knowledge management tool in clinical
between 1980 to 2000. The concept of setups (Heathfield et al 1999). Dewas et al.
expert system was to feed knowledge into also pointed out the major reasons behind
the systems in form of sense of reasoning the desire as information needs of medical
which the system will utilize while making practitioners and clinical information
decisions. Though MYCIN was a very overload (Dawes et al 2003). Hersh et al

www.ijmlr.com/IJMLR© All right are reserved


69
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

(1996) classified clinical information into management system tools available in form
two broader categories: clinical information of research articles and reports and most of
specifically related to patients and them are available thorough digital library
knowledge based information. Both types of techniques. The National Library of
information are growing at a rapid pace as Medicine (NLM) provides the PubMed
huge number of medical reports, academic service which contains over 13 million
research papers, books and technical reports citations for biomedical articles from
are getting generated on daily basis. different medical journals. Regarding
Though early clinical systems were nothing application of different models and
greater than ordinary data storage systems, techniques, MARVIN is one the medical
but they can be considered as first information retrieval systems, (Baujard et
generation knowledge management systems. al., 1998) which uses machine learning
The use of ontologies in biomedical techniques. It was based on multi-agent
knowledge management has been architecture which works on scrutinizing
widespread since over two decades. An relevant documents from given sets of web
ontology is basically a conceptualization of pages using machine learning methods and
specification which defines the existing follows web links in order to retrieve new
relationships and describes the terminologies documents. Shatkay et al. used probabilistic
in a particular domain. The major use of similarity based search in order to retrieve
ontologies is to facilitate knowledge sharing biomedical documents which are similar in
among people, information networks, terms of themes (Shatkay et al., 2002). Chau
softwares and knowledge management and Chen et al. applied artificial neural
applications. HELP System developed in networks for filtering web pages (Chau and
1991 supported in monitoring a traditional Chen et al., 2004). Three dimensional
medical record management system. The display (Han and Byun, 2004) for
SAPHIRE system was enabled to perform visualization of protein interaction networks.
automatic indexing of reports related to This is very much similar to Virtual Reality
radiology by utilizing the UMLS which has been successfully implemented in
Metathesaurus (Hersh et al. 2002). visualizing metabolic networks
Other than reporting and formatting clinical (Rojdestvesski, 2003).
information, there is many other knowledge

DISCUSSION AND ANALYSIS: but also will be an value addition in the


service of mankind. This presented review
Effective use of data mining in the of data mining applications and knowledge
medical field is a need of time due to the management concepts has endowed us with
idiosyncratic nature of the medical an overview of current practices and further
profession. It demands widespread changes challenges. Health care organizations and
and transformations in processes which agencies could be benefitted with the
further urge to have throughout effective use of these applications and to
understanding of requirements of the find useful ideas for knowledge discovery
healthcare sector. Use of techniques of data from their own database systems. In further
mining can lead to requisite knowledge sub-sections we have presented key
which can further lead successful decisions advantages and disadvantages of Data
that will improve the overall success ratio of Mining and Knowledge Management tools
medical experiments and treatments. It will
not only benefit to the health of the patients
www.ijmlr.com/IJMLR© All right are reserved
70
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

and techniques while highlighting major effective tool for reducing medical
limitations of the same. prescription errors which leads to cost
reduction as well Melymuka (2002).
Advantages of knowledge management and  Scope of Innovation – In recent times,
data mining innovation has been proved as the major
driving tool for new developments and
As the a result of evaluation of above advancements in organizations. Medical
given contents, the conclusion can be stated field is not an exception for same. In fact,
as knowledge management and data mining innovation is the one of the essential part of
has vast scope and advantages which can knowledge management as it includes
enable both of techniques to provide better sharing of concepts, learning and
health delivery. The advantages are listed experiences so that more people can
below as – become aware of it and provide further
refinement and enhancements in the same
 Quality of care – Oranzo et al. (2008) has tools. Buchan & Hanka (1997) advocated
suggested that adoption of knowledge distributed knowledge management
management techniques can lead to paradigms for management of clinical
enhancement in the quality of care knowledge. Ansel et al. (2007) further
specifically related to health care domain. identified innovation facilitation
Chae et al. (2001) has proved that methodologies which are being followed by
knowledge management tools are very analysis of knowledge flow barriers which
effective in different domains including may hamper the effective communication
health insurance. Besides this, the in the organization.
efficiency of work can be enhanced by  Mobility – The current era can be referred
applying knowledge management tools in as mobile era and future also belongs to
day to day work (Davenport et al. 2002). mobile and mobility. Hubert et al. (2006)
Similar kind of research conducted by highlighted the explosion of mobile,
Goddard et al. (2004) has proved that interactive devices, e-homecare solutions.
knowledge management tools are useful in In fact, mobile services has emerged in
public health care decision making. form of virtual communities in which
 Cost Reduction – Lamout et al. (2007) has knowledge is captured, shared and
argued that information sharing among disseminated by individuals involved in
stake holders of the organization can lead communication networks which includes
to cost effective use of health resources. patients and medical practitioners.
McElroy (2005) suggested that use of
knowledge management tools has  Challenges of knowledge management and
significantly reduced high cost of medical data mining
errors.
 Medical error reduction – Abidi et al.  Based on extensive review of literature and
(2001) has studied the impacts of recent practices, followings are the
knowledge management practices used by identified challenges of medical informatics
medical practitioners has provided effective -
decision support in the process of medical  The growing volume of medical data is
error reduction. Case based reasoning can resulting into growing number of
also be used for the same purpose (Montani challenges and opportunities which
et al. (2002). Additionally, knowledge includes data analysis, capturing and
management tools has been proved as managing data and data mining tools. As

www.ijmlr.com/IJMLR© All right are reserved


71
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

the diversity and volume of data getting Researchers or data analysts who are using
increased,, so does researchers feeling need medical records and data do not have an
to develop innovative and effective ways of automatic right to use such data and they
capturing, retrieving and using the are bound to have necessary approval and
knowledge from this data. Effective and consent (Berman, 2002) before using the
efficient management of this knowledge is data. United States of America has enacted
one of emerging challenges of concerned Health Insurance Portability and
field. Gene expression is one of biomedical Accountability Act (HIPAA) which has
concept which extensively uses data mining imposed several rules and standards for
techniques. Basically medical researchers handling patient data in electronic form.
measure the level of expression of all genes The EU Data Protection Directive imposed
of a particular tissue under a given similar kind of regulations in Europe. Such
condition and conduct comparative analysis legal provisions ensure the ethical and legal
of frequency levels of same genes in same use of medical data for only social concerns
tissue under different conditions. This and benefits of society.
process is known as differential gene  Security is one another challenge of the
expression. The major point of concerns field which is another result of growing use
and challenge in gene expression is of Internet. However, security concern is
management of experimental data because relatively new in comparison of other
a single gene expression measurement challenges. Numerous threats exist in and
generates millions of data points. outside of the organizations which are part
 Another challenge is data analysis as of security concerns such as malicious
number of tools available for the purpose codes i.e. virus, worms, Trojan horse.
does not implement latest algorithms and Attacks of such malicious codes results into
methods for the same. However few open denial of services, theft of information and
source collaborations are available for other kind of intrusions. The major reason
researchers in order to conduct array of such attacks is often the vulnerabilities
analysis. Finally we can conclude that there in operating systems. However there may
is a need of new advancements to mine be other reasons also responsible for same
gene expression data in effective manner. like lack of trained staff, failure in using
 Privacy is another significant challenge of updated antivirus software.
biomedical field which warrants attention.  Different types of data being used in
Broadly defined, privacy refers to medical science are another issue of
controlling release of information and concern as data is the core at all types of
eliminating any possibility of intrusion or decisions and incorrect or incomplete data
disturbance. In the context of medical field, cannot lead to successful results. Few of the
we share sensitive data and information concerns related to qualitative aspects of
with medical practitioners which makes it data are. In biomedical area, it is very
vulnerable to goes public on which we do difficult to obtain satisfactory result of data
not have any control. Medical data and mining acceptance because the data found
records contains extensively sensitive and in biomedicine field is very multifaceted
confidential information and therefore, it is and the quality of data is one of the major
a very serious concern that such data challenges on blowing the performance in
should be handled with proper caution and this industry. The effect of data mining
care so that their privacy and security can majorly depends on nature (quantity and
be completely assured and guaranteed. quality) of existing data so that we can

www.ijmlr.com/IJMLR© All right are reserved


72
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

conclude that data mining is a data driven values, because the methods require a
approach. However, the nature of the preset dimension for each data object in
biomedical data is still quite complex. data base. In the context of medical
Therefore, some concerns are indentified as database, the information assortment
the key findings of the chapter in order to depends on collection of patient care
improve the functions of data mining activities not on organized research
techniques as follow: protocols. As an example, The
University of Wisconsin Hospitals
a. Volume of data - This is very difficult encountered the problem of missing
for all the data mining methods to values in some large medical databases
achieve desirable results using raw data such as breast cancer data sets. In order
because biomedicine area is one in to address this issue, they skipped over
which has huge data is being generated the records having missing attribute
on daily basis. It is required that the values. We can have another approach
biomedical experts pre-process the data to eliminate this difficulty is to
prior to apply data mining technique. In substitute the missing values with some
this context, pre-processing also has similar kind of. We can change over the
different aspects to be handled by the missing values as a process by
medical professionals and these aspects gathering values from existing values
makes the data handling processes through Artificial intelligence methods
extensively rigorous and time (e.g. case-based reasoning).
consuming. d. Redundant, insignificant data, or
b. Nature of data - The dynamic nature of inconsistent data - Data redundancy,
data makes it difficult to handle and inconsistency or insignificant data leads
manage as well. Adding new to data corruption. In such type of
information, updating (new versions) of situation, data objects and attributes
data is constant process which cannot contain repeated non relevant values in
be avoided if we wish to gain maximum data sets and it must be avoided by
benefit of data mining. Medical data designing the data base in correct form.
has a common property of changing Pre-processing of data is the ideal way
values i.e. new tests may have new type to get useful and desirable data. In
of results which makes medical data Biomedical Informatics field the
dynamic in nature. For data collection databases contains huge amount of data
techniques, It is very difficult to makes in textual and numeric format thus, data
noise free database (elimination of should be pre-processed in order to
entire noise). To address this, the eliminate possibility of redundancy and
amount of noise in database to be inconsistency of data. As an example,
collected in the future by the different in the medical terminology one
data mining methods ,that will be condition or prescription may be
approximately same as that in the generally referred to by many names
current data. Therefore, data mining like (i.e. stomach and abdominal pain),
methods should be less sensitive to this condition commonly occurred by
noise. the misspelled of medical terms.
c. Missing attribute values -The methods
of data mining generally facing the In perspectives of data quality in the
problems regarding missing attribute database, several considerations are also

www.ijmlr.com/IJMLR© All right are reserved


73
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

been made. Quality of learning mechanism a timeline and flowing summary of all major
is one of them which refer to the affects of tools and applications of data mining and
mining process by the mean of several ways. knowledge management which are used in
One of them is under and over learning. This medical informatics. Since past two decades,
condition occurred when the human’s it has been observed that data mining and
preferences misunderstand by the learning knowledge management has served as major
mechanism techniques and for accomplish driving force for advancements in medical
the target, they need human being. Quality informatics field and we have witnessed
of knowledge representation is another numerous developments in the field of
perspective. Representation of knowledge or bioinformatics. However, it is also
information in biomedical area is a recommended that medical data and
significant tool for patient data or clinical information should be handled with due care
trials. The aim of Knowledge representation in order to ensure necessary security and
is to understand things in easy manner. If the privacy concerns. It must at priority that
machine is not capable to store the patient’s records and details must be used
knowledge discovered and to represent the for any illegal or unethical practice.
same; then, the machine cannot be referred Additionally confidentiality and privacy of
as intelligent machine. Nature of problem is such records must not be compromised at
playing an important role in data mining any level due to recent advancements in
methods. Sometimes the intelligent system medical fields. Another point which needs
or machine has insufficient time or attention is the interpretations and
knowledge to solve the problem and provide judgments which are based upon the
correct result findings discovered by computers using data
mining and knowledge management tools.
CONCLUSION: Such interpretations must be validated in
similar ways just like any other knowledge
In this paper, we provided a detailed and generated by human. Incorrect
refined overview of various technologies interpretations should be propagated through
and tools of Data Mining and Knowledge media so that people understands the
Management with emphasis on tools and consequences and actual facts of the
techniques used especially in medical situations.
informatics. Nonetheless, the paper presents

REFERENCES:

1. Abidi, S. S. R. (2001). "Knowledge 3. Ashwin Belle, Raghuram Thiagarajan,


Management in Healthcare: Towards S. M. Reza Soroushmehr, Fatemeh
'Knowledgedriven' Decision-support Navidi, Daniel A. Beard, and Kayvan
Services," International Journal of Najarian (2015), “Big Data Analytics in
Medical Informatics, 63, 5-18. Healthcare,” BioMed Research
2. Antonie M. L., Zaiane O. R., and International, vol. 2015, Article ID
Coman A. (2001), “Application of data 370194, 16 pages, 2015.
mining techniques for medical image doi:10.1155/2015/370194
classifica-tion,” in Proceedings Second 4. Brossette S E, Sprague A. P., Jones W.
International Workshop on Multimedia T., and Moser S. A., (2000), “A data
Data Mining, pp. 94–101. mining system for infection control sur-
veillance,” Methods of Information in

www.ijmlr.com/IJMLR© All right are reserved


74
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Medicine, Vol. 39, No. 4-5, pp. 303–310 World Wide Web, doi:10.1007/s11280-
5. Carbonell, J. G. Michalski, R. S., 013-0203-y.
Mitchell, T. M. (1983). "An Overview 15. Hubert, R. (2006). Accessibility and
of Machine Learning," in R. S. usability guidelines for mobile devices
Michalski, J. G. in home health monitoring. SIGACCESS
6. Clifton, L.; Clifton, D.A.; Pimentel, Accessibility and Computing (84), 26-
M.A.F.; Watkinson, P.J.; Tarassenko, L. 29.
(2013) Gaussian processes for 16. J. M. Blum, H. Joo, H. Lee, and M.
personalized e-health monitoring with Saeed (2015), “Design and
wearable sensors. IEEE Trans. Biomed. implementation of a hospital wide
Eng., 60, 193–197. waveform capture system,” Journal of
7. C. M. C. Tempany, J. Jayender, T. Clinical Monitoring and Computing,
Kapur et al. (2015), “Multimodal vol. 29, no. 3, pp. 359–362
imaging for improved diagnosis and 17. K. Bernatowicz, P. Keall, P.Mishra, A.
treatment of cancers,”Cancer, vol. 121, Knopf, A. Lomax, and J. Kipritidis
no. 6, pp. 817–827 (2015), “Quantifying the impact of
8. Coulter D. M., Bate A., Meyboom R. H. respiratory-gated 4D CT acquisition on
B., Lindquist M., and Edwards R. thoracic image quality: a digital
(2001), “Antipsychotic drugs and heart phantom study,” Medical Physics, vol.
muscle disorder in international 42, no. 1, pp. 324–334
pharmacovigilance: Data mining study,” 18. Kohonen, T. (1995).Self-organizing
British Medical Journal, 322, 1207– Maps, Springer-Verlag,
1209. Berlin.Kononenko, I. (1993). "Inductive
9. Delen D., Walker G., and Kadam A. and Bayesian Learning in Medical
(2005), “Predicting breast cancer Diagnosis," Applied Artificial
survivability: A comparison of three Intelligence, 7,3 17-337
data mining methods,” Artificial 19. Langley, P. and Simon, H. (1995).
Intelligence in Medicine, 34(2), 113–27. "Applications of Machine Learning and
10. Duda, R. 0.and Hart, P. E. (1973). Rule Induction," Communications of the
Pattern ClassiJication and Scene ACM, 38(1 I), 55-64.
Analysis, New York: John Wiley and 20. Li L., Tang H., Wu Z., Gong J., Gruidl
Sons. M., Zou J., Tockman M., and Clark R.
11. F. E. Dewey, M. E. Grove, C. Pan et al. (2004), “Data mining techniques for
(2014), “Clinical interpretation and cancer detection using serum proteomic
implications of whole-genome profiling,” Artificial Intelligence in
sequencing,” JAMA, vol. 311, no. 10, Medicine, 32(2), 71–83.
pp. 1035–1045 21. L. Qu, F. Long, and H. Peng (2015),
12. Fisher, D. H. (1987)."Knowledge “3D registration of biological images
Acquisition via Incremental Conceptual and models: registration of microscopic
Clustering," Machine Learning, 2, 139- images and its uses in segmentation and
172. annotation,” IEEE Signal Processing
13. Gaura, E.; Kemp, J.; Brusey, J. (2013), Magazine, vol. 32, no. 1, pp. 70–77
Leveraging knowledge from 22. M. Attin, G. Feld, H. Lemus et al.
physiological data: On-body heat stress (2015), “Electrocardiogram
risk prediction with sensor networks. characteristics prior to in-hospital
IEEE Trans. Biomed.Circuits System. cardiac arrest,” Journal of Clinical
14. Huang, G.; Zhang, Y.; Cao, J.; Steyn, Monitoring and Computing, vol. 29, no.
M.; Taraporewalla, K. (2013), Online 3, pp. 385–392
mining abnormal period patterns from 23. Megalooikonomou V., Ford J., Shen L.,
multiple medical sensor data streams. Makedon F., and Saykin A. (2000),
“Data mining in brain imaging,”

www.ijmlr.com/IJMLR© All right are reserved


75
Ahmed et al., Int J Med Lab Res 2017, 2(2): 60-76
ISSN 2456-4400

Statistical Methods in Medical 29. Representations by Error Propagation,"


Research, Vol. 9, No. 4, pp. 359–394. in D. E. Rumelhart, J. L. McClelland,
24. Prather, J. C., Lobach, D. F., Goodwin, and the PDP Research Group (Eds.),
L. K., Hales, J. W., Hage, M. L., and Parallel Distributed Processing, pp. 318-
Hammond, W. E. (1997)."Medical Data 362, Cambridge, MA: The MIT Press.
Mining: Knowledge Discovery in a 30. Su, C. T., Yang, C. H., Hsu, K. H., and
Clinical Data Warehouse,"cin Chiu, W. K. (2006), “Data mining for
Proceedings of the AMIA Annual the diagnosis of type II diabetes from
Symposium Fall 1997, 101-105. three- dimensional body surface
25. Philips-Wren G., Sharkey. P., and anthropometrical scanning data,”
Morss. S. (2008), “Mining lung cancer Computers & Mathematics with
patient data to assess healthcare resource Applications, 51(6–7), 1075–1092.
utiliza-tion,” Expert Systems with 31. T. Hussain and Q. T. Nguyen (2014),
Applications: An International Journal, “Molecular imaging for cancer
35(4), 1611–1619. diagnosis and surgery,” Advanced Drug
26. P. Zikopoulos, C. Eaton, D. deRoos, T. Delivery Reviews, vol. 66, pp. 90–100
Deutsch, and G. Lapis (2011), 32. T. G. Kannampallil, A. Franklin, T.
Understanding Big Data: Analytics for Cohen, and T. G. Buchman (2014),
Enterprise Class Hadoop and Streaming “Sub-optimal patterns of information
Data, McGraw-Hill Osborne Media use: a rational analysisof information
27. Rumelhart, D. E., Hinton, G. E., and seeking behavior in critical care,” in
McClelland, J. L. (1986a). "A General Cognitive Informatics in Health and
Framework for Parallel Distributed Biomedicine, pp. 389–408, Springer,
Processing," in D. E. Rumelhart, J. L. London, UK
McClelland, and the PDP Research 33. W. Y. Hsu (2015), “Segmentation-based
Group (Eds.), Parallel Distributed compression: new frontiers of
Processing, pp. 45-76, Cambridge, MA: telemedicine in telecommunication,”
The MIT Press. Telematics and Informatics, vol. 32, no.
28. Rumelhart, D. E., Hinton, G. E., and 3, pp. 475–485
Williams, R. J. (1986b). "Learning
Internal

CONFLICT OF INTEREST: Authors declared no conflict of interest

www.ijmlr.com/IJMLR© All right are reserved


76

Você também pode gostar