Escolar Documentos
Profissional Documentos
Cultura Documentos
Yogita Gupta1, Rana Khudhair Abbas Ahmed2*, Dr. Sandeep Kumar Kautish3
1
Assistant Professor, Aryabhatta Institute of Engineering & Technology, Barnala Punjab, India
2
Alrafidain University College, Baghdad, Iraq
3
Professor – Computer Science, Guru Kashi University, Talwandi Sabo, Bathinda Punjab, India
INTRODUCTION:
Biomedical field is one of the areas from amount of data at various public and private
where huge data is being generated on daily repositories. However, only storage of data
basis. The digitization of critical data of has no use if it is not stored in organized
biomedical field in form of various reports, manner as per right context. Conventional
anatomic images and patient records has techniques of data storage were not capable
resulted into large amount of data. With the to manage mass data in efficient manner
advent of latest computer technologies in because of their failure in ensuring the
databases, we are capable to store such large availability of right data on right time.
Corresponding Author:
Rana Khudhair Abbas Ahmed
Alrafidain University College, Baghdad, Iraq
which makes all such information world population is still not in position to
available 24*7 in desirable form. utilize such advancements in medical
The increasing complexity of science due to poverty, but it is a fact that
business processes which is knowledge management has served as major
multiplied by increasing customer driving force in recent advances in medical
demands and expectations which sciences. The use and application of
compels businesses to deliver “the knowledge management in medical
right information at the right time in informatics can be categorically divided in
right manner”. few major areas. Content Management in
Knowledge Management came into first of these in which medical reports,
existence in late 1990s which inspired prescriptions and lab reports are the contents
business houses towards applying many which can be effectively managed by the use
revolutionary changes in all business of knowledge management. It can serve as
processes i.e. production, supply chain, centralized “library” having various “layers”
sales, accounts and finance and even human of information which can disseminated by
resource management also. These changes various professionals working at different
were related to acquire, store, process, levels in medical fields like doctors, persons
interpret and utilize the DATA residing at working in medical labs, medical data
various levels (including vertical and experts and researchers. Knowledge
horizontal) of the business units. Soon the Transfer is another tool of knowledge
popularity of Knowledge Management got management which is based on knowledge
viral and many big business houses i.e. IBM, sharing which can change the behavior and
Shell, British Petroleum started using the outcomes of processes. Knowledge sharing
same and all of these companies gained is the simple process of diffusion of
huge profits from out of it. Now Knowledge innovation within the processes of
Management is an established discipline that organization which can be referred as
focuses on collaboration; collaboration knowledge transfer which will impart the
among people, processes and technologies in adoption and implementation of critical
such a way that results into “knowledge” success factors. The results received after
which is the driving force of goals of the this knowledge transfer process is the next
business. area which needs to be assessed in order to
Specifically in the context of Medical get maximum benefits of Knowledge
Informatics, the use of knowledge Management. It is referred to as tracking the
management has been increased drastically results in terms of outcome measures,
since past two decades as a result of heavy process measures and satisfaction measures
investments towards improvement in the which measures the performances and
quality of health care services. Though this interpretations of all three measures.
is not sufficient, especially when 60% of
Prather et al. (1997) used Knowledge chronic disease treatment and proposed a
Data Discovery (KDD) to indentify factors model named chronic prognosis and
for improving various parameters of prenatal diagnosis (CDPD) system. Kuo et al (2007)
care of obstetrical patients. Breault et al. discovered useful rules from different
(2002) used CART tool onto the data clusters of insurance databases by
warehouse of diabetic patients and identified integrating cluster analysis and association
factors affecting diabetics. They were the rules mining techniques.
first to discover that younger age predicts While medical information systems are
bad control over diabetics. Su et al. (2006) getting widespread, various types of medical
also presented a model for diabetic control data are emerging including text,
by using data mining tools. Table 1 also documents, audio, speech, hypertext,
shows one of the researches of similar type. graphics and images, etc. Similarly new
Wilson et al (2004) applied KDD advents in Internet era is another dimension
technologies in pharma-covigilance which contributes towards the expansion of
situations for detecting signals earlier than versatility of types of data.
other conventional methods. Lian et al Multimedia data is one of type of data
(2003) presented a research about which is considered next generation data
prescription probability and they correlated mining techniques. Text Mining, image
the specified prescription with a preference mining, web mining and video mining
function based on the preferences of patients also part of same category in which
in prior clinical experiences. It was an many researches are either going on and
excellent example and application of have been conducted in recent time.
probability theory which resulted into dose Other than medical images and signals,
optimization framework. Susan and Warren unstructured free texts are highly
(2000) proposed conditional probability prominent area in which researchers
(CP) model for optimizing the drug lists pursuing their research because it is
with the application of discriminant analysis difficult to interpret such data. The
and multiple linear regression. It process of extracting useful information
demonstrated relationship between diagnosis and knowledge from textual documents
and medication and proposed a posterior or data is called Text Mining. For
probability based on priori probability where example, we can utilize text mining
first refers to what medication is needed and techniques in order to search text related
the later one refers to what diagnosis has to causes of diabetes in different text
made. This approach is quite similar to documents. Cohen and Hunter define
Warren et al (1998). text mining as “the use of automated
methods for exploiting the enormous
Emerging concepts amount of knowledge available in
biomedical literature” (Cohen and
Many researchers combined artificial Hunter, 2008). Semantic parsing and
intelligence and data mining techniques hidden vector state model (Zhou et al.,
together and proposed improved models for 2006) can be used to mine the text on
decision support in clinical situations. given unstructured text.
Huang (2007) combined Case Based
Image mining is another area where
Reasoning (CBR) and data mining for
huge medical data is available because
finding a solution model for supporting the
Artificial Intelligence techniques can be effective and was an early success, but it
considered as the first ever methods have could not be used in actual clinical setups. It
been used in knowledge management in the had two major reasons; resistance of people
field of biomedical techniques. It started in because they were unfamiliar with computer
1970s when MYCIN program was developed technologies. Even many medical
in order to support medical decision making practitioners did not believe that computer
(Shortliffe et al, 1976). MYCIN was an can perform better than humans. Another
intelligent computer program in which reason was heavy costs incurred in
knowledge obtained from experts was establishing such systems because
represented as set of programming computers were bulky and extensively
constructs like IF-THEN rules. It worked as expensive machines in 1970s. Heathfield et
the motivation of another type of system al. advocated that patient record
called expert systems which became very management systems are highly desirable
popular knowledge management tool in knowledge management tool in clinical
between 1980 to 2000. The concept of setups (Heathfield et al 1999). Dewas et al.
expert system was to feed knowledge into also pointed out the major reasons behind
the systems in form of sense of reasoning the desire as information needs of medical
which the system will utilize while making practitioners and clinical information
decisions. Though MYCIN was a very overload (Dawes et al 2003). Hersh et al
(1996) classified clinical information into management system tools available in form
two broader categories: clinical information of research articles and reports and most of
specifically related to patients and them are available thorough digital library
knowledge based information. Both types of techniques. The National Library of
information are growing at a rapid pace as Medicine (NLM) provides the PubMed
huge number of medical reports, academic service which contains over 13 million
research papers, books and technical reports citations for biomedical articles from
are getting generated on daily basis. different medical journals. Regarding
Though early clinical systems were nothing application of different models and
greater than ordinary data storage systems, techniques, MARVIN is one the medical
but they can be considered as first information retrieval systems, (Baujard et
generation knowledge management systems. al., 1998) which uses machine learning
The use of ontologies in biomedical techniques. It was based on multi-agent
knowledge management has been architecture which works on scrutinizing
widespread since over two decades. An relevant documents from given sets of web
ontology is basically a conceptualization of pages using machine learning methods and
specification which defines the existing follows web links in order to retrieve new
relationships and describes the terminologies documents. Shatkay et al. used probabilistic
in a particular domain. The major use of similarity based search in order to retrieve
ontologies is to facilitate knowledge sharing biomedical documents which are similar in
among people, information networks, terms of themes (Shatkay et al., 2002). Chau
softwares and knowledge management and Chen et al. applied artificial neural
applications. HELP System developed in networks for filtering web pages (Chau and
1991 supported in monitoring a traditional Chen et al., 2004). Three dimensional
medical record management system. The display (Han and Byun, 2004) for
SAPHIRE system was enabled to perform visualization of protein interaction networks.
automatic indexing of reports related to This is very much similar to Virtual Reality
radiology by utilizing the UMLS which has been successfully implemented in
Metathesaurus (Hersh et al. 2002). visualizing metabolic networks
Other than reporting and formatting clinical (Rojdestvesski, 2003).
information, there is many other knowledge
and techniques while highlighting major effective tool for reducing medical
limitations of the same. prescription errors which leads to cost
reduction as well Melymuka (2002).
Advantages of knowledge management and Scope of Innovation – In recent times,
data mining innovation has been proved as the major
driving tool for new developments and
As the a result of evaluation of above advancements in organizations. Medical
given contents, the conclusion can be stated field is not an exception for same. In fact,
as knowledge management and data mining innovation is the one of the essential part of
has vast scope and advantages which can knowledge management as it includes
enable both of techniques to provide better sharing of concepts, learning and
health delivery. The advantages are listed experiences so that more people can
below as – become aware of it and provide further
refinement and enhancements in the same
Quality of care – Oranzo et al. (2008) has tools. Buchan & Hanka (1997) advocated
suggested that adoption of knowledge distributed knowledge management
management techniques can lead to paradigms for management of clinical
enhancement in the quality of care knowledge. Ansel et al. (2007) further
specifically related to health care domain. identified innovation facilitation
Chae et al. (2001) has proved that methodologies which are being followed by
knowledge management tools are very analysis of knowledge flow barriers which
effective in different domains including may hamper the effective communication
health insurance. Besides this, the in the organization.
efficiency of work can be enhanced by Mobility – The current era can be referred
applying knowledge management tools in as mobile era and future also belongs to
day to day work (Davenport et al. 2002). mobile and mobility. Hubert et al. (2006)
Similar kind of research conducted by highlighted the explosion of mobile,
Goddard et al. (2004) has proved that interactive devices, e-homecare solutions.
knowledge management tools are useful in In fact, mobile services has emerged in
public health care decision making. form of virtual communities in which
Cost Reduction – Lamout et al. (2007) has knowledge is captured, shared and
argued that information sharing among disseminated by individuals involved in
stake holders of the organization can lead communication networks which includes
to cost effective use of health resources. patients and medical practitioners.
McElroy (2005) suggested that use of
knowledge management tools has Challenges of knowledge management and
significantly reduced high cost of medical data mining
errors.
Medical error reduction – Abidi et al. Based on extensive review of literature and
(2001) has studied the impacts of recent practices, followings are the
knowledge management practices used by identified challenges of medical informatics
medical practitioners has provided effective -
decision support in the process of medical The growing volume of medical data is
error reduction. Case based reasoning can resulting into growing number of
also be used for the same purpose (Montani challenges and opportunities which
et al. (2002). Additionally, knowledge includes data analysis, capturing and
management tools has been proved as managing data and data mining tools. As
the diversity and volume of data getting Researchers or data analysts who are using
increased,, so does researchers feeling need medical records and data do not have an
to develop innovative and effective ways of automatic right to use such data and they
capturing, retrieving and using the are bound to have necessary approval and
knowledge from this data. Effective and consent (Berman, 2002) before using the
efficient management of this knowledge is data. United States of America has enacted
one of emerging challenges of concerned Health Insurance Portability and
field. Gene expression is one of biomedical Accountability Act (HIPAA) which has
concept which extensively uses data mining imposed several rules and standards for
techniques. Basically medical researchers handling patient data in electronic form.
measure the level of expression of all genes The EU Data Protection Directive imposed
of a particular tissue under a given similar kind of regulations in Europe. Such
condition and conduct comparative analysis legal provisions ensure the ethical and legal
of frequency levels of same genes in same use of medical data for only social concerns
tissue under different conditions. This and benefits of society.
process is known as differential gene Security is one another challenge of the
expression. The major point of concerns field which is another result of growing use
and challenge in gene expression is of Internet. However, security concern is
management of experimental data because relatively new in comparison of other
a single gene expression measurement challenges. Numerous threats exist in and
generates millions of data points. outside of the organizations which are part
Another challenge is data analysis as of security concerns such as malicious
number of tools available for the purpose codes i.e. virus, worms, Trojan horse.
does not implement latest algorithms and Attacks of such malicious codes results into
methods for the same. However few open denial of services, theft of information and
source collaborations are available for other kind of intrusions. The major reason
researchers in order to conduct array of such attacks is often the vulnerabilities
analysis. Finally we can conclude that there in operating systems. However there may
is a need of new advancements to mine be other reasons also responsible for same
gene expression data in effective manner. like lack of trained staff, failure in using
Privacy is another significant challenge of updated antivirus software.
biomedical field which warrants attention. Different types of data being used in
Broadly defined, privacy refers to medical science are another issue of
controlling release of information and concern as data is the core at all types of
eliminating any possibility of intrusion or decisions and incorrect or incomplete data
disturbance. In the context of medical field, cannot lead to successful results. Few of the
we share sensitive data and information concerns related to qualitative aspects of
with medical practitioners which makes it data are. In biomedical area, it is very
vulnerable to goes public on which we do difficult to obtain satisfactory result of data
not have any control. Medical data and mining acceptance because the data found
records contains extensively sensitive and in biomedicine field is very multifaceted
confidential information and therefore, it is and the quality of data is one of the major
a very serious concern that such data challenges on blowing the performance in
should be handled with proper caution and this industry. The effect of data mining
care so that their privacy and security can majorly depends on nature (quantity and
be completely assured and guaranteed. quality) of existing data so that we can
conclude that data mining is a data driven values, because the methods require a
approach. However, the nature of the preset dimension for each data object in
biomedical data is still quite complex. data base. In the context of medical
Therefore, some concerns are indentified as database, the information assortment
the key findings of the chapter in order to depends on collection of patient care
improve the functions of data mining activities not on organized research
techniques as follow: protocols. As an example, The
University of Wisconsin Hospitals
a. Volume of data - This is very difficult encountered the problem of missing
for all the data mining methods to values in some large medical databases
achieve desirable results using raw data such as breast cancer data sets. In order
because biomedicine area is one in to address this issue, they skipped over
which has huge data is being generated the records having missing attribute
on daily basis. It is required that the values. We can have another approach
biomedical experts pre-process the data to eliminate this difficulty is to
prior to apply data mining technique. In substitute the missing values with some
this context, pre-processing also has similar kind of. We can change over the
different aspects to be handled by the missing values as a process by
medical professionals and these aspects gathering values from existing values
makes the data handling processes through Artificial intelligence methods
extensively rigorous and time (e.g. case-based reasoning).
consuming. d. Redundant, insignificant data, or
b. Nature of data - The dynamic nature of inconsistent data - Data redundancy,
data makes it difficult to handle and inconsistency or insignificant data leads
manage as well. Adding new to data corruption. In such type of
information, updating (new versions) of situation, data objects and attributes
data is constant process which cannot contain repeated non relevant values in
be avoided if we wish to gain maximum data sets and it must be avoided by
benefit of data mining. Medical data designing the data base in correct form.
has a common property of changing Pre-processing of data is the ideal way
values i.e. new tests may have new type to get useful and desirable data. In
of results which makes medical data Biomedical Informatics field the
dynamic in nature. For data collection databases contains huge amount of data
techniques, It is very difficult to makes in textual and numeric format thus, data
noise free database (elimination of should be pre-processed in order to
entire noise). To address this, the eliminate possibility of redundancy and
amount of noise in database to be inconsistency of data. As an example,
collected in the future by the different in the medical terminology one
data mining methods ,that will be condition or prescription may be
approximately same as that in the generally referred to by many names
current data. Therefore, data mining like (i.e. stomach and abdominal pain),
methods should be less sensitive to this condition commonly occurred by
noise. the misspelled of medical terms.
c. Missing attribute values -The methods
of data mining generally facing the In perspectives of data quality in the
problems regarding missing attribute database, several considerations are also
been made. Quality of learning mechanism a timeline and flowing summary of all major
is one of them which refer to the affects of tools and applications of data mining and
mining process by the mean of several ways. knowledge management which are used in
One of them is under and over learning. This medical informatics. Since past two decades,
condition occurred when the human’s it has been observed that data mining and
preferences misunderstand by the learning knowledge management has served as major
mechanism techniques and for accomplish driving force for advancements in medical
the target, they need human being. Quality informatics field and we have witnessed
of knowledge representation is another numerous developments in the field of
perspective. Representation of knowledge or bioinformatics. However, it is also
information in biomedical area is a recommended that medical data and
significant tool for patient data or clinical information should be handled with due care
trials. The aim of Knowledge representation in order to ensure necessary security and
is to understand things in easy manner. If the privacy concerns. It must at priority that
machine is not capable to store the patient’s records and details must be used
knowledge discovered and to represent the for any illegal or unethical practice.
same; then, the machine cannot be referred Additionally confidentiality and privacy of
as intelligent machine. Nature of problem is such records must not be compromised at
playing an important role in data mining any level due to recent advancements in
methods. Sometimes the intelligent system medical fields. Another point which needs
or machine has insufficient time or attention is the interpretations and
knowledge to solve the problem and provide judgments which are based upon the
correct result findings discovered by computers using data
mining and knowledge management tools.
CONCLUSION: Such interpretations must be validated in
similar ways just like any other knowledge
In this paper, we provided a detailed and generated by human. Incorrect
refined overview of various technologies interpretations should be propagated through
and tools of Data Mining and Knowledge media so that people understands the
Management with emphasis on tools and consequences and actual facts of the
techniques used especially in medical situations.
informatics. Nonetheless, the paper presents
REFERENCES:
Medicine, Vol. 39, No. 4-5, pp. 303–310 World Wide Web, doi:10.1007/s11280-
5. Carbonell, J. G. Michalski, R. S., 013-0203-y.
Mitchell, T. M. (1983). "An Overview 15. Hubert, R. (2006). Accessibility and
of Machine Learning," in R. S. usability guidelines for mobile devices
Michalski, J. G. in home health monitoring. SIGACCESS
6. Clifton, L.; Clifton, D.A.; Pimentel, Accessibility and Computing (84), 26-
M.A.F.; Watkinson, P.J.; Tarassenko, L. 29.
(2013) Gaussian processes for 16. J. M. Blum, H. Joo, H. Lee, and M.
personalized e-health monitoring with Saeed (2015), “Design and
wearable sensors. IEEE Trans. Biomed. implementation of a hospital wide
Eng., 60, 193–197. waveform capture system,” Journal of
7. C. M. C. Tempany, J. Jayender, T. Clinical Monitoring and Computing,
Kapur et al. (2015), “Multimodal vol. 29, no. 3, pp. 359–362
imaging for improved diagnosis and 17. K. Bernatowicz, P. Keall, P.Mishra, A.
treatment of cancers,”Cancer, vol. 121, Knopf, A. Lomax, and J. Kipritidis
no. 6, pp. 817–827 (2015), “Quantifying the impact of
8. Coulter D. M., Bate A., Meyboom R. H. respiratory-gated 4D CT acquisition on
B., Lindquist M., and Edwards R. thoracic image quality: a digital
(2001), “Antipsychotic drugs and heart phantom study,” Medical Physics, vol.
muscle disorder in international 42, no. 1, pp. 324–334
pharmacovigilance: Data mining study,” 18. Kohonen, T. (1995).Self-organizing
British Medical Journal, 322, 1207– Maps, Springer-Verlag,
1209. Berlin.Kononenko, I. (1993). "Inductive
9. Delen D., Walker G., and Kadam A. and Bayesian Learning in Medical
(2005), “Predicting breast cancer Diagnosis," Applied Artificial
survivability: A comparison of three Intelligence, 7,3 17-337
data mining methods,” Artificial 19. Langley, P. and Simon, H. (1995).
Intelligence in Medicine, 34(2), 113–27. "Applications of Machine Learning and
10. Duda, R. 0.and Hart, P. E. (1973). Rule Induction," Communications of the
Pattern ClassiJication and Scene ACM, 38(1 I), 55-64.
Analysis, New York: John Wiley and 20. Li L., Tang H., Wu Z., Gong J., Gruidl
Sons. M., Zou J., Tockman M., and Clark R.
11. F. E. Dewey, M. E. Grove, C. Pan et al. (2004), “Data mining techniques for
(2014), “Clinical interpretation and cancer detection using serum proteomic
implications of whole-genome profiling,” Artificial Intelligence in
sequencing,” JAMA, vol. 311, no. 10, Medicine, 32(2), 71–83.
pp. 1035–1045 21. L. Qu, F. Long, and H. Peng (2015),
12. Fisher, D. H. (1987)."Knowledge “3D registration of biological images
Acquisition via Incremental Conceptual and models: registration of microscopic
Clustering," Machine Learning, 2, 139- images and its uses in segmentation and
172. annotation,” IEEE Signal Processing
13. Gaura, E.; Kemp, J.; Brusey, J. (2013), Magazine, vol. 32, no. 1, pp. 70–77
Leveraging knowledge from 22. M. Attin, G. Feld, H. Lemus et al.
physiological data: On-body heat stress (2015), “Electrocardiogram
risk prediction with sensor networks. characteristics prior to in-hospital
IEEE Trans. Biomed.Circuits System. cardiac arrest,” Journal of Clinical
14. Huang, G.; Zhang, Y.; Cao, J.; Steyn, Monitoring and Computing, vol. 29, no.
M.; Taraporewalla, K. (2013), Online 3, pp. 385–392
mining abnormal period patterns from 23. Megalooikonomou V., Ford J., Shen L.,
multiple medical sensor data streams. Makedon F., and Saykin A. (2000),
“Data mining in brain imaging,”