biolab.si

Biolab.si

The Automatic Discovery of Alarm Rules for the Validation of Microbiological Data
E. Lammaa, M. Manservigib, P. Melloc, A. Nanettid, F. Riguzzia, S. Storaria
a Dipartimento di Ingegneria, Università di Ferrara, Ferrara, Italy b Dianoema S.p.A., Bologna, Italy c D.E.I.S., Università di Bologna, Bologna, Italy d Clinical, Specialist and Experimental Medicine Department, Microbiology section, Università di Bologna, Bologna, Italy Abstract
• the kind of material (specimen) to be analysed In this work, we describe a project, jointly started by (e.g., blood, urine, saliva, pus, etc.) and its origin University of Bologna and Dianoema S.p.A. in order to (the body part where the specimen was collected); build a system which is able to validate microbiologicaldata. Within the project we have experimented data mining • the date when the specimen was collected (often techniques in order to automatically discover association substituted with the analysis request date); rules from microbiological data, and obtain from them • for every different bacterium identified, its species alarm rules to be used for data validation. To this purpose, we have exploited the WEKA system and applied it to adatabase containing data about bacterial antibiograms. For each isolated bacterium, the antibiogram represents its Discovered association rules are then transformed into resistance to a series of antibiotics. The set of antibiotics alarm rules, to be used for data validation within an expert used to test bacterial resistance can be defined by the user, system named ESMIS. Among automatically produced and the antibiogram is a vector of couples (antibiotics, alarm rules, we have identified some already considered in resistance), where four types of resistance are possibly ESMIS and suggested by experts according to the NCCLS recorded: R when resistant, I when intermediate, S when compendium, and new rules which were not present in that report, but were recommended by interviewed The antibiogram is not uniquely identified given the bacterium species but it can vary significantly for bacteria of the same species. This is due to the fact that bacteria of Data mining, Knowledge Based System, Microbiology, the same species may have evolved differently and have developed different resistances to antibiotics. However,very often groups of antibiotics have similar answer when Microbiological Data validation
tested on a given bacterium species, despite its strains.
With respect to quality of the results produced through Today, in a modern microbiological laboratory of a hospital microbiological analysis, an important step of the entire the process of analysis result production is similar to an process is validation. Some instruments already execute assembly line where both efficiency and quality are intelligent controls on performed antibiotic test results but fundamental. With respect to efficiency, in Italy, a great these controls are limited because they haven’t information number of hospitals manages microbiological analysis about specimen, patient characteristics and infection results by means of a software system named Italab C/S, history. A system, capable of using all available developed by Dianoema S.p.A., an Italian information information, may represent a better support for laboratory technology company operating in the Health Care market.
personnel in the validation task. This system should also Italab C/S is a Laboratory Information System based on a control the application of standard antibiotic testing Client/Server architecture, which manages all the activities guidelines: these guidelines, used by almost all of the various analysis laboratories of the hospital. Italab microbiological laboratories, suggest antibiotic test C/S stores all the information concerning patients, the execution methods and result interpretation. Examples of analysis requests and the analysis results. In particular, for problems that this system should manage are: automatic correction of antibiotic results for particular species that present in vitro susceptibility but in vivo resistance, information about the patient: sex, age, hospital controls on the list of tested antibiotics, predictions of test unit where the patient has been admitted; results for a group of antibiotics using some representativeantibiotic (e.g., Tetracycline is representative for allTetracyclines).
In the validation task, one would like the system to control about ESMIS architecture and knowledge base can be the results reported in antibiograms in order to verify the presence of inconsistencies and alarming situations (e.g., The paper is organised as follows. Section 2 describes the some results for given antibiotics should be in accordance discovery of association rules by exploiting the APRIORI with one another or the result with respect to an antibiotic is algorithm and the WEKA system. Section 3 shows how not the expected and usual one, but some unexpected alarm rules are generated from the discovered association rules. Section 4 describes the experiments done. Related To guide this task, NCCLS [1], an international standard work is surveyed in section 5. We conclude and mention organization recognised by almost all laboratories as reference in routinely work, writes an annual compendium,titled “Performance Standards for Antimicrobial Discovery of Association Rules
Susceptibility Testing” [2], regarding testing guidelines formicrobiological laboratory. NCCLS guidelines, for eachspecies, are basically composed of a table that specifies the Association rules describe correlation of events and can be antibiotics to be tested, a table that specifies how to regarded as probabilistic rules. "Correlation of events" interpret the test of antibiotics and a list of exceptions means that events are frequently observed together. A good regarding particular antibiotic test results. Nonetheless, the example from real life is databases of sales transactions, validation task, when performed manually can be long and which are very frequently used by the marketing difficult, and some laboratory management system helping department of many companies because knowledge about microbiologists in this task should be very useful.
sets of items frequently bought together is useful to developsuccessful marketing strategies.
During the last few years, many surveillance systems havebeen developed in order to validate and monitor The problem of discovering association rules can be microbiological analysis results, and to early identify infective and epidemiological events.
Let I = {i1, i2, ., im} be a set of literals, called items.
Within a joint project between University of Bologna and A transaction T is a set of items such that T⊆I. A Dianoema S.p.A., we have implemented an expert system database of transactions D is a set of transactions and is (named ESMIS [3]) for validating microbiological data and generating alarms for critical situations. ESMIS has beenbuilt by following a knowledge-base approach. One of themain and well-known problems in building expert systems Table 1: Schema of a database of transactions is knowledge acquisition. In general, this is a very time Transaction ID
consuming and hard task. With respect to ESMISknowledge-base building, we were interested in extracting Let an itemset X be a set of items such X⊆ I. We say that a knowledge about anomalous situations of resistance to transaction T contains an itemset X if X ⊆ T.
antibiotics by isolated bacterium, in order to generatesuitable alarm rules. This kind of knowledge can be An association rule is an implication of the form X ⇒ Y, extracted by hand in accordance with NCCLS documents where X and Y are itemsets and X ∩Y ≠∅.
and by intensive colloquia with experts on microbiology(and this approach has been followed in building the first • The rule X ⇒ Y holds with confidence c in database D, if and only if c% of transactions in D that containX also contain Y.
Another approach could be the use of the existing databasewhere a large number of antibiograms is stored, in order to • The rule X ⇒ Y has support s in transaction set D, if automatically extract "rules" representing anomalous and only if s% of transactions in D contain X ∪ Y.
situations. This latter approach, described in this paper, not Given a set of transactions D, the task of mining association antithetic, but complementary to the former one, can be rules can be reformulated as finding all association rules very effective in validating ESMIS’s knowledge-base, and with at least a minimum support (called minsup) and a also in extending this knowledge base by "discovering" new minimum confidence (called minconf), where minsup and rules not yet considered by official documents. Last but not minconf are user-specified values.
least, these new discovered rules, taking into account thehistory of the specific laboratory, are better tailored to the The task of discovering association rules can be considered hospital situation, and this is very important since some resistances to antibiotics are specific to 1. Find all itemsest that have transaction support above particular, local hospital environments. In this work, we minimum support. The support for an itemset is the report on the application of data mining techniques in number of transactions that contain the itemset.
ESMIS. In particular, we have experimented these Itemsets with minimum support are called large techniques in order to automatically discover association itemsets, all others are called small itemsets. This rules to be used for the validation of microbiological data subtask is addressed by the algorithm APRIORI [4].
and for the generation of alarming situations. Other details 2. Generate all association rules with minimum support which can easily be achieved given the set Lk-1.
and confidence from the set of all large itemsets.
This subtask can be addressed by a straightforward Learning Association Rules by WEKA
In order to learn association rules for validating - For each large itemset l, find all non-empty microbiological data, we have exploited the WEKA system [5], a collection of machine learning algorithms for solvingreal-world data mining problems. WEKA is written in Java - For each such subset a of l, output the rule l and runs on almost any platform. WEKA is open source (l-a), iff the ratio of support(l) to software issued under the GNU General Public License.
support(a) is at least minconf.
WEKA contains algorithms for performing classification,numeric prediction, clustering and learning association The APRIORI Algorithm
The APRIORI algorithm discovers large itemsets by means As regards association rule learning, WEKA employs a version of the APRIORI algorithm that is able to learn • In the first pass, APRIORI counts the support of association rules from a generic table (like Table 2) with n individual items and determine which of them are Table 2: Example of a table for knowledge extraction Each subsequent pass starts with a seed set represented by the itemsets found to be large in the Attribute1 Attrbute2
Attributen
previous pass. From this set it generates the new potentially large itemsets, called candidate itemsets.
The actual support for these candidate itemsets is counted during a new pass over the data.
In this case, an association rule is a rule of the form • At the end of the pass, we determine which of the candidate itemsets are actually large. These itemsets A1=vA1, A2=vA2,…,Aj=vAj ⇒ B1=vB1, B2= vB2,…,Bk=vBk become the seed for the next pass. This process where A1, A2,…, Aj, B1, B2,…,Bk are attribute names and continues until no large itemsets are found.
vA1, vA2,…,vAj, vB1, vB2,…,vBk are values such that vAl (vBh) For the sake of completeness, the algorithm is reported in belongs to the domain of the attribute Al (Bh).
In practice, each record is considered as a transaction andeach possible equivalence Attribute=Value an item.
WEKA's version of the APRIORI algorithm works as if Table 2 is first transformed into a transaction database withthe schema of Table 3: Notation:k-itemset: An itemset having k items.
L Table 3: Example of a table from which WEKA extracts k: Set of large k-itemsets (those with minimum support).
k: Set of candidate k-itemsets (potentially large itemsets).
Transaction ID
Ct = subset( Ck,t); //Candidates contained in t and the standard version of the APRIORI algorithm is then The apriori-gen function takes Lk-1, the set of all large (k- The algorithm in WEKA takes into account two numbers: 1)-itemsets, as an argument, and returns a set of candidates the number of records verifying the rule antecendent (NA), for being large k-itemsets. It exploits the fact, that and the number of records verifying both the antecedent expanding an itemset will reduce its support. A k-itemset and consequent of the rule (NR). Starting from these two can be large only if all of its (k-1)-subsets are large. So values, confidence and support are assigned to the rule as apriori-gen generates only candidates with this property, ratio NR/NA, and NR/N (where N is the total number of record considered) respectively. Rules are generated and (for the sake of simplicity, we omitted support and presented by decreasing value for the confidence.
confidence in the reported alarm rules).
Otherwise, when Y is a composed condition, e.g.: Generation of Alarm Rules
Discovered association rules can be transformed into alarm rules, to be used for data validation, as follows.
We have first applied filtering to discovered rules, in orderto consider the most general ones among them. A rule, R1, we just move its negation to the body of the alarm rule. In is more general than a second rule, R2, if they have the this case, for the sample rule above n. 537, we obtain the same consequent, but conditions in R1’s antecedent are a (proper) subset of those in R2’s antecedent. For instanceamong the four rules below: 537’. Oxacillin=R, not([Amoxicillin+ClavulanicAcid=R, Penicillin=R])] ==>alarm([Amoxicillin+ClavulanicAcid=R, Experimental Results
We have applied WEKA to an Italab C/S database containing data about bacterial antibiograms. We have considered all the bacteria belonging to the species Staphilococcus Aureus, Escherichia Coli and four species belonging to Enterobacteriaceae. All the data have been collected from the Clinical, Specialist and Experimental medicine Department of the University of Bologna, inBologna, Italy. We report about the experiments in the rule 1 is the most general, rule 4 is the most specific, and rule 2 and 3 are intermediate (and not comparable WITH Staphilococcus Aureus
To the selected most general rules, we have then applied The considered dataset for Staphilococcus Aureus contains syntactic transformations in order to produce alarm rules, to 7009 records having as attributes 41 different antibiotics, be used in ESMIS [3]. Alarm rules have been obtained by plus the site of the considered sample, patient sex, hospital considering that an association rule of the kind: department hosting the patient and information about thetherapy for the patient.
First experiments have been done by running the system represents a regular (and usually quite frequent) situation, with decreasing values for minimal support and confidence.
In particular, we first run the system with minimal supportequal to 0.5, 0.4, 0.3 e 0.2. and confidence equal to 0.9.
These experiments have not produced any known rule or where the consequent is complemented and moved to the discovered new rules confirmed by experts. Then, we antecedent, represents an abnormality situation. When X choose to further diminish the requested minsup, and run and not Y simultaneously occur, and alarm has to be raised the system with minimum support equal to 0.1 and minconf because the usual value for Y should be true instead of equal to 0.9. With this experiment, among produced alarm rules, we have identified some rules already suggested bythe NCCLS report, and already considered in the ESMIS In order to apply this kind of transformation, when Y is a knowledge base. In particular, we have discovered those singleton condition, we have considered the result for an rules which relate to each other the results of two classes of antibiotic in an antibiogram as two-valued, where R is the antibiotics, i.e., Oxacillin and Penicillin (when a bacterium complementary value of S and vice-versa. For instance, the is resistant to Oxacillin it must also be resistant to any kind of Penicillin), and the resistance result for Oxacillin andPenicillin with β-lactamase inhibition (when a bacterium is resistant to Oxacillina it must also be resistant to any Penicillin with β-lactamase inhibition). For instance, the following two istances of these general rules were found: ==>alarm([Amoxicillin+ClavulanicAcid=R,Penicillin=R] This couple of rules relates to each other the results of two classes of antibiotics, i.e., Cefotaxime and Ceftazidime (when a bacterium is susceptible to Cefotaxidime it must also be susceptible to Ceftazidime, and vice-versa).
==>alarm([Amoxicillin+ClavulanicAcid=R, With lower support, but with confidence equal to 1, we have also discovered rules already considered in ESMIS inaccordance with the NCCLS compendium, e.g. those The discovery of this set of rules both confirms part of the relating, when the bacterium was isolated from the urinary content of the NCCLS compendium and of rules elicitated tract, the resistance to Piperacillin with the resistance to by the experts and already considered in ESMIS.
Furthermore, the experiment has also discovered new rules Enterobacteriaceae
which were not present in the NCCLS report and in ESMISknowledge base, but have been validated and recommended We have also done further experiments by considering four by the interviewed microbiologists, and in particular, different bacteria belonging to the same family (Enterobacteriaceae, in particular).The considered datasetcontains 3387 records having as attributes the bacteria 1 0 8 0 ’ . Teicoplanin =S, Vancomycin =R species, 28 different antibiotics, plus patient sex and information about the therapy for the patient 1 5 3 9 ’ . Vancomycin =S, Teicoplanin =R==>alarm(Teicoplanin =S) Also for Enterobacteriaceae, the most significantexperiments were done by deleting from the datasetunuseful anitibiograms, i.e., all those for which the which relate to each other the results in an antibiogram of considered bacteria were always susceptible to each two (last-generation) antibiotics (i.e., Teicoplanin and antibiotic in the antibiogram. From the remaining data (2656 records), with support equal to 0.68 (and confidence Further experiments for the Staphilococcus Aureus have equal to 1) we have rediscovered the couple of rules been done by filtering data and removing from the dataset relating to each other the results of Cefotaxime and unuseful anitibiograms, i.e., all those for which the Ceftazidime (previously discovered for Escherichia Coli).
bacterium was always susceptible to each antibiotic in the With lower support, but with confidence still equal to 1, we antibiogram (a part from Penicillin, to which the have also discovered rules already considered in ESMIS in Staphilococcus Aureus can be sometimes susceptible and accordance with the NCCLS compendium, e.g. those sometimes resistant). This filtering has been suggested by relating the resistance to Cefotaxime with the resistance to interviewed microbiologists, and has reduced the dataset to 3734 records. With this last experiment (done withdecreasing minimum support, till 0.1, and minimumconfidence equal to 0.9) we have newly discovered the Related Work
mentioned above rule 537', rule 1080’ and rule 1539’, butwith a higher minimum support, since noisy and unuseful During the last few years, many surveillance systems have data have been removed from the database.
been developed in order to monitor microbiologicalanalysis results and to early identify infection and Escherichia Coli
epidemiological events. Some of them also encompassed The considered dataset for Escherichia Coli contains 7165 data validation according to NCCLS compendium. We records having as attributes 25 different antibiotics, plus the survey the most significant among them.
site of the considered sample, patient sex and information WHONET 5 [6] is a database software for the management on the hospital department hosting the patient of microbiology laboratory test results. The software was The most significant experiments were done for this developed for the management of routine laboratory results bacterium by deleting from the dataset unuseful but has also been used for research studies. Software anitibiograms, i.e., all those for which the bacterium was development has focused on data analysis, particularly of always susceptible to each antibiotic in the antibiogram.
the results of antimicrobial susceptibility testing.
From the remaining data (3285 records), with a minimum GermWatcher [7] is an expert system, which applies both support equal to 0.8 (and confidence equal to 1) a new local and international culture-based criteria for detecting couple of rules was discovered, and confirmed by potential nosocomial infections. Its knowledge base was obtained by the analysis of some documents, written by CDC’s NNIS [8] (Center for Disease Control, National Nosocomial Infection Surveillance), providing explicit treatment, with the purpose of enhancing medical quality culture-based and clinical-based definition for the most Finally, as concerns the application of data mining TheraTrac 2 [9] is a system for microbiological data techniques to microbiological data, two previous works validation and real-time alarming. It directly interacts with have considered the analysis of microbiological data ([3] Vitek, an expert system for test results validation, that is integrated in particular analytical instruments.
In [14] the system PTAH is presented that was developed All the systems mentioned above use international standard for the analysis of antibiogram data in order to help medical guidelines in order to define controls to be executed on doctors in the prescription of antibiotics for the cure of nosocomial infections. PTAH performs four types ofanalysis: Our data mining approach is deeply integrated with theexpert system ESMIS [3], under development within a joint project between the University of Bologna and Dianoema S.p.A. ESMIS is able to validate microbiological data, according to the NCCLS document. In particular, given a newly isolated bacterium, ESMIS performs five main tasks:(i) Validates the culture results; (ii) Identifies the most • effectiveness of antibiotics over time suitable antibiotics list; (iii) Issues alarms regarding the In [9] the demographic clustering algorithm that is enclosed newly isolated bacterium; (iv) Issues alarms regarding in Intelligent Miner [15,16] is applied in order find patient clinical situation; and (v) Identifies epidemic events interesting cluster of antibiograms.
inside the hospital. Furthermore, ESMIS it is also able toconsider alarm rules discovered through the application of We differ from these works because we consider the data mining techniques when confirmed by the problem of discovering potential correlations among the microbiologist experts. In this respect, ESMIS is able both tests of different antibiotics, to be used later on for result to consider standard validation rules as they are stated in the NCCLS documents, but it is also able to extend itselfand embrace new rules once they have been discovered Conclusions
starting from data that are peculiar of a given hospital (orregion).
In this paper we have described the application of data In the past, the University of Bologna and Dianoema S.p.A.
mining techniques in order to automatically discover have designed and implemented an expert system for the association rules from microbiological data, and obtain validation of clinical analysis [10] named DNSEV (Expert from them alarm rules for data validation. This has been System for clinical result Validation). DNSEV has been done within a project, supported by MURST, jointly started developed in order to improve the quality of the validation by the University of Bologna and Dianoema S.p.A. Among process performed by a specific Laboratory Information automatically discovered alarm rules, we have identified System, which is an Italab C/S database. Quality some already considered in the knowledge base of the improvement of the validation process has led to a decrease expert system ESMIS – to be used for monitoring in the time required by medical doctors in the validation microbiological data - and suggested by experts according task of clinical analysis data, permitting them to direct their to the NCCLS compendium. Furthermore, we have also energies toward other important tasks. In DNSEV the discovered new rules which were not present in that report, medical laboratory expertise on the validation process is but were recommended by interviewed microbiologists.
translated into rules that perform all the necessary checkson analysis results. The reasoning made by the new system We are currently extending ESMIS knowledge base by is documented in order to explain it to the medical team.
considering other bacterium species, by interviewing The type of reasoning and the rules used are clearly shown experts and by applying, in parallel, the WEKA system to a and easy to change by a laboratory expert manager.
database containing data about various bacteria.
Previous work on the detection of data inconsistencies at Acknowledgements
the level of every patient record has been done byconsidering the application of inductive learning on adatabase of atherosclerotic coronary heart disease patients This work has been partially supported by Dianoema S.p.A.
[11]. In particular, confirmation rules for the detection of under MURST (Ministero dell’università e della ricerca outliers are discovered in that work by exploiting inductive scientifica e tecnologica) Project n. 23204/DSPAR/99. The methods. The authors also consider the application of authors would like to thank Giovanni Pizzi of Dianoema S.p.A. Authors are in debt with Massimo Perelli for hishelp in doing the experiments.
In [12], data mining techniques are applied to patient datafrom several hospitals and along three years in order todiscover associations, e.g., within diagnoses and medical References
Expert System Approach for Clinical Analysis ResultValidation, Proceedings of ICAI2000, Las Vegas,Nevada,CSREA Press, USA, 2000.
[1] NCCLS, National Committee for Clinical Laboratory [11] D.Gamberger,N. Lavrac, G. Krstacic, T. Smuc Inconsistency tests for patient records in a coronary [2] Mary Jane Ferraro et. al., Performance Standards for heart disease database, Proceedings of IDAMAP2000.
Antimicrobial Susceptibility Testing; EleventhInformational Supplement, NCCLS document M100- [12] W.Stuhlinger, O.Hogl, H.Stoyan, M.Muller, Intelligent data mining for medical qualitymanagement, Proceedings of IDAMAP2000.
[3] E. Lamma, P. Mello, A. Nanetti, G. Poli, F. Riguzzi, S.Storari, An Expert System for Microbiological Data [13] E.Lamma, M.Manservigi, P.Mello, F.Riguzzi, Validation and Surveillance, to appear in Proceedings R.Serra, S.Storari, A System for Monitoring of ISMDA 2001, Lecture Notes in Computer Science, Nosocomial Infections, Proceedings of [4] Agrawal Rakesh, Srikant Ramakrishnan Fast [14] M. Bohanec, M. Rems, S. Slavec, B. Urh, PTAH: A Algorithms for Mining Association Rules, Proceedings system for supporting nosocomial infection therapy, of the 20th International Conference on Very Large in N. Lavrac, E. Keravnou, B. Zupan (eds) "Intelligent Data Analysis in Medicine and Pharmacology",Kluwer Academic Publishers, 1997.
[5] I.H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, October 1999.
http://www.software.ibm.com/ data/iminer/fordata, 9July 2001.
communicable disease surveillance and response, [16] Cabena, Hadjinian, Stadler, Verhees, Zanasi, WHONET 5 - Microbiology Laboratory Database Discovering Data Mining – from concept to implementation, Prentice Hall – IBM.
[7] M.G.Kahn, S.A.Steib, V.J.Fraser, W.C.Dunghan, An Address for correspondence
Expert System for Culture-Based Infection control Surveillance, Washington University, 1992.
[8] Center for Disease Control National Nosocomial Infection Surveillance, CDC NNIS, www.cdc.org, 9 [9] Theratrac, Biomerieux, see at web site: Tel. ++39 051 2093818 Fax. ++39 051 2093073 http://www.theratrac.com, 9 July 2001.
[10] M.Boari, E.Lamma, P.Mello, S.Storari, S.Monesi, An

Source: http://www.biolab.si/idamap/idamap2001/papers/lamma.pdf

nephcure.org

SAVING KIDNEYS. SAVING LIVES. GABRIEL’S STORY The NephCure Foundation is currently the only organization dedicated to What do you do when two very nice physicians sit you down support research seeking the cause and kindly tell you why your two-year-old son is pale, listless of two debilitating kidney diseases, and swollen throughout his body, when they call the condition Ne

Microsoft word - campermed sheet

AUTHORIZATION FOR THE ADMINISTRATION OF MEDICATIONS Our Camp infirmary is well stocked with medications most commonly used/needed (as listed on stockmedication sheet, other side). If you choose to send a prescription or non-prescription (over thecounter) drug to camp with your child, for EACH medication you need to complete this form andhave it signed by the prescribing physician. NO DRUG WIL