Process mining groups at the end rwth
Learn the basics of process mining: what it is, how it works, and how to get started! Sign up. Read story. Looking to gain a real competitive edge, accelerate process discovery, and customer value delivery? Learn more.
We are searching data for your request:
Upon completion, a link will appear to access the found materials.
Content:
Prof. Dr. Wil van der Aalst
Try out PMC Labs and tell us what you think. Learn More. Process mining has been successfully applied in the healthcare domain and has helped to uncover various insights for improving healthcare processes. While the benefits of process mining are widely acknowledged, many people rightfully have concerns about irresponsible uses of personal data.
Healthcare information systems contain highly sensitive information and healthcare regulations often require protection of data privacy. The need to comply with strict privacy requirements may result in a decreased data utility for analysis. Until recently, data privacy issues did not get much attention in the process mining community; however, several privacy-preserving data transformation techniques have been proposed in the data mining community. Many similarities between data mining and process mining exist, but there are key differences that make privacy-preserving data mining techniques unsuitable to anonymise process data without adaptations.
In this article, we analyse data privacy and utility requirements for healthcare process data and assess the suitability of privacy-preserving data transformation methods to anonymise healthcare data. We demonstrate how some of these anonymisation methods affect various process mining results using three publicly available healthcare event logs.
We describe a framework for privacy-preserving process mining that can support healthcare process mining analyses. We also advocate the recording of privacy metadata to capture information about privacy-preserving transformations performed on an event log. Process mining is a specialised form of data-driven analytics where process data, collated from different IT systems typically available in organisations, are analysed to uncover the real behaviour and performance of business operations [ 1 ].
Process mining was successfully applied in the healthcare domain and helped to uncover insights for improving operational efficiency of healthcare processes and evidence-informed decision making [ 2 , 3 , 4 , 5 , 6 ]. A recent literature review [ 3 ] discovered articles which report applications of various process mining techniques in the healthcare domain. While the potential benefits of data analytics are widely acknowledged, many people have grave concerns about irresponsible use of their data.
Healthcare data can include highly sensitive attributes e. Hence, privacy of such data needs to be protected. An increased concern of society with protecting the privacy of personal data is reflected in the growing number of privacy regulations that were recently introduced or updated by governments around the world.
These government regulations provide general governance principles for the collection, storage, and use of personal data. Failure to comply with data privacy regulations can lead to significant penalties e. Data privacy requirements are also often included in legislation which regulates the healthcare sector e. The need to comply with strict data privacy requirements often results in a decreased data utility, i. This bill allows Australians to delete their electronic health records at any time.
While this allows protecting privacy, the quality and value of data analysis may decrease. The need to consider data privacy in process mining and develop privacy-aware tools was already raised in the Process Mining Manifesto [ 7 ]. However, the process mining community, until recently, largely overlooked the problem.
Privacy considerations are quite well-known in the field of data mining and several privacy-preserving data transformation techniques were proposed [ 13 , 14 ] e. Although there are many similarities between data mining and process mining, some key differences exist that make some of the well-known privacy-preserving data mining techniques unsuitable to transform process data.
For example, the addition of noise to a data set may have an unpredictable impact on the accuracy of all kinds of process mining analyses.
In this article, we analyse data privacy and utility requirements for process data typically recorded in the healthcare domain, assess the suitability of privacy-preserving data transformation methods proposed in the data mining and process mining fields to anonymise healthcare process data, and evaluate the impact of selected privacy-preserving methods on process mining results.
The results of the analyses and the evaluation showed that the problem of privacy protection for healthcare data while preserving data utility for process mining analyses is challenging. As a possible solution to the problem, we propose a privacy-preserving process mining framework which is based on the use of privacy metadata, and we propose a privacy extension for XES logs.
This journal article presents an extended version of the workshop paper presented at PODS4H [ 15 ] with two new additions Section 5 and Section 7. Section 5 presents new insights from a detailed evaluation conducted on three healthcare event logs. Section 7 describes the proposed privacy metadata for XES logs. In addition to new research contributions presented in these two sections, the related work discussion in Section 2 has been extensively revised.
This article is organised as follows. We present related work Section 2 , analyse data privacy and utility requirements for healthcare process data Section 3 , and assess the suitability of existing privacy-preserving methods to anonymise healthcare process data Section 4. We then evaluate the impact of some generic data transformation approaches on the results of various process mining methods applied to three publicly available healthcare event logs Section 5 , describe the proposed privacy-preserving process mining framework in Section 6 , and describe the proposed privacy extension in Section 7.
Section 8 concludes the paper. In this section, we first provide an overview of privacy-preserving data mining Section 2. We then discuss existing privacy-preserving approaches proposed by the process mining community Section 2. Privacy and access control considerations are quite well-known in several research communities, including the statistical community, the database community, the cryptographic community, and the data mining community. Several data transformation techniques, access control mechanisms, and frameworks to preserve data privacy were proposed by these communities [ 13 , 14 , 16 , 17 ].
The data mining community is concerned with protecting privacy of personal information that may be recorded about individuals e. Distributed PPDM methods, which aim to protect privacy of multiple data owners who wish to conduct analysis of combined data sets without disclosing their data to other data owners [ 13 ], originated in the database and cryptographic communities [ 17 ].
For example, data swapping, suppression, noise addition, and k-anonymity are discussed in both the SDC literature [ 18 ] and the PPDM literature [ 13 ]. In this article, we use term PPDM to refer to all privacy-preserving methods regardless of their origin. To preserve data privacy, privacy-preserving methods usually reduce the representation accuracy of the data [ 13 ]. Such data modifications can affect the quality of analysis results. The effectiveness of the transformed data for analyses is often quantified explicitly as its utility [ 13 ].
The main challenge of privacy-preserving methods is to minimise privacy risks while maximising data utility [ 13 , 18 , 19 ]. Most privacy-preserving methods aim to minimise risks of identity disclosure or sensitive attribute disclosure [ 13 , 19 ].
Identity disclosure happens when an individual is identified by an attribute e. Sensitive attribute disclosure happens when a value of some sensitive attribute is discovered by an adversary e.
Many privacy-preserving data transformation methods that originated in the statistics community e. For example, distance-based approaches quantify the level of protection by computing distance between the original data set and the transformed data set [ 18 ]. For example, privacy guarantees can be specified in terms of k-anonymity. A data set satisfies k-anonymity if each record in the data set is indistinguishable from at least k-1 other records.
We discuss in detail methods from both categories in Section 2. Methods for measuring data utility either assess information loss by quantifying differences between original and anonymised data [ 18 ] or are designed for specific applications; for example, utility can be assessed by comparing classification accuracy [ 20 ] or regression coefficients [ 18 ] obtained from original and anonymised data.
Utility measures designed for specific applications are more informative as different data analysis methods have different data requirements [ 18 , 20 ]. Privacy-preserving data mining techniques can be generic or specific [ 14 ]. These approaches can provide anonymisation in this article, anonymisation refers to any method that can protect data privacy by modifying records without introducing new values e. In specific approaches privacy preservation is tailored for specific data mining algorithms e.
Furthermore, outputs of some data mining algorithms can also be sensitive and methods that anonymise such outputs were proposed e. Finally, distributed privacy-preserving methods are proposed for scenarios in which multiple data owners wish to derive insights from combined data without compromising privacy of their portions of the data [ 13 ].
Such methods often use cryptographic protocols for secure multi-party computations SMC [ 13 ]. In this article, we focus on protecting privacy of process data within a healthcare organisation, distributed privacy scenarios are considered outside the scope of this work. Furthermore, we do not analyse specific PPDM methods and methods for protecting output privacy as they are tailored to specific data mining algorithms and are not applicable to other data or process mining algorithms.
In this subsection, we describe generic privacy-preserving data transformation approaches, such as data swapping, noise addition, suppression, generalisation, and micro-aggregation [ 13 , 18 ].
We evaluate the suitability of these approaches to anonymise process data in Section 4. Data swapping involves enacting privacy to a dataset by adding a degree of uncertainty. Uncertainty is introduced into individual records by swapping the true values of sensitive attributes between subsets of records [ 16 ]. This method allows anonymisation of both numerical and categorical attributes. Noise addition can be used for both numerical and categorical data [ 14 ]. White noise is generated using a random distribution, often either uniform or Gaussian.
Adding noise to categorical values is more complex, and can be achieved, for example, using clustering-based techniques [ 22 ]. This method preserves the aggregate distribution of the attribute values; however, the randomisation leads to the loss of individual records.
Suppression anonymises data by omission. Values can be removed under three types of data suppression [ 13 ]. The most common type is column suppression which targets the presence of highly sensitive attributes whose values directly identify an individual e.
Alternatively, row suppression is used when outlier records are infrequent and difficult to anonymise. Value suppression omits selected sensitive attribute values. Generalisation methods replace data values with approximate values making it difficult for adversaries to identify records with full confidence [ 13 ].
The process of generalising usually includes the construction of a generalisation hierarchy, which is a predefined classification of values at decreasing levels of granularity.
For numeric data, values are sorted into numerical ranges. For categorical data, a domain expert creates semantically meaningful generalisations using a tree structure. Micro-aggregation methods consist of two steps: partition and aggregation [ 23 ].
Partition organises the original records into clusters whose data is similar to each other. An aggregation operator is then used to compute a collective value e. Original values in each cluster are then replaced with the computed collective value. Micro-aggregation can be applied to both continuous and categorical data without the need for the data author to create generalised categories. Many approaches for achieving k-anonymity were proposed and often use suppression, generalisation, or micro-aggregation.
For example, a dataset may satisfy k-anonymity; however, a group of records with identical key attributes may have the same value of a sensitive attribute e. Although one cannot link an individual to a record, they can still discover the diagnosis. The t-closeness model requires that the distance between the distribution of a sensitive attribute in an equivalence class and the distribution of the attribute in the data set does not exceed a threshold t [ 17 ].
The assumption that a record contains all information about an individual is not true for process execution data in which personal information can be scattered across multiple records and there could be dependencies between such records we discuss this in detail in Section 3.
Process Mining Extension
Optimization of catalysts is often difficult because the exact mechanisms leading to the desired result are often unclear. An example of this is the stabilization of palladium I dimers, which relies on a very small class of phosphine ligands. This year, the world's largest HPC conference will be held on-site in St. Louis as well as remotely. From November 14 to 19, SC21 offers researchers, developers, and students the opportunity to exchange ideas on topics that move the community.
Meet ProcessGold at ICPM 2019 in Aachen!
He spoke about the work of the Cluster of Excellence at RWTH Aachen University, about the transfer of the Internet of Things into the production world, and about the joint work between SPbPU and the RWTH Aachen during the past year, which, despite the epidemiological situation and the inability to meet in person, is actively developing. Lev Utkin presented the main activities of the institute and the areas of developments and projects in such areas as artificial intelligence, machine learning, cyber-physical systems, etc. The working part of the workshop was held in a pitch session format. Researchers and young scientists from Polytechnic University and RWTH Aachen made presentations of their research and projects in a wide range of areas and topics, united by the common theme of smart algorithms. SPbPU scientists presented their projects on studying changes in neural network activity caused by neurodegenerative diseases using artificial intelligence methods , told about the development of methods and software and hardware for detection of tissue defects in the textile industry using neural network methods, presented a report on how to achieve high-precision object synchronization of distributed systems, discussed artificial intelligence techniques for unstructured point cloud analysis, visualization methods and deep learning for medical imaging, and touched upon the topics of information modeling in the real estate object management system and hybrid neural network model of industrial facility management and status control. The past seminars and meetings showed a great variety of directions for international scientific cooperation and the high activity of young researchers in Russia and Germany. I see here great prospects for the development of projects in many industries, including the light industry, construction, and medicine. RWTH Aachen scientists in their presentations shared their experience and developments in dynamics modeling of industrial robots using neural networks with physical plausibility, discussed methods and components for semantic web applications, transfer learning for image based quality control, self-optimizing machines for the production of nonwovens, , forward-looking process mining in production and other topics. We have already been enjoying a full year of sustainable partnership.
Amin Yazdi
The course will be given by renowned experts in the field. Young researchers such as PhD students and Postdocs are the primary target audience, but the school will also be of high interest to practitioners, students, and senior researchers. The summer school will have introductory lectures and lectures focusing on applications in healthcare, auditing, and robotic process automation. Moreover, different process discovery and conformance checking techniques will be presented in detail.
The RWTH Aachen – University of Melbourne Joint PhD Group
Process Mining provides techniques to extract process-centric knowledge from event data available in information systems. These techniques have been successfully adopted to solve process-related problems in diverse industries. In recent years, the attention of the process mining discipline has shifted to supporting continuous process management and actual process improvement. To this end, techniques for operational support, including predictive process monitoring, have been actively studied to monitor and influence running cases. However, the conversion from insightful diagnostics to actual actions is still left to the user i. The action-oriented process mining supports the continuous management of operational processes and the automated execution of actions to improve the process.
{{occasionCtrl.occasionInfo.subject | decodeURIComponent}}
He is better known as the "Godfather of Process Mining" and one of the most cited computer scientists in the world top worldwide according to Guide2Research. He is seen as one of the leading experts in in process mining, business process management, workflow management, simulation, concurrency, and process modeling. The combined power of these technologies enable organizations to fully understand and improve some of the most complex and variable business processes such as the patient journey, customer onboarding, claims processing, customer service, and more. With access to real-time data about exactly how processes are actually working and the content that fuels them, a human-centered workforce optimization and process automation solution empowers you to gain end-to-end visibility into processes where it matters most: customer experience, competitive advantage, and compliance. Join Prof.
Peaking Process Excellence
Easy data sourcing and preparation, a unique analyzing experience, intuitive action management - MPM offers you an all-in-one platform to increase your process efficiency. From medium-sized businesses to large corporations. On-premise or from the cloud.
This page introduces at first the project pm4knime. The category of nodes is listed in this section, too. Apache License Version 2. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link or bind by name to the interfaces of, the Work and Derivative Works thereof. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution. Grant of Copyright License.
The M. This subreddit is not officially run by the university itself, but rather run by students that attend the university. The result is a highly attractive programme: the Master of Public Administration in European Studies. Master in Data Analytics. As "additional competences", students have the possibilities to choose up to 12 CP in courses from a wide range of areas. There is a total of 21 researchers included with 3 of them also being included in the global ranking.
Institute of Interactive Systems and Data Science. About Us. Immersive Environments.
You are not right. I'm sure. We will discuss it.
Bravo, perfect answer.
Om-Nom-nom
I find that you are not right. I'm sure. We will discuss it. Write in PM.