Scattered Case

Description

  • Key process steps are missing in the event log
    • (thus not giving the complete picture of the activities involved in a case), but they are recorded elsewhere
  • This pattern then is concerned with constructing a complete picture of the cases in a log by pulling and merging information from different sources
    • The key challenge here is linking the information pulled from different sources when they may not be identified by the same case identifiers (the so called ‘record linkage’ problem)

Affect

  • If not addressed properly, this event log imperfection pattern will result in discovered process models representing only a fraction of the total process due to the event log containing incomplete trace information

Data Quality Issues

I12 - Incorrect data: relationship
  • The associations between events and cases are logged incorrectly from the domain perspective. That is, within each contributing system's log, events are correctly ascribed to cases, but as there is no common case identifier at the domain level, when the events from the contributing system's logs are combined to form a consolidated process level log, it is not possible to properly merge events into cases.

Manifestation and Detection

  • Pattern signature — (i) the absence of a single log that provides all the expected activity names, (ii) different frequencies of known predecessor and successor activities (iii) the existence of ‘gaps’ with regard to activities recorded in the log, e.g. record of blood test being ordered, but no record of blood being taken or tested or test results returned.

Remedy

  • Use appropriate record linkage technique
  • Approach to merging multiple source logs:
    • If the different logs use identical case identifiers, simply merge them
    • If the different logs use different case identifiers, but a global, unique identifier exists, use the global identifier to link the records in the various source logs
    • Otherwise, use a standard record linkage technique

Side-effects of Remedy

  • Where no global unique identifier can be determined, i.e. a record linkage algorithm has been applied, it is possible that events will not be properly attributed to cases due to false-positives and false-negatives generated by the linkage algorithm.