Data Input Validation

Validation of BPMN data policies using Sparx Enterprise Architect and Enterprise Hub

Posted by steve on June 06, 2020
Enterprise Lineage | Data Lineage | Pathwise Complexity

Problem

The 'know your customer' (KYC) and ‘three lines of defence’ (3LoD) are two common business modelling problems that require that under all circumstances data must be provide prior to the use of the information.

For KYC, we need to ensure that the {due-diligence, credit-check, embargo, politically exposed persons, financial-crime} checks are all performed before trades are executed for the client. These checks are normally undertaken before a trading account is setup; but in an increasingly complex world clients can be introduced via brokers, white-label services or as counterparties in a derivatives bought from another institution. In a small number of cases the KYC issues only become apparent in the back-office when trades must be settled. In these cases there is no direct link between KYC and the settlement activity so a database must be checked. The challenge for process-design is to ensure that the database entry has always been written prior to being needed.

For 3LoD we need to ensure that treasury provision or hedging has been completed before regulatory exposure reporting for BCBS239 to ensure that the risk profile reconciles with the trading book.

Both of these cases are examples of the "read before write" problem where a process can be defective if there are scenarios that information is needed before it is provided. The conventional strategy is intensive process quality assurance to ensure that the linked set of processes across a value-chain include the appropriate activities. The problem is that a the upstream process might be changed without the downstream processes being re-validated. The problem becomes an issue when systems are implemented and gaps create data-quality exceptions during normal business

A Solution

The enterprise hub addresses this need with the sample DataInputValidation.fs script that continuously monitors the process-model (either real-time as models are changed, or overnight for the entire model) to find cases where a Data-store (drum logo) or Data-Object (page logo) is being read (by BPMN DataInputAssociation), and recursively searches back through the control flows to find an instance where the Object/Store is being written (by BPMN DataOutputAssociation). Where an instance can not be found, an issue is created in the repository that can be used through continuous process improvement.

The algorithm of the script is just a few lines of recursive F# code that uses EA.Gen.Model to search back through a Sparx process model (through activities, decisions, messages, intermediate-events) to find the write activity.

    let rec findTarget (start : Element) (target : Element) (last : string) (path : int list) : Boolean = 
        let recurs i l = 
            List.fold (fun a y -> if a then a else (y = i)) false l

        if start.Id = target.Id &&  last = "Activity" then // we're only interested in activities that write 
            true
        elif recurs start.Id path then
            false
        else
            (start.StartConnectors 
                |> Seq.filter (fun i -> i.ConnectorType = "Dependency" && i.Stereotype = "DataOutputAssociation")
                |> Seq.map (fun i -> i.EndElement)
                |> Seq.toList)
            @
            (start.EndConnectors
                |> Seq.filter (fun i -> i.ConnectorType = "ControlFlow")
                |> Seq.map (fun i -> i.StartElement)
                |> Seq.toList)
            |> List.fold (fun a y -> if a then a else findTarget y target start.ObjectType ([start.Id] @ path)) false

The sample can easily be changed to handle sources other than a standard Activity or to provide additional validation. For KYC and 3LoD the process can easily be changed to validate client specific rules such as the distinction between the creation of a risk, mitigation of a risk and audit-point (end-event) where risks are highlighted

Example

When reading the sparx sample process for Nobel prizes, it is not immediately apparent that “Completed Nomination Forms” does not have a process to create it, but is highlighted by the DataInputValidation script.