Blog

Enterprise Transitive Edge

Sat, 29 Mar 2025 14:24:19 GMT

transitive edge Introduces the idea of a transitive edge as a projection of an edges between nodes that can be used to hide the details of the internal connections within a graph, allowing you to focus on the start and end of graph. Genealogy family trees use just two basic edge types {mother, father} and one derived edge (child) - which in Hiperspace are views projected from relational facts: Person { Name: 'Mark', Mother: 'Mary', Father: 'John'} can be projected as a node and four edges { Mark -(mother)-> Mary; Mark -(father)-> John; Mark <-(child)- Mary; Mark <-(child)- John; } that do not need to be stored to be accessed.

Problem

If you have an organisation with three divisions, each with 100 employees spread across 10 countries, each with 30 employees in each. If the chief executive has 3 devisional and 10 country reports, how many people in total report to the chief? The logical answer is 300, provided you don't double count! You can achieve the aggregate by:

Use a global list in addition to division and country
Use domain knowledge : count only division or country reporting, but not both
Use split allocation: and aggregate allocation - a problem arises if you introduce split allocation between country and region, leading to double counting
Use transitative reporting, and ignore the intermediate edges and focus on employee -> chief

Transitive Edges solve all of these problems by hiding the intermediate edges. The problem becomes exponentially more complex when we consider Enterprise Architecture and the myriad connections between nodes.

If you want to prioritise work-packages for a System by the business goal being addressed, there are a myriad different paths between the work-package and business-goal with potentially many-many duplicates. What we really want is to is to transitively project all the paths between work-package and goal and priorize by business goal.

Solution

Hiperspace TransitiveEdge provides a mechanism to the declare Goals as an extension property of Work-Package. The TOGAF sample includes the segment Togaf.Has.WorkPackage that has been extended with a property StrategicEdges that is derived from a function, and Goals that projects the set of TransitativeEdge as Goal properties.

segment Togaf.Has.WorkPackage : Togaf.Base 
    = Node        ( SKey = SKey, Name = Name, TypeName = "AF-WorkPackage"),
      Edge        (From = owner, To = this, Name = Name, TypeName = "AF-Has-WorkPackage") ,
      Togaf.Edge2 (From = this, To = owner, Name = Name, TypeName = "AF-WorkPackage-For") ,
      Graph.TransitiveEdge = StrategicEdges 
[
    "All Edges that can be projected as Transitative Edges to a Business Goal"
    @Once
    StrategicEdges  = StrategicEdge(this),
    Goals           = Goals(StrategicEdges)
];

The functions use the Graph.Route definition

From Type	To Type	^	Edge Type
AF-CourseOfAction	AF-Goal		AF-CourseOfAction-Goal
AF-Capability	AF-CourseOfAction		AF-Capability-Related
AF-Function	AF-CourseOfAction		AF-Function-CourseOfAction
AF-Capability	AF-Capability	^	AF-Capability-Part
AF-Function	AF-Function	^	AF-Function-Part
AF-Process	AF-Function		AF-Process-Function
AF-Process	AF-Capability		AF-Process-Capability
AF-Activity	AF-Process		AF-Activity-Process
AF-Service	AF-Activity		AF-Activity-Service
AF-System	AF-Service		AF-System-Service
AF-Component	AF-System		AF-Component-System
AF-Deployed	AF-Component		AF-Deployed-Component
AF-Platform	AF-Service		*
AF-Platform	AF-Platform	^	*
AF-WorkPackage	AF-Function		AF-WorkPackage-For
AF-WorkPackage	AF-Capability		AF-WorkPackage-For
AF-WorkPackage	AF-Goal		AF-WorkPackage-For
AF-WorkPackage	AF-Activity		AF-WorkPackage-For
AF-WorkPackage	AF-Process		AF-WorkPackage-For
AF-WorkPackage	AF-CourseOfAction		AF-WorkPackage-For
AF-WorkPackage	AF-Service		AF-WorkPackage-For
AF-WorkPackage	AF-System		AF-WorkPackage-For
AF-WorkPackage	AF-Component		AF-WorkPackage-For
AF-WorkPackage	AF-Deployed		AF-WorkPackage-For
AF-WorkPackage	AF-Platform		AF-WorkPackage-For

"^" denotes recursive search up through the hierarchy of {platform, function, business-capability}.

The strategic goal for each and every work-package can be found by:

Reading the Goals (set) property of `WorkPackage'
Using SQL query the view TransitiveEdges "SELECT Name, "From", To, Width, Length FROM TransitiveEdges WHERE To.TypeName = 'AF-Goal';" to query
Using SQL join of WorkPackage "SELECT w.*, e.* FROM WorkPackages AS w, w.StrategicEdges AS e;" - the join type here is appears to be a cross-join, but only because the sub-table StrategicEdges has an implicit join to work-package.

Execution

Hiperspace uses a parallel Breadth-first search to search the graph of nodes, with recursive back-search to eliminate cyclic paths. The access of each Edge set use parallel search of each element SetSpace that provides the Edge view, which (for very large datasets) can also parallel search partitions and generations.

Parallel search, and fact direct access not data means that even huge datasets are processes very quickly

Transitive Edge

Fri, 28 Mar 2025 18:39:33 GMT

I've been developing a new feature for Hiperspace that includes the extension of the Node and Edge model to include a new type of TransitativeEdge that can be transitively extended to project relationships as a Butterfly graph that that shows end-to-end relationships between nodes as relations that do not need recursive query to examine.

If you wanted to show the costs for an organization of operating a service, you'd need to include costs associated with

The activities performed with the service
The activities performed by other functions to support the fron-line activities
The applications, databases and other software used by the service
The hardware the software is hosted on
The costs space within a data-centre and power usage

Aggregating all this information would require a traversal of the entire graph of information, and allocation of proportion of the costs. The aggregation would cover a number of disciplines and potentially a number of steps to accumulate the information.

The problem of finding relevant information can be simplified using a graph-view that treats all the data as nodes and edges, but still leaves the problem of how to recursively query the data. Transitive-Edge addresses this problem by folding the entire graph of edges into a single relation that can be queries directly. If Cost is transitative then A -> B -> C -> D can be transitatively projected as A -> D.

Transitive-Edge

Transitive Edge uses the principle that an edge can be projected to a transitive-edge provided it follows one of the rules of the transitive route. For our purposes, those rules can be simplified to a set {from-type, edge-type, to-type}. ForFamily-tree is a good simple example, that uses just three Edge types {father, mother, child}, but a whole family tree can be projected from them using the Transitive Edge relation.

In this example Mark is the child of Mary is the child of Jane is the child of Eve through the Mother Edge. Mark has a transitive Edge relation to Eve because each of the edges between them follows the relation route.

The data-mode extends the Node and Edge model adding four additional field

Role	Name	Description
Projects	Edge	The final Edge that is transitively projected to a high-level relationship
Extends	Source	The Transitive Edge that has been e3xtended to provide this connection
	Length	The shortest number of Edges that have been traversed for this view, in the example Mark -> Eve the length is 3
	Width	The number of distinct routes that are summarized by this Transitive Edge

While Length and Width are mostly anecdotal in this example, they are very useful when considering cost allocation for application usage.

Implementation

The Hiperspace architecture makes it relatively simple and extremely fast to project these transitive edges without the need to store them in an intermediate database. The Hiperspace Graph.PathFunctions.Paths() function takes the following parameters

Parameter	Type	Description
root	Node, or any element that can be viewed as a Node	the topic of the search for Transitive Edges
route	A Route model, with rules	the Transudative Route model used to project Edges as Transitive Edges
length	int	The maximum number of edges to consider in the parallel search for Transitive Edges
targets	Set of Node TypeName	the end target types that should be returned

Derived Edge

For family-trees, a further derivation is possible to define {Brother, Sister, Grandmother, Grandfather, Aunt, Uncle, Cousin} edges from the transitive relations for a person by examining the Transitive-Edge or Edges that are referenced by it.

Length	Width	Edges	Name
2	2	Parent -> Child	Brother, Sister (depending on gender). Width is 2 since the routes are by Mother and Father
2	1	Parent -> Child	Half-Brother , Half-Sister. Width is 2 since the route is either by Mother or Father
2		Parent -> Parent	Grandmother, Grandfather
3		Parent -> Parent -> Child	Aunt, Uncle
3		Parent -> Child -> Child	Neice, Nephew
..	..	..	..

Family-trees normally include a line between two people for marriage, but there is not a unique phrase to describe the relationship, somewhat amusingly ChatGPT can be persuaded to hallucinate that this relationship is Bonkers.

Hiperspace Notebook

Thu, 08 Aug 2024 19:49:06 GMT

jupyter notebooks is a great technology for interactive development in python, with built in integration for graph visualisation tools, particularly for data science. Today, it is most commonly used with Visual Studio code. With the popularity of jupyter notebooks, Microsoft have added Polyglot notebooks, To extend the technology to other languages other than python, notably C#,F# and R.

Fo the Cousins sample we're using the F# interactive environment because F# is a fucntional language designed for interactive development rather than an adapted scripting language. The script closely follows the C# unit test

The notebook presents a family tree of eleven people as a graph of nodes with parent/child relations as

Family tree	Graph View

when infered relations (to cousins, aunts, grandparaents) are added the graph grows obscuring the hierarchical nature of the graph.

We often expend so much effort extracting data, shapping, and loading graph stores; that it is just a means to an end.

Cousins was built using F#, but could just as easily have been built with Python using pythonnet (test.py)

Hiperspace

Sat, 13 Jul 2024 21:20:02 GMT

Hiperspace is an acronym for “high performance space”, it provides higher performance than conventional database/object storage but accessed transparently as if data was already in memory. The name is similar to hyperspace in science fiction (a way to reduce the latency moving from open point is space to another), hyperlink in the World-Wide-Web (transparent navigation from one page to another), and the Hiperspace expanded memory for IBM mainframes.

It's not called HiperspaceDB because it is also applicable to ephemeral use cases that don’t need durable storage, but need access to be faster than reloading everything whenever it changes, or alternate views (Graph/History) without explicit handling:

Low latency direct access to information
Larger space than virtual memory
Simpler than a cache service (that need whole objects to be serialized)
Viewable as a graph without transformation
History of Elements for point-in-time view of data
delta views OLAP aggregates with history
Horizon global filtering of context (e..g. approval status)

The runtime is pure open-source GitHub and can be deployed from Nuget.

Cephei Orleans

Sat, 13 Jul 2024 21:22:10 GMT

Forward

In an earlier post on the Cell Framework we described the paradigm of a Model as a collection of Cell that can be used as a class that industrialises a spreadsheet where values are guaranteed to be consistent with the underlying values that the formula is based on, but calculated in parallel. This matches well with a market scenario where any number of changes to underlying instruments can alter the value/price of calculations.

We showed that models can be built up from basic models (e.g. Floating Rate Bond) to represent models that include an entire portfolio of trades, which can then provide high-level values for continuous hedging and liquidity-driven risk appetite for near-real-time (where a SessionStream ensures that a fresh compute-intensive risk calculation does not start until the last session has completed).

Model and Cell provide IObervable/IObserver subscription for an event-stream Architecture, where events are passed between Nano-Servers through to active actors. Nano-Server is used here as a small block of logic that runs within a Micro-Service (modern Service Oriented Architecture ) that is itself deployed to a cluster of computers like Kubernetes. The Cell Framework is a building block for these kind of architecture.

Orleans

Within the realm of massive on-line gaming, Orleans provides a framework for Distributed Virtual Actors that run on a large number of load-balanced servers, where each Grain is a Nano-Server or Digital Twin for a remote device. Where Orleans is intended for millions of low-cost Nano-Services that cooperate, it also provides a rich hosting environment for scheduling any kind of work over massive clusters of Micro-Services, with production level telemetry and instrumentation.

Cephei.Orleans

The Cephei.Orleans Nuget Package will provide a Grain class ModelGrain to host a Model within an Orleans cluster (e.g. ModelGrain) that can provide the fabric for real-time-risk. ModelGrain requires no changes to Cephei.Cell because Orleans also uses the asynchronous event-oriented model. Cephei.Cell directly support asynchronous notification through IObservable subscriptions with overlapping SessionStream to prevent blocking for continuous streams of data though the NanoService. Cephei.Orleans will provide all the plumbing to allow thousands of Models to collaborate in a managed compute-fabric.

Enterprise Architecture

While an event-fabric appears to be a complex graph of objects that cannot be presented within the document set needed for management and regulatory reporting, Enterprise Lineage can be simply demonstrated using UML trace-relationships and automatic derivation through an Enterprise Hub

Cephie.QL

Cephei.Orleans will be released with the upcoming update to the [Cephei.QL] wrapper around the latest version of Quantlib together with the Cephei.Excel Excel addin that enables the development of F# models using Microsoft Excel as an editor

Implementing Enterprise Lineage

Sat, 13 Jul 2024 21:23:42 GMT

Enterprise Lineage | Data Lineage | Pathwise Complexity

In Data Lineage, we demonstrated that <> relationships can be a superior alternative for documenting data-lineage, and provide a framework for Enterprise Lineage. This blog is concerned with the automatic derivation of lineage constraints for <> Information Items and the implementation.

Nature of the problem

<> dependency is inherently an graph problem, but within software architecture the scale of the problem is a finite graph so does not need to be transferred to, or interrogated with Graph Database technology – any 64-bit operating systems can hold the entire graph in memory while processing.

Modelling oddities like [Chicken] has a trace dependency to [Egg] but [Egg] has a trace dependency to [Chicken] can be ignored because no additional information is provided by recursive search – and can be a useful construct when an in-memory object is sourced from a database row, and the database row is sourced from the in-memory object.

The trace-graph can be expressed as a (In JSON) as an array of objects, where each object is either a text node or a trace-graph of other dependencies – a flattened array of all dependencies can produced by removing all the [ and ] from the array.

Lineage is not concerned with the process/algorithm of any transformation, but the reagents list of the information used.

Domain Model

This implementation uses Sparx Enterprise Architect, which stores the repository information in a normalised relational database, than can be accessed by Entity Data Model using EA.Gen.Model to provide an object-graph view of the repository database

Derivation

The derivation consists of three parts: recursive derivation of Attribute data-lineage; recursive derivation of Information Item data-lineage constraints; derivation of Enterprise Lineage

Attribute Data-Lineage

Starting from the <> Information Item, recursive search through <> abstractions to find common attributes (either by name or alias) that imply lineage. This is a separate pair of functions dataReferences and mapDataLineage to allow all data lineage to be refreshed for all entities once when derivation is scheduled. The script cane be tailored to site-specific requirements.

It’s generally recommended that overloaded names like “Name” and “Id” are not used in Data-warehouses because Business Inteligence tools (PowerBI, Tableau, Qlikview, etc) will assume that they represent the same domain (if absolutely necessary a copy-columns view object can avoid implied linage by using a different name – this is preferable to changing the script)

The result of the recursive search is stored as AttributeTag

There is no fragile dependency that trace does not include Chicken/Egg loops.

Data-Lineage

Starting from the <> Information Item, the dataLineageReport function recursively gathers all attribute lineage Tag values into an in-memory dictionary so that every object needed for Lineage is referenceable.

For every attribute of every element referenced by the <> Information Item a Lineage constraint is created by recursively searching the dictionary to expand the lineage constraint until all source reagents are added.

Enterprise Lineage

Starting from the <> Information Item, all <> Information Flow references are recursively gathered into an in-memory dictionary of elements referenced (including components, classes, actors, process, etc)

For every data entity referenced by a <> a dictionary is produced of data references with their onward <> references. This dictionary is then recursively expanded to include a reference to every source reagent.

For every data-entity included in every trace reference to the < item a constraint is created recursively searching the flow dictionary and filtering flows with the rule

For each Element in the flow if the next flow conveys a data entity with common data-lineage to the previous flow and has common data-lineage with the constraint item, then it is inferred that this is an extension of the flow lineage

Operation

The Lineage.fs script is included as an example with the Enterprise Hub and scheduled either as a real-time change trigger (where a complex graph is calculated in a few seconds); scheduled refresh job or both

Summary

This kind of highly recursive and data-intensive function cannot reasonably be performed within a client-side addin, but is fast and efficient when scheduled through an Enterprise Hub.

When combined with the change-governance capability of the Enterprise Hub it is possible to include lineage reviews in the governance process.

Enterprise Lineage

Sun, 14 Jul 2024 17:39:34 GMT

Implementing Enterprise Lineage | Data Lineage | Pathwise Complexity

In Data Lineage we highlighted that data lineage can be captured using UML <> relationships and field-level lineage can be projected from trace + common attribute names. This blog is concerned with using those relationships to automatically derive systems lineage.

Application Landscape

A core deliverable for any mature Enterprise Architecture practice is the application landscape that shows applications within the content of the platforms and layers that they are built upon, clustered so that collaborators appear next to each other. Applications are connected by <> relationships that highlight the data that is passed between them. Often only the high-level flows are used, but the detail flows inform the layout of the diagram

Service/Message/Event Buses

A spider-web of flows between systems (where the number of flows between application is a factorial of the number of applications) is often used to demonstrate the complexity of the Current-State compared to the simplicity of a service/message bus in the target state. Services-Buses compound the difficulty of mapping lineage because it becomes common to highlight input/output as message and presume that interpretation of data is usage problem – this is erroneous:

Generalising at the landscape, pushes the actual architecture work down to the applications and converts the landscape from a diagram to picture.
Over generalisation mitigates against analysis of availability or exception escalation.
Automatic lineage projects this as a complex dependency graph where every system is potentially dependant on every other system. This is not a deficiency of the lineage graph, but highlights the need to be more specific.

Event Buses

Quality Event buses derive from the market-data pub/sub pattern rather than the message-queue pattern because they are time-ordered and newer events always supersede prior events. In lineage terms events busses appear to extenuate the service bus complexity problem, but can highlight design faults because the presence of event-lineage in an externally facing interface highlights data-leakage rather than complexity – fixing the leakage fault removes the complexity.

Deriving Enterprise Lineage

In object oriented analysis and design it is axiomatic that if [A] inherits from [B] and [B] inherits from [C] then [A] also inherits from [C] (even if [B] completely replaces the behaviour of [C]). Appling inheritance semantics to data trace [A] <- [B] <- [C] implies [A] <- [C]. In a trading scenario it is common to take the closing-price of one geographical market, and never use it because the opening-price of the next geographic market is available before we use it, but in 1% of cases (holidays/disasters) an opening price is not available and the prior closing-price is used.

The diagram show how enterprise lineage can be inferred from trace relationships. Although Reporting consumes Fact from the data-warehouse, FI Trade is inferred to be in the Fact lineage because Fact has a trace relationship t FI trade. It will be demonstrated that the enterprise lineage can be automatically recursive derived from the flow relationships and the data trace relationships. With enterprise lineage, lineage does not need to be confined to components but can be extended to Actors (Bloomberg in the example); code classes/interfaces and even use-cases and business process for manual sourcing. Examining the lineage at the Reporting component boundary highlights (without consideration of internal systems complexity that Bloomberg contributes to Price and that Fact includes an element of manual data-entry. From a governance/regulatory perspective, it is not necessary to see the detailed working to know that price-sourcing and operational-risk need to be considered. Demonstration that lineage is automatically derived from detailed flows means that a detailed audit is not required to confirm the veracity of lineage summaries.

Price lineage generated automatically from above diagram

["SimpleBank.Component.Reporting", 
    [
        ["SimpleBank.Component.Data Warehouse",
            [
                ["SimpleBank.Component.Message Bus", 
                    [
                        ["SimpleBank.Component.MarketData",
                            ["SimpleBank.Class Diagram.MarketData.Bloomberg"]
                        ]
                    ]
                ],
                "SimpleBank.Component.FI Trading", 
                "SimpleBank.Component.EQ Trading"
            ]
        ]
    ]
]

Fact lineage generated automatically from above diagram

["SimpleBank.Component.Reporting", 
    [   
        ["SimpleBank.Component.Data Warehouse", 
            [   
                ["SimpleBank.Component.Message Bus", 
                    ["SimpleBank.Component.FX Trading"]],
                ["SimpleBank.Component.FI Trading", 
                    [
                        ["SimpleBank.Use Case.Execute Swap OTC", 
                            ["SimpleBank.Process.Execute Trade"]]
                    ]
                ], 
                "SimpleBank.Component.EQ Trading"
            ]
        ]
    ]
]

Conclusion

An alternative approach to data-lineage highlighted that <> relationships are superior when your objective is to provide lineage analysis rather than design. Using trace for data-lineage is an enabler for enterprise lineage. Enterprise Data-Lineage is well suited to enterprise architecture because it separates the solution domain from the enterprise domain where lineage analysis becomes an additional part architecture review and feedback to solutions architects and designers is addressed through improvement proceses. The next article will show how lineage is generated automatically through an Enterprise Hub

Data Lineage

Sat, 13 Jul 2024 21:26:11 GMT

Enterprise Lineage | Implementing Enterprise Lineage

Background

Data lineage is an important concept in information technology because it provides 'meta' information about data that enables us to see where a value came from, how it was manipulated/consolidated and what the information can reasonably used for. "Balance" for cash-accounts is different from "Balance" of all accounts (including loans & mortgages) and different from "Balance" including fungible value of assets held as security – knowing the lineage of a value determines in which context it can be used.

Aside from the constituents and usage of data, lineage enables us to build confidence in the values:

back-testing an individual value to determine the quality of the information (an aggregate of credit and debit accounts is functional dependant on the accounting convention used), and currency values cannot be aggregated without applying an exchange rate.
establishing that the sources of value is complete and consistent conversions are applied.

It is a common mistake to conflate internal and external data-lineage requirements as being the same thing, and treat them as the same problem. This can lead to an internal data-processing approach to a lineage, where the external requirement is only concerned with the ultimate inputs and outputs.

Internal-data-lineage is important for internal quality and testing, but in large organisation (with hundreds of process steps) it obscures the external-data-lineage, and might not meet external needs

Internal Lineage

The traditional approach to data-lineage is to map the association from each attribute source to each attribute destination using database Extract/Transform/Load (ETL); Enterprise Application Integration (EAI); or in heterogeneous environments stand-alone Data-Dictionaries (with pre-built data loaders for common ETL/EAI scenarios).

It is also possible to document these relationships in UML modelling tools (like Sparx Enterprise Architect) using attribute specific associations between classes. The diagram at the top of the page shows the ETL mapping for a data transfer. All ETL/EAI tools obscure detailed calculations, partly because of tool limitations, but mostly because complex calculations (VaR, xVA, RWA, etc ) cannot be represented as lineage graphs.

In all cases, tooling support it critical to reduce the effort of mapping, but often at great cost.

External Lineage

An alternative (simpler) approach is to separate the details of field-level derivation from the summary view of metadata lineage, taking advantages of the fact that in 95% of cases attributes have the same name (in finance Reuters instrument code is always RIC) and in the remaining cases an alias can be used (ETL tools always presume this to start). The detailed mapping can be replace by a single trace reference.

The advantage of using <> references is that the diagrams remain domain focused (without the clutter of detail), and changes to attributes imply changes to lineage, and updates do not need to be cascaded through a myriad of related diagrams. In UML the Lineage can be represented as <> Information Items where constraints represents the lineage rules.

It will be demonstrated that the constraints can be automatically derived by recursively scanning all trace relationships to algorithmically summarise the lineage in real-time in an enterprise hub. In our solution, the lineage constraint is represented as JSON array graph that can be imported into visualisation tools if required. Whilst a raw JSON graph does not provide compelling presentation it does allow gap to be highlighted

["SimpleBank.Databases.WareHouse.Price.MARKET_PRICE", ["SimpleBank.Databases.Trading.LSEPrice.Price"], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.OpeningPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.OpeningPrice"]]], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.ClosingPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.ClosingPrice"]]], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.SpotPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.SpotPrice"]]], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Bid_Price"], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Close_Price"], ["SimpleBank.Databases.Trading.LSEPrice.Price"], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Price"]]

Derivation

While the automatic derivation of data lineage from source is valuable for reporting and analysis, a derivative value includes reliable enterprise-lineage of UML flows (with referenced content) can now be properly derived.

While a regulatory reporting application will commonly source Price/Trade data from a data warehouse, when the warehouse price has trace references to an LSE Equity Price and Bloomberg-Price it allows the systems lineage to be inferred from the data-flows. Enterprise Service Bus, market-data pub/sub and Event Driven Architecture can obscure the flow of data through systems because a provider will often be unaware of usage.

It will be demonstrated that system lineage can be infered from Information Flow between systems and where a tranformation is implied by data lineage

Usage

<> summaries are most useful at the end-point where lineage needs to be demonstrated to sponsors and regulators - it can also be used at any level as a quality review for internal-lineage.

Real-time automatic derivation allows continuous improvement of domain-specific models, and overcomes the need to bridge the gap between ETL/EAI/DD tools and modern service/event orientated architectures

Conclusion

Early attempts to meet the regulatory needs of BCBS 239 have focused on using the internal lineage approaches in depth first search for lineage information.

Regulators are generally more interested in Breadth first search demonstration of product coverage and commonality of price sources; rather than the size and cost of data-governance-offices.

It is not unreasonable to conclude that failure to demonstrate lineage for BCBS 239 using internal-lineage implies that it will fail for FRTB... a better approach is available, and should be taken

Quantitative Enterprise Architecture

Sat, 13 Jul 2024 21:27:49 GMT

Enterprise Lineage | Pathwise Complexity | Data Input Validation

Background

Much has been written about Quantitative Enterprise Architecture, largely starting with the Capability Maturity Model from the US Software Engineering Institute in the 1980’s through Six-Sigma process optimisation through to structured systems testing across a range of business capabilities. The challenge of CMMi level-3 (processes are defined and being followed) is that the assesment is subjective. CMMi Level-4 introduces the capability to capture metrics from the development process, while CMMi Level-5 introduces the capability to use the metrics to improve the development process.

Togaf and Zachman provide frameworks that extenuate the need to consider architecture early with a focus on the principle that addressing risk early reduces overall cost, providing evidence of progress at the early stages of a project.

Model Driven Architecture

MDA was driven as a mechanism to add value to architecture activities, with the observation

That requirements coverage and scope management can be quantitatively measured if there is a maintained trace relationship between the requirements and deliverable
That the net-value of specification can go from positive to negative if the document is misleading because it is wrong
That database design rigour and principles accrue value when applied to system behaviour
That much code is boilerplate where the same concept needs to be applied to database, classes, interfaces and the different abstractions needed at different levels of an architecture
That most integration test failings are due to mismatches in the implementations at component boundaries.

The big driver for MDA were

Inability to represent EJB interfaces as normal class diagrams (an enterprise bean does not directly implement is interface, but relies on the EJB container to provide glue)
Limitation of the Unified Modelling Language (UML) that many abstractions (e.g. ternary associations) can not be mapped to code

MDA is most commonly manifested as model transformations (PIM to PSM) and code generations

Architecture Compliance

Past Malfeasance has lead regulators to demand that institutions demonstrate evidence of data-sourcing to ensure that all systems are covered with consistent pricing sources and that mandated regulator processes are being performed.

In the absents of an established Enterprise Architecture various point-solution databases are developed

Quantitative Enterprise Architecture

Combining CMMi, MDA and Architecture Compliance introduces the network effect that information combined into an Architecture Repository than can meet multiple needs from a single source, but two issues become apparent:

Different models progress at different rates, and becomes difficult to coordinate manually
The “forest and trees” pattern emerges where a mass of detail obscures the areas that need focus

The approach taken with Enterprise Hub is to consolidate information from different domain sources, and then use automated metric to prioritize areas for detailed analysis. The metrics that are important change over time, so flexible frameworks and tools are needed for rapid evolution; Enterprise Hub provides a runtime environment and two frameworks for Analysts/Developers:

Workflow Foundation - process oriented workflow tool for business analysts to provide simple procedural tools
F# Script Foundation - code oriented tool for developers to provide complex (recursive) analysis. Functional languages like F# are ideally suited the to the analytical problems of graph/diagram interpretation because prevent the kind of sider effects that cause imperitaive programs to fail with highly recursive problems.

With each foundation, scripts are automatically executed within the Hub, resulting in metrics properties and issues being created that can feed into the review cycle for architecture work. The Enterprise Hub includes working samples that provided real-world solutions to common quantitative architecture problems:

Enterprise Lineage - Recursive derivation enterprise and data lineage as Lineage objects change and when scheduled.
Pathwise Complexity - Recursive Process complexity metrics, with property banding {high, normal, low} and issues (for high complexity)
Data Input Validation - Recursive search for process issues that cannot be identified with normal QA
ChangeStatus - XAML Workflow job to apply change-request status changes to the related items

Implementing Pathwise Complexity

Sat, 13 Jul 2024 21:32:22 GMT

Pathwise Complexity | Implementing Enterprise Lineage | Data Input Validation

Background

Pathwise Complexity is one technique to measure complexity (and by inference quality) of business processes, but is dependent on rigour in the modelling process, and needs to be amended when a less mature approach has been adopted. In an earlier blog I demonstrated that pathwise complexity can drive the need to move beyond back-box process interactions and highlight the wider collaboration needed with counterparties.

A common pattern in long value-chains is to use intermediate event (like of-page connectors in a flow-chart) to extenuate a customer journey: in these scenarios the technique needs to be amended to meet site-specific conventions. Algorithmic precision can be traded for greater coverage of immature process maps – it is recommended to add start/end events to value-chains because intermediate events (in mature maps) highlight service interactions rather off-page links. The Enterprise Hub facilitates this with scripted workflows that can be evolved as the site becomes more mature. Included with the server are three variants of script to calculate the metric

Overview

The script is either run real-time (from update triggers) or overnight for an entire model. Pathwise Complexity is calculated and stored as metric objects for processes. From the spread of complexity score three bands derived and stored as indicator properties; then for ‘high’ complexity processes issues are created to feed into continuous improvement efforts

Implementation

The full implementation is 250 lines of functional F# code (* F# is used for economy of expression, and because it avoids the errors common with imperative languages). EA.Gen.Model provides the database binding to view a Sparx repository as a graph of interconnected nodes.
The core algorithm is a recursively searches through the connections on a diagram from each start-event accumulating a lists of paths each time an end-event is found. Rather than a simple recursive search with re-entry checking that would filter loops, the cycle function breaks the path into splices of routes delimited by the current node, then if there are any duplicates the path can be rejected.

        // start elements
        let starts = 
            diagRefs 
            |>  List.filter (fun i ->  i.Stereotype = "StartEvent" ) 

        // Find number of cycles of the same pattern
        let cycle (e : Element) (l : Element list) =
            if l = [] then 0
            else
                let id = e.Id 
                let state : (int list list * int list) = ([id],[]) 
                let splice = 
                    l
                    |>  List.filter (fun i ->  not (i.Stereotype = "StartEvent" || i.Stereotype = "EndEvent"))
                    |>  List.fold (fun (h,w) y ->  if y.Id = id then ([[y.Id] @ w] @ h, []) else (h, [y.Id] @ w)) state
                let t = List.rev (fst splice)
                let sets =
                    List.tail t @ [List.head t @ (snd splice)]
                List.map (fun i ->  List.fold (fun x y ->  if i = y then x + 1 else x) 0 sets) sets
                |>  List.fold (fun e y ->  (e + y - 1)) 0

        // seek recursively until EndEvents are found or cycle detected
        let rec findEnd (e : Element) (l : Element list) (paths : Element list list) = //: Element list list =
            let reenter = List.filter (fun (i :  Element) ->  i.Id = e.Id) l |>  List.length  
            if reenter >= 1 && ((cycle e l) > 0) then   // skip path if it loops
                paths
                [([e] @ l)] @ paths // add this path to the sets
            else
                e.StartConnectors 
                |> Seq.toArray
                |> Array.filter (fun c -> c.ConnectorType = "ControlFlow" && c.EndElementId.HasValue)
                |> Array.map (fun c -> c.EndElementId.Value)
                |> Array.filter (fun c -> refMap.ContainsKey (c))
                |> Array.map (fun i -> refMap.[i])                            // only connctions on the decomposition diagrams
                |> Array.fold (fun a v -> (findEnd v ([e] @ l) a)) paths

        starts 
        |> List.fold (fun a v -> (findEnd v [] a)) [] 
        |> Seq.map (fun i -> Seq.ofList i)

A two line amendment to the starts list and EndEvent stereotype filter allows for paths between intermediate events to be included, while addition of && c.Stereotype = "SequenceFlow" to the connector type filter blocks traversal of message-flow paths Deployment The script is deployed to the enterprise hub by adding a scheduled job (for the whole-repository: load balanced by Quart.net in a cluster) or as a trigger on a connection for real-time service-side update.

Pathwise Complexity is an example of the kind of complex data-intensive analysis/metrics that really need to run unattended on a server. Pathwise Complexity is an architype for the kind of quantitative enterprise architecture that is possible with {Sparx Enterprise Architect, EA.Gen.Model graph view of data, Enterprise Hub hosting environment. Full code will be familiar to any programmer familiar with {F#, OCaml, Haskell, Scala} – it is worth the effort to use functional languages to avoid unexpected side-effects.

namespace EA.Gen.Hub.Script

open EA.Gen.Hub.Model
open EA.Gen.Model
open System.Linq
open System
open Quartz
open System.Threading.Tasks
open Serilog

(*
    Summary : Calculate the Pathwise complexity of the an element from the decomosition diagram of the object
*)
type PathwiseComplexity () =
    class
    let findPaths (element : Element) (diagrams : Diagram seq) : Element seq seq =
   
        // all references from diagrams
        let diagRefs = 
            let els (d : Diagram) = 
                d.Elements 
                |> Seq.map (fun i ->  i.Element)
                |> Seq.filter (fun i -> not ( i = null))
                |> Seq.toList
            diagrams
            |> Seq.fold (fun a y -> (els y) @ a) [] 
        
        // referenes as a dictionary for lookup
        let refMap = 
            diagRefs
            |>  List.map (fun i -> (i.Id, i))
            |>  Map.ofList

        // start elements
        let starts = 
            diagRefs 
            |>  List.filter (fun i ->  i.Stereotype = "StartEvent" ) 

        // Find number of cycles of the same pattern
        let cycle (e : Element) (l : Element list) =
            if l = [] then 0
            else
                let id = e.Id 
                let state : (int list list * int list) = ([id],[]) 
                let splice = 
                    l
                    |>  List.filter (fun i ->  not (i.Stereotype = "StartEvent" || i.Stereotype = "EndEvent"))
                    |>  List.fold (fun (h,w) y ->  if y.Id = id then ([[y.Id] @ w] @ h, []) else (h, [y.Id] @ w)) state
                let t = List.rev (fst splice)
                let sets =
                    List.tail t @ [List.head t @ (snd splice)]
                List.map (fun i ->  List.fold (fun x y ->  if i = y then x + 1 else x) 0 sets) sets
                |>  List.fold (fun e y ->  (e + y - 1)) 0

        // seek recursively until EndEvents are found or cycle detected
        let rec findEnd (e : Element) (l : Element list) (paths : Element list list) = //: Element list list =
            let reenter = List.filter (fun (i :  Element) ->  i.Id = e.Id) l |>  List.length  
            if reenter >= 1 && ((cycle e l) > 0) then   // skip path if it loops
                paths
            elif e.Stereotype = "EndEvent" then
                [([e] @ l)] @ paths // add this path to the sets
            else
                e.StartConnectors 
                |> Seq.toArray
                |> Array.filter (fun c -> c.ConnectorType = "ControlFlow" && c.EndElementId.HasValue)
                |> Array.map (fun c -> c.EndElementId.Value)
                |> Array.filter (fun c -> refMap.ContainsKey (c))
                |> Array.map (fun i -> refMap.[i])                            // only connctions on the decomposition diagrams
                |> Array.fold (fun a v -> (findEnd v ([e] @ l) a)) paths

        starts 
        |> List.fold (fun a v -> (findEnd v [] a)) [] 
        |> Seq.map (fun i -> Seq.ofList i)

    (* Summary : apply the complexity metric to database *)
    let metric (db : Sparx) (e : Element) values : unit = 
        
        let set = query {for r in e.Metrics do
                            where (r.MetricType = "Complexity")
                            select (r.Metric,r)}
                  |> Map.ofSeq

        let findOrCreate name = 
            match set.TryFind name with
            | Some m -> m
            | None   ->  let o = new ObjectMetric()
                         o.Metric <- name 
                         o.MetricType <- "Complexity"
                         e.Metrics.Add (o)
                         o
        let apply (p,m,t) = 
            (findOrCreate "Paths").EValue               <- Nullable(float(p))
            (findOrCreate "Longest Path").EValue        <- Nullable(float(m))
            (findOrCreate "Total Path Length").EValue   <- Nullable(float(t))

        apply values

    let fromMetric (m : ObjectMetric) =
        match m with
        | p when m.Metric = "Paths"             -> (int(m.EValue.Value),0,0)
        | l when m.Metric = "Longest Path"      -> (0,int(l.EValue.Value),0)
        | t when m.Metric = "Total Path Length" -> (0,0,int(t.EValue.Value))
        | _                                     -> (0,0,0)

    let createProperty (id : int) (rag : string) = 
        let o = new ObjectProperty ()
        o.ElementId <- Nullable(id)
        o.Property <- "Pathwise Complexity" 
        o.Value <- rag
        o

    let createIssue (id : int) (db : Sparx) = 
        let o = new ObjectProblem()
        o.ElementId <- id
        o.Problem <- "High Pathwise Complexity"
        o.ProblemType <- "Issue"
        o.DateReported <- new Nullable (DateTime.Now)
        o.Status <- "New"
        db.ObjectProblems.Add o |> ignore


    let createIssues (db : Sparx) = 
        query {for p in query {for p in db.ObjectProperties do 
                               where (p.Property = "Pathwise Complexity" && p.Value = "High" && p.ElementId.HasValue)
                               select p} do
               leftOuterJoin i in query {for p in db.ObjectProblems do 
                                         where (p.Problem = "High Pathwise Complexity")
                                         select p} on (p.ElementId.Value = i.ElementId) into g
               for r in g do
               select (p.ElementId.Value,r)}
        |> Seq.iter (fun (p,r) -> if r = null then createIssue p db)

    let measure (e : Element) (db : Sparx) = 

        let agregatetPaths e d = 
            let pathmap = findPaths e d
            let paths = 
                pathmap |> Seq.fold (fun a y -> a + 1) 0
            let totalmax =
                let length l = 
                    l |> Seq.fold (fun a y -> a + 1) 0
                pathmap 
                |> Seq.map (fun i -> length i) 
                |> Seq.fold (fun (m,t) y -> ((if y > m then y else m),(t + y))) (0,0) (*max,total*)
            (paths, (fst totalmax), (snd totalmax))
        
        Serilog.Log.Information ("Measuring {0} {1}", e.Name, e.GUID)

        let diagrams =
            let did = if not (e.PDATA1 = null) && (Seq.fold (fun a y -> if a then a else Char.IsDigit(y)) false e.PDATA1) then int e.PDATA1 else 0
            if did > 0 then
                query {for d in db.Diagrams do
                        where (d.Id = did)
                        select d}
                |> Seq.toArray
            elif e.ObjectType = "Package" then
                query {for p in db.Packages do
                       join d in db.Diagrams on (p.Id = d.PackageId.Value)
                       where (p.GUID = e.GUID)
                       select d}
                |> Seq.toArray
            else 
                [||]
        let aggregate = agregatetPaths e diagrams
        if not (aggregate = (0,0,0)) then 
            Serilog.Log.Information ("Measured {0} {1}", e.Name, aggregate)
            metric db e aggregate

    // Summary: Execute the measurement for the activity elements 
    let execute (elementClass : ElementClass) (repo: string) (guid : string ) : unit  =

        use db = new Sparx(repo)

        query {for e in db.Elements do
               where (e.GUID = guid && (e.ObjectType = "Activity" || e.ObjectType = "Package"))
               select e}
        |> Seq.iter (fun i -> measure i db |> ignore)

    // Summary: Execute the measurement for all packages that contain measures
    let executeAll (repo: string) : unit =

        use db = new Sparx(repo)
        try
            let childMetrics pid = 
                let max a b = if a > b then a else b
                query {for e in db.Elements do
                       join m in db.ObjectMetrics on (e.Id = m.ElementId)
                       where (e.PackageId.Value = pid)
                       select m}
                |> Seq.map (fun i -> fromMetric i)
                |> Seq.fold (fun (ap,am,at) (p,m,t) -> (ap + p,max am m, at + t)) (0,0,0)

            // measure all activities
            let elements = 
                query {for e in query {for e in db.Elements do
                                       where (e.ObjectType = "Activity" || e.ObjectType = "Package")
                                       select e} do
                       leftOuterJoin i in query {for p in db.ObjectProblems do 
                                                 where (p.Problem = "High Pathwise Complexity")
                                                 select p} on (e.Id = i.ElementId) into g
                       for r in g do
                       select (e,r)}
                |> Seq.toArray
            elements |> Array.iter (fun (e,r) -> if r = null then measure e db)

            let complexitySet =
                query {for m in db.ObjectMetrics do
                       where (m.Metric = "Paths" && m.MetricType = "Complexity")
                       select (m.ElementId, m.EValue.Value)}
                |> Seq.toArray
            let avg = (Array.fold (fun a (e,v) -> a + v ) (0.0) complexitySet) / float complexitySet.Length
            let stdev = sqrt ((Array.fold (fun a (e,v)-> a + (float v - avg) ** 2.0) 0.0 complexitySet) / float complexitySet.Length)
            let low = avg - stdev
            let high = avg + stdev

            let Rag v = 
                if (v > high) then 
                    "High"
                elif (v < low) then 
                    "Low"
                else
                    "Normal"

            let curTag =
                query {for t in db.ObjectProperties do
                       where (t.Property = "Pathwise Complexity" && t.ElementId.HasValue)
                       select (t.ElementId.Value,t)}
                |> Map.ofSeq
        
            complexitySet
            |> Array.filter (fun (e,v) -> curTag.ContainsKey(e))
            |> Array.iter (fun (e,v) -> curTag.[e].Value <- (Rag v))
        
            complexitySet
            |> Array.filter (fun (e,v) -> not (curTag.ContainsKey(e)))
            |> Array.map (fun (e,v) -> createProperty e (Rag v))
            |> (fun p -> db.ObjectProperties.AddRange (p))
            |> ignore

            createIssues db

            db.SaveChanges () |> ignore
        with 
            | :? Exception as e -> Serilog.Log.Error (e, "Pathwise Complexity error {0}", e.Message )
                

    (* Implement the trigger methods that would be called by the dynamic script job*)
    interface ITriggerScript with 
        member this.ExecuteTrigger ( elementClass: ElementClass, repo : string, guid : string) =
            execute elementClass repo guid

        member this.ExecuteAll ( repo : string, onlyModified : bool ) =
            executeAll repo 
end

Data Input Validation

Sat, 13 Jul 2024 21:33:17 GMT

Enterprise Lineage | Data Lineage | Pathwise Complexity

Problem

The 'know your customer' (KYC) and ‘three lines of defence’ (3LoD) are two common business modelling problems that require that under all circumstances data must be provide prior to the use of the information.

For KYC, we need to ensure that the {due-diligence, credit-check, embargo, politically exposed persons, financial-crime} checks are all performed before trades are executed for the client. These checks are normally undertaken before a trading account is setup; but in an increasingly complex world clients can be introduced via brokers, white-label services or as counterparties in a derivatives bought from another institution. In a small number of cases the KYC issues only become apparent in the back-office when trades must be settled. In these cases there is no direct link between KYC and the settlement activity so a database must be checked. The challenge for process-design is to ensure that the database entry has always been written prior to being needed.

For 3LoD we need to ensure that treasury provision or hedging has been completed before regulatory exposure reporting for BCBS239 to ensure that the risk profile reconciles with the trading book.

Both of these cases are examples of the "read before write" problem where a process can be defective if there are scenarios that information is needed before it is provided. The conventional strategy is intensive process quality assurance to ensure that the linked set of processes across a value-chain include the appropriate activities. The problem is that a the upstream process might be changed without the downstream processes being re-validated. The problem becomes an issue when systems are implemented and gaps create data-quality exceptions during normal business

A Solution

The enterprise hub addresses this need with the sample DataInputValidation.fs script that continuously monitors the process-model (either real-time as models are changed, or overnight for the entire model) to find cases where a Data-store (drum logo) or Data-Object (page logo) is being read (by BPMN DataInputAssociation), and recursively searches back through the control flows to find an instance where the Object/Store is being written (by BPMN DataOutputAssociation). Where an instance can not be found, an issue is created in the repository that can be used through continuous process improvement.

The algorithm of the script is just a few lines of recursive F# code that uses EA.Gen.Model to search back through a Sparx process model (through activities, decisions, messages, intermediate-events) to find the write activity.

    let rec findTarget (start : Element) (target : Element) (last : string) (path : int list) : Boolean = 
        let recurs i l = 
            List.fold (fun a y -> if a then a else (y = i)) false l

        if start.Id = target.Id &&  last = "Activity" then // we're only interested in activities that write 
            true
        elif recurs start.Id path then
            false
        else
            (start.StartConnectors 
                |> Seq.filter (fun i -> i.ConnectorType = "Dependency" && i.Stereotype = "DataOutputAssociation")
                |> Seq.map (fun i -> i.EndElement)
                |> Seq.toList)
            @
            (start.EndConnectors
                |> Seq.filter (fun i -> i.ConnectorType = "ControlFlow")
                |> Seq.map (fun i -> i.StartElement)
                |> Seq.toList)
            |> List.fold (fun a y -> if a then a else findTarget y target start.ObjectType ([start.Id] @ path)) false

The sample can easily be changed to handle sources other than a standard Activity or to provide additional validation. For KYC and 3LoD the process can easily be changed to validate client specific rules such as the distinction between the creation of a risk, mitigation of a risk and audit-point (end-event) where risks are highlighted

Example

When reading the sparx sample process for Nobel prizes, it is not immediately apparent that “Completed Nomination Forms” does not have a process to create it, but is highlighted by the DataInputValidation script.

Pathwise Complexity

Sat, 13 Jul 2024 21:34:27 GMT

Enterprise Lineage | Implementing Pathwise Complexity | Data Input Validation

Pathwise Complexity is technique to measure the complexity of a BPMN process with emphasis on of decisions and particularly the consequences of re-work/re-check. All well-formed business processes begin with one or more start-events, and proceed through a sequence of activities decisions, gateways and intermediate events until they end at one or more end-events: Business processes are closed graphs that have finite number of paths from start to finish.

Pathwise Complexity assists process improvement by ranking processes that are most receptive to improvement. Process improvement techniques like Six-Sigma are iterative – if you fix the worst processes, eventually they will all be good.

Pathwise complexity is inspired by the Cyclomatic complexity in software engineering that uses the total number of nodes and total number connections between them to compute a score, but with the distinctions:

Cyclomatic complexity explicitly ignores the algorithmic complexity of iterating over a list rather than treating each item in a list as a distinct activity. In software engineering exceptions can be raised at a number points that can not be mapped to an orderly flowchart from start to finish.
Activities in software have low cost relative to activities in a business process and business decisions are more expensive than activities.
Rework in a business process is exponentially more expensive than preparatory activities and non-looping decision. Pathwise Complexity makes axiomatic assumptions:
A path that does not reach an end is not complex, it does not work. Complexity score is not a substitute for quality – quality assurance metrics also need to be captured.
Actors are intelligent enough not to keep making the same decision over and over again unless some decision has changed.

Pathwise complexity is calculated by:

Mapping ever distinct path from start to finish branching to different paths following decisions and gateways
If an activity is reached that was previously visited in the path, check it was visited by a different route from last time. [1]->[2]->[3]->[2] is ok because [1]->[2] and [3]->[2] are different routes, but [1]->[2]->[3]-[2]->[3] is not because the route [2]->[3] is cyclic and therefore invalid.
Count the total number of distinct paths
Calculate the standard deviation of distinct paths across all processes, and use the bands to classify them into {High, Normal, Low)

There is a high correlation between the Pathwise Complex Process and Processes that can be improved. Automatic calculation is critical to it being a viable approach

Examples

The Evaluation process has low Pathwise complexity because there is one path

The Implement Process has Normal Pathwise Complexity because there are five distinct paths The Process Incident Management has High Pathwise Complexity because there are 128 paths (due to the feedback loop to VIP Customer, who might raise additional questions from the answer). The solution to the high pathwise complexity is not to model 'VIP Customer' as a black-box, but instead as a dialog where the customer is expected to review the answer before closure. The limit to the methodology is that Black-box lane often embed hidden assumptions about time ordering of behavior that results in the "E-main voting example" reporting a Pathwise complexity of 7,164,676 because "Receive Vote" (in pink) can be sent at any time. Either black boxes are a defective design pattern, or messages need to be excluded from complexity analysis

The dirty truth of data lakes

Sat, 13 Jul 2024 21:35:07 GMT

The architecture principle that drove the creation of the data-lake paradigm was:

you cannot determine all the future use-cases for data at the point of capture
the opportunity cost from analysis delays is higher than the cost of storing the data
the operational cost of archiving data is higher than the cost of retaining it
the operational cost of updating data is higher than processing cost of aggregating it.

These principles give rise to the data-lake pattern, where considerable investment by web-scale companies {google, amazon, Alibaba, Facebook, etc} continue to accrue new insights from the click-stream of users; this in turn led to the widespread adoption of Hadoop and its many derivatives as a data-storage pattern.

The allegory of a lake is appealing because lakes store water for later use, but also imply an effortless natural process rather than the effort and cost of a building reservoir. Data-sewer is the anti-pattern of the data-lake pattern, you really do not know whether you’ve built a lake or sewer until you try to accrue value from it.

The reason for is post is to highlights three traits that lead to data-sewers rather than a data-lakes:

missing the opportunity to undertake rudimentary analysis up-front and normalisation (either by missing commonality (treating {facility, loan, mortgage, repo,..} as exceptions) or misclassifying (treating a derivative contract as a legal agreement rather than an instrument)
missing the opportunity to map lifecycle (hedging, securitisation, late-booking)
missing the opportunity to move to a real-time event-model, with a focus on batch cycles.

The Architecture mistake is to see Hadoop as a paradigm shift in technology rather than a (potentially) cheaper data-warehouse. When cloud providers offer hybrid solutions that combine traditional MPP databases (SQL/Server PDW, Oracle Exadata, etc) with Hadoop/Spark/Kafka integration and block-storage replacing HDFS.. it is not unreasonable for business sponsors to question whether all the effort was a waste of time.

Microsoft Synapse is one example of technology advance obsoleting chief data offices

Cephei Modelling

Sat, 13 Jul 2024 21:35:56 GMT

Cephei is a low-code solution for financial modelling, from prototyping through to real-time pricing and risk. Traders and Developers can work on the same models because parallel calculation eliminates performance issues of a flexible framework.

Event driven approach

Today the most successful businesses are event driven to respond to customer needs, just-in-time provision of services or agile change. This is not always reflected in application services; as we move to cloud services, systems need to change to meet the current and future challenges of business.

The Cephei approach is to model services as components that respond to (market) events and raise events automatically as calculated cells change – much like a spreadsheet but on a desktop, server or cloud.

Cephei models are constructed as building blocks that hide cells that are only used internally. Models can be assembled from other models as parts, building high-level models that represent the business domain.

Cephei addin

The Cephei Excel addin enables Excel to be used to build models, and test them with live data from internal sources and market data from Bloomberg or Reuters.

Cephei is not a conventional Excel addin (that relies on VBA macros to prevent Excel from hanging during modelling). It has its own parallel calculation that updates Excel in the background.

Cephei provides 15,000 Finance functions from Quantlib, but can be re-built using domain specific libraries. The addin can be re-generated using any (C++/C#/F#) financial library using the Cephei.Gen code generator. The Quantlib functions can be viewed as a proof of concept, that demonstrates that any library can be used by Cephei.

Cells are added to the Cephei Model by creator functions like =_FixedRateBond(“bond1,..) , properties are accessed by functions like =_FixedRateBond_cleanPrice(“clean1”, “bond1”) and values viewed in Excel by +_Value(“clean1”). In the packground Cephei maintains the source-code to generate source-code for each function from the Cephei menu.

Contact feedback@cepheis.com or call for evaluation, integration, advise or development services.

See Blotters for examples of with live ticking data and Sample Model for code generation and reuse

File	Notes
Cephei.xll	Addin to provide Cephei and functions to 32-bit version of Excel
Cephei64.xll	Addin to provide Cephei and functions to 64-bit version of Excel
BondBlotter.xlsx	BondBlotter Example spreadsheet showing ticking changes without blocking Excel editing
Bond1.xlsx	Bond1 Scratchpad spreadsheet for modelling {Zero, Fixed, Floating} bonds
Globals.xlsx	Globals Extract from Bond1.xlsx to generate globals.fs
Cephei.Model.zip	F# code generated from the three above models, together with compiled Cephei.Model.dll that adds using in the following example. Cephei.Model.dll needs to be copied to the same directory as Cephei.xll for the following examples
BondPricer.xlsx	BondPricer Extract from Bond1.xlsx to generate BondPricer.fs
Bond1_Fixed.xlsx	Bond1_Fixed Extract from Bond1.xlsx to generate FixedRateBond.fs
Bond_Fixed_Portfolio.xlsx	Bond_Fixed_Portfolio Example spreadsheet showing ticking changes to a model referencing compiled versions for the three models above

Cephei Cell

Sat, 13 Jul 2024 21:38:50 GMT

This article introduces the Cephei.Cell library released recently to the Nuget package manager with source in GetHub honouring an promise made to Don Syme many years ago. It demonstrates the efficiency of development of mathematical models in F#

Background

The “Cell Framework” started fifteen years ago as a mechanism to make Monte Carlo simulations execute fast for interactive calculation of Potential and Expected Exposure of derivative {swap, swaption, cap, floor} trades before execution to ensure they were profitable enough to balance the exposure with a CDS trade. Monte Carlo simulation are compute intensive because thousands of alternate scenarios must be calculated for each time-point. With a minor change to use Cells (with asynchronous calculation) for NPV calculation it was possible to halve the time taken to risk a trade. At the time cells implemented a future promise pattern, but it was apparent the speed of overall calculation far outweighed the cost of constructing and scheduling tasks for reasonably expensive operations, and that re-using the cells for further time-points would allow for more operations to be performed in parallel.

The second version replaced the Mutex lock with a “Latch Lock” pattern (used internally by the Oracle RDBMS) where objects are only explicitly locked if there is contention between threads. This version introduced event subscriptions to propagate changes to dependant Cells like a spreadsheet, and a profiling mechanism to identify dependencies without the time (or errors) of defining dependencies in code. This was used for Early Warning of Liquidity risk, where any number of movements in the price of {equity, FX, futures} instruments could trigger action to tighten risk appetite.

The third version combines the Latch-lock with state transition to provide lockless concurrency; eager-steal for waitless calculation on modern multi-core servers; moves history from within Cells to session to remove garbage collection contention; and initialisation-time profiling of closures. Cell is significantly faster for large complex calculations, where a number of factors can change, are well-suited to streaming price calculation and real-time risk derivation.

Framework or Kernel

The term “Cell Framework” is used because Cephei.Cell is foundation for Cephei.QL that is being updated for .NET 5 to remove Windows/Wine dependency. Cephei uses code generation to wrap underlying C++ quantitative finance functions into higher-level abstracts to allow an Excel addin to be used to define a financial model that can be saved as a functional program directly – Excel becomes an editor for functional code.

While the Cell Framework replicates the promise pattern and the paradigm of spreadsheet cells, it is also a foundation for different way of thinking about software building blocks that extenuates functional relationships between values rather than linear paths of derivation.

While Cell provides a mechanism to automatically parallel calculate a number of functions, a Model provides a mechanism to encapsulate complexity, and only surface values that are input or output. Model can contain other Models to build high-level abstractions for {Asset-Class agnostic Trade, Portfolio, Book, Ledger, etc}. Cells within a Model can be changed at runtime (like a spreadsheet)

Session provide a mechanism to group together changes to input values (e.g. market data feed) without duplicate calculations (same as the pattern of manual calculation in Excel), while SessionStream adds overlapping sessions for a continuous calculation of high-level values (like RWA) in near-real-time. Cell and Model provide the IObservable/IObserver pattern for event linkage with a stream based calculation.

Implementation

Cell uses lockless concurrency with thread synchronisation ManualResetEvent for contention (when a value is needed, but calculation has already commenced) using processor cache bypassCompare & Swap for SpinLock and State pointer

Cell

Cell is implemented as a Finite State Machine where Operations and Events cause atomic state-transitions to ensure that under no circumstance is a Dirty value read from Cell, with Operating System Events only used when threads are blocking for a value from calculation currently being performed. {{ "Cephei/Cell-state.png" | asset_url | img_tag }}

There are three specialisations of Cell that are instantiated either through the Cell module, or (in the case of CellEmpty) through a Model that includes forward-reference to Cells that have not been defined at that point in the Model

CellFast

When it is know in advance that Cells and their references will not be redefined at runtime, and their references are not forward referenced, CellFast can be used to bypass runtime profiling of cells. In this scenario closures are inspected at instantiation to extract the cells that this cell is dependent on.

let calculation_cell =
  let build (p : ICell<’t>) =
    Cell.CreateFast (fun () -> some_complex_calculation  p.Value)
  build referenced_cell

In this example referenced_cell is captured by the closure rather than the wider model (Cell.CreateFast factory method should be used to avoid causing the calculation to re-evaluate whenever any value in the Model changes – it will default to a Cell object if there are no parameters for instantiation-time profiling. CellFast should only be used if you are comfortable with advanced concepts of Functional Programming.

CellSpot

CellSpot is a further specialisation of CellFast where it is known in advance that the Cell will never be redefined, and the latest (spot) value should always be used (typically the FX rate for a portfolio) {{ "Cephei/cell-class.png" | asset_url | img_tag }}

Cell

The static module Cell provides factory functions to create cells from F# using type-inference, plus a Thread Local stack of Cells currently being profiled.

Any Cell that reads the content of another Cell while evaluating its function, is by definition dependant on it For the Boolean conditional logic, the condition code needs to be in a separate cell in order for the expression to re-profile when the boolean value changes:

let dependant_cell = Cell.Create (fun () -> 
		if cond_cell.Value then 
			equity_trade.Value 
		else 
			credit_trade.Value)

Model

{{ "Cephei/cell-model.png" | asset_url | img_tag }} Model provides a tree structured dictionary of cells in an overall model, but is different from a plain dictionary in a number of respects:

It collects all events from each of the Cells (& Models) within it, enabling a single subscription for changes to any part of a model
It provides the model.As(“reference name”) for type coercion of the value being referenced from different parts of the model, which also enables forward reference of cells that have not been previously defined within a model
Names passed to a model lookup use the ’|’ as a delimiter, so “equity|hsbc|lon|fair_value_price” would resolve to a reference to the Cell fair_value_price within the hierarchy of the model

Session

{{ "Cephei/cell-session.png" | asset_url | img_tag }} Session provides for consistency that derived cells are calculated with the same set of values that were set when the session was opened. Generally changes to the spot value of a bond future, with trigger changes to the price of quoted IRS instruments and long-dated government bonds which appear to ripple along a yield curve, which could trigger multiple valuations of dependant instruments – unless relative value arbitrage is being sought, it is better to snap all price changes together and calculate one.

Session has a Current thread static reference that allows assignment to the value of a cell to be implicitly part of the session, with calculation delayed until the session is disposed – in this content IDisposable.Dispose() is not a euphemism for delete, because the session will be passed through the event-notification methods and kept alive until all Cells have left the session when it finally becomes eligible for garbage collection.
Important values are

Scope - Cells that have joined the session in response to "JoinSession" event and Join call
Values - Boxed value of the cells referenced in the session for consistent values

The only guarantee that the value returned from cell.Value will be consistent with the session (later sessions might be scheduled first) is to read the value using a SessionObserver

SessionStream

SessionStream provides a proxy to an interlaced stream of sessions, that return _current for GetValue calls and _next for SetValue moving starting calculation of _next once the _current session has completed.

SessionStream allows an event-stream subscriber to hold a single session open, and allow consistent sets of calculations to be provided as quickly as calculation is completed. This is designed for real-time-risk where high-level portfolio calculations take time to calculate and can not keep-up with fast-moving-markets. An example would be a liquidity barometer decline in liquidity-coverage-ratio triggers an uptick in internal treasury cross-charging interest rates that have the effect of reducing the risk-appetite and quantity of trades executed.

Usage

The example of a floating rate bond that provides NPV, CleanPrice and DirtyPrice from observations of deposit rates for one-week to one-year and swap rates from two to fifteen years allows the model to be used for

what-if analysis
market quotes
back-testing
real-time-risk through a simulation model
liquidity risk All without any changes to the quantitative model. Changing the subscription used to provide deposit and swap rates and adding an onward subscriptions of the NPV, Clean and Dirty prices allows the model to be used for different scenarios

namespace SampleModels

open Cephei.Cell
open Cephei.QL

type FloatingBondModel () as this =
    inherit Model ()
(* ... implementation ... *)
    // Index model for collection access
    do this.Bind()

    // Externally visible properties 
    member this.NPV         = NPV
    member this.CleanPrice  = CleanPrice
    member this.DirtyPrice  = DirtyPrice
    member this.Deposit1W   = d1wQuote    
    member this.Deposit1M   = d1mQuote
    member this.Deposit3h   = d3mQuote
    member this.Deposit6M   = d6mQuote
    member this.Deposit9M   = d9mQuote
    member this.Deposit1Y   = d1yQuote
    member this.Swap2Y      = s2yQuote
    member this.Swap3Y      = s3yQuote
    member this.Swap5Y      = s5yQuote
    member this.Swap10Y     = s10yQuote
    member this.Swap15Y     = s15yQuote

The full implementation of the model using Cephei.QL for QuantLib functions demonstrates why F# has been selected as the model scripting language

namespace SampleModels

open Cephei.Cell
open Cephei.QL

type FloatingBondModel () as this =
    inherit Model ()

    let calendar            = Fun.Times.Calendars.TARGET.Create ()
    let settlementDate      = DateTime (2018, 9, 18)
    let fixingDays          = 3u
    let settlementDays      = 3u
    let todaysDate          = Cell.Create (fun ()-> calendar.Advance (settlementDate, -(int fixingDays), QL.Times.TimeUnitEnum.Days, None, None))
    let S                   = Cell.Create (fun () -> Fun.Swapessions.Create ( todaysDate))

    (*
        Rate helpers
    *)
    let zc3mQuote           = 0.0096
    let zc6mQuote           = 0.0145
    let zc1yQuote           = 0.0194

    let zc3mRate            = S.With Fun.Quotes.SwapimpleQuote.Create (Some zc3mQuote)
    let zc6mRate            = Fun.Quotes.SwapimpleQuote.Create (Some zc6mQuote)
    let zc1yRate            = Fun.Quotes.SwapimpleQuote.Create (Some zc1yQuote)

    let zcBondsDayCounter   = Fun.Times.Daycounters.Actual365Fixed.Create () 

    let months m            = Fun.Times.Period.Create (m, QL.Times.TimeUnitEnum.Months)
    let ModifiedFollowing   = QL.Times.BusinessDayConventionEnum.ModifiedFollowing
    let zc3m                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc3mRate, (months 3), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :> QL.Termstructures.Yield.IRateHelper
    let zc6m                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc6mRate, (months 6), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :> QL.Termstructures.Yield.IRateHelper
    let zc1y                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc1yRate, (months 12), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :> QL.Termstructures.Yield.IRateHelper

    let termStrucDayCounter = Fun.Times.Daycounters.ActualActual.Create (Some QL.Times.Daycounters.ActualActual.ConventionEnum.ISDA)
    let tolerance           = 1.0e-15
    
    (*
        Bond Data
    *)
    let redemption          = 100.0

    let issueDates          = [ DateTime (2015, 3, 15)
                              ; DateTime (2015, 6, 15)
                              ; DateTime (2016, 6, 30)
                              ; DateTime (2012, 11, 15)
                              ; DateTime (1997, 5, 15)
                              ]

    let maturities          = [ DateTime (2020, 8, 31)
                              ; DateTime (2021, 8, 31)
                              ; DateTime (2023, 8, 31)
                              ; DateTime (2028, 8, 31)
                              ; DateTime (2048, 5, 15)
                              ]
 
    let couponRates         = [ 0.02375
                              ; 0.04625
                              ; 0.03125
                              ; 0.04000
                              ; 0.04500
                              ]

    let marketQuotes        = [ 100.390625
                              ; 106.21875
                              ; 100.59375
                              ; 101.6875
                              ; 102.140625
                              ]

    let two2one a b         = List.map2 (fun e y -> (e,y)) a b
    let combine             = two2one (two2one issueDates maturities) (two2one couponRates marketQuotes)

    let quote               = List.map (fun q -> Fun.Quotes.SwapimpleQuote.Create (Some q)) marketQuotes 
    let quoteRate r         = Fun.Quotes.SwapimpleQuote.Create (Some r)
    let usCalendar          = Fun.Times.Calendars.UnitedStates.Create (Some QL.Times.Calendars.UnitedStates.MarketEnum.GovernmentBond)
    let unadjusted          = QL.Times.BusinessDayConventionEnum.Unadjusted 
    let backward            = QL.Times.DateGeneration.RuleEnum.Backward
    let semiannual          = Fun.Times.Period.Create (QL.Times.FrequencyEnum.Swapemiannual)
    let schedule i m        = Fun.Times.Swapchedule.Create (i, m, semiannual, usCalendar, unadjusted, unadjusted, backward, false, None, None)
    let coupons q           = Fun.Doubles.CreateVector ([q])
    let actualActualBond    = Fun.Times.Daycounters.ActualActual.Create (Some QL.Times.Daycounters.ActualActual.ConventionEnum.Bond)
    let fixedHelper q s c i = Fun.Termstructures.Yield.FixedRateBondHelper.Create (q, settlementDays, redemption, s, c, actualActualBond, Some unadjusted, Some redemption, Some i) :> QL.Termstructures.Yield.IRateHelper
    let schedules           = List.map2 schedule issueDates maturities
    let rateHelpers         = List.map (fun ((i,m),(c,q))-> fixedHelper (quoteRate q) (schedule i m) (coupons c) i) combine

    let bondInstruments     = Fun.Vector ([zc3m;zc6m;zc1y] @ rateHelpers)
    let bondTermStructure   = Fun.Termstructures.Yield.PiecewiseYieldCurveDiscountLogLinear.Create (settlementDate, bondInstruments, termStrucDayCounter, tolerance)

    (*
        curve building
    *)

    // Building of the Libor forecasting curve
    // deposits
    let d1wQuote            = Cell.CreateValue 0.043375
    let d1mQuote            = Cell.CreateValue 0.031875
    let d3mQuote            = Cell.CreateValue 0.0320375
    let d6mQuote            = Cell.CreateValue 0.03385
    let d9mQuote            = Cell.CreateValue 0.0338125
    let d1yQuote            = Cell.CreateValue 0.0335125
    // swaps
    let s2yQuote            = Cell.CreateValue 0.0295
    let s3yQuote            = Cell.CreateValue 0.0323
    let s5yQuote            = Cell.CreateValue 0.0359
    let s10yQuote           = Cell.CreateValue 0.0412
    let s15yQuote           = Cell.CreateValue 0.0433

    // SimpleQuote stores a value which can be manually changed;
    // other Quote subclasses could read the value from a database
    // or some kind of data feed.

    // deposits

    let d1wRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d1wQuote.Value))
    let d1mRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d1mQuote.Value))
    let d3mRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d3mQuote.Value))
    let d6mRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d6mQuote.Value))
    let d9mRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d9mQuote.Value))
    let d1yRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some d1yQuote.Value))
    // swaps
    let s2yRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some s2yQuote.Value))
    let s3yRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some s3yQuote.Value))
    let s5yRate             = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some s5yQuote.Value))
    let s10yRate            = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some s10yQuote.Value))
    let s15yRate            = Cell.Create (fun () -> Fun.Quotes.SwapimpleQuote.Create (Some s15yQuote.Value))

    let depositDayCounter   = Fun.Times.Daycounters.Actual360.Create ()

    let period n t          = match t with
                              | 'w' -> Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Weeks)
                              | 'm' -> Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Months)
                              | 'y' -> Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Years) 
                              | 'd' -> Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Days) 
                              | _   -> raise (new Exception ("invalid period type"))

    let d1w                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d1wRate.Value, (period 1 'w'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d1m                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d1mRate.Value, (period 1 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d3m                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d3mRate.Value, (period 3 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d6m                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d1wRate.Value, (period 6 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d9m                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d9mRate.Value, (period 9 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d1y                 = Cell.Create (fun () -> Fun.Termstructures.Yield.DepositRateHelper.Create (d1yRate.Value, (period 1 'y'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))

    // setup swaps
    let annual              = Cell.Create (fun () -> QL.Times.FrequencyEnum.Annual))
    let thirty360European   = Cell.Create (fun () -> Fun.Times.Daycounters.Thirty360.Create (Some QL.Times.Daycounters.Thirty360.ConventionEnum.European))
    let forwardStart        = Cell.Create (fun () -> period 1 'd')
    let swFloatingLegIndex  = Cell.Create (fun () -> Fun.Indexes.Ibor.Euribor.Create (period 6 'm'))
    let s2y                 = Cell.Create (fun () -> Fun.Termstructures.Yield.SwapwapRateHelper.Create (s2yRate.Value, (period 2 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s3y                 = Cell.Create (fun () -> Fun.Termstructures.Yield.SwapwapRateHelper.Create (s3yRate.Value, (period 3 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s5y                 = Cell.Create (fun () -> Fun.Termstructures.Yield.SwapwapRateHelper.Create (s5yRate.Value, (period 5 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s10y                = Cell.Create (fun () -> Fun.Termstructures.Yield.SwapwapRateHelper.Create (s10yRate.Value, (period 10 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s15y                = Cell.Create (fun () -> Fun.Termstructures.Yield.SwapwapRateHelper.Create (s15yRate.Value, (period 15 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))

    (*
        Curve building
    *)
    let tr p : QL.Termstructures.Yield.IRateHelper = p :> QL.Termstructures.Yield.IRateHelper
    let depoSwapInstruments = Cell.Create (fun () -> Fun.Vector ([tr d1w.Value; tr d1m.Value;tr d3m.Value;tr d6m.Value;tr d9m.Value;tr d1y.Value;tr s2y.Value;tr s3y.Value;tr s5y.Value;tr s10y.Value;tr s15y.Value]))
    let depoSwapTermStructure = Cell.Create (fun () -> Fun.Termstructures.Yield.PiecewiseYieldCurveDiscountLogLinear.Create (settlementDate, depoSwapInstruments.Value, termStrucDayCounter, tolerance))

    (*
        Bonds to be priced
    *)
    let faceAmount          = 100.0
    let bondEngine          = Fun.Pricingengines.Bond.DiscountingBondEngine.Create(Some (bondTermStructure :> QL.Termstructures.IYieldTermStructure), None) 
    let following           = QL.Times.BusinessDayConventionEnum.Following 

    // Floating rate bond (3M USD Libor + 0.1%)
    // Should and will be priced on another curve later...

    let libor3m             = Cell.Create (fun () ->
                              let t = Fun.Indexes.Ibor.USDLibor.Create ((period 3 'm'), Some (depoSwapTermStructure.Value :> QL.Termstructures.IYieldTermStructure))
                              t.AddFixing (new DateTime(2028,07,17), 0.0278625, None) :?> QL.Indexes.IIborIndex)
    let quarterly           = Fun.Times.Period.Create (QL.Times.FrequencyEnum.Quarterly)
    let usNYSE              = Fun.Times.Calendars.UnitedStates.Create (Some QL.Times.Calendars.UnitedStates.MarketEnum.NYSE)
    let floatingBondSchedule = Fun.Times.Swapchedule.Create (DateTime(2015,10,21), DateTime(2020,10,21), quarterly, usNYSE, unadjusted, unadjusted, QL.Times.DateGeneration.RuleEnum.Backward, true, None, None)

    // Coupon pricers

    let pricer              = Fun.Cashflows.BlackIborCouponPricer.Create (None)
    let volatility          = 0.0
    let Actual365Fixed      = Fun.Times.Daycounters.Actual365Fixed.Create ();
    let vol                 = Fun.Termstructures.Volatility.Optionlet.ConstantOptionletVolatility.Create (settlementDate, calendar, ModifiedFollowing,volatility, Actual365Fixed)
    //use fluent interface to set capvol
    let capletPricer        = pricer.SwapetCapletVolatility (Some (vol :> QL.Termstructures.Volatility.Optionlet.IOptionletVolatilityStructure))

    let floatingRateBond    = Cell.Create (fun () -> Fun.Instruments.Bonds.FloatingRateBond.Create (capletPricer, settlementDays, faceAmount, floatingBondSchedule, libor3m.Value, depositDayCounter, Some ModifiedFollowing, Some 2u, Some (coupons 1.0), Some (coupons 0.001), None, None, Some true, Some faceAmount, Some (DateTime(2015, 10, 21)), bondEngine))

    let NPV                 = Cell.Create (fun () -> floatingRateBond.Value.NPV)
    let CleanPrice          = Cell.Create (fun () -> floatingRateBond.Value.CleanPrice())
    let DirtyPrice          = Cell.Create (fun () -> floatingRateBond.Value.DirtyPrice())

    // Index model for collection access
    do this.Bind()

    // Externally visible properties 
    member this.NPV         = NPV
    member this.CleanPrice  = CleanPrice
    member this.DirtyPrice  = DirtyPrice
    member this.Deposit1W   = d1wQuote    
    member this.Deposit1M   = d1mQuote
    member this.Deposit3h   = d3mQuote
    member this.Deposit6M   = d6mQuote
    member this.Deposit9M   = d9mQuote
    member this.Deposit1Y   = d1yQuote
    member this.Swap2Y      = s2yQuote
    member this.Swap3Y      = s3yQuote
    member this.Swap5Y      = s5yQuote
    member this.Swap10Y     = s10yQuote
    member this.Swap15Y     = s15yQuote

Cephei Sample

Sat, 13 Jul 2024 21:42:59 GMT

Introduction

Cephei was conceived on the premise that Excel is not inherently a bad tool to prototyping a model, but it is a poor tool to operate a business that depends on the model.

The architecture impedance issue is that it is historically difficult to translate the prototype model from a spreadsheet to a reliable application. Cephei addresses the impedance issue by replicating the Cell notification paradigm and designing the implementing quantitative functions with code-generation metadata.

Cephei.XL can be viewed as a Low-code development environment for a Financial Digital Twin

Cephei can be used directly with Quantlib analytics or used as a an Architype when integrated with another financial library

Cephei.XL Solution

The Cephei.XL solution provided a comprehensive Quantitative Finance Library of 15000 functions that can be used like a traditional Excel XLL addin, but without the need to disable automatic calculation because formula are refreshed by RTD only when they change, and always in the background.

At any point the model the corresponding source code (model and Excel addin functions) can be generated from the Cephei menu and compiled to executable code for deployment to pricing server or Financial Digital Twin Cephei models use the Cephei Cell Framework to mirror the automatic dependency tracking of Excel, but with parallel calculation within Excel or a server application like Cephei.Orleans

Sample

The simplest example uses a fixed rate bond, entered interactively in Excel using Excel Formula wizard with standard evaluation and validation as each formula is entered. All Cephei functions are prefixed with _ with Mnemonics prefixed with + for model parameters and - for private formula that are not exported as properties or with Excel addin functions

Spreadsheet formula

Sample.xlsx open with Cephei 32-bit or Cephei 64-bit

Forumula view

Generated Source Code

namespace Cephei.Models

open QLNet
open Cephei.QL
open Cephei.QL.Util
open Cephei.Cell
open Cephei.Cell.Generic
open System
open System.Collections

type FixedBond 
    ( Tenor : ICell
    , Maturity : ICell
    , FixedAmount : ICell
    ) as this =
    inherit Model ()

(* functions *)
    let _Calendar = Fun.TARGET()
    let _Today = (value DateTime.Today)
    let _clock = Fun.Date1 (triv (fun () -> int (_Today.Value.ToOADate())))
    let _PriceDay = _Calendar.Adjust _clock (value BusinessDayConvention.Following)
    let _DayCount = Fun.ActualActual1 (value ActualActual.Convention.ISMA) (value (null :> Schedule))
    let _Quote = Fun.SimpleQuote1 (triv (fun () -> toNullable (0.03)))
    let _Tenor = Tenor
    let _Frequency = Fun.Period2 (value Frequency.Annual)
    let _FlatForward = Fun.FlatForward _PriceDay (triv (fun () -> _Quote.Value :> Quote)) (triv (fun () -> _DayCount.Value :> DayCounter))
    let _Maturity = Maturity
    let _Coupon = cell new Generic.List([| Convert.ToDouble(0.02); Convert.ToDouble(0.05); Convert.ToDouble(0.08)|]
    let _ExCoupon = Fun.Period1()
    let _Settlement = (value (Convert.ToInt32(0)))
    let _FixedAmount = FixedAmount
    let _Engine = Fun.DiscountingBondEngine (triv (fun () -> toHandle (_-FlatForward.Value))) (triv (fun () -> toNullable (True)))
    let _Schedule = Fun.Schedule _PriceDay _Maturity _Frequency (triv (fun () -> _Calendar.Value :> Calendar)) (value BusinessDayConvention.Unadjusted) (value BusinessDayConvention.Unadjusted) (value DateGeneration.Rule.Backward) (value false) (value (null :> Date)) (value (null :> Date))
    let _Bond = Fun.FixedRateBond _Settlement _FixedAmount _Schedule _Coupon (triv (fun () -> _DayCount.Value :> DayCounter)) (value BusinessDayConvention.ModifiedFollowing) _FixedAmount _PriceDay (triv (fun () -> _Calendar.Value :> Calendar)) _ExCoupon (triv (fun () -> _Calendar.Value :> Calendar)) (value BusinessDayConvention.Following) (value False) (triv (fun () -> _Engine.Value :> IPricingEngine)) _PriceDay
    let _clock = Fun.Date1 (triv (fun () -> int (_Today.Value.ToOADate())))
    let _CleanPrice = _Bond.CleanPrice()
    let _DirtyPrice = _Bond.DirtyPrice()
    let _NPV = _Bond.NPV()
    let _Cash = _Bond.CASH()

    do this.Bind ()

(* Externally visible/bindable properties *)
    member this.Today = _Today
    member this.Quote = _Quote
    member this.Tenor = _Tenor
    member this.Frequency = _Frequency
    member this.Maturity = _Maturity
    member this.FixedAmount = _FixedAmount
    member this.clock = _clock
    member this.CleanPrice = _CleanPrice
    member this.Clock = _Clock
    member this.DirtyPrice = _DirtyPrice
    member this.NPV = _NPV
    member this.Cash = _Cash

#if EXCEL
module FixedBondFunction =

    []
    let FixedBond_create
        ([] 
         mnemonic : string)
        ([]
        Tenor : obj)
        ([]
        Maturity : obj)
        ([]
        FixedAmount : obj)

        = 
        if not (Model.IsInFunctionWizard()) then

            try
                let _Tenor = Helper.toCell Tenor "Tenor"
                let _Maturity = Helper.toCell Maturity "Maturity"
                let _FixedAmount = Helper.toCell FixedAmount "FixedAmount"

                let builder (current : ICell) = withMnemonic mnemonic (new FixedBond
                                                            _Tenor.cell
                                                            _Maturity.cell
                                                            _FixedAmount.cell

                                                       ) :> ICell
                let format (i : ICell) (l:string) = Helper.Range.fromModel (i :?> FixedBond) l
                let source () = Helper.sourceFold "new FixedBond"
                                               [| _Tenor.source
                                               ;  _Maturity.source
                                               ;  _FixedAmount.source
                                               |]

                let hash = Helper.hashFold
                                [| _Tenor.cell
                                ;  _Maturity.cell
                                ;  _FixedAmount.cell
                                |]
                Model.specify 
                    { mnemonic = Model.formatMnemonic mnemonic
                    ; creator = builder
                    ; subscriber = Helper.subscriberModel format
                    ; source = source 
                    ; hash = hash
                    } :?> string
                        with
                        | _ as e ->  "#" + e.Message
        else
            ""

Download

The Excel addin can be downloaded from Cepheis

Excel (32-bit)
Excel (64-bit)
Source code The public version of the addin includes basic telemetry to track errors and usage of functions by user (but not data) in order to assist with support during evaluation. A release version without telemetry is available upon request

Summary

Whilst Cephei can be used directly as a Quantitative Finance library for structuring, pricing and as a summary blotter, it is designed to be used:

As a model editor for the prototyping with ticking market data, for saving directly to code that can be deployed without additional development
As a recipe editor for model parts that are generated an compiled for use within more complex models (the sample model encapsulates the additional objects (schedule, termsheet, pricing engine) needed for Quantilib
As a foundation for risk simulation models (the Cell framework is designed for massively parallel Monte Carlo simulation of Exposure for Realtime Risk (Cephei is named after Delta Cephei )

Further information is available at Cephei.QL , Cephei.XL and Cephei Cell

Contact feedback@cepheis.com

Introducing Cephei.XL

Sat, 13 Jul 2024 21:43:29 GMT

Background

For more than a quarter of a century Microsoft Excel has been the preferred “desktop” for traders and fund managers to develop financial models for Bond and derivatives trades. Excel provides integration for market data vendors to provide Real-Time Data (RTD) directly into spreadsheets; and integration for advanced finance libraries to build models. Spreadsheet models were/are a key enabler for the development of models for financing contracts, but for more than ten years there has been programmes to replace these models with applications that are controlled by the product control, compliant with regulatory commitments and re-valued for risk exposure at market close.
There has been considerable resistance to this process because the applications do not provide the flexibility of building on a desktop and tested with live prices.

Another approach

Cephei was conceived from the perspective that Excel is not inherently evil - it is unrevealed for the ability prototype and test models – but conversion to reliable models is difficult: If we can we automate the process of generating code from models, we don’t necessarily need to re-implement in impetrative languages.

Cephei Solution

The Cephei solution is replicate the good facets of spreadsheet, and design the finance libraries enable the automatic generation of code from models

The Cephei Cell Framework mirrors the spreadsheet notion that a Cell can contain a value or a formula, and that the formulas and are updated automatically whenever an underlying value changes. The Cell framework is faster than imperative calculation because calculations are performed in parallel
Cephei.QL building block combining Cephei Cell Framework and QuantLib open-source quantitative finance library
Cephei.XL complete Excel addin providing access to all functions of QuantLib from Excel

Cephei.XL overview

Cephei.XL uses (RTD) as an object cache to avoid the need for complex logic to track formula changes as functions are called over-and-over-and-over-again by the Excel calculation engine. The Cephei Cell Framework ensures that when a formula is changed, all dependant cells are re-calculated in parallel with updates passed back to Excel as RTD data change notifications. Irrespective of the complexity of the model, interactive performance of Excel (apart from the initial start-up time of adding 15000 functions to Excel) remains responsive – all calculation is performed in parallel by the Cell thread-pool. While Cephei.XL can be used as an efficient Quant library that interoperates well with Bloomberg, it is designed from the foundation with code generation “save as code” in mind: Cephei.XL does not create handle-references, all functions take a Mnemonic parameter that is used as a property name when code is generated.

Implementation

Cephei.XL exports 15,000 worksheet functions from 600,000 lines of code, all of which start _, the simplest of which is _clock() and _today() which update every second or every time the machine clock rolls into another day. All functions (except _value(mnemonic) and _value_range(mnemonic, layout)) return the handle provided.
_value_range() layout is one of:

“C” for column layout
“R” for row layout
“CT” for columns with tittles
“RT” for rows with titles When the layout includes “T” for a complex object (e.g. Bond) all properties (dirtyprice, cleanprice, yield, cash, etc) are provide in a table. The alpha code can be downloaded from Cepheis • Excel (32-bit) • Excel (64-bit) • Excel (32-bit debug) • Excel (64-bit debug) • Source code

Positioning

Cephei.XL is designed to be used like an interactive editor – when the model is complete, you can click generate to translate to an F# model that can be deployed to a server (Cephei.Orleans will provide a “Financial Digital Twin” for hosting active cloud models). Quantlib is a fairly comprehensive library of financial methods that are widely used or copied in Investment Banking, buy-side valuation or IPV. It can be used directly or as proof-of-concept of how Quant addin’s should be done.

Model Driven Architecture

Cephei.QL and Cephei.XL are examples of the benefit of using model driven architecture: most of the code has been produced using code-generation from a Sparx Enterprise Architect software model of the underlying Quantlib Functions.

Enterprise Architecture global scalability

Sat, 13 Jul 2024 21:44:15 GMT

Enterprise Architecture covers a range of practices from business strategy through the structure and organisations of applications to detailed design using enterprise frameworks. While Business, Applications and Technology Architecture are often developed independently to address different stakeholder needs, the value of modelling is enhanced when different views are reconciled through realisation, trace and dependency relationships. In complex organisations, high-level taxonomies carry more influence when they represent the aggregation of application and services, while applications criticality is better understood when related to high-level functions and critical processes.

A Universal Bank example

A universal (covering Retail, Corporate, Investment and Wealth) commissioned a number of initiatives using Sparx Enterprise Architect to model the organisation in distinct repositories

Service Architecture

Focused on common modelling of applications with common layout of charter, requirements; high-level-design, functional and detailed design. Using a common repository allowed for dependencies between application and common frameworks to be highlighted. Detailed design of fifteen thousand application components was captured using UML notation with MDG meta-model customisation customisation for additional properties

Process Architecture

Focused on the processes performed across the organisation from client onboarding, through trading to risk management and financial accounting. The Enterprise Process Model was organised around a hierarchical process taxonomy that provided a Bank on Page process view, with MDG meta-model customisation customisation for reference to common applications, organisational hierarchy and other business taxonomies. Detained process modelling of twenty thousand activities, decision gateways and events was captured using BPMN notation.

Functional Architecture

Focused on the Functional Taxonomy to identify sponsorship for application and services, to coordinate the governance of change with escalation points to mitigate the impact of issues. Functional Architecture provided the current and future state of the business and allowed common capabilities to be identified, and transitioned to standard functions/services.

Domain Architecture

Focused on end-to-end design for specific domains/applications without the imposition of taxonomies or structure; but with the freedom to mix function, data, process, application design as appropriate for each initiative.

Enterprise Architecture

The challenge for Enterprise Architecture was to bring together different views and enrich each viewpoint with context provided by the different models. Rationalisation served to bridge the gaps between the viewpoints and provide fresh impetus to keep models current as design moved through implementation to maintenance. Rationalisation faced a number of challenges:

Size and complexity of models/metamodel-customisation, tested and exceeded the capability XMI transfer of repositories when projects did not share the same package hierarch, but had dependencies between projects
The need to retain domain-specific viewpoints, so that different aspects were not referenced inappropriately (such as a Customer Actor, component, table or class being referenced when Customer process lane was needed)
The need for MDG meta-model customisation to continue to evolve to meet domain presentation needs
Inability to coordinate/schedule the suspension of modelling activities to allow consolidation
Performance of huge models for interactive modelling activities

Value

While the impediment to enterprise rationalisation were considerable, the potential benefit of requirement traceability and enterprise lineage were also considerable. Sparx Enterprise Architect is almost unique in its ability to model an organisation from high-level Enterprise Architecture and detailed application design though to real-time interaction of timing sensitive application. Sparx Enterprise Architect allows an interbank-interface class to be related to the strategic function, requirement and data that is needed for regulatory reporting.

Solution

The initial solution was a batch database ETL process, intended as a one-time load, but evolved into the need for an Enterprise Hub to provide continuous replication between models. Today, the Enterprise Hub provides continuous replication between domain-specific models, and an enterprise repository, applying viewpoint rules to changes:

Status filters to prevent unfinished changes being replicated, and to separate as-is and to-be viewpoints
Content filters to prevent application or database content being replicated to a consolidated process viewpoint

Enterprise Hub also allows for globally distributed repositories with local performance

Implementing Cephei.QL

Sat, 13 Jul 2024 21:44:51 GMT

Background

Cephei.QL is a large project, consisting of over two thousand classes and twenty thousand closures. It has been implemented in F# because parallel executing needs to be functionally immutable to avoid co-mutation of data on separate threads, but is also a simpler problem when type inference can be implied in many places.

Whilst there is a huge amount of code that needs to be produced; with some exceptions, the pattern of mapping an underlying library to a higher level abstraction is common

Overview

This blog is about how model-driven architecture can be applied to the problem of creating a library that integrates a number of underlying sources. Traditionally Architecture modelling tools approach this problem by translating classes to a metadata rich source where the use of templates allows the implementation to be derived at runtime.

For Cephei.QL, code generation has been used because:

The target language (F#) is not supported by the current generation of software engineering tools.
The permutations are more complex than is supported by templates and would require a runtime code generator to be used.
Runtime generation is not appropriate for high-performance production environments.
The target library is intended to provide recipes for specific implementation, that can tailored to specific problems.

While this example is concerned with generation of abstractions over a large financial library (library -> cell models -> serialisation -> Excel integration), the problem is the same as implementing data-structures that need to be mapped from Javascript though application tiers to data persistence

Components

Sparx Enterprise Architect is the leading tool for software engineering because of the wide range languages supported and tools for Model-Driven-Architecture.. In this scenario it is also extremely strong because the underlying repository supports SQL queries and therefore rich database integration.

EA.Gen.Model is a .NET class library (that we developed years ago) that uses the Entity Data Model to provide a high-level object-graph view of the underlying Sparx Repository, with Linq Queries There are many templating technologies based on the principles for web-server pages that generate HTML for browsers, but the one we use is T4 Text Templates because it can be used within a Build Server (TFS, Teamcity, Jenkins, etc) without having to configure open access.

Implementation

The full Visual Studio 2019 code for Cephei.Gen is available from GitHub, but includes the C++ code generator that was previously used to wrap QuantLib before switching to the C# port QLNet. The NetModel/Class.cs provides the data model for classes, and NetQL/Class.tt provides template for the bulk of the Cephei.QL code.

After reverse engineering the source code into Sparx, the code generator can be run to generate F# code.

Conclusion

For a wide variety of problems, model driven architecture can be used to accrue greater value from the models we create during systems design, increasing productivity, and reducing errors introduced when coded by hand

Cephei.QL

Sat, 13 Jul 2024 21:45:25 GMT

Overview

With Cephei.Cell we introduced a Cell Framework that allows computationally intensive problems to be declared as a series of functional definitions that the runtime calculates in parallel. Cephei.QL applies the Cell Framework to the QuantLib QLNet quantitative finance library to provide a series of pre-canned model building blocks for all Quantlib classes, that can be assembled into complete models for a financial instrument or portfolio. The full source is available on GitHub

Each Quant class is wrapped as a Model, each property is wrapped as a ICell<> with each function wrapped as a function taking ICell<> parameters and returning an ICell<> wrapper of the return value. There are three exceptions:

IPricingEngine is added to the constructor of Instruments that use a pricing engine to follow a functional idiom
Evaluation Date is added to the constructor of Instruments so that a common source of date is available to all properties/functions that need a date for pricing, and is common across all instruments that will be valued together.
Methods that do not return a value, instead return the reference object to allow them to be chained together as fluent function (e.g. FedFund.AddFixing return FedFund) Despite the overhead of constructing Cells rather than evaluating functions imperatively, the overall performance of a non-trivial model is significantly quicker because evaluation is performed in parallel with rendezvous happening when prices need to be aggregate.

Rational

Quantlib is not a natural library for functional wrapping (because of the internal observer/observable pattern), Cephei.QL demonstrates that any library can be wrapped for functional definition. Cephei.QL (and Cephei.Cell) encapsulates the dependencies within model, and only exposes value cells that can be edited and function cells that provide results that always reflect the value sources, irrespective of the change. Financial Models that are presented as a dictionary of values with IObservale/IObserver event linkage can be wired together without needing to know internal structure can be used as financial building blocks

Purpose

Cephei.QL provide a building block for Cephei.XL (that supports construction of models in Spreadsheets -that can be saved as F# code), and embedding in systems, including “Financial Digital Twins” for real-time-risk.

Usages

The Cephei.QLnuget package includes Cephei.QL assembly including 2000+ models and a cell module that provides a functions to create models, together with Utility functions for cell to construct cells with type inference and triv for trivial (lookup) functions. Depending on the value of Cell.Lazy (false for construction at definition time) and Cell.Parallel (true for parallel execution). Each model is declared with three sections:

Parameters : references to the source cells.
Functions : declarative functions that perform calculations using other cells within the model.
Externally visible binding: cells and cell functions that can be bound and serialised to backing store.

Example

This model is based on the QLLib Bond example that uses Cephei.QL as blocks to represent a small portfolio of Fixed Rate Bonds, with difference {tenor, coupon rates, payment frequencies, and yield rates} and allows the {Face Value, Quantity, Redemption} to be edited and provides a Market Clean Price.

External to the Bond model, two models are provided for Business wide properties and market conditions that are used to change the valuation through event propergation.

Business Standards

type BusinessStandards () as this =
    inherit Model ()

    let accrualConvention           = value BusinessDayConvention.Unadjusted
    let paymentConvention           = value BusinessDayConvention.ModifiedFollowing
    let settlementDays              = value 3
    let dayCount                    = value (new ActualActual (ActualActual.Convention.ISMA) :> DayCounter)
    let includeSettlementDate       = value (new System.Nullable (true))

    do this.Bind ()

    member this.AccrualConvention   = accrualConvention
    member this.PaymentConvention   = paymentConvention
    member this.SettlementDays      = settlementDays
    member this.DayCount            = dayCount
    member this.IncludeSettlement   = includeSettlementDate

Market Condition

type MarketCondition 
    ( standards                     : BusinessStandards ) as this =
    inherit Model ()

    let toNullable (v : double)     = new System.Nullable (v)

    let calendar                    = Fun.TARGET()
    let clockDate                   = value Date.Today;
    let convention                  = value BusinessDayConvention.Following
    let today                       = calendar.Adjust clockDate convention

    do this.Bind ()

    member this.Today               = today
    member this.Calendar            = calendar
    member this.ClockDate           = clockDate

This trivial model provides a clock date that is incremented in the text example, and calculates a cashflow date using a calendar and date adjustment convention. Whenever the clock date is changed, the update to Today is sent to an cells dependant on this value

Bond

type BondPortfolio 
    ( standards                     : BusinessStandards 
    , marketCondition               : MarketCondition
    ) as this =
    inherit Model ()

    let calendar                    = triv (fun () -> marketCondition.Calendar.Value :> Calendar)
(* … *)
    
    let makeBond issue length coupon  (frequency : ICell) yieldVal = 
        let today = marketCondition.Today.Value
        let dated = triv (fun () -> today)      // don't reset on valuation date
        let nullDate = value (null :> Date)
        let maturity = marketCondition.Calendar.Advance1 dated length years standards.PaymentConvention eom 
        let schedule = Fun.Schedule dated maturity frequency calendar standards.AccrualConvention standards.AccrualConvention dateGenerationRule eom nullDate nullDate
        let yieldCurve = triv (fun () -> (makeYield yieldVal))
        let engine = Fun.DiscountingBondEngine yieldCurve standards.IncludeSettlement 
        let castEgnine = triv (fun () -> engine.Value :> IPricingEngine)
        let exCouponPeriod = value (null :> Period)
        let b = Fun.FixedRateBond standards.SettlementDays faceAmount schedule coupon bondDayCount standards.PaymentConvention redemption issue calendar exCouponPeriod calendar convention eom castEgnine marketCondition.Today
        b.Mnemonic <- "B" + id.ToString()
        id <- id + 1
        b

    let bonds = 
        seq {for l in lengths do
                for c in coupons do 
                    for f in frequencies do
                        for y in yields do
                            (l,c,f, y)}
        |> Seq.map (fun (l,c,f, y) -> makeBond marketCondition.Today l c f y)
        |> Seq.toArray

    let cleanPrices                 = bonds |> Array.map (fun i -> i.CleanPrice) 

    let cleanPrice                  = cell (fun () -> cleanPrices |> Seq.fold (fun a y -> a + y.Value * quantity.Value) 0.0)
        
    do this.Bind ()

    member this.Amount              = faceAmount
    member this.Quantity            = quantity
    member this.Redemption          = redemption

    member this.CleanPrice          = cleanPrice

This model uses the business standards and market conditions and a set of permutations to build Fixed Rate Bonds by constructing schedule, yield curve and engine.. the user of the model does not need to know the intermediate steps that Quantlib uses to build a Bond. Refactoring the Yield Curve functionality to be shared via market conditions is a simple task that is transparent to users

Test

    []
    member this.TestLazy () =

        let lots = 
            seq { for n in 1..60 do
                    new BondPortfolio (standards, market)}
            |> Seq.toList

        let r = 
            seq { for c in 0..100 do
                    market.ClockDate.Value <- market.ClockDate.Value + c
                    let cleanPrice = lazy (lots |> List.fold (fun a y -> a + y.CleanPrice.Value) 0.0 )
                    Console.WriteLine ("Lazy, {1}, {0}", cleanPrice.Value, market.Today.Value)
                    cleanPrice
                    } |> Seq.toArray

        Assert.IsTrue(true);

The test case generates 60 portfolios, and retrieves the clean price for 100 time points, but the event could be market prices, or a what-if of changing settlement period. Enabling do Cell.Parellel <- true reduced runtime by a factor of four on my workstation. The key take-away is that the cost of profiling calculations (on background threads) still results in shorter runtime on multi-core computers that are now common.