<rss version="2.0">
  <channel>
    <title>Blog</title>
    <link>https://www.cepheis.com/blog</link>
    <description><![CDATA[News from Cepheis]]></description>
    <item>
      <title>Enterprise Transitive Edge </title>
      <link>https://www.cepheis.com/blog/blog/enterprise-transitive-edge</link>
      <description><![CDATA[<p><a href="https://www.cepheis.com/blog/blog/transitive-edge">transitive edge</a> Introduces the idea of a transitive edge as a projection of an edges between nodes that can be used to hide the details of the internal connections within a graph, allowing you to focus on the start and end of graph.   Genealogy family trees use just two basic edge types {mother, father} and one derived edge (child) - which in <code>Hiperspace</code> are views projected from relational facts: <code>Person { Name: 'Mark', Mother: 'Mary', Father: 'John'}</code> can be projected as a <em>node</em> and four <em>edges</em>  <code>{ Mark -(mother)-&gt; Mary; Mark -(father)-&gt; John;  Mark &lt;-(child)- Mary; Mark &lt;-(child)- John; }</code> that do not need to be stored to be accessed.</p>
<h2>Problem</h2>
<p>If you have an <em>organisation</em> with three divisions, each with <strong>100</strong> <em>employees</em> spread across 10 <em>countries</em>, each with <strong>30</strong>  <em>employees</em> in each.  If the <em>chief executive</em> has <strong>3</strong> <em>devisional</em> and <strong>10</strong> <em>country</em> reports, how many <em>people</em> in total report to the <em>chief</em>? The logical answer is <strong>300</strong>, provided you don't double count!  You can achieve the aggregate by:</p>
<ul>
<li>Use a global list in addition to <em>division</em> and <em>country</em></li>
<li>Use domain knowledge : count only division or country reporting, but not both</li>
<li>Use split allocation: and aggregate allocation - a problem arises if you introduce split allocation between <em>country</em> and <em>region</em>, leading to double counting</li>
<li>Use transitative reporting, and ignore the intermediate edges and focus on <code>employee -&gt; chief</code></li>
</ul>
<p>Transitive Edges solve all of these problems by <em>hiding</em> the intermediate edges.  The problem becomes exponentially more complex when we consider Enterprise Architecture and the myriad connections between nodes.</p>
<p><img src="/blog/media/Sites/hiperspace/togaf.png"></p>
<p>If you want to prioritise <em>work-packages</em> for a <em>System</em> by the <em>business goal</em> being addressed, there are a myriad different paths between the <em>work-package</em> and <em>business-goal</em> with potentially many-many duplicates.  What we really want is to is to transitively project all the paths between <em>work-package</em> and <em>goal</em> and priorize by <em>business goal</em>.</p>
<h2>Solution</h2>
<p><code>Hiperspace</code> <code>TransitiveEdge</code> provides a mechanism to the declare <em>Goals</em> as an extension property of <em>Work-Package</em>.  The <a href="https://github.com/channell/Hiperspace/blob/master/examples/TOGAF/TOGAF.hilang">TOGAF</a> sample includes the <code>segment</code>  Togaf.Has.WorkPackage that has been extended with a property <code>StrategicEdges</code> that is derived from a function, and <code>Goals</code> that projects the set of  <code>TransitativeEdge</code> as <em>Goal</em> properties.</p>
<pre><code>segment Togaf.Has.WorkPackage : Togaf.Base 
    = Node        ( SKey = SKey, Name = Name, TypeName = "AF-WorkPackage"),
      Edge        (From = owner, To = this, Name = Name, TypeName = "AF-Has-WorkPackage") ,
      Togaf.Edge2 (From = this, To = owner, Name = Name, TypeName = "AF-WorkPackage-For") ,
      Graph.TransitiveEdge = StrategicEdges 
[
    "All Edges that can be projected as Transitative Edges to a Business Goal"
    @Once
    StrategicEdges  = StrategicEdge(this),
    Goals           = Goals(StrategicEdges)
];
</code></pre>
<p>The functions use the <code>Graph.Route</code> definition</p>
<table>
<thead>
<tr>
<th>From Type</th>
<th>To Type</th>
<th>^</th>
<th>Edge Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>AF-CourseOfAction</td>
<td>AF-Goal</td>
<td></td>
<td>AF-CourseOfAction-Goal</td>
</tr>
<tr>
<td>AF-Capability</td>
<td>AF-CourseOfAction</td>
<td></td>
<td>AF-Capability-Related</td>
</tr>
<tr>
<td>AF-Function</td>
<td>AF-CourseOfAction</td>
<td></td>
<td>AF-Function-CourseOfAction</td>
</tr>
<tr>
<td>AF-Capability</td>
<td>AF-Capability</td>
<td>^</td>
<td>AF-Capability-Part</td>
</tr>
<tr>
<td>AF-Function</td>
<td>AF-Function</td>
<td>^</td>
<td>AF-Function-Part</td>
</tr>
<tr>
<td>AF-Process</td>
<td>AF-Function</td>
<td></td>
<td>AF-Process-Function</td>
</tr>
<tr>
<td>AF-Process</td>
<td>AF-Capability</td>
<td></td>
<td>AF-Process-Capability</td>
</tr>
<tr>
<td>AF-Activity</td>
<td>AF-Process</td>
<td></td>
<td>AF-Activity-Process</td>
</tr>
<tr>
<td>AF-Service</td>
<td>AF-Activity</td>
<td></td>
<td>AF-Activity-Service</td>
</tr>
<tr>
<td>AF-System</td>
<td>AF-Service</td>
<td></td>
<td>AF-System-Service</td>
</tr>
<tr>
<td>AF-Component</td>
<td>AF-System</td>
<td></td>
<td>AF-Component-System</td>
</tr>
<tr>
<td>AF-Deployed</td>
<td>AF-Component</td>
<td></td>
<td>AF-Deployed-Component</td>
</tr>
<tr>
<td>AF-Platform</td>
<td>AF-Service</td>
<td></td>
<td>*</td>
</tr>
<tr>
<td>AF-Platform</td>
<td>AF-Platform</td>
<td>^</td>
<td>*</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Function</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Capability</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Goal</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Activity</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Process</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-CourseOfAction</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Service</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-System</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Component</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Deployed</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
<tr>
<td>AF-WorkPackage</td>
<td>AF-Platform</td>
<td></td>
<td>AF-WorkPackage-For</td>
</tr>
</tbody>
</table>
<p>"<strong>^</strong>" denotes recursive search up through the hierarchy of {<em>platform</em>, <em>function</em>, <em>business-capability</em>}.</p>
<p>The <strong>strategic goal</strong> for each and every <em>work-package</em> can be found by:</p>
<ol>
<li>Reading the <code>Goals</code> (set) property of `WorkPackage'</li>
<li>Using SQL query the view  TransitiveEdges "<code>SELECT Name, "From", To, Width, Length FROM TransitiveEdges WHERE To.TypeName = 'AF-Goal';</code>" to query</li>
<li>Using SQL join of WorkPackage "<code>SELECT w.*, e.* FROM WorkPackages AS w, w.StrategicEdges AS e;</code>" - the join type here is appears to be a cross-join, but only because the sub-table <code>StrategicEdges</code> has an implicit join to work-package.</li>
</ol>
<h2>Execution</h2>
<p><code>Hiperspace</code> uses a parallel <a href="https://en.wikipedia.org/wiki/Breadth-first_search">Breadth-first search</a> to search the graph of nodes, with recursive back-search to eliminate cyclic paths.  The access of each <code>Edge</code> set use parallel search of each element <code>SetSpace</code> that provides the <code>Edge</code> view, which (for very large datasets) can also parallel search <em>partitions</em> and <em>generations</em>.</p>
<p>Parallel search, and fact direct access not data means that even huge datasets are processes very quickly</p>
]]></description>
      <pubDate>Sat, 29 Mar 2025 14:24:19 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/blog/enterprise-transitive-edge</guid>
    </item>
    <item>
      <title>Transitive Edge</title>
      <link>https://www.cepheis.com/blog/blog/transitive-edge</link>
      <description><![CDATA[<p>I've been developing a new feature for Hiperspace that includes the extension of the <code>Node</code> and <code>Edge</code> model to include a new type of <code>TransitativeEdge</code> that can be transitively extended to project relationships as a <em>Butterfly</em> graph that that shows end-to-end relationships between nodes as relations that do not need recursive query to examine.</p>
<p>If you wanted to show the <em>costs</em>  for an <strong>organization</strong> of operating a <strong>service</strong>, you'd need to include costs associated with</p>
<ul>
<li>The activities performed with the service</li>
<li>The activities performed by other functions to support the fron-line activities</li>
<li>The applications, databases and other software used by the service</li>
<li>The hardware the software is hosted on</li>
<li>The costs space within a data-centre and power usage</li>
</ul>
<p>Aggregating all this information would require a traversal of the entire graph of information, and allocation of proportion of the costs.  The aggregation would cover a number of disciplines and potentially a number of steps to accumulate the information.</p>
<p>The problem of finding relevant information can be simplified using a  <code>graph-view</code> that treats all the data as nodes and edges, but still leaves the problem of how to recursively query the data.  <strong>Transitive-Edge</strong> addresses this problem by folding the entire graph of edges into a single relation that can be queries directly.  If <em>Cost</em> is transitative then A -&gt; B -&gt; C -&gt; D can be transitatively projected as A -&gt; D.</p>
<h1>Transitive-Edge</h1>
<p>Transitive Edge uses the principle that an edge can be <em>projected</em> to a transitive-edge provided it follows one of the rules of the transitive route.  For our purposes, those rules can be simplified to a set {from-type, edge-type, to-type}. ForFamily-tree is a good simple example, that uses just three <code>Edge</code> types {father, mother, child}, but a whole family tree can be projected from them using the Transitive Edge <em>relation</em>.</p>
<p>In this example <em>Mark</em> is the child of <em>Mary</em> is the child of <em>Jane</em> is the child of <em>Eve</em> through the <em>Mother</em> <code>Edge</code>. <em>Mark</em> has a transitive <code>Edge</code> <strong>relation</strong> to <em>Eve</em> because each of the edges between them follows the relation route.</p>
<p><img src="/blog/media/Sites/hiperspace/Transative Edge.png"></p>
<p>The data-mode extends the <code>Node</code> and <code>Edge</code> model adding four additional field</p>
<table>
<thead>
<tr>
<th>Role</th>
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Projects</td>
<td>Edge</td>
<td>The final Edge that is transitively projected to a high-level relationship</td>
</tr>
<tr>
<td>Extends</td>
<td>Source</td>
<td>The Transitive Edge that  has been e3xtended to provide this connection</td>
</tr>
<tr>
<td></td>
<td>Length</td>
<td>The shortest number of Edges that have been traversed for this view, in the example Mark -&gt; Eve the length is 3</td>
</tr>
<tr>
<td></td>
<td>Width</td>
<td>The number of distinct routes that are summarized by this Transitive Edge</td>
</tr>
</tbody>
</table>
<p><img src="/blog/media/Sites/hiperspace/TransitativeEdge.png"></p>
<p>While <em>Length</em> and <em>Width</em> are mostly anecdotal in this example, they are very useful when considering cost allocation for application usage.</p>
<h2>Implementation</h2>
<p>The Hiperspace architecture makes it relatively simple and extremely fast to project these transitive edges without the need to store them in an intermediate database.  The Hiperspace <code>Graph.PathFunctions.Paths()</code> function takes the following parameters</p>
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>root</td>
<td>Node, or any element that can be viewed as a Node</td>
<td>the topic of the search for Transitive Edges</td>
</tr>
<tr>
<td>route</td>
<td>A Route model, with rules</td>
<td>the Transudative Route model used to project Edges as Transitive Edges</td>
</tr>
<tr>
<td>length</td>
<td>int</td>
<td>The maximum number of edges to consider in the parallel search for Transitive Edges</td>
</tr>
<tr>
<td>targets</td>
<td>Set of Node TypeName</td>
<td>the end target types that should be returned</td>
</tr>
</tbody>
</table>
<h2>Derived Edge</h2>
<p>For family-trees, a further derivation is possible to define {Brother, Sister, Grandmother, Grandfather, Aunt, Uncle, Cousin} edges from the transitive relations for a person by examining the Transitive-Edge or Edges that are referenced by it.</p>
<table>
<thead>
<tr>
<th>Length</th>
<th>Width</th>
<th>Edges</th>
<th>Name</th>
</tr>
</thead>
<tbody>
<tr>
<td>2</td>
<td>2</td>
<td>Parent -&gt; Child</td>
<td><strong>Brother</strong>, <strong>Sister</strong> (depending on gender). Width is <em>2</em> since the routes are by Mother and Father</td>
</tr>
<tr>
<td>2</td>
<td>1</td>
<td>Parent -&gt; Child</td>
<td><strong>Half-Brother</strong> , <strong>Half-Sister</strong>.  Width is <em>2</em> since the route is either by Mother or Father</td>
</tr>
<tr>
<td>2</td>
<td></td>
<td>Parent -&gt; Parent</td>
<td><strong>Grandmother</strong>, <strong>Grandfather</strong></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td>Parent -&gt; Parent -&gt; Child</td>
<td><strong>Aunt</strong>, <strong>Uncle</strong></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td>Parent -&gt; Child -&gt; Child</td>
<td><strong>Neice</strong>,  <strong>Nephew</strong></td>
</tr>
<tr>
<td>..</td>
<td>..</td>
<td>..</td>
<td>..</td>
</tr>
</tbody>
</table>
<p>Family-trees normally include a line between two people for <strong>marriage</strong>, but there is not a unique phrase to describe the relationship, somewhat amusingly <strong>ChatGPT</strong> can be persuaded to hallucinate that this relationship is <a href="https://cepheis.blob.core.windows.net/$web/metabonkers.html">Bonkers</a>.</p>
]]></description>
      <pubDate>Fri, 28 Mar 2025 18:39:33 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/blog/transitive-edge</guid>
    </item>
    <item>
      <title>Hiperspace Notebook</title>
      <link>https://www.cepheis.com/blog/blog/hiperspace-notebook</link>
      <description><![CDATA[<p><a href="https://jupyter.org/">jupyter</a> notebooks is a great technology for interactive development in python, with built in integration for graph visualisation tools, particularly for data science. Today, it is most commonly used with Visual Studio code.  With the popularity of jupyter notebooks, Microsoft have added <a href="https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode">Polyglot</a> notebooks, To extend the technology to other languages other than python, notably C#,F# and R.</p>
<p>Fo the <a href="https://github.com/channell/Hiperspace/blob/master/examples/CousinProblem/notebook.ipynb">Cousins</a> sample we're using the F# interactive environment because F# is a fucntional language designed for interactive development rather than an adapted scripting language.  The script closely follows the <a href="https://github.com/channell/Hiperspace/blob/master/examples/CousinProblem/Test.cs">C# unit test</a></p>
<p>The notebook presents a family tree of eleven people as a graph of nodes with parent/child relations as</p>
<table>
<thead>
<tr>
<th>Family tree</th>
<th>Graph View</th>
</tr>
</thead>
<tbody>
<tr>
<td><img src="/blog/media/Sites/hiperspace/cousins-white.svg"></td>
<td><img src="/blog/media/Sites/hiperspace/diagrams/graph.png"></td>
</tr>
</tbody>
</table>
<p>when infered relations (<em>to cousins, aunts, grandparaents</em>) are added the graph grows obscuring the hierarchical nature of the graph.</p>
<p><img src="/blog/media/Sites/hiperspace/diagrams/graph-all.png"></p>
<p>We often expend so much effort extracting data, shapping, and loading graph stores; that it is just a means to an end.</p>
<p><img src="/blog/media/Sites/hiperspace/diagrams/Graph-Lucy.png"></p>
<p><a href="https://github.com/channell/Hiperspace/blob/master/examples/CousinProblem/notebook.ipynb">Cousins</a>  was built using F#, but could just as easily have been built with Python using pythonnet (<a href="https://github.com/channell/Hiperspace/blob/master/examples/CousinProblem/test.py">test.py</a>)</p>
]]></description>
      <pubDate>Thu, 08 Aug 2024 19:49:06 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/blog/hiperspace-notebook</guid>
    </item>
    <item>
      <title>Hiperspace</title>
      <link>https://www.cepheis.com/blog/hiperspace</link>
      <description><![CDATA[<p>Hiperspace is an acronym for “high performance space”, it provides higher performance than conventional database/object storage but accessed transparently as if data was already in memory. The name is similar to <a href="https://en.wikipedia.org/wiki/Hyperspace">hyperspace</a> in science fiction (a way to reduce the latency moving from open point is space to another), <a href="https://www.w3.org/TR/html401/struct/links.html">hyperlink</a> in the World-Wide-Web (transparent navigation from one page to another), and the <a href="https://www.ibm.com/support/pages/overview-hiperspace-caching-pdse">Hiperspace</a> expanded memory for IBM mainframes.</p>
<p>It's not called HiperspaceDB because it is also applicable to ephemeral use cases that don’t need durable storage, but need access to be faster than reloading everything whenever it changes, or alternate views (Graph/History) without explicit handling:</p>
<ul>
<li><em>Low latency</em> direct access to information</li>
<li><em>Larger space</em> than virtual memory</li>
<li><em>Simpler</em> than a cache service (that need whole objects to be serialized)</li>
<li><em>Viewable</em> as a <a href="https://en.wikipedia.org/wiki/Graph_theory">graph</a> without transformation</li>
<li><em>History</em> of Elements for point-in-time view of data</li>
<li><em>delta</em> views OLAP aggregates with history</li>
<li><em>Horizon</em> global filtering of context (e..g. approval status)</li>
</ul>
<p>The runtime is pure open-source <a href="https://github.com/channell/Hiperspace">GitHub</a> and can be deployed from <a href="https://www.nuget.org/packages/Hiperspace">Nuget</a>.</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:20:02 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/hiperspace</guid>
    </item>
    <item>
      <title>Cephei Orleans</title>
      <link>https://www.cepheis.com/blog/cephei-orleans</link>
      <description><![CDATA[<h1>Forward</h1>
<p>In an earlier post on the <a href="https://www.cepheis.com/CellFramework">Cell Framework</a> we described the paradigm of a <code>Model</code> as a collection of <code>Cell</code> that <em>can</em> be used as a class that industrialises a spreadsheet where values are guaranteed to be consistent with the underlying values that the formula is based on, but calculated in parallel.  This matches well with a market scenario where any number of changes to underlying instruments can alter the value/price of calculations.</p>
<p>We showed that models can be built up from basic models (e.g. Floating Rate Bond) to represent models that include an entire portfolio of trades, which can then provide high-level values for continuous hedging and liquidity-driven risk appetite for near-real-time (where a <code>SessionStream</code> ensures that a fresh compute-intensive risk calculation does not start until the last session has completed).</p>
<p><code>Model</code> and <code>Cell</code> provide <code>IObervable</code>/<code>IObserver</code> subscription for an event-stream Architecture, where events are passed between Nano-Servers through to active actors.  Nano-Server is used here as a small block of logic that runs within a Micro-Service (modern <a href="https://en.wikipedia.org/wiki/Service-oriented_architecture">Service Oriented Architecture</a> ) that is itself deployed to a cluster of computers like <a href="https://kubernetes.io/">Kubernetes</a>. The <a href="https://www.cepheis.com/CellFramework">Cell Framework</a> is a building block for these kind of architecture.
<img src="/blog/media/blogs/Blogs/Cephei/Cell-graph.png"></p>
<h1>Orleans</h1>
<p>Within the realm of massive on-line gaming, <a href="https://dotnet.github.io/orleans/index.html">Orleans</a> provides a framework for <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Orleans-MSR-TR-2014-41.pdf">Distributed Virtual Actors</a> that run on a large number of load-balanced servers, where each <a href="https://dotnet.github.io/orleans/Documentation/grains/index.html">Grain</a> is a Nano-Server or <a href="https://en.wikipedia.org/wiki/Digital_twin">Digital Twin</a> for a remote device.
Where <a href="https://dotnet.github.io/orleans/index.html">Orleans</a> is intended for millions of low-cost Nano-Services that cooperate, it also provides a rich hosting environment for scheduling any kind of work over massive clusters of Micro-Services, with production level telemetry and instrumentation.</p>
<h1>Cephei.Orleans</h1>
<p>The <code>Cephei.Orleans</code> <a href="https://www.nuget.org/packages?q=channell">Nuget Package</a> <em>will</em> provide a <a href="https://dotnet.github.io/orleans/1.5/Documentation/Getting-Started-With-Orleans/Grains.html">Grain</a> class <code>ModelGrain&lt;T&gt;</code> to host a <code>Model</code> within an Orleans cluster (e.g. <code>ModelGrain&lt;FloatingRateBond&gt;</code>) that can provide the fabric for real-time-risk.
<code>ModelGrain</code> requires <strong>no</strong> changes to <a href="https://www.nuget.org/packages/Cephei.Cell/">Cephei.Cell</a> because Orleans also uses the asynchronous event-oriented model. <code>Cephei.Cell</code> directly support asynchronous notification through <code>IObservable</code> subscriptions with overlapping <code>SessionStream</code> to prevent blocking for continuous streams of data though the NanoService.
<code>Cephei.Orleans</code> will provide all the plumbing to allow thousands of Models to collaborate in a managed compute-fabric.</p>
<h1>Enterprise Architecture</h1>
<p>While an event-fabric <em>appears</em> to be a complex graph of objects that cannot be presented within the document set needed for management and regulatory reporting, <a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> can be simply demonstrated using UML trace-relationships and automatic derivation through an <a href="https://www.cepheis.com/EnterpriseHub">Enterprise Hub</a></p>
<h1>Cephie.QL</h1>
<p><code>Cephei.Orleans</code> will be released with the upcoming update to the [Cephei.QL] wrapper around the latest version of <a href="https://www.quantlib.org/extensions.shtml">Quantlib</a>  together with the <a href="https://www.cepheis.com/Product/Xl">Cephei.Excel</a> Excel addin that enables the development of F# models using Microsoft Excel as an  editor</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:22:10 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/cephei-orleans</guid>
    </item>
    <item>
      <title>Implementing Enterprise Lineage</title>
      <link>https://www.cepheis.com/blog/implementing-enterprise-lineage</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> | <a href="https://www.cepheis.com/blog/data-lineage">Data Lineage</a> | <a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a></h6>
<p>In <a href="https://www.cepheis.com/blog/data-lineage">Data Lineage</a>, we demonstrated that <code>&lt;&lt;trace&gt;&gt;</code> relationships can be a superior alternative for documenting <a href="https://www.cepheis.com/blog/data-lineage">data-lineage</a>, and provide a framework for <a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a>.  This blog is concerned with the automatic derivation of lineage constraints for <code>&lt;&lt;metdata&gt;&gt;</code> <a href="https://sparxsystems.com/enterprise_architect_user_guide/15.0/model_domains/informationitem.html">Information Items</a> and the implementation.</p>
<h1>Nature of the problem</h1>
<p><code>&lt;&lt;trace&gt;&gt;</code> dependency is inherently an graph problem, but within software architecture the scale of the problem is a finite graph so does not need to be transferred to, or interrogated with Graph Database technology – any 64-bit operating systems can hold the entire graph in memory while processing.</p>
<p>Modelling oddities like <code>[Chicken]</code> has a trace dependency to <code>[Egg]</code> but <code>[Egg]</code> has a trace dependency to <code>[Chicken]</code> can be ignored because no additional information is provided by recursive search – and can be a useful construct when an in-memory object is sourced from a database row, and the database row is sourced from the in-memory object.</p>
<p>The trace-graph can be expressed as a (In <a href="https://en.wikipedia.org/wiki/JSON">JSON</a>) as an array of objects, where each object is either a text node or a trace-graph of other dependencies – a flattened array of all dependencies can produced by removing all the <code>[</code> and <code>]</code> from the array.</p>
<p>Lineage is not concerned with the process/algorithm of any transformation, but the reagents list of the information used.</p>
<h1>Domain Model</h1>
<p>{{ "Hub/metamodel.png" | asset_url | img_tag }}</p>
<p>This implementation uses <a href="https://sparxsystems.com/">Sparx Enterprise Architect</a>, which stores the repository information in a normalised relational database, than can be accessed by <a href="https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/entity-data-model">Entity Data Model</a> using <a href="https://www.nuget.org/packages/EA.Gen.Model">EA.Gen.Model</a> to provide an object-graph view of the repository database</p>
<h1>Derivation</h1>
<p>The derivation consists of three parts: recursive derivation of Attribute data-lineage; recursive   derivation of Information Item data-lineage constraints; derivation of Enterprise Lineage</p>
<h2>Attribute Data-Lineage</h2>
<p>Starting from the <code>&lt;&lt;metadata&gt;&gt;</code> Information Item, recursive search through <code>&lt;&lt;trace&gt;&gt;</code> abstractions to find common attributes (either by name or alias) that imply lineage. This is a separate pair of functions <code>dataReferences</code> and <code>mapDataLineage</code> to allow all data lineage to be refreshed for all entities once when derivation is scheduled.  The script cane be tailored to site-specific <em>requirements</em>.</p>
<p>It’s generally recommended that overloaded names like “Name” and “Id” are not used in Data-warehouses because Business Inteligence tools (PowerBI, Tableau, Qlikview, etc) will assume that they represent the same domain (if absolutely necessary a copy-columns view object can avoid implied linage by using a different name – this is preferable to changing the script)</p>
<p>The result of the recursive search is stored as <code>AttributeTag</code></p>
<p>There is no fragile dependency that trace does not include Chicken/Egg loops.</p>
<h2>Data-Lineage</h2>
<p>Starting from the <code>&lt;&lt;metadata&gt;&gt;</code> Information Item, the <code>dataLineageReport</code> function recursively gathers all attribute lineage Tag values into an in-memory dictionary so that every object needed for Lineage is referenceable.</p>
<p>For every attribute of every element referenced by the <code>&lt;&lt;metadata&gt;&gt;</code> Information Item a Lineage constraint is created by recursively searching the dictionary to expand the lineage constraint until all source reagents are added.</p>
<h2>Enterprise Lineage</h2>
<p>Starting from the <code>&lt;&lt;metadata&gt;&gt;</code> Information Item, all <code>&lt;&lt;flow&gt;&gt;</code> Information Flow references are recursively gathered into an in-memory dictionary of elements referenced (including components, classes, actors, process, etc)</p>
<p>For every data entity referenced by a <code>&lt;&lt;flow&gt;&gt;</code> a dictionary is produced of data references with their onward <code>&lt;&lt;trace&gt;&gt;</code> references.  This dictionary is then recursively expanded to include a reference to every source reagent.</p>
<p>For every data-entity included in every trace reference to the <code>&lt;&lt;metadata&gt;</code> item a constraint is created recursively searching the flow dictionary and filtering flows with the rule</p>
<blockquote>
<p>For each Element in the flow if the next flow conveys a data entity with common data-lineage to the previous flow and has common data-lineage with the constraint item, then it is <em>inferred</em> that this is an extension of the flow lineage</p>
</blockquote>
<h1>Operation</h1>
<p>The Lineage.fs script is included as an example with the <a href="https://cepheis.blob.core.windows.net/$web/HubServer.zip">Enterprise Hub</a> and scheduled either as a real-time change trigger (where a complex graph is calculated in a few seconds); scheduled refresh job or both</p>
<pre><code>&lt;!-- realtime scheduling on change --&gt; 
&lt;connection name="name"&gt;
  &lt;triggers&gt;
    &lt;trigger class="EA.Gen.Hub.Script.ScriptJob" assembly="EA.Gen.Hub.Script" description="lineage" workflow=".\Lineage.fs" elementClass="Element" type="InformationItem"/&gt;
  &lt;/triggers&gt;
&lt;/connection&gt;

&lt;!-- batch scheduling via Qaurtz --&gt; 
&lt;schedule&gt;
  &lt;job name="PathwiseComplexity" startup="true" startAt="01:00" interval="01:00" frequency="Daily" connections="name"&gt;
    &lt;trigger class="EA.Gen.Hub.Script.ScriptJob" assembly="EA.Gen.Hub.Script" description="lineage" workflow=".\Lineage.fs" /&gt;
  &lt;/job&gt;
&lt;/schedule&gt;
</code></pre>
<h1>Summary</h1>
<p>This kind of highly recursive and data-intensive function cannot reasonably be  performed within a client-side addin, but is fast and efficient when scheduled through an <a href="https://cepheis.blob.core.windows.net/$web/HubServer.zip">Enterprise Hub</a>.</p>
<p>When combined with the change-governance capability of the <a href="https://cepheis.blob.core.windows.net/$web/HubServer.zip">Enterprise Hub</a> it is possible to include lineage reviews in the governance process.</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:23:42 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/implementing-enterprise-lineage</guid>
    </item>
    <item>
      <title>Enterprise Lineage</title>
      <link>https://www.cepheis.com/blog/enterprise-lineage</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/implementing-enterprise-lineage">Implementing Enterprise Lineage</a> |  <a href="https://www.cepheis.com/blog/data-lineage">Data Lineage</a> | <a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a></h6>
<p>In <a href="https://www.cepheis.com/blog/data-lineage">Data Lineage</a> we highlighted that data lineage can be captured using UML <code>&lt;&lt;trace&gt;&gt;</code> relationships and field-level lineage can be projected from trace + common attribute names.  This blog is concerned with using those relationships to automatically derive systems lineage.</p>
<h1>Application Landscape</h1>
<p>A core deliverable for any mature Enterprise Architecture practice is the application landscape that shows applications within the content of the platforms and layers that they are built upon, clustered so that collaborators appear next to each other.  Applications are connected by <code>&lt;&lt;Flow&gt;&gt;</code>  relationships that highlight the data that is passed between them.
Often only the high-level flows are used, but the detail flows inform the layout of the diagram</p>
<h2>Service/Message/Event Buses</h2>
<p>A spider-web of flows between systems (where the number of flows between application is a factorial of the number of applications) is often used to demonstrate the complexity of the Current-State compared to the simplicity of a service/message bus in the target state.
Services-Buses compound the difficulty of mapping lineage because it becomes common to highlight input/output as <code>message</code> and presume that interpretation of data is usage problem – this is erroneous:</p>
<ul>
<li>Generalising at the landscape, pushes the actual architecture work down to the applications and converts the landscape from a diagram to picture.</li>
<li>Over generalisation mitigates against analysis of availability or exception escalation.</li>
<li>Automatic lineage projects this as a complex dependency graph where every system is potentially dependant on every other system.  This is <strong>not</strong> a deficiency of the lineage graph, but highlights the need to be more specific.</li>
</ul>
<h3>Event Buses</h3>
<p>Quality Event buses derive from the market-data pub/sub pattern rather than the message-queue pattern because they are time-ordered and newer events always supersede prior events.
In lineage terms events busses appear to extenuate the service bus complexity problem, but can highlight design faults because the presence of event-lineage in an externally facing interface highlights data-leakage rather than complexity – fixing the leakage fault removes the complexity.</p>
<h1>Deriving Enterprise Lineage</h1>
<p><img src="/blog/media/blogs/Blogs/Hub/enterprise-lineage.png"></p>
<p>In object oriented analysis and design it is axiomatic that if <code>[A]</code> inherits from <code>[B]</code> and <code>[B]</code> inherits from <code>[C]</code> then <code>[A]</code> also inherits from <code>[C]</code> (even if <code>[B]</code> completely replaces the behaviour of <code>[C]</code>).
Appling inheritance semantics to data trace <code>[A]</code> &lt;- <code>[B]</code> &lt;- <code>[C]</code> implies <code>[A]</code> &lt;- <code>[C]</code>.  In a trading scenario it is common to take the closing-price of one geographical market, and never use it because the opening-price of the next geographic market is available before we use it, but in 1% of cases (holidays/disasters) an opening price is not available and the prior closing-price is used.</p>
<p>The diagram show how enterprise lineage can be inferred from trace relationships.  Although <code>Reporting</code> consumes <code>Fact</code> from the <code>data-warehouse</code>, <code>FI Trade</code> is inferred to be in the <code>Fact</code> lineage because <code>Fact</code> has a trace relationship t FI trade.
It will be demonstrated that the enterprise lineage can be automatically recursive derived from the flow relationships and the data trace relationships.
With enterprise lineage, lineage does not need to be confined to components but can be extended to Actors (Bloomberg in the example); code classes/interfaces and even use-cases and business process for manual sourcing.
Examining the lineage at the <code>Reporting</code> component boundary highlights (without consideration of internal systems complexity that <code>Bloomberg</code> contributes to <code>Price</code> and that <code>Fact</code> includes an element of manual data-entry.
From a governance/regulatory perspective, it is not necessary to see the detailed working to know that price-sourcing and operational-risk need to be considered.  Demonstration that lineage is automatically derived from detailed flows means that a detailed audit is not required to confirm the veracity of lineage summaries.</p>
<h6>Price lineage generated automatically from above diagram</h6>
<pre><code>["SimpleBank.Component.Reporting", 
    [
        ["SimpleBank.Component.Data Warehouse",
            [
                ["SimpleBank.Component.Message Bus", 
                    [
                        ["SimpleBank.Component.MarketData",
                            ["SimpleBank.Class Diagram.MarketData.Bloomberg"]
                        ]
                    ]
                ],
                "SimpleBank.Component.FI Trading", 
                "SimpleBank.Component.EQ Trading"
            ]
        ]
    ]
]
</code></pre>
<h6>Fact lineage generated automatically from above diagram</h6>
<pre><code>["SimpleBank.Component.Reporting", 
    [   
        ["SimpleBank.Component.Data Warehouse", 
            [   
                ["SimpleBank.Component.Message Bus", 
                    ["SimpleBank.Component.FX Trading"]],
                ["SimpleBank.Component.FI Trading", 
                    [
                        ["SimpleBank.Use Case.Execute Swap OTC", 
                            ["SimpleBank.Process.Execute Trade"]]
                    ]
                ], 
                "SimpleBank.Component.EQ Trading"
            ]
        ]
    ]
]
</code></pre>
<h1>Conclusion</h1>
<p><a href="https://www.cepheis.com/blog/data-lineage">An alternative approach</a> to data-lineage highlighted that <code>&lt;&lt;trace&gt;&gt;</code> relationships are superior when your objective is to provide lineage analysis rather than design.  Using trace for data-lineage is an enabler for enterprise lineage.
Enterprise Data-Lineage is well suited to enterprise architecture because it separates the solution domain from the enterprise domain where lineage analysis becomes an additional part architecture review and feedback to solutions architects and designers is addressed through improvement proceses.  The <a href="https://www.cepheis.com/blog/implementing-enterprise-lineage">next article</a> will show how lineage is generated automatically through an <a href="https://cepheis.blob.core.windows.net/$web/HubServer.zip">Enterprise Hub</a></p>
]]></description>
      <pubDate>Sun, 14 Jul 2024 17:39:34 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/enterprise-lineage</guid>
    </item>
    <item>
      <title>Data Lineage</title>
      <link>https://www.cepheis.com/blog/data-lineage</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> | <a href="https://www.cepheis.com/blog/implementing-enterprise-lineage">Implementing Enterprise Lineage</a></h6>
<h1>Background</h1>
<p><a href="https://en.wikipedia.org/wiki/Data_lineage">Data lineage</a> is an important concept in information technology because it provides 'meta' information about data that enables us to see  where a value came from, how it was manipulated/consolidated and what the information can reasonably used for.
"Balance" for cash-accounts is different from "Balance" of all accounts (including loans &amp; mortgages) and different from "Balance" including fungible value of assets held as security – knowing the lineage of a value determines in which context it can be used.</p>
<p>Aside from the constituents and usage of data, lineage enables us to build confidence in the values:</p>
<ul>
<li>back-testing an individual value to determine the quality of the information (an aggregate of credit and debit accounts is functional dependant on the accounting convention used), and currency values cannot be aggregated without applying an exchange rate.</li>
<li>establishing that the sources of value is complete and consistent conversions are applied.</li>
</ul>
<p>It is a common mistake to conflate internal and external data-lineage requirements as being the same thing, and treat them as the same problem.  This can lead to an internal data-processing approach to a lineage, where the external requirement is only concerned with the ultimate inputs and outputs.</p>
<p><em>Internal-data-lineage is important for internal quality and testing, but in large organisation (with hundreds of process steps) it obscures the external-data-lineage, and might not meet external needs</em></p>
<h1>Internal Lineage</h1>
<p>The traditional approach to data-lineage is to map the association from each attribute source to each attribute destination using database <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">Extract/Transform/Load</a> (ETL); <a href="https://en.wikipedia.org/wiki/Enterprise_application_integration">Enterprise Application Integration</a> (EAI); or in heterogeneous environments stand-alone <a href="https://en.wikipedia.org/wiki/Data_dictionary">Data-Dictionaries</a> (with pre-built data loaders for common ETL/EAI scenarios).</p>
<p>It is also possible to document these relationships in UML modelling tools (like <a href="https://www.sparxsystems.com">Sparx Enterprise Architect</a>) using attribute specific associations between classes.  The diagram at the top of the page shows the ETL mapping for a data transfer. All ETL/EAI tools obscure detailed calculations, partly because of tool limitations, but mostly because complex calculations  (<a href="https://en.wikipedia.org/wiki/Value_at_risk">VaR</a>, <a href="https://en.wikipedia.org/wiki/XVA"><em>x</em>VA</a>, <a href="https://en.wikipedia.org/wiki/Risk-weighted_asset">RWA</a>, <em>etc</em> ) cannot be represented as lineage graphs.</p>
<p>In all cases, tooling support it critical to reduce the effort of mapping, but often at great cost.</p>
<h1>External Lineage</h1>
<p><img src="/blog/media/blogs/Blogs/Hub/data-lineage-1.png">
<img src="/blog/media/Hub/data-lineage-1.png"></p>
<p>An alternative <em>(simpler)</em> approach is to separate the details of field-level derivation from the summary view of metadata lineage, taking advantages of the fact that in 95% of cases attributes have the same name (<em>in finance Reuters instrument code is always RIC</em>) and in the remaining cases an alias can be used (<em>ETL tools always presume this to start</em>).  The detailed mapping can be replace by a single trace reference.</p>
<p>The advantage of using <code>&lt;&lt;trace&gt;&gt;</code> references is that the diagrams remain domain focused (<em>without the clutter of detail</em>), and changes to attributes <em>imply</em> changes to lineage, and updates do not need to be cascaded through a  myriad of related diagrams. In <a href="https://en.wikipedia.org/wiki/Unified_Modeling_Language">UML</a> the Lineage can be represented as &lt;&lt;metadata&gt;&gt; <a href="https://www.sparxsystems.com/enterprise_architect_user_guide/14.0/model_domains/informationitem.html">Information Items</a> where constraints represents the lineage rules.</p>
<p>It will be demonstrated that the constraints can be automatically derived by recursively scanning all trace relationships to algorithmically summarise the lineage in real-time in an <a href="https://www.cepheis.com/EA/Hub">enterprise hub</a>.  In our solution, the lineage constraint is represented as <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> array graph that can be imported into visualisation tools if required.
Whilst a raw JSON graph does not provide compelling presentation it does allow gap to be highlighted</p>
<pre><code>["SimpleBank.Databases.WareHouse.Price.MARKET_PRICE", ["SimpleBank.Databases.Trading.LSEPrice.Price"], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.OpeningPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.OpeningPrice"]]], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.ClosingPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.ClosingPrice"]]], ["SimpleBank.Class Diagram.MarketData.BloombergPrice.SpotPrice", [["SimpleBank.Class Diagram.MarketData.Bloomberg API.SpotPrice"]]], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Bid_Price"], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Close_Price"], ["SimpleBank.Databases.Trading.LSEPrice.Price"], ["SimpleBank.Class Diagram.MarketData.RuetersPrice.Price"]]
</code></pre>
<h1>Derivation</h1>
<p>While the automatic derivation of data lineage from source is valuable for reporting and analysis, a derivative value includes reliable <a href="https://www.cepheis.com/blog/enterprise-lineage">enterprise-lineage</a> of UML flows (with referenced content) can now be properly derived.</p>
<p>While a regulatory reporting application will commonly source Price/Trade data from a data warehouse, when the warehouse price has trace references to an LSE Equity Price and Bloomberg-Price it allows the systems lineage to be inferred from the data-flows. <a href="https://en.wikipedia.org/wiki/Enterprise_service_bus">Enterprise Service Bus</a>, market-data pub/sub and <a href="https://en.wikipedia.org/wiki/Event-driven_architecture">Event Driven Architecture</a> can obscure the flow of data through systems because a provider will often be unaware of usage.</p>
<p>It will be demonstrated that system lineage can be infered from <a href="https://sparxsystems.com/enterprise_architect_user_guide/15.0/model_domains/informationflow.html">Information Flow</a> between systems and where a tranformation is implied by data lineage
<img src="/blog/media/blogs/Blogs/Hub/system-lineage.png"></p>
<h1>Usage</h1>
<p><code>&lt;&lt;metadata&gt;&gt;</code> summaries are most useful at the end-point where lineage needs to be demonstrated to sponsors and regulators - it can also be used at any level as a quality review for internal-lineage.</p>
<p>Real-time automatic derivation allows continuous improvement of domain-specific models, and overcomes the need to bridge the gap between ETL/EAI/DD tools and modern service/event orientated architectures</p>
<h1>Conclusion</h1>
<p>Early attempts to meet the regulatory needs of <a href="https://en.wikipedia.org/wiki/BCBS_239">BCBS 239</a> have focused on using the internal lineage approaches in <em><a href="https://en.wikipedia.org/wiki/Depth-firstsearch">depth first search</a></em> for lineage information.</p>
<p>Regulators are generally more interested in <em><a href="https://en.wikipedia.org/wiki/Breadth-first_search">Breadth first search</a></em> demonstration of product coverage and commonality of price sources; rather than the size and cost of data-governance-offices.</p>
<p>It is not unreasonable to conclude that failure to demonstrate lineage for <a href="https://en.wikipedia.org/wiki/BCBS_239">BCBS 239</a> using internal-lineage <em>implies</em> that it will fail for <a href="https://en.wikipedia.org/wiki/FRTB">FRTB</a>... a better approach is available, and should be taken</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:26:11 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/data-lineage</guid>
    </item>
    <item>
      <title>Quantitative Enterprise Architecture </title>
      <link>https://www.cepheis.com/blog/quantitative-enterprise-architecture</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> | <a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a> | <a href="https://www.cepheis.com/blog/data-input-validation">Data Input Validation</a></h6>
<h1>Background</h1>
<p>Much has been written about Quantitative Enterprise Architecture, largely starting with the <a href="https://en.wikipedia.org/wiki/Capability_Maturity_Model">Capability Maturity Model</a> from the US Software Engineering Institute in the 1980’s through <a href="https://en.wikipedia.org/wiki/Six_Sigma">Six-Sigma</a> process optimisation through to structured systems testing across a range of business capabilities. The challenge of CMMi level-3 (<em>processes are defined and being followed</em>) is that the assesment is subjective.  CMMi Level-4 introduces the capability to capture metrics from the development process, while CMMi Level-5 introduces the capability to use the metrics to improve the development process.</p>
<p><a href="https://www.opengroup.org/togaf">Togaf</a> and <a href="https://www.zachman.com/about-the-zachman-framework">Zachman</a> provide frameworks that extenuate the need to consider architecture early with a focus on the principle that addressing risk early reduces overall cost, providing evidence of progress at the early stages of a project.</p>
<h1>Model Driven Architecture</h1>
<p><a href="https://en.wikipedia.org/wiki/Model-driven_architecture">MDA</a> was driven as a  mechanism to add value to architecture activities, with the observation</p>
<ol>
<li>That requirements coverage and scope management can be quantitatively measured if there is a maintained trace relationship between the requirements and deliverable</li>
<li>That the net-value of specification can go from positive to negative if the document is misleading because it is wrong</li>
<li>That database design rigour and principles accrue value when applied to system behaviour</li>
<li>That much code is boilerplate where the same concept needs to be applied to database, classes, interfaces and the different abstractions needed at different levels of an architecture</li>
<li>That most integration test failings are due to mismatches in the implementations at component boundaries.</li>
</ol>
<p>The big driver for <a href="https://en.wikipedia.org/wiki/Model-driven_architecture">MDA</a> were</p>
<ul>
<li>Inability to represent EJB interfaces as normal class diagrams  (an enterprise bean does not directly implement is interface, but relies on the EJB container to provide glue)</li>
<li>Limitation of the <a href="https://en.wikipedia.org/wiki/Unified_Modeling_Language">Unified Modelling Language</a> (UML) that many abstractions (e.g. ternary associations) can not be mapped to code</li>
</ul>
<p><a href="https://en.wikipedia.org/wiki/Model-driven_architecture">MDA</a> is most commonly manifested as model transformations (PIM to PSM) and code generations</p>
<h1>Architecture Compliance</h1>
<p>Past Malfeasance has lead regulators to demand that institutions demonstrate evidence of data-sourcing to ensure that all systems are covered with consistent pricing sources and that mandated regulator processes are being performed.</p>
<p>In the absents of an established Enterprise Architecture various point-solution databases are developed</p>
<h1>Quantitative Enterprise Architecture</h1>
<p>Combining <a href="https://en.wikipedia.org/wiki/Capability_Maturity_Model_Integration">CMMi</a>, <a href="https://en.wikipedia.org/wiki/Model-driven_architecture">MDA</a> and Architecture Compliance introduces the network effect that information combined into an Architecture Repository than can meet multiple needs from a single source, but two issues become apparent:</p>
<ul>
<li>Different models progress at different rates, and becomes difficult to coordinate manually</li>
<li>The “forest and trees” pattern emerges where a mass of detail obscures the areas that need focus</li>
</ul>
<p>The approach taken with <a href="https://www.cepheis.com/EA/Hub">Enterprise Hub</a> is to consolidate information from different domain sources, and then use automated metric to prioritize areas for detailed analysis.
The metrics that are important change over time, so flexible frameworks and tools are needed for rapid evolution; Enterprise Hub provides a runtime environment and two frameworks for Analysts/Developers:</p>
<ul>
<li>Workflow Foundation - process oriented workflow tool for business analysts to provide simple procedural tools</li>
<li>F# Script Foundation - code oriented tool for developers to provide complex (recursive) analysis. Functional languages like F# are ideally suited the to the analytical problems of graph/diagram interpretation because prevent the kind of sider effects that cause imperitaive programs to fail with highly recursive problems.</li>
</ul>
<p>With each foundation, scripts are automatically executed within the Hub, resulting in metrics properties and issues being created that can feed into the review cycle for architecture work.  The Enterprise Hub includes working samples that provided real-world solutions to common quantitative architecture problems:</p>
<ul>
<li><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> - Recursive derivation enterprise and data lineage as Lineage objects change and when scheduled.</li>
<li><a href="https://www.cepheis.com/blog/implementing-pathwise-compelxity">Pathwise Complexity</a> - Recursive Process complexity metrics, with property banding {high, normal, low} and issues (for high complexity)</li>
<li><a href="https://www.cepheis.com/blog/data-input-validation">Data Input Validation</a> - Recursive search for process issues that cannot be identified with normal QA</li>
<li>ChangeStatus - XAML Workflow job to apply change-request status changes to the related items</li>
</ul>
<p><img src="/blog/media/blogs/Blogs/Hub/hub.png"></p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:27:49 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/quantitative-enterprise-architecture</guid>
    </item>
    <item>
      <title>Implementing Pathwise Complexity</title>
      <link>https://www.cepheis.com/blog/implementing-pathwise-compelxity</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a> | <a href="https://www.cepheis.com/blog/implementing-enterprise-lineage">Implementing Enterprise Lineage</a> | <a href="https://www.cepheis.com/blog/data-input-validation">Data Input Validation</a></h6>
<h1>Background</h1>
<p><a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a> is one technique to measure complexity (and by inference quality) of business processes, but is dependent on rigour in the modelling process, and needs to be amended when a less mature approach has been adopted.  In an <a href="https://www.cepheis.com/blog/pathwise-complexity">earlier blog</a> I demonstrated that pathwise complexity can drive the need to move beyond back-box process interactions and highlight	the wider collaboration needed with counterparties.</p>
<p>A common pattern in long value-chains is to use intermediate event (like of-page connectors in a flow-chart) to extenuate a customer journey: in these scenarios the technique needs to be amended to meet site-specific conventions.  Algorithmic precision can be traded for greater coverage of immature process maps – it is recommended to add start/end events to value-chains because intermediate events (in mature maps) highlight service interactions rather off-page links.
The Enterprise Hub facilitates this with scripted workflows that can be evolved as the site becomes more mature.  Included with the server are three variants of script to calculate the metric</p>
<h1>Overview</h1>
<p>The script is either run real-time (from update triggers) or overnight for an entire model.  Pathwise Complexity is calculated and stored as metric objects for processes.  From the spread of complexity score three bands derived and stored as indicator properties; then for ‘high’ complexity processes issues are created to feed into continuous improvement efforts</p>
<h2>Implementation</h2>
<p>The full implementation is 250 lines of functional F# code (* F# is used for economy of expression, and because it avoids the errors common with imperative languages). <a href="https://www.nuget.org/packages/EA.Gen.Model">EA.Gen.Model</a> provides the database binding to view a Sparx repository as a graph of interconnected nodes.<br>
The core algorithm is a recursively searches through the connections on a diagram from each <em>start-event</em> accumulating a lists of paths each time an <em>end-event</em> is found.  Rather than a simple recursive search with re-entry checking that would filter loops, the <em>cycle</em> function breaks the path into splices of routes delimited by the current node, then if there are any duplicates the path can be rejected.</p>
<pre><code>        // start elements
        let starts = 
            diagRefs 
            |&gt;  List.filter (fun i -&gt;  i.Stereotype = "StartEvent" ) 

        // Find number of cycles of the same pattern
        let cycle (e : Element) (l : Element list) =
            if l = [] then 0
            else
                let id = e.Id 
                let state : (int list list * int list) = ([id],[]) 
                let splice = 
                    l
                    |&gt;  List.filter (fun i -&gt;  not (i.Stereotype = "StartEvent" || i.Stereotype = "EndEvent"))
                    |&gt;  List.fold (fun (h,w) y -&gt;  if y.Id = id then ([[y.Id] @ w] @ h, []) else (h, [y.Id] @ w)) state
                let t = List.rev (fst splice)
                let sets =
                    List.tail t @ [List.head t @ (snd splice)]
                List.map (fun i -&gt;  List.fold (fun x y -&gt;  if i = y then x + 1 else x) 0 sets) sets
                |&gt;  List.fold (fun e y -&gt;  (e + y - 1)) 0

        // seek recursively until EndEvents are found or cycle detected
        let rec findEnd (e : Element) (l : Element list) (paths : Element list list) = //: Element list list =
            let reenter = List.filter (fun (i :  Element) -&gt;  i.Id = e.Id) l |&gt;  List.length  
            if reenter &gt;= 1 &amp;&amp; ((cycle e l) &gt; 0) then   // skip path if it loops
                paths
                [([e] @ l)] @ paths // add this path to the sets
            else
                e.StartConnectors 
                |&gt; Seq.toArray
                |&gt; Array.filter (fun c -&gt; c.ConnectorType = "ControlFlow" &amp;&amp; c.EndElementId.HasValue)
                |&gt; Array.map (fun c -&gt; c.EndElementId.Value)
                |&gt; Array.filter (fun c -&gt; refMap.ContainsKey (c))
                |&gt; Array.map (fun i -&gt; refMap.[i])                            // only connctions on the decomposition diagrams
                |&gt; Array.fold (fun a v -&gt; (findEnd v ([e] @ l) a)) paths

        starts 
        |&gt; List.fold (fun a v -&gt; (findEnd v [] a)) [] 
        |&gt; Seq.map (fun i -&gt; Seq.ofList i) 
</code></pre>
<p>A two line amendment to the starts list  and <code>EndEvent</code> stereotype filter allows for paths between intermediate events to be included, while addition of <code>&amp;&amp; c.Stereotype = "SequenceFlow"</code> to the connector type filter blocks traversal of message-flow paths
Deployment
The script is deployed to the enterprise hub by adding a scheduled job (for the whole-repository: load balanced by Quart.net in a cluster) or as a trigger on a connection for real-time service-side update.</p>
<pre><code>      &lt;job name="PathwiseComplexity" startup="true" startAt="01:00" interval="01:00" frequency="Daily" connections="ER"&gt;
        &lt;trigger class="EA.Gen.Hub.Script.ScriptJob" assembly="EA.Gen.Hub.Script" description="validation" workflow="C:\Users\steve\source\repos\EA.Gen\EA.Gen.Hub.Script\PathwiseComplexity.fs"  /&gt;
      &lt;/job&gt;
</code></pre>
<p>Pathwise Complexity is an example of the kind of complex data-intensive analysis/metrics that really need to run unattended on a server.  Pathwise Complexity is an architype for the kind of quantitative enterprise architecture that is possible with {Sparx Enterprise Architect, <a href="https://www.nuget.org/packages/EA.Gen.Model">EA.Gen.Model</a> graph view of data, Enterprise Hub hosting environment.  Full code will be familiar to any programmer familiar with {F#, OCaml, Haskell, Scala} – it is worth the effort to use functional languages to avoid unexpected side-effects.</p>
<pre><code>namespace EA.Gen.Hub.Script

open EA.Gen.Hub.Model
open EA.Gen.Model
open System.Linq
open System
open Quartz
open System.Threading.Tasks
open Serilog

(*
    Summary : Calculate the Pathwise complexity of the an element from the decomosition diagram of the object
*)
type PathwiseComplexity () =
    class
    let findPaths (element : Element) (diagrams : Diagram seq) : Element seq seq =
   
        // all references from diagrams
        let diagRefs = 
            let els (d : Diagram) = 
                d.Elements 
                |&gt; Seq.map (fun i -&gt;  i.Element)
                |&gt; Seq.filter (fun i -&gt; not ( i = null))
                |&gt; Seq.toList
            diagrams
            |&gt; Seq.fold (fun a y -&gt; (els y) @ a) [] 
        
        // referenes as a dictionary for lookup
        let refMap = 
            diagRefs
            |&gt;  List.map (fun i -&gt; (i.Id, i))
            |&gt;  Map.ofList

        // start elements
        let starts = 
            diagRefs 
            |&gt;  List.filter (fun i -&gt;  i.Stereotype = "StartEvent" ) 

        // Find number of cycles of the same pattern
        let cycle (e : Element) (l : Element list) =
            if l = [] then 0
            else
                let id = e.Id 
                let state : (int list list * int list) = ([id],[]) 
                let splice = 
                    l
                    |&gt;  List.filter (fun i -&gt;  not (i.Stereotype = "StartEvent" || i.Stereotype = "EndEvent"))
                    |&gt;  List.fold (fun (h,w) y -&gt;  if y.Id = id then ([[y.Id] @ w] @ h, []) else (h, [y.Id] @ w)) state
                let t = List.rev (fst splice)
                let sets =
                    List.tail t @ [List.head t @ (snd splice)]
                List.map (fun i -&gt;  List.fold (fun x y -&gt;  if i = y then x + 1 else x) 0 sets) sets
                |&gt;  List.fold (fun e y -&gt;  (e + y - 1)) 0

        // seek recursively until EndEvents are found or cycle detected
        let rec findEnd (e : Element) (l : Element list) (paths : Element list list) = //: Element list list =
            let reenter = List.filter (fun (i :  Element) -&gt;  i.Id = e.Id) l |&gt;  List.length  
            if reenter &gt;= 1 &amp;&amp; ((cycle e l) &gt; 0) then   // skip path if it loops
                paths
            elif e.Stereotype = "EndEvent" then
                [([e] @ l)] @ paths // add this path to the sets
            else
                e.StartConnectors 
                |&gt; Seq.toArray
                |&gt; Array.filter (fun c -&gt; c.ConnectorType = "ControlFlow" &amp;&amp; c.EndElementId.HasValue)
                |&gt; Array.map (fun c -&gt; c.EndElementId.Value)
                |&gt; Array.filter (fun c -&gt; refMap.ContainsKey (c))
                |&gt; Array.map (fun i -&gt; refMap.[i])                            // only connctions on the decomposition diagrams
                |&gt; Array.fold (fun a v -&gt; (findEnd v ([e] @ l) a)) paths

        starts 
        |&gt; List.fold (fun a v -&gt; (findEnd v [] a)) [] 
        |&gt; Seq.map (fun i -&gt; Seq.ofList i)

    (* Summary : apply the complexity metric to database *)
    let metric (db : Sparx) (e : Element) values : unit = 
        
        let set = query {for r in e.Metrics do
                            where (r.MetricType = "Complexity")
                            select (r.Metric,r)}
                  |&gt; Map.ofSeq

        let findOrCreate name = 
            match set.TryFind name with
            | Some m -&gt; m
            | None   -&gt;  let o = new ObjectMetric()
                         o.Metric &lt;- name 
                         o.MetricType &lt;- "Complexity"
                         e.Metrics.Add (o)
                         o
        let apply (p,m,t) = 
            (findOrCreate "Paths").EValue               &lt;- Nullable&lt;float&gt;(float(p))
            (findOrCreate "Longest Path").EValue        &lt;- Nullable&lt;float&gt;(float(m))
            (findOrCreate "Total Path Length").EValue   &lt;- Nullable&lt;float&gt;(float(t))

        apply values

    let fromMetric (m : ObjectMetric) =
        match m with
        | p when m.Metric = "Paths"             -&gt; (int(m.EValue.Value),0,0)
        | l when m.Metric = "Longest Path"      -&gt; (0,int(l.EValue.Value),0)
        | t when m.Metric = "Total Path Length" -&gt; (0,0,int(t.EValue.Value))
        | _                                     -&gt; (0,0,0)

    let createProperty (id : int) (rag : string) = 
        let o = new ObjectProperty ()
        o.ElementId &lt;- Nullable&lt;int&gt;(id)
        o.Property &lt;- "Pathwise Complexity" 
        o.Value &lt;- rag
        o

    let createIssue (id : int) (db : Sparx) = 
        let o = new ObjectProblem()
        o.ElementId &lt;- id
        o.Problem &lt;- "High Pathwise Complexity"
        o.ProblemType &lt;- "Issue"
        o.DateReported &lt;- new Nullable&lt;DateTime&gt; (DateTime.Now)
        o.Status &lt;- "New"
        db.ObjectProblems.Add o |&gt; ignore


    let createIssues (db : Sparx) = 
        query {for p in query {for p in db.ObjectProperties do 
                               where (p.Property = "Pathwise Complexity" &amp;&amp; p.Value = "High" &amp;&amp; p.ElementId.HasValue)
                               select p} do
               leftOuterJoin i in query {for p in db.ObjectProblems do 
                                         where (p.Problem = "High Pathwise Complexity")
                                         select p} on (p.ElementId.Value = i.ElementId) into g
               for r in g do
               select (p.ElementId.Value,r)}
        |&gt; Seq.iter (fun (p,r) -&gt; if r = null then createIssue p db)

    let measure (e : Element) (db : Sparx) = 

        let agregatetPaths e d = 
            let pathmap = findPaths e d
            let paths = 
                pathmap |&gt; Seq.fold (fun a y -&gt; a + 1) 0
            let totalmax =
                let length l = 
                    l |&gt; Seq.fold (fun a y -&gt; a + 1) 0
                pathmap 
                |&gt; Seq.map (fun i -&gt; length i) 
                |&gt; Seq.fold (fun (m,t) y -&gt; ((if y &gt; m then y else m),(t + y))) (0,0) (*max,total*)
            (paths, (fst totalmax), (snd totalmax))
        
        Serilog.Log.Information ("Measuring {0} {1}", e.Name, e.GUID)

        let diagrams =
            let did = if not (e.PDATA1 = null) &amp;&amp; (Seq.fold (fun a y -&gt; if a then a else Char.IsDigit(y)) false e.PDATA1) then int e.PDATA1 else 0
            if did &gt; 0 then
                query {for d in db.Diagrams do
                        where (d.Id = did)
                        select d}
                |&gt; Seq.toArray
            elif e.ObjectType = "Package" then
                query {for p in db.Packages do
                       join d in db.Diagrams on (p.Id = d.PackageId.Value)
                       where (p.GUID = e.GUID)
                       select d}
                |&gt; Seq.toArray
            else 
                [||]
        let aggregate = agregatetPaths e diagrams
        if not (aggregate = (0,0,0)) then 
            Serilog.Log.Information ("Measured {0} {1}", e.Name, aggregate)
            metric db e aggregate

    // Summary: Execute the measurement for the activity elements 
    let execute (elementClass : ElementClass) (repo: string) (guid : string ) : unit  =

        use db = new Sparx(repo)

        query {for e in db.Elements do
               where (e.GUID = guid &amp;&amp; (e.ObjectType = "Activity" || e.ObjectType = "Package"))
               select e}
        |&gt; Seq.iter (fun i -&gt; measure i db |&gt; ignore)

    // Summary: Execute the measurement for all packages that contain measures
    let executeAll (repo: string) : unit =

        use db = new Sparx(repo)
        try
            let childMetrics pid = 
                let max a b = if a &gt; b then a else b
                query {for e in db.Elements do
                       join m in db.ObjectMetrics on (e.Id = m.ElementId)
                       where (e.PackageId.Value = pid)
                       select m}
                |&gt; Seq.map (fun i -&gt; fromMetric i)
                |&gt; Seq.fold (fun (ap,am,at) (p,m,t) -&gt; (ap + p,max am m, at + t)) (0,0,0)

            // measure all activities
            let elements = 
                query {for e in query {for e in db.Elements do
                                       where (e.ObjectType = "Activity" || e.ObjectType = "Package")
                                       select e} do
                       leftOuterJoin i in query {for p in db.ObjectProblems do 
                                                 where (p.Problem = "High Pathwise Complexity")
                                                 select p} on (e.Id = i.ElementId) into g
                       for r in g do
                       select (e,r)}
                |&gt; Seq.toArray
            elements |&gt; Array.iter (fun (e,r) -&gt; if r = null then measure e db)

            let complexitySet =
                query {for m in db.ObjectMetrics do
                       where (m.Metric = "Paths" &amp;&amp; m.MetricType = "Complexity")
                       select (m.ElementId, m.EValue.Value)}
                |&gt; Seq.toArray
            let avg = (Array.fold (fun a (e,v) -&gt; a + v ) (0.0) complexitySet) / float complexitySet.Length
            let stdev = sqrt ((Array.fold (fun a (e,v)-&gt; a + (float v - avg) ** 2.0) 0.0 complexitySet) / float complexitySet.Length)
            let low = avg - stdev
            let high = avg + stdev

            let Rag v = 
                if (v &gt; high) then 
                    "High"
                elif (v &lt; low) then 
                    "Low"
                else
                    "Normal"

            let curTag =
                query {for t in db.ObjectProperties do
                       where (t.Property = "Pathwise Complexity" &amp;&amp; t.ElementId.HasValue)
                       select (t.ElementId.Value,t)}
                |&gt; Map.ofSeq
        
            complexitySet
            |&gt; Array.filter (fun (e,v) -&gt; curTag.ContainsKey(e))
            |&gt; Array.iter (fun (e,v) -&gt; curTag.[e].Value &lt;- (Rag v))
        
            complexitySet
            |&gt; Array.filter (fun (e,v) -&gt; not (curTag.ContainsKey(e)))
            |&gt; Array.map (fun (e,v) -&gt; createProperty e (Rag v))
            |&gt; (fun p -&gt; db.ObjectProperties.AddRange (p))
            |&gt; ignore

            createIssues db

            db.SaveChanges () |&gt; ignore
        with 
            | :? Exception as e -&gt; Serilog.Log.Error (e, "Pathwise Complexity error {0}", e.Message )
                

    (* Implement the trigger methods that would be called by the dynamic script job*)
    interface ITriggerScript with 
        member this.ExecuteTrigger ( elementClass: ElementClass, repo : string, guid : string) =
            execute elementClass repo guid

        member this.ExecuteAll ( repo : string, onlyModified : bool ) =
            executeAll repo 
end
</code></pre>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:32:22 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/implementing-pathwise-compelxity</guid>
    </item>
    <item>
      <title>Data Input Validation</title>
      <link>https://www.cepheis.com/blog/data-input-validation</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> |  <a href="https://www.cepheis.com/blog/data-lineage">Data Lineage</a> | <a href="https://www.cepheis.com/blog/pathwise-complexity">Pathwise Complexity</a></h6>
<h1>Problem</h1>
<p>The 'know your customer' (<a href="https://en.wikipedia.org/wiki/Know_your_customer">KYC</a>) and ‘three lines of defence’ (<a href="https://wiki.treasurers.org/wiki/Three_Lines_of_Defence_Model">3LoD</a>) are two common business modelling problems that require that <em>under all circumstances</em> data must be provide prior to the use of the information.</p>
<p>For KYC, we need to ensure that the {due-diligence, credit-check, embargo, politically exposed persons, financial-crime} checks are all performed before trades are executed for the client.  These checks are normally undertaken before a trading account is setup; but in an increasingly complex world clients can be introduced via brokers, white-label services or as counterparties in a derivatives bought from another institution.  In a small number of cases the KYC issues only become apparent in the back-office when trades must be settled.  In these cases there is no direct link between KYC and the settlement activity so  a database must be checked.  The challenge for process-design is to ensure that the database entry has always been written prior to being needed.</p>
<p>For 3LoD we need to ensure that treasury provision or hedging has been completed before regulatory exposure reporting for <a href="https://en.wikipedia.org/wiki/BCBS_239">BCBS239</a> to ensure that the risk profile reconciles with the trading book.</p>
<p>Both of these cases are examples of the "read before write" problem where a process can be defective if there are scenarios that information is needed before it is provided.  The conventional strategy is intensive process quality assurance to ensure that the linked set of processes across a value-chain include the appropriate activities.  The problem is that a the upstream process might be changed without the downstream processes being re-validated.  The problem becomes an issue when systems are implemented and gaps create data-quality exceptions during normal business</p>
<h1>A Solution</h1>
<p>The <a href="https://www.cepheis.com/EA/Hub">enterprise hub</a> addresses this need with the sample DataInputValidation.fs script that continuously monitors the process-model (either real-time as models are changed, or overnight for the entire model) to find cases where a Data-store (drum logo) or Data-Object (page logo) is being read (by BPMN DataInputAssociation), and recursively searches back through the control flows to find an instance where the Object/Store is being written (by BPMN DataOutputAssociation).  Where an instance can not be found, an issue is created in the repository that can be used through continuous process improvement.</p>
<p>The algorithm of the script is just a few lines of recursive F# code that uses <a href="https://www.nuget.org/packages/EA.Gen.Model">EA.Gen.Model</a> to search back through a Sparx process model (through activities, decisions, messages, intermediate-events)	to find the write activity.</p>
<pre><code>    let rec findTarget (start : Element) (target : Element) (last : string) (path : int list) : Boolean = 
        let recurs i l = 
            List.fold (fun a y -&gt; if a then a else (y = i)) false l

        if start.Id = target.Id &amp;&amp;  last = "Activity" then // we're only interested in activities that write 
            true
        elif recurs start.Id path then
            false
        else
            (start.StartConnectors 
                |&gt; Seq.filter (fun i -&gt; i.ConnectorType = "Dependency" &amp;&amp; i.Stereotype = "DataOutputAssociation")
                |&gt; Seq.map (fun i -&gt; i.EndElement)
                |&gt; Seq.toList)
            @
            (start.EndConnectors
                |&gt; Seq.filter (fun i -&gt; i.ConnectorType = "ControlFlow")
                |&gt; Seq.map (fun i -&gt; i.StartElement)
                |&gt; Seq.toList)
            |&gt; List.fold (fun a y -&gt; if a then a else findTarget y target start.ObjectType ([start.Id] @ path)) false
</code></pre>
<p>The sample can easily be changed to handle sources other than a standard Activity or to provide additional validation.
For KYC and 3LoD the process can easily be changed to validate client specific rules such as the distinction between the creation of a risk, mitigation of a risk and audit-point (end-event) where risks are highlighted</p>
<h1>Example</h1>
<p>When reading the sparx sample process for Nobel prizes, it is not immediately apparent that “Completed Nomination Forms” does not have a process to create it, but is highlighted by the DataInputValidation script.</p>
<p><img src="/blog/media/blogs/Blogs/Hub/DataValidation.png">
<img src="/blog/media/blogs/Blogs/Hub/DataValidation2.png"></p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:33:17 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/data-input-validation</guid>
    </item>
    <item>
      <title>Pathwise Complexity</title>
      <link>https://www.cepheis.com/blog/pathwise-complexity</link>
      <description><![CDATA[<h6><a href="https://www.cepheis.com/blog/enterprise-lineage">Enterprise Lineage</a> | <a href="https://www.cepheis.com/blog/implementing-pathwise-compelxity">Implementing Pathwise Complexity</a> | <a href="https://www.cepheis.com/blog/data-input-validation">Data Input Validation</a></h6>
<p>Pathwise Complexity is technique to measure the complexity of a <a href="https://en.wikipedia.org/wiki/Business_Process_Model_and_Notation">BPMN</a> process with emphasis on of decisions and particularly the consequences of re-work/re-check.  All well-formed business processes begin with one or more start-events, and proceed through a sequence of activities decisions, gateways and intermediate events until they end at one or more end-events: Business processes are closed graphs that have finite number of paths from start to finish.</p>
<p>Pathwise Complexity assists process improvement by ranking processes that are most receptive to improvement.  Process improvement techniques like Six-Sigma are iterative – if you fix the worst processes, eventually they will all be good.</p>
<p>Pathwise complexity is inspired by the <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">Cyclomatic complexity</a> in software engineering that uses the total number of nodes and total number connections between them to compute a score, but with the distinctions:</p>
<ul>
<li>Cyclomatic complexity explicitly ignores the algorithmic complexity of iterating over a list rather than treating each item in a list as a distinct activity.  In software engineering exceptions can be raised at a number points that can not be mapped to an orderly flowchart from start to finish.</li>
<li>Activities in software have low cost relative to activities in a business process and business decisions are more expensive than activities.</li>
<li>Rework in a business process is exponentially more expensive than preparatory activities and non-looping decision.
Pathwise Complexity makes axiomatic assumptions:</li>
<li>A path that does not reach an end is not complex, it does not work. Complexity score is not a substitute for quality – quality assurance metrics also need to be captured.</li>
<li>Actors are intelligent enough not to keep making the same decision over and over again unless some decision has changed.</li>
</ul>
<p>Pathwise complexity is calculated by:</p>
<ol>
<li>Mapping ever distinct path from start to finish branching to different paths following decisions and gateways</li>
<li>If an activity is reached that was previously visited in the path, check it was visited by a different route from last time. <code>[1]</code>-&gt;<code>[2]</code>-&gt;<code>[3]</code>-&gt;<code>[2]</code> is ok because <code>[1]</code>-&gt;<code>[2]</code> and <code>[3]</code>-&gt;<code>[2]</code> are different routes, but <code>[1]</code>-&gt;<code>[2]</code>-&gt;<code>[3]</code>-<code>[2]</code>-&gt;<code>[3]</code> is not because the route <code>[2]</code>-&gt;<code>[3]</code> is cyclic and therefore invalid.</li>
<li>Count the total number of distinct paths</li>
<li>Calculate the standard deviation of distinct paths across all processes, and use the bands to classify them into {High, Normal, Low)</li>
</ol>
<p>There is a high correlation between the Pathwise Complex Process and Processes that can be improved.  <a href="https://www.cepheis.com/blog/pathwise-complexity">Automatic calculation</a> is critical to it being a viable approach</p>
<h1>Examples</h1>
<p>The Evaluation process has low Pathwise complexity because there is one path
<img src="/blog/media/blogs/Blogs/Hub/install.png"></p>
<p>The Implement Process has Normal Pathwise Complexity because there are five distinct paths
<img src="/blog/media/blogs/Blogs/Hub/Implement.png">
The Process Incident Management has High Pathwise Complexity because there are 128 paths (due to the feedback loop to VIP Customer, who might raise additional questions from the answer).
The solution to the high pathwise complexity is not to model 'VIP Customer' as a black-box, but instead as a dialog where the customer is expected to review the answer before closure.
<img src="/blog/media/blogs/Blogs/Hub/incident.png">
The limit to the methodology is that Black-box lane often embed hidden assumptions about time ordering of behavior that results in the "E-main voting example" reporting a Pathwise complexity of <strong>7,164,676</strong>
because "Receive Vote" (in pink) can be sent at any time.  Either black boxes are a defective design pattern, or messages need to be excluded from complexity analysis
<img src="/blog/media/blogs/Blogs/Hub/e-voting.png"></p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:34:27 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/pathwise-complexity</guid>
    </item>
    <item>
      <title>The dirty truth of data lakes</title>
      <link>https://www.cepheis.com/blog/the-dirty-truth-of-data-lakes</link>
      <description><![CDATA[<p>The architecture principle that drove the creation of the data-lake paradigm was:</p>
<ol>
<li>you cannot determine all the future use-cases for data at the point of capture</li>
<li>the opportunity cost from analysis delays is higher than the cost of storing the data</li>
<li>the operational cost of archiving data is higher than the cost of retaining it</li>
<li>the operational cost of updating data is higher than processing cost of aggregating it.</li>
</ol>
<p>These principles give rise to the data-lake pattern, where considerable investment by web-scale companies {google, amazon, Alibaba, Facebook, etc} continue to accrue new insights  from the click-stream of users; this in turn led to the widespread adoption of Hadoop and its many derivatives as a data-storage pattern.</p>
<p>The allegory of a lake is appealing because lakes store water for later use, but also imply an effortless natural process rather than the effort and cost of a building reservoir.  Data-sewer is the anti-pattern of the data-lake pattern, you really do not know whether you’ve built a lake or sewer until you try to accrue value from it.</p>
<p>The reason for is post is to highlights three traits that lead to data-sewers rather than a data-lakes:</p>
<ol>
<li>missing the opportunity to undertake rudimentary analysis up-front and normalisation (either by missing commonality (treating {facility, loan, mortgage, repo,..} as exceptions) or misclassifying (treating a derivative contract as a legal agreement rather than an instrument)</li>
<li>missing the opportunity to map lifecycle  (hedging, securitisation, late-booking)</li>
<li>missing the opportunity to move to a real-time event-model, with a focus on batch cycles.</li>
</ol>
<p>The Architecture mistake is to see Hadoop as a paradigm shift in technology rather than a (potentially) cheaper data-warehouse.  When cloud providers offer hybrid solutions that combine traditional MPP databases (SQL/Server PDW, Oracle Exadata, etc) with Hadoop/Spark/Kafka integration and block-storage replacing HDFS.. it is not unreasonable for business sponsors to question whether all the effort was a waste of time.</p>
<p><a href="https://https://news.microsoft.com/innovation-stories/ignite-2019-azure-synapse/">Microsoft Synapse</a> is one example of technology advance obsoleting chief data offices</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:35:07 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/the-dirty-truth-of-data-lakes</guid>
    </item>
    <item>
      <title>Cephei Modelling</title>
      <link>https://www.cepheis.com/blog/cephei-modelling</link>
      <description><![CDATA[<p>Cephei is a low-code solution for financial modelling, from prototyping through to real-time pricing and risk. Traders and Developers can work on the same models because parallel calculation eliminates performance issues of a flexible framework.</p>
<h1>Event driven approach</h1>
<p>Today the most successful businesses are event driven to respond to customer needs, just-in-time provision of services or agile change.  This is not always reflected in application services; as we move to cloud services, systems need to change to meet the current and future challenges of business.</p>
<p>The <a href="https://www.cepheis.com/CellFramework">Cephei</a> approach is to model services as components that respond to (market) events and raise events automatically as calculated cells change – much like a spreadsheet but on a desktop, server or cloud.</p>
<p>Cephei models are constructed as building blocks that hide cells that are only used internally. Models can be assembled from other models as parts, building high-level models that represent the business domain.</p>
<h2>Cephei addin</h2>
<p>The Cephei Excel addin enables Excel to be used to build models, and test them with live data from internal sources and market data from Bloomberg or Reuters.</p>
<p><strong><a href="https://www.cepheis.com/CellFramework">Cephei</a></strong> is not a conventional Excel addin (that relies on VBA macros to prevent Excel from hanging during modelling).  It has its own parallel calculation that updates Excel in the background.</p>
<p>Cephei provides 15,000 Finance functions from <a href="https://www.quantlib.org/">Quantlib</a>, but can be re-built using domain specific libraries.   The addin can be re-generated using any (C++/C#/F#) financial library using the Cephei.Gen code generator.  The Quantlib functions can be viewed as a proof of concept, that demonstrates that any library can be used by Cephei.</p>
<p>Cells are added to the Cephei Model by 	<em>creator</em> functions like <code>=_FixedRateBond(“bond1,..)</code> , properties are accessed by functions like <code>=_FixedRateBond_cleanPrice(“clean1”, “bond1”)</code> and values viewed in Excel by <code>+_Value(“clean1”)</code>.  In the packground Cephei maintains the source-code to generate source-code for each function from the Cephei menu.</p>
<p>Contact <a>feedback@cepheis.com</a> or <a>call</a> for evaluation, integration, advise or development services.</p>
<p>See <a href="https://www.cepheis.com/blog/cephei-blotters">Blotters</a> for examples of with live ticking data and <a href="https://www.cepheis.com/blog/sample-models">Sample Model</a> for code generation and reuse</p>
<table>
<thead>
<tr>
<th>File</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/XL32.zip">Cephei.xll</a></td>
<td>Addin to provide Cephei and functions to 32-bit version of Excel</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/XL32.zip">Cephei64.xll</a></td>
<td>Addin to provide Cephei and functions to 64-bit version of Excel</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/BondBlotter.xlsx">BondBlotter.xlsx</a></td>
<td>BondBlotter Example spreadsheet showing ticking changes without blocking Excel editing</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/Bond1.xlsx">Bond1.xlsx</a></td>
<td><a href="https://www.cepheis.com/blog/sample-models">Bond1</a> Scratchpad spreadsheet for modelling {Zero, Fixed, Floating} bonds</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/Globals.xlsx">Globals.xlsx</a></td>
<td><a>Globals</a> Extract from Bond1.xlsx to generate globals.fs</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/Cephei.Model.zip">Cephei.Model.zip</a></td>
<td>F# code generated from the three above models, together with compiled Cephei.Model.dll that adds using in the following example. Cephei.Model.dll needs to be copied to the same directory as Cephei.xll for the following examples</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/BondPricer.xlsx">BondPricer.xlsx</a></td>
<td>BondPricer Extract from Bond1.xlsx to generate BondPricer.fs</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/Bond1_Fixed.xlsx">Bond1_Fixed.xlsx</a></td>
<td><a href="https://www.cepheis.com/blog/sample-models">Bond1_Fixed</a> Extract from Bond1.xlsx to generate FixedRateBond.fs</td>
</tr>
<tr>
<td><a href="https://cepheis.blob.core.windows.net/$web/samples/Bond_Fixed_Portfolio.xlsx">Bond_Fixed_Portfolio.xlsx</a></td>
<td>Bond_Fixed_Portfolio Example spreadsheet showing ticking changes to a model referencing compiled versions for the three models above</td>
</tr>
</tbody>
</table>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:35:56 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/cephei-modelling</guid>
    </item>
    <item>
      <title>Cephei Cell</title>
      <link>https://www.cepheis.com/blog/CellFramework</link>
      <description><![CDATA[<p>This article introduces the <a href="https://www.nuget.org/packages/Cephei.Cell/">Cephei.Cell library</a> released recently to the Nuget package manager with source in <a href="https://github.com/channell/Cephei">GetHub</a> honouring an promise made to Don Syme many years ago.  It demonstrates the efficiency of development of mathematical models in F#</p>
<h1>Background</h1>
<p>The “Cell Framework” started fifteen years ago as a mechanism to make Monte Carlo simulations execute fast for interactive calculation of Potential and Expected Exposure of derivative {swap, swaption, cap, floor} trades before execution to ensure they were profitable enough to balance the exposure with a CDS trade.  Monte Carlo simulation are compute intensive because thousands of alternate scenarios must be calculated for each time-point.  With a minor change to use Cells (with asynchronous calculation) for NPV calculation it was possible to halve the time taken to risk a trade.  At the time cells implemented a <a href="https://en.wikipedia.org/wiki/Futures_and_promises">future promise pattern</a>, but it was apparent the speed of overall calculation far outweighed the cost of constructing and scheduling tasks for reasonably expensive operations, and that re-using the cells for further time-points would allow for more operations to be performed in parallel.</p>
<p>The second version replaced the Mutex lock with a “Latch Lock” pattern (used internally by the Oracle RDBMS) where objects are only explicitly locked if there is contention between threads.  This version introduced event subscriptions to propagate changes to dependant Cells like a spreadsheet, and a profiling mechanism to identify dependencies without the time (or errors) of defining dependencies in code.  This was used for Early Warning of Liquidity risk, where any number of movements in the price of {equity, FX, futures} instruments could trigger action to tighten risk appetite.</p>
<p>The third version combines the Latch-lock with state transition to provide lockless concurrency; <a href="https://en.wikipedia.org/wiki/Work_stealing">eager-steal</a> for waitless calculation on modern multi-core servers; moves history from within Cells to session to remove garbage collection contention; and initialisation-time profiling of <a href="https://en.wikipedia.org/wiki/Closure_(computer_programming)">closures</a>.
<code>Cell</code> is significantly faster for large complex calculations, where a number of factors can change, are well-suited to streaming price calculation and real-time risk derivation.</p>
<h1>Framework or Kernel</h1>
<p>The term “Cell Framework” is used because <a href="https://www.nuget.org/packages/Cephei.Cell/">Cephei.Cell</a> is foundation for <a href="https://www.nuget.org/packages/Cephei.QL/">Cephei.QL</a> that is being updated for .NET 5 to remove Windows/Wine dependency. Cephei uses code generation to wrap underlying C++ quantitative finance functions into higher-level abstracts to allow an Excel addin to be used to define a financial model that can be saved as a functional program directly – Excel becomes an editor for functional code.</p>
<p>While the Cell Framework replicates the <a href="https://en.wikipedia.org/wiki/Futures_and_promises">promise pattern</a> and the paradigm of spreadsheet cells, it is also a foundation for different way of thinking about software building blocks that extenuates functional relationships between values rather than linear paths of derivation.</p>
<p>While <code>Cell</code> provides a mechanism to automatically parallel calculate a number of functions, a <code>Model</code> provides a mechanism to encapsulate complexity, and only surface values that are input or output. <code>Model</code> can contain other Models to build high-level abstractions for {Asset-Class agnostic Trade, Portfolio, Book, Ledger, etc}.  Cells within a <code>Model</code> can be changed at runtime (like a spreadsheet)</p>
<p><code>Session</code> provide a mechanism to group together changes to input values (e.g. market data feed) without duplicate calculations (same as the pattern of manual calculation in Excel), while <code>SessionStream</code> adds overlapping sessions for a continuous calculation of high-level values (like RWA) in near-real-time.
<code>Cell</code> and <code>Model</code> provide the <code>IObservable</code>/<code>IObserver</code> pattern for event linkage with a stream based calculation.</p>
<h1>Implementation</h1>
<p><code>Cell</code> uses lockless concurrency with thread synchronisation <code>ManualResetEvent</code> for contention (when a value is needed, but calculation has already commenced) using processor cache bypass<a href="https://en.wikipedia.org/wiki/Compare-and-swap">Compare &amp; Swap</a> for <code>SpinLock</code> and State pointer</p>
<h2>Cell&lt;T&gt;</h2>
<p><code>Cell&lt;T&gt;</code> is implemented as a <a href="https://en.wikipedia.org/wiki/Finite-state_machine">Finite State Machine</a> where Operations and Events cause atomic state-transitions to ensure that under no circumstance is a <code>Dirty</code> value read from <code>Cell</code>, with Operating System Events only used when threads are blocking for a value from calculation currently being performed.
{{ "Cephei/Cell-state.png" | asset_url | img_tag }}</p>
<p>There are three specialisations of <code>Cell&lt;T&gt;</code> that are instantiated either through the <code>Cell</code> module, or (in the case of <code>CellEmpty</code>) through a <code>Model</code> that includes forward-reference to Cells that have not been defined at that point in the <code>Model</code></p>
<h2>CellFast</h2>
<p>When it is know in advance that Cells and their references will not be redefined at runtime, and their references are not forward referenced, <code>CellFast&lt;T&gt;</code> can be used to bypass runtime profiling of cells.  In this scenario <a href="https://en.wikipedia.org/wiki/Closure_(computer_programming)">closures</a> are inspected at instantiation to extract the cells that this <code>cell</code> is dependent on.</p>
<pre><code>let calculation_cell =
  let build (p : ICell&lt;’t&gt;) =
    Cell.CreateFast (fun () -&gt; some_complex_calculation  p.Value)
  build referenced_cell 
</code></pre>
<p>In this example <code>referenced_cell</code> is captured by the closure rather than the wider model (<code>Cell.CreateFast</code> factory method should be used to avoid causing the calculation to re-evaluate whenever <em>any</em> value in the <code>Model</code> changes – it will default to a <code>Cell&lt;T&gt;</code> object if there are no parameters for instantiation-time profiling.  <code>CellFast&lt;T&gt;</code> should only be used if you are comfortable with advanced concepts of Functional Programming.</p>
<h2>CellSpot</h2>
<p><code>CellSpot&lt;T&gt;</code> is a further specialisation of <code>CellFast&lt;T&gt;</code> where it is known in advance that the <code>Cell</code> will never be redefined, and the latest (spot) value should always be used (typically the FX rate for a portfolio)	
{{ "Cephei/cell-class.png" | asset_url | img_tag }}</p>
<h2>Cell</h2>
<p>The static module <code>Cell</code> provides factory functions to create cells from F# using type-inference, plus a Thread Local stack of Cells currently being profiled.</p>
<blockquote>
<p>Any Cell that reads the content of another Cell while evaluating its function, is by definition dependant on it
For the Boolean conditional logic, the condition code needs to be in a separate cell in order for the expression to re-profile when the boolean value changes:</p>
</blockquote>
<pre><code>let dependant_cell = Cell.Create (fun () -&gt; 
		if cond_cell.Value then 
			equity_trade.Value 
		else 
			credit_trade.Value)
</code></pre>
<h2>Model</h2>
<p>{{ "Cephei/cell-model.png" | asset_url | img_tag }}
<code>Model</code> provides a tree structured dictionary of cells in an overall model, but is different from a plain dictionary in a number of respects:</p>
<ul>
<li>It collects all events from each of the Cells (&amp; Models) within it, enabling a single subscription for changes to any part of a model</li>
<li>It provides the <code>model.As&lt;T&gt;</code>(“reference name”) for type coercion of the value being referenced from different parts of the model, which also enables forward reference of cells that have not been previously defined within a model</li>
<li>Names passed to a model lookup use the <code>’|’</code> as a delimiter, so “equity|hsbc|lon|fair_value_price” would resolve to a reference to the <code>Cell</code> fair_value_price within the hierarchy of the model</li>
</ul>
<h2>Session</h2>
<p>{{ "Cephei/cell-session.png" | asset_url | img_tag }}
<code>Session</code> provides for consistency that derived cells are calculated with the same set of values that were set when the session was opened.  Generally changes to the spot value of a bond future, with trigger changes to the price of quoted IRS instruments and long-dated government bonds which appear to ripple along a yield curve, which could trigger multiple valuations of dependant instruments – unless relative value arbitrage is being sought, it is better to snap all price changes together and calculate one.</p>
<p><code>Session</code> has a <code>Current</code> thread static reference that allows assignment to the value of a cell to be implicitly part of the session, with calculation delayed until the session is disposed – in this content <code>IDisposable.Dispose()</code> is not a euphemism for delete, because the session will be passed through the event-notification methods and kept alive until all Cells have left the session when it finally becomes eligible for garbage collection.<br>
Important values are</p>
<ul>
<li>Scope - Cells that have joined the session in response to "JoinSession" event and Join call</li>
<li>Values -  Boxed value of the cells referenced in the session for consistent values</li>
</ul>
<p>The only guarantee that the value returned from <code>cell.Value</code> will be consistent with the session (later sessions might be scheduled first) is to read the value using a <code>SessionObserver</code></p>
<h3>SessionStream</h3>
<p><code>SessionStream</code> provides a proxy to an interlaced stream of sessions, that return <code>_current</code> for <code>GetValue</code> calls and <code>_next</code> for <code>SetValue</code> moving starting calculation of <code>_next</code> once the <code>_current</code> session has completed.</p>
<p><code>SessionStream</code> allows an event-stream subscriber to hold a single session open, and allow consistent sets of calculations to be provided as quickly as calculation is completed.
This is designed for real-time-risk where high-level portfolio calculations take time to calculate and can not keep-up with fast-moving-markets.  An example would be a liquidity barometer decline in liquidity-coverage-ratio triggers an uptick in internal treasury cross-charging interest rates that have the effect of reducing the risk-appetite and quantity of trades executed.</p>
<h1>Usage</h1>
<p>The example of a floating rate bond that provides NPV, CleanPrice and DirtyPrice from observations of deposit rates for one-week to one-year and swap rates from two to fifteen years allows the model to be used for</p>
<ul>
<li>what-if analysis</li>
<li>market quotes</li>
<li>back-testing</li>
<li>real-time-risk through a simulation model</li>
<li>liquidity risk
All without any changes to the quantitative model.  Changing the subscription used to provide deposit and swap rates and adding an onward subscriptions of the NPV, Clean and Dirty prices allows the model to be used for different scenarios</li>
</ul>
<pre><code>namespace SampleModels

open Cephei.Cell
open Cephei.QL

type FloatingBondModel () as this =
    inherit Model ()
(* ... implementation ... *)
    // Index model for collection access
    do this.Bind()

    // Externally visible properties 
    member this.NPV         = NPV
    member this.CleanPrice  = CleanPrice
    member this.DirtyPrice  = DirtyPrice
    member this.Deposit1W   = d1wQuote    
    member this.Deposit1M   = d1mQuote
    member this.Deposit3h   = d3mQuote
    member this.Deposit6M   = d6mQuote
    member this.Deposit9M   = d9mQuote
    member this.Deposit1Y   = d1yQuote
    member this.Swap2Y      = s2yQuote
    member this.Swap3Y      = s3yQuote
    member this.Swap5Y      = s5yQuote
    member this.Swap10Y     = s10yQuote
    member this.Swap15Y     = s15yQuote
</code></pre>
<p>The full implementation of the model using <a href="https://www.nuget.org/packages/Cephei.QL">Cephei.QL</a> for <a href="https://www.quantlib.org/">QuantLib</a> functions  demonstrates why F# has been selected as the model scripting language</p>
<pre><code>namespace SampleModels

open Cephei.Cell
open Cephei.QL

type FloatingBondModel () as this =
    inherit Model ()

    let calendar            = Fun.Times.Calendars.TARGET.Create ()
    let settlementDate      = DateTime (2018, 9, 18)
    let fixingDays          = 3u
    let settlementDays      = 3u
    let todaysDate          = Cell.Create (fun ()-&gt; calendar.Advance (settlementDate, -(int fixingDays), QL.Times.TimeUnitEnum.Days, None, None))
    let S                   = Cell.Create (fun () -&gt; Fun.Swapessions.Create ( todaysDate))

    (*
        Rate helpers
    *)
    let zc3mQuote           = 0.0096
    let zc6mQuote           = 0.0145
    let zc1yQuote           = 0.0194

    let zc3mRate            = S.With Fun.Quotes.SwapimpleQuote.Create (Some zc3mQuote)
    let zc6mRate            = Fun.Quotes.SwapimpleQuote.Create (Some zc6mQuote)
    let zc1yRate            = Fun.Quotes.SwapimpleQuote.Create (Some zc1yQuote)

    let zcBondsDayCounter   = Fun.Times.Daycounters.Actual365Fixed.Create () 

    let months m            = Fun.Times.Period.Create (m, QL.Times.TimeUnitEnum.Months)
    let ModifiedFollowing   = QL.Times.BusinessDayConventionEnum.ModifiedFollowing
    let zc3m                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc3mRate, (months 3), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :&gt; QL.Termstructures.Yield.IRateHelper
    let zc6m                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc6mRate, (months 6), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :&gt; QL.Termstructures.Yield.IRateHelper
    let zc1y                = Fun.Termstructures.Yield.DepositRateHelper.Create (zc1yRate, (months 12), fixingDays, calendar, ModifiedFollowing, true, zcBondsDayCounter) :&gt; QL.Termstructures.Yield.IRateHelper

    let termStrucDayCounter = Fun.Times.Daycounters.ActualActual.Create (Some QL.Times.Daycounters.ActualActual.ConventionEnum.ISDA)
    let tolerance           = 1.0e-15
    
    (*
        Bond Data
    *)
    let redemption          = 100.0

    let issueDates          = [ DateTime (2015, 3, 15)
                              ; DateTime (2015, 6, 15)
                              ; DateTime (2016, 6, 30)
                              ; DateTime (2012, 11, 15)
                              ; DateTime (1997, 5, 15)
                              ]

    let maturities          = [ DateTime (2020, 8, 31)
                              ; DateTime (2021, 8, 31)
                              ; DateTime (2023, 8, 31)
                              ; DateTime (2028, 8, 31)
                              ; DateTime (2048, 5, 15)
                              ]
 
    let couponRates         = [ 0.02375
                              ; 0.04625
                              ; 0.03125
                              ; 0.04000
                              ; 0.04500
                              ]

    let marketQuotes        = [ 100.390625
                              ; 106.21875
                              ; 100.59375
                              ; 101.6875
                              ; 102.140625
                              ]

    let two2one a b         = List.map2 (fun e y -&gt; (e,y)) a b
    let combine             = two2one (two2one issueDates maturities) (two2one couponRates marketQuotes)

    let quote               = List.map (fun q -&gt; Fun.Quotes.SwapimpleQuote.Create (Some q)) marketQuotes 
    let quoteRate r         = Fun.Quotes.SwapimpleQuote.Create (Some r)
    let usCalendar          = Fun.Times.Calendars.UnitedStates.Create (Some QL.Times.Calendars.UnitedStates.MarketEnum.GovernmentBond)
    let unadjusted          = QL.Times.BusinessDayConventionEnum.Unadjusted 
    let backward            = QL.Times.DateGeneration.RuleEnum.Backward
    let semiannual          = Fun.Times.Period.Create (QL.Times.FrequencyEnum.Swapemiannual)
    let schedule i m        = Fun.Times.Swapchedule.Create (i, m, semiannual, usCalendar, unadjusted, unadjusted, backward, false, None, None)
    let coupons q           = Fun.Doubles.CreateVector ([q])
    let actualActualBond    = Fun.Times.Daycounters.ActualActual.Create (Some QL.Times.Daycounters.ActualActual.ConventionEnum.Bond)
    let fixedHelper q s c i = Fun.Termstructures.Yield.FixedRateBondHelper.Create (q, settlementDays, redemption, s, c, actualActualBond, Some unadjusted, Some redemption, Some i) :&gt; QL.Termstructures.Yield.IRateHelper
    let schedules           = List.map2 schedule issueDates maturities
    let rateHelpers         = List.map (fun ((i,m),(c,q))-&gt; fixedHelper (quoteRate q) (schedule i m) (coupons c) i) combine

    let bondInstruments     = Fun.Vector ([zc3m;zc6m;zc1y] @ rateHelpers)
    let bondTermStructure   = Fun.Termstructures.Yield.PiecewiseYieldCurveDiscountLogLinear.Create (settlementDate, bondInstruments, termStrucDayCounter, tolerance)

    (*
        curve building
    *)

    // Building of the Libor forecasting curve
    // deposits
    let d1wQuote            = Cell.CreateValue 0.043375
    let d1mQuote            = Cell.CreateValue 0.031875
    let d3mQuote            = Cell.CreateValue 0.0320375
    let d6mQuote            = Cell.CreateValue 0.03385
    let d9mQuote            = Cell.CreateValue 0.0338125
    let d1yQuote            = Cell.CreateValue 0.0335125
    // swaps
    let s2yQuote            = Cell.CreateValue 0.0295
    let s3yQuote            = Cell.CreateValue 0.0323
    let s5yQuote            = Cell.CreateValue 0.0359
    let s10yQuote           = Cell.CreateValue 0.0412
    let s15yQuote           = Cell.CreateValue 0.0433

    // SimpleQuote stores a value which can be manually changed;
    // other Quote subclasses could read the value from a database
    // or some kind of data feed.

    // deposits

    let d1wRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d1wQuote.Value))
    let d1mRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d1mQuote.Value))
    let d3mRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d3mQuote.Value))
    let d6mRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d6mQuote.Value))
    let d9mRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d9mQuote.Value))
    let d1yRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some d1yQuote.Value))
    // swaps
    let s2yRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some s2yQuote.Value))
    let s3yRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some s3yQuote.Value))
    let s5yRate             = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some s5yQuote.Value))
    let s10yRate            = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some s10yQuote.Value))
    let s15yRate            = Cell.Create (fun () -&gt; Fun.Quotes.SwapimpleQuote.Create (Some s15yQuote.Value))

    let depositDayCounter   = Fun.Times.Daycounters.Actual360.Create ()

    let period n t          = match t with
                              | 'w' -&gt; Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Weeks)
                              | 'm' -&gt; Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Months)
                              | 'y' -&gt; Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Years) 
                              | 'd' -&gt; Fun.Times.Period.Create (n, QL.Times.TimeUnitEnum.Days) 
                              | _   -&gt; raise (new Exception ("invalid period type"))

    let d1w                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d1wRate.Value, (period 1 'w'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d1m                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d1mRate.Value, (period 1 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d3m                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d3mRate.Value, (period 3 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d6m                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d1wRate.Value, (period 6 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d9m                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d9mRate.Value, (period 9 'm'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))
    let d1y                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.DepositRateHelper.Create (d1yRate.Value, (period 1 'y'), fixingDays, calendar, ModifiedFollowing, true, depositDayCounter))

    // setup swaps
    let annual              = Cell.Create (fun () -&gt; QL.Times.FrequencyEnum.Annual))
    let thirty360European   = Cell.Create (fun () -&gt; Fun.Times.Daycounters.Thirty360.Create (Some QL.Times.Daycounters.Thirty360.ConventionEnum.European))
    let forwardStart        = Cell.Create (fun () -&gt; period 1 'd')
    let swFloatingLegIndex  = Cell.Create (fun () -&gt; Fun.Indexes.Ibor.Euribor.Create (period 6 'm'))
    let s2y                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.SwapwapRateHelper.Create (s2yRate.Value, (period 2 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s3y                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.SwapwapRateHelper.Create (s3yRate.Value, (period 3 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s5y                 = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.SwapwapRateHelper.Create (s5yRate.Value, (period 5 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s10y                = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.SwapwapRateHelper.Create (s10yRate.Value, (period 10 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))
    let s15y                = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.SwapwapRateHelper.Create (s15yRate.Value, (period 15 'y'), calendar, annual, unadjusted, thirty360European, swFloatingLegIndex, None, Some forwardStart, None))

    (*
        Curve building
    *)
    let tr p : QL.Termstructures.Yield.IRateHelper = p :&gt; QL.Termstructures.Yield.IRateHelper
    let depoSwapInstruments = Cell.Create (fun () -&gt; Fun.Vector ([tr d1w.Value; tr d1m.Value;tr d3m.Value;tr d6m.Value;tr d9m.Value;tr d1y.Value;tr s2y.Value;tr s3y.Value;tr s5y.Value;tr s10y.Value;tr s15y.Value]))
    let depoSwapTermStructure = Cell.Create (fun () -&gt; Fun.Termstructures.Yield.PiecewiseYieldCurveDiscountLogLinear.Create (settlementDate, depoSwapInstruments.Value, termStrucDayCounter, tolerance))

    (*
        Bonds to be priced
    *)
    let faceAmount          = 100.0
    let bondEngine          = Fun.Pricingengines.Bond.DiscountingBondEngine.Create(Some (bondTermStructure :&gt; QL.Termstructures.IYieldTermStructure), None) 
    let following           = QL.Times.BusinessDayConventionEnum.Following 

    // Floating rate bond (3M USD Libor + 0.1%)
    // Should and will be priced on another curve later...

    let libor3m             = Cell.Create (fun () -&gt;
                              let t = Fun.Indexes.Ibor.USDLibor.Create ((period 3 'm'), Some (depoSwapTermStructure.Value :&gt; QL.Termstructures.IYieldTermStructure))
                              t.AddFixing (new DateTime(2028,07,17), 0.0278625, None) :?&gt; QL.Indexes.IIborIndex)
    let quarterly           = Fun.Times.Period.Create (QL.Times.FrequencyEnum.Quarterly)
    let usNYSE              = Fun.Times.Calendars.UnitedStates.Create (Some QL.Times.Calendars.UnitedStates.MarketEnum.NYSE)
    let floatingBondSchedule = Fun.Times.Swapchedule.Create (DateTime(2015,10,21), DateTime(2020,10,21), quarterly, usNYSE, unadjusted, unadjusted, QL.Times.DateGeneration.RuleEnum.Backward, true, None, None)

    // Coupon pricers

    let pricer              = Fun.Cashflows.BlackIborCouponPricer.Create (None)
    let volatility          = 0.0
    let Actual365Fixed      = Fun.Times.Daycounters.Actual365Fixed.Create ();
    let vol                 = Fun.Termstructures.Volatility.Optionlet.ConstantOptionletVolatility.Create (settlementDate, calendar, ModifiedFollowing,volatility, Actual365Fixed)
    //use fluent interface to set capvol
    let capletPricer        = pricer.SwapetCapletVolatility (Some (vol :&gt; QL.Termstructures.Volatility.Optionlet.IOptionletVolatilityStructure))

    let floatingRateBond    = Cell.Create (fun () -&gt; Fun.Instruments.Bonds.FloatingRateBond.Create (capletPricer, settlementDays, faceAmount, floatingBondSchedule, libor3m.Value, depositDayCounter, Some ModifiedFollowing, Some 2u, Some (coupons 1.0), Some (coupons 0.001), None, None, Some true, Some faceAmount, Some (DateTime(2015, 10, 21)), bondEngine))

    let NPV                 = Cell.Create (fun () -&gt; floatingRateBond.Value.NPV)
    let CleanPrice          = Cell.Create (fun () -&gt; floatingRateBond.Value.CleanPrice())
    let DirtyPrice          = Cell.Create (fun () -&gt; floatingRateBond.Value.DirtyPrice())

    // Index model for collection access
    do this.Bind()

    // Externally visible properties 
    member this.NPV         = NPV
    member this.CleanPrice  = CleanPrice
    member this.DirtyPrice  = DirtyPrice
    member this.Deposit1W   = d1wQuote    
    member this.Deposit1M   = d1mQuote
    member this.Deposit3h   = d3mQuote
    member this.Deposit6M   = d6mQuote
    member this.Deposit9M   = d9mQuote
    member this.Deposit1Y   = d1yQuote
    member this.Swap2Y      = s2yQuote
    member this.Swap3Y      = s3yQuote
    member this.Swap5Y      = s5yQuote
    member this.Swap10Y     = s10yQuote
    member this.Swap15Y     = s15yQuote
</code></pre>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:38:50 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/CellFramework</guid>
    </item>
    <item>
      <title>Cephei Sample</title>
      <link>https://www.cepheis.com/blog/cephei-sample</link>
      <description><![CDATA[<h1>Introduction</h1>
<p>Cephei was conceived on the premise that Excel is not <strong>inherently</strong> a bad tool to <em>prototyping</em> a <code>model</code>, but it is a poor tool to <em>operate</em>  a business that depends on the <code>model</code>.</p>
<p>The architecture impedance issue is that it is historically difficult to translate the <em>prototype</em> model from a spreadsheet to a reliable application.  Cephei addresses the impedance issue by replicating the <code>Cell</code> notification paradigm  and designing the implementing quantitative functions with code-generation metadata.</p>
<p>Cephei.XL can be viewed as a <a href="https://en.wikipedia.org/wiki/Low-code_development_platform">Low-code</a> development environment for a <a href="https://en.wikipedia.org/wiki/Digital_twin">Financial Digital Twin</a></p>
<p>Cephei can be used directly with Quantlib analytics or used as a an Architype when integrated with another financial library</p>
<h1>Cephei.XL Solution</h1>
<p>The Cephei.XL  solution provided a comprehensive Quantitative Finance Library of 15000 functions that can be used like a traditional Excel XLL addin, but without the need to disable automatic calculation because formula are refreshed by <a href="https://docs.microsoft.com/en-us/office/troubleshoot/excel/set-up-realtimedata-function">RTD</a> only when they change, and <strong>always</strong> in the background.</p>
<p>At any point the model the corresponding source code (model and Excel addin functions) can be generated from the Cephei menu and compiled to executable code for deployment to pricing server or <a href="https://en.wikipedia.org/wiki/Digital_twin">Financial Digital Twin</a>
Cephei models use the <a href="https://www.cepheis.com/CellFramework">Cephei Cell Framework</a> to mirror the automatic dependency tracking of Excel, but with parallel calculation within Excel or a server application like <code>Cephei.Orleans</code></p>
<h1>Sample</h1>
<p>The simplest example uses a fixed rate bond, entered interactively in Excel using Excel Formula wizard with standard evaluation and validation as each formula is entered.  All Cephei functions are prefixed with <code>_</code> with Mnemonics prefixed with <code>+</code> for model parameters and <code>-</code> for private formula that are not exported as properties or with Excel addin functions</p>
<h2>Spreadsheet formula</h2>
<p><a href="https://cepheis.blob.core.windows.net/$web/samples/Sample.xlsx">Sample.xlsx</a> open with  <a href="https://www.cepheis.com/Download/XL32">Cephei 32-bit</a>  or <a href="https://www.cepheis.com/Download/XL64">Cephei 64-bit</a>
<img src="/blog/media/blogs/Blogs/Cephei/Sample-n.png"></p>
<h4>Forumula view</h4>
<p><img src="/blog/media/blogs/Blogs/Cephei/sample-f.png"></p>
<h2>Generated Source Code</h2>
<pre><code>namespace Cephei.Models

open QLNet
open Cephei.QL
open Cephei.QL.Util
open Cephei.Cell
open Cephei.Cell.Generic
open System
open System.Collections

type FixedBond 
    ( Tenor : ICell&lt;Int32&gt;
    , Maturity : ICell&lt;Date&gt;
    , FixedAmount : ICell&lt;Double&gt;
    ) as this =
    inherit Model ()

(* functions *)
    let _Calendar = Fun.TARGET()
    let _Today = (value DateTime.Today)
    let _clock = Fun.Date1 (triv (fun () -&gt; int (_Today.Value.ToOADate())))
    let _PriceDay = _Calendar.Adjust _clock (value BusinessDayConvention.Following)
    let _DayCount = Fun.ActualActual1 (value ActualActual.Convention.ISMA) (value (null :&gt; Schedule))
    let _Quote = Fun.SimpleQuote1 (triv (fun () -&gt; toNullable (0.03)))
    let _Tenor = Tenor
    let _Frequency = Fun.Period2 (value Frequency.Annual)
    let _FlatForward = Fun.FlatForward _PriceDay (triv (fun () -&gt; _Quote.Value :&gt; Quote)) (triv (fun () -&gt; _DayCount.Value :&gt; DayCounter))
    let _Maturity = Maturity
    let _Coupon = cell new Generic.List&lt;double&gt;([| Convert.ToDouble(0.02); Convert.ToDouble(0.05); Convert.ToDouble(0.08)|]
    let _ExCoupon = Fun.Period1()
    let _Settlement = (value (Convert.ToInt32(0)))
    let _FixedAmount = FixedAmount
    let _Engine = Fun.DiscountingBondEngine (triv (fun () -&gt; toHandle&lt;YieldTermStructure&gt; (_-FlatForward.Value))) (triv (fun () -&gt; toNullable (True)))
    let _Schedule = Fun.Schedule _PriceDay _Maturity _Frequency (triv (fun () -&gt; _Calendar.Value :&gt; Calendar)) (value BusinessDayConvention.Unadjusted) (value BusinessDayConvention.Unadjusted) (value DateGeneration.Rule.Backward) (value false) (value (null :&gt; Date)) (value (null :&gt; Date))
    let _Bond = Fun.FixedRateBond _Settlement _FixedAmount _Schedule _Coupon (triv (fun () -&gt; _DayCount.Value :&gt; DayCounter)) (value BusinessDayConvention.ModifiedFollowing) _FixedAmount _PriceDay (triv (fun () -&gt; _Calendar.Value :&gt; Calendar)) _ExCoupon (triv (fun () -&gt; _Calendar.Value :&gt; Calendar)) (value BusinessDayConvention.Following) (value False) (triv (fun () -&gt; _Engine.Value :&gt; IPricingEngine)) _PriceDay
    let _clock = Fun.Date1 (triv (fun () -&gt; int (_Today.Value.ToOADate())))
    let _CleanPrice = _Bond.CleanPrice()
    let _DirtyPrice = _Bond.DirtyPrice()
    let _NPV = _Bond.NPV()
    let _Cash = _Bond.CASH()

    do this.Bind ()

(* Externally visible/bindable properties *)
    member this.Today = _Today
    member this.Quote = _Quote
    member this.Tenor = _Tenor
    member this.Frequency = _Frequency
    member this.Maturity = _Maturity
    member this.FixedAmount = _FixedAmount
    member this.clock = _clock
    member this.CleanPrice = _CleanPrice
    member this.Clock = _Clock
    member this.DirtyPrice = _DirtyPrice
    member this.NPV = _NPV
    member this.Cash = _Cash

#if EXCEL
module FixedBondFunction =

    [&lt;ExcelFunction(Name="__FixedBond", Description="Create a FixedBond",Category="Cephei Models", IsThreadSafe = false, IsExceptionSafe=true)&gt;]
    let FixedBond_create
        ([&lt;ExcelArgument(Name="Mnemonic",Description = "Identifer for the value")&gt;] 
         mnemonic : string)
        ([&lt;ExcelArgument(Name="__Tenor",Description = "reference to Int32")&gt;]
        Tenor : obj)
        ([&lt;ExcelArgument(Name="__Maturity",Description = "reference to Date")&gt;]
        Maturity : obj)
        ([&lt;ExcelArgument(Name="__FixedAmount",Description = "reference to Double")&gt;]
        FixedAmount : obj)

        = 
        if not (Model.IsInFunctionWizard()) then

            try
                let _Tenor = Helper.toCell&lt;Int32&gt; Tenor "Tenor"
                let _Maturity = Helper.toCell&lt;Date&gt; Maturity "Maturity"
                let _FixedAmount = Helper.toCell&lt;Double&gt; FixedAmount "FixedAmount"

                let builder (current : ICell) = withMnemonic mnemonic (new FixedBond
                                                            _Tenor.cell
                                                            _Maturity.cell
                                                            _FixedAmount.cell

                                                       ) :&gt; ICell
                let format (i : ICell) (l:string) = Helper.Range.fromModel (i :?&gt; FixedBond) l
                let source () = Helper.sourceFold "new FixedBond"
                                               [| _Tenor.source
                                               ;  _Maturity.source
                                               ;  _FixedAmount.source
                                               |]

                let hash = Helper.hashFold
                                [| _Tenor.cell
                                ;  _Maturity.cell
                                ;  _FixedAmount.cell
                                |]
                Model.specify 
                    { mnemonic = Model.formatMnemonic mnemonic
                    ; creator = builder
                    ; subscriber = Helper.subscriberModel&lt;FixedBond&gt; format
                    ; source = source 
                    ; hash = hash
                    } :?&gt; string
                        with
                        | _ as e -&gt;  "#" + e.Message
        else
            "&lt;WIZ&gt;"
</code></pre>
<h1>Download</h1>
<p>The Excel addin can be downloaded from <a href="https://www.cepheis.com/">Cepheis</a></p>
<ul>
<li><a href="https://cepheis.blob.core.windows.net/$web/XL32.zip">Excel (32-bit)</a></li>
<li><a href="https://cepheis.blob.core.windows.net/$web/XL64.zip">Excel (64-bit)</a></li>
<li><a href="https://github.com/channell/Cephei/">Source code</a>		
The <strong>public</strong> version of the addin includes basic telemetry to track errors and usage of functions by user (but not data) in order to assist with support during evaluation.  A release version without telemetry is available upon request</li>
</ul>
<h1>Summary</h1>
<p>Whilst Cephei can be used directly as a Quantitative Finance library for structuring, pricing and as a summary blotter, it is designed to be used:</p>
<ul>
<li>As a model editor for the prototyping with ticking market data, for saving directly to code that can be deployed without additional development</li>
<li>As a recipe editor for model parts that are generated an compiled for use within more complex models (the sample model encapsulates the additional objects (schedule, termsheet, pricing engine) needed for Quantilib</li>
<li>As a foundation for risk simulation models (the Cell framework is designed for massively parallel Monte Carlo simulation of Exposure for Realtime Risk (Cephei is named after <a href="https://en.wikipedia.org/wiki/Delta_Cephei">Delta Cephei</a> )</li>
</ul>
<p>Further information is available at <a href="https://www.cepheis.com/blog/cephei-ql">Cephei.QL</a> , <a href="https://www.cepheis.com/blog/introducing-cephei-xl">Cephei.XL</a> and <a href="https://www.cepheis.com/CellFramework">Cephei Cell</a></p>
<p>Contact <a>feedback@cepheis.com</a></p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:42:59 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/cephei-sample</guid>
    </item>
    <item>
      <title>Introducing Cephei.XL</title>
      <link>https://www.cepheis.com/blog/introducing-cephei-xl</link>
      <description><![CDATA[<h1>Background</h1>
<p>For more than a quarter of a century Microsoft Excel has been the preferred “desktop” for traders and fund managers to develop financial models for Bond and derivatives trades.  Excel provides integration for market data vendors to provide <strong>Real-Time Data</strong> (<a href="https://docs.microsoft.com/en-us/office/troubleshoot/excel/set-up-realtimedata-function">RTD</a>) directly into spreadsheets; and integration for advanced finance libraries to build models.
Spreadsheet models were/<em>are</em> a  key enabler for the development of models for financing contracts, but for more than ten years there has been programmes to replace these models with applications that are controlled by the product control, compliant with regulatory commitments and re-valued for risk exposure at market close.<br>
There has been considerable resistance to this process because the applications do not provide the flexibility of building on a desktop and tested with live prices.</p>
<h2>Another approach</h2>
<p>Cephei was conceived from the perspective that Excel is not <em>inherently evil</em> - it is unrevealed for the ability prototype and test models – but conversion to reliable models is difficult: <strong>If</strong> we can we automate the process of generating code from models, we don’t necessarily need to re-implement in impetrative languages.</p>
<h1>Cephei Solution</h1>
<p>The Cephei solution is replicate the good facets of spreadsheet, and design the finance libraries enable the <em>automatic generation</em> of code from models</p>
<ol>
<li>The <a href="https://www.cepheis.com/CellFramework">Cephei Cell Framework</a> mirrors the spreadsheet notion that a <code>Cell</code> can contain a value or a formula, and that the formulas and are updated automatically whenever an underlying value changes.  The <code>Cell</code> framework is faster than imperative calculation because calculations are performed in parallel</li>
<li><a href="https://www.cepheis.com/blog/cephei-ql">Cephei.QL</a> building block combining <a href="https://www.cepheis.com/CellFramework">Cephei Cell Framework</a> and <a href="https://www.quantlib.org/">QuantLib</a> open-source quantitative finance library</li>
<li><a href="https://www.cepheis.com/blog/introducing-cephei-xl">Cephei.XL</a> complete Excel addin providing access to all functions of <a href="https://www.quantlib.org/">QuantLib</a> from Excel</li>
</ol>
<h2>Cephei.XL overview</h2>
<p>Cephei.XL uses (<a href="https://docs.microsoft.com/en-us/office/troubleshoot/excel/set-up-realtimedata-function">RTD</a>) as an object cache to avoid the need for complex logic to track formula changes as functions are called over-and-over-and-over-again by the Excel calculation engine.
The Cephei Cell Framework ensures that when a formula is changed, all dependant cells are re-calculated in parallel with updates passed back to Excel as RTD data change notifications.
Irrespective of the complexity of the model, interactive performance of Excel (<em>apart from the initial start-up time of adding 15000 functions to Excel</em>) remains responsive – all calculation is performed in parallel by the <code>Cell</code> thread-pool.
While Cephei.XL can be used as an efficient Quant library that interoperates well with Bloomberg, it is designed from the foundation with code generation “save as code” in mind: Cephei.XL does not create handle-references, all functions take a Mnemonic parameter that is used as a property name when code is generated.</p>
<h2>Implementation</h2>
<p>Cephei.XL exports 15,000 worksheet functions from 600,000 lines of code, all of which start <code>_</code>, the simplest of which is <code>_clock()</code> and <code>_today()</code> which update every second or every time the machine clock rolls into another day.
All functions (except <code>_value(mnemonic)</code> and <code>_value_range(mnemonic, layout)</code>) return the handle provided.<br>
<code>_value_range()</code> layout is one of:</p>
<ol>
<li>“C” for column layout</li>
<li>“R” for row layout</li>
<li>“CT” for columns with tittles</li>
<li>“RT” for rows with titles
When the layout includes “T” for a complex object (e.g. <code>Bond</code>) all properties (dirtyprice, cleanprice, yield, cash, etc) are provide in a table.
The alpha code can be downloaded from <a href="https://www.cepheis.com/">Cepheis</a>
•	<a href="https://www.cepheis.com/Download/XL32">Excel (32-bit)</a>
•	<a href="https://www.cepheis.com/Download/XL64">Excel (64-bit)</a>
•	<a href="https://www.cepheis.com/Download/XL32Debug">Excel (32-bit debug)</a>
•	<a href="https://www.cepheis.com/Download/XL64Debug">Excel (64-bit debug)</a>
•	<a href="https://github.com/channell/Cephei/">Source code</a></li>
</ol>
<h2>Positioning</h2>
<p>Cephei.XL is designed to be used like an interactive editor – when the model is complete, you can click generate to translate to an F# model that can be deployed to a server (<code>Cephei.Orleans</code> will provide a “Financial Digital Twin” for hosting active cloud models).
Quantlib is a fairly comprehensive library of financial methods that are widely used <em>or copied</em> in Investment Banking, buy-side valuation or <a href="https://www.thegoldensource.com/solutions/banks-brokers/ipv-pruval/">IPV</a>.  It can be used directly or as proof-of-concept of how Quant addin’s <em>should</em> be done.</p>
<h1>Model Driven Architecture</h1>
<p><a href="https://www.cepheis.com/blog/cephei-ql">Cephei.QL</a> and   <a href="https://www.cepheis.com/blog/introducing-cephei-xl">Cephei.XL</a> are examples of the benefit of using model driven architecture: most of the code has been produced using code-generation from a <a href="https://sparxsystems.com/">Sparx Enterprise Architect</a> software model of the underlying Quantlib Functions.</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:43:29 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/introducing-cephei-xl</guid>
    </item>
    <item>
      <title>Enterprise Architecture global scalability</title>
      <link>https://www.cepheis.com/blog/enterprise-architecture-global-scalability</link>
      <description><![CDATA[<p>Enterprise Architecture covers a range of practices from business strategy through the structure and organisations of applications to detailed design using enterprise frameworks.  While Business, Applications and Technology Architecture are often developed independently to address different stakeholder needs, the value of modelling is enhanced when different views are reconciled through realisation, trace and dependency relationships.
In complex organisations, high-level taxonomies carry more influence when they represent the aggregation of application and services, while applications criticality is better understood when related to high-level functions and critical processes.</p>
<h1>A Universal Bank example</h1>
<p>A universal (covering Retail, Corporate, Investment and Wealth) commissioned a number of initiatives using <a href="https://sparxsystems.com/">Sparx Enterprise Architect</a> to model the organisation in distinct repositories</p>
<h3>Service Architecture</h3>
<p>Focused on common modelling of applications with common layout of charter, requirements; high-level-design, functional and detailed design.  Using a common repository allowed for dependencies between application and common frameworks to be highlighted.
Detailed design of fifteen thousand application components was captured using UML notation with <a href="https://www.sparxsystems.com/resources/mdg_tech/">MDG meta-model customisation</a> customisation for additional properties</p>
<h3>Process Architecture</h3>
<p>Focused on the processes performed across the organisation from client onboarding, through trading to risk management and financial accounting.  The Enterprise Process Model was organised around a hierarchical process taxonomy that provided a <em>Bank on Page</em> process view, with <a href="https://www.sparxsystems.com/resources/mdg_tech/">MDG meta-model customisation</a> customisation for reference to common applications, organisational hierarchy and other business taxonomies.
Detained process modelling of twenty thousand activities, decision gateways and events was captured using <a href="https://sparxsystems.com/enterprise_architect_user_guide/14.0/model_domains/bpmn_1_4.html">BPMN</a> notation.</p>
<h3>Functional Architecture</h3>
<p>Focused on the Functional Taxonomy to identify sponsorship for application and services, to coordinate the governance of change with escalation points to mitigate the impact of issues.
Functional Architecture provided the current and future state of the business and allowed common capabilities to be identified, and transitioned to standard functions/services.</p>
<h3>Domain Architecture</h3>
<p>Focused on end-to-end design for specific domains/applications without the imposition of taxonomies or structure; but with the freedom to mix function, data, process, application design as appropriate for each initiative.</p>
<h2>Enterprise Architecture</h2>
<p>The challenge for Enterprise Architecture was to bring together different views and enrich each viewpoint with context provided by the different models.  Rationalisation served to bridge the gaps between the viewpoints and provide fresh impetus to keep models current as design moved through implementation to maintenance. Rationalisation faced a number of challenges:</p>
<ul>
<li>Size and complexity of models/metamodel-customisation, tested and exceeded the capability <a href="https://sparxsystems.com/resources/share.html">XMI</a> transfer of repositories when projects did not share the same package hierarch, but had dependencies between projects</li>
<li>The need to retain domain-specific viewpoints, so that different aspects were not referenced inappropriately (such as a <code>Customer</code> Actor, component, table or class being referenced when <code>Customer</code> process lane was needed)</li>
<li>The need for <a href="https://www.sparxsystems.com/resources/mdg_tech/">MDG meta-model customisation</a> to continue to evolve to meet domain presentation needs</li>
<li>Inability to coordinate/schedule the suspension of modelling activities to allow consolidation</li>
<li>Performance of huge models for interactive modelling activities</li>
</ul>
<h3>Value</h3>
<p>While the impediment to enterprise rationalisation were considerable, the potential benefit of requirement traceability and enterprise lineage were also considerable.
<a href="https://sparxsystems.com/">Sparx Enterprise Architect</a> is almost unique in its ability to model an organisation from high-level Enterprise Architecture <em>and</em> detailed application design though to real-time interaction of timing sensitive application. Sparx Enterprise Architect allows an interbank-interface class to be related to the strategic function, requirement and data that is needed for regulatory reporting.</p>
<h1>Solution</h1>
<p>The initial solution was a batch database <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">ETL</a> process, intended as a one-time load, but evolved into the need for an <a href="https://www.cepheis.com/EA/Hub">Enterprise Hub</a> to provide continuous replication between models.
Today, the <a href="https://www.cepheis.com/EA/Hub">Enterprise Hub</a> provides continuous replication between domain-specific models, and an enterprise repository, applying viewpoint rules to changes:</p>
<ol>
<li>Status filters to prevent unfinished changes being replicated, and to separate <code>as-is</code> and <code>to-be</code> viewpoints</li>
<li>Content filters to prevent application or database content being replicated to a consolidated process viewpoint</li>
</ol>
<p>Enterprise Hub also allows for globally distributed repositories with local performance</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:44:15 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/enterprise-architecture-global-scalability</guid>
    </item>
    <item>
      <title>Implementing Cephei.QL</title>
      <link>https://www.cepheis.com/blog/implementing-cephei-ql</link>
      <description><![CDATA[<h1>Background</h1>
<p><a href="https://www.cepheis.com/blog/cephei-ql">Cephei.QL</a> is a large project, consisting of over two thousand classes and twenty thousand <a href="https://en.wikipedia.org/wiki/Closure_(computer_programming)">closures</a>. It has been implemented in F# because parallel executing needs to be functionally immutable to avoid co-mutation of data on separate threads, but is also a simpler problem when type inference can be implied in many places.</p>
<p>Whilst there is a huge amount of code that needs to be produced; with some exceptions, the pattern of mapping an underlying library to a higher level abstraction is common</p>
<h1>Overview</h1>
<p>This blog is about how model-driven architecture can be applied to the problem of creating a library that integrates a number of underlying sources.  Traditionally Architecture modelling tools approach this problem by translating classes to a metadata rich source where the use of templates allows the implementation to be derived at runtime.</p>
<p>For Cephei.QL, code generation has been used because:</p>
<ul>
<li>The target language (F#) is not supported by the current generation of software engineering tools.</li>
<li>The permutations are more complex than is supported by templates and would require a runtime code generator to be used.</li>
<li>Runtime generation is not appropriate for high-performance production environments.</li>
<li>The target library is intended to provide recipes for specific implementation, that can tailored to specific problems.</li>
</ul>
<p>While this example is concerned with generation of abstractions over a large financial library (library -&gt; cell models -&gt; serialisation -&gt; Excel integration), the problem is the same as implementing data-structures that need to be mapped from Javascript though application tiers to data persistence</p>
<h2>Components</h2>
<p>Sparx Enterprise Architect is the leading tool for software engineering because of the wide range languages supported and tools for <a href="https://en.wikipedia.org/wiki/Model-driven_architecture">Model-Driven-Architecture</a>.. In this scenario it is also extremely strong because the underlying repository supports SQL queries and therefore rich database integration.</p>
<p><a href="https://www.nuget.org/packages/EA.Gen.Model/">EA.Gen.Model</a> is a .NET class library (that we developed years ago) that uses the <a href="https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/entity-data-model">Entity Data Model</a> to provide a high-level object-graph view of the underlying Sparx Repository, with <a href="https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/introduction-to-linq-queries">Linq Queries</a>
There are many templating technologies based on the principles for web-server pages that generate HTML for browsers, but the one we use is <a href="https://docs.microsoft.com/en-us/visualstudio/modeling/code-generation-and-t4-text-templates?view=vs-2019">T4 Text Templates</a> because it can be used within a Build Server (<a href="https://docs.microsoft.com/en-us/azure/devops/pipelines/overview?view=azure-devops-2020">TFS</a>, <a href="https://www.jetbrains.com/teamcity/">Teamcity</a>, <a href="https://www.jenkins.io/">Jenkins</a>, etc) without having to configure open access.</p>
<h1>Implementation</h1>
<p>The full Visual Studio 2019 code for <a href="https://www.cepheis.com/Product/Gen">Cephei.Gen</a> is available from <a href="https://github.com/channell/Cephei/tree/master/Cephei.Gen/">GitHub</a>, but includes the C++ code generator that was previously used to wrap <a href="https://www.quantlib.org/">QuantLib</a> before switching to the C# port <a href="https://github.com/amaggiulli/qlnet">QLNet</a>. The <a href="https://github.com/channell/Cephei/blob/master/Cephei.Gen/NetModel/Class.cs">NetModel/Class.cs</a> provides the data model for classes, and <a href="https://github.com/channell/Cephei/blob/master/Cephei.Gen/NetQL/Class.tt">NetQL/Class.tt</a> provides template for the bulk of the Cephei.QL code.</p>
<p>After reverse engineering the source code into Sparx, the code generator can be run to generate F# code.</p>
<h1>Conclusion</h1>
<p>For a wide variety of problems, model driven architecture can be used to accrue greater value from the models we create during systems design, increasing productivity, and reducing errors introduced when coded by hand</p>
<p><img src="/blog/media/blogs/Blogs/Cephei/Cephei.QL.png"></p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:44:51 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/implementing-cephei-ql</guid>
    </item>
    <item>
      <title>Cephei.QL</title>
      <link>https://www.cepheis.com/blog/cephei-ql</link>
      <description><![CDATA[<h1>Overview</h1>
<p>With <a href="https://www.cepheis.com/CellFramework">Cephei.Cell</a> we introduced a Cell Framework that allows computationally intensive problems to be declared as a series of functional definitions that the runtime calculates in parallel.
<a href="https://www.nuget.org/packages/Cephei.QL/">Cephei.QL</a> applies the Cell Framework to the <a href="https://www.quantlib.org/">QuantLib</a> <a href="https://github.com/amaggiulli/QLNet">QLNet</a> quantitative finance library to provide a series of pre-canned model building blocks for all Quantlib classes, that can be assembled into complete models for a financial instrument or portfolio. The full source is available on <a href="https://github.com/channell/Cephei">GitHub</a></p>
<p>Each Quant class is wrapped as a <code>Model</code>, each property is wrapped as a <code>ICell&lt;&gt;</code> with each function wrapped as a function taking <code>ICell&lt;&gt;</code> parameters and returning an <code>ICell&lt;&gt;</code> wrapper of the return value.  There are three exceptions:</p>
<ol>
<li><code>IPricingEngine</code> is added to the constructor of Instruments that use a pricing engine to follow a functional idiom</li>
<li>Evaluation Date is added to the constructor of Instruments so that a common source of date is available to all properties/functions that need a date for pricing, and is common across all instruments that will be valued together.</li>
<li>Methods that do not return a value, instead return the reference object to allow them to be chained together as fluent function (e.g. <code>FedFund</code>.<code>AddFixing</code> return <code>FedFund</code>)
Despite the overhead of constructing Cells rather than evaluating functions imperatively, the overall performance of a non-trivial model is significantly quicker because evaluation is performed in parallel with rendezvous happening when prices need to be aggregate.</li>
</ol>
<h2>Rational</h2>
<p>Quantlib is not a natural library for functional wrapping (because of the internal observer/observable pattern), Cephei.QL demonstrates that any library can be wrapped for functional definition.
<code>Cephei.QL</code> (and <code>Cephei.Cell</code>) encapsulates the dependencies within model, and only exposes value cells that can be edited and function cells that provide results that <em>always</em> reflect the value sources, irrespective of the change.
Financial Models that are presented as a dictionary of values with <code>IObservale</code>/<code>IObserver</code> event linkage can be wired together without needing to know internal structure can be used as financial building blocks</p>
<h2>Purpose</h2>
<p>Cephei.QL provide a building block for Cephei.XL (that supports construction of models in Spreadsheets -that can be saved as F# code), and embedding in systems, including “Financial Digital Twins” for real-time-risk.</p>
<h1>Usages</h1>
<p>The <a href="https://www.nuget.org/packages/Cephei.QL/">Cephei.QL</a>nuget package includes Cephei.QL assembly including 2000+ models and a <code>cell</code> module that provides a functions to create models, together with Utility functions for <code>cell</code> to construct cells with type inference and <code>triv</code> for trivial (lookup) functions. Depending on the value of <code>Cell.Lazy</code> (false for construction at definition time) and <code>Cell.Parallel</code> (true for parallel execution).
Each model is declared with three sections:</p>
<ul>
<li>Parameters : references to the source cells.</li>
<li>Functions : declarative functions that perform calculations using other cells within the model.</li>
<li>Externally visible binding: cells and cell functions that can be bound and serialised to backing store.</li>
</ul>
<h2>Example</h2>
<p>This model is based on the QLLib Bond example that uses <code>Cephei.QL</code> as blocks to represent a small portfolio of Fixed Rate Bonds, with difference {tenor, coupon rates, payment frequencies, and yield rates} and allows the {Face Value, Quantity, Redemption} to be edited and provides a Market Clean Price.</p>
<p>External to the Bond model, two models are provided for Business wide properties and market conditions that are used to change the valuation through event propergation.</p>
<h3>Business Standards</h3>
<pre><code>type BusinessStandards () as this =
    inherit Model ()

    let accrualConvention           = value BusinessDayConvention.Unadjusted
    let paymentConvention           = value BusinessDayConvention.ModifiedFollowing
    let settlementDays              = value 3
    let dayCount                    = value (new ActualActual (ActualActual.Convention.ISMA) :&gt; DayCounter)
    let includeSettlementDate       = value (new System.Nullable&lt;bool&gt; (true))

    do this.Bind ()

    member this.AccrualConvention   = accrualConvention
    member this.PaymentConvention   = paymentConvention
    member this.SettlementDays      = settlementDays
    member this.DayCount            = dayCount
    member this.IncludeSettlement   = includeSettlementDate
</code></pre>
<h3>Market Condition</h3>
<pre><code>type MarketCondition 
    ( standards                     : BusinessStandards ) as this =
    inherit Model ()

    let toNullable (v : double)     = new System.Nullable&lt;double&gt; (v)

    let calendar                    = Fun.TARGET()
    let clockDate                   = value Date.Today;
    let convention                  = value BusinessDayConvention.Following
    let today                       = calendar.Adjust clockDate convention

    do this.Bind ()

    member this.Today               = today
    member this.Calendar            = calendar
    member this.ClockDate           = clockDate
</code></pre>
<p>This trivial model provides a clock date that is incremented in the text example, and calculates a cashflow date using a calendar and date adjustment convention.  Whenever the clock date is changed, the update to <code>Today</code> is sent to an cells dependant on this value</p>
<h3>Bond</h3>
<pre><code>type BondPortfolio 
    ( standards                     : BusinessStandards 
    , marketCondition               : MarketCondition
    ) as this =
    inherit Model ()

    let calendar                    = triv (fun () -&gt; marketCondition.Calendar.Value :&gt; Calendar)
(* … *)
    
    let makeBond issue length coupon  (frequency : ICell&lt;Period&gt;) yieldVal = 
        let today = marketCondition.Today.Value
        let dated = triv (fun () -&gt; today)      // don't reset on valuation date
        let nullDate = value (null :&gt; Date)
        let maturity = marketCondition.Calendar.Advance1 dated length years standards.PaymentConvention eom 
        let schedule = Fun.Schedule dated maturity frequency calendar standards.AccrualConvention standards.AccrualConvention dateGenerationRule eom nullDate nullDate
        let yieldCurve = triv (fun () -&gt; (makeYield yieldVal))
        let engine = Fun.DiscountingBondEngine yieldCurve standards.IncludeSettlement 
        let castEgnine = triv (fun () -&gt; engine.Value :&gt; IPricingEngine)
        let exCouponPeriod = value (null :&gt; Period)
        let b = Fun.FixedRateBond standards.SettlementDays faceAmount schedule coupon bondDayCount standards.PaymentConvention redemption issue calendar exCouponPeriod calendar convention eom castEgnine marketCondition.Today
        b.Mnemonic &lt;- "B" + id.ToString()
        id &lt;- id + 1
        b

    let bonds = 
        seq {for l in lengths do
                for c in coupons do 
                    for f in frequencies do
                        for y in yields do
                            (l,c,f, y)}
        |&gt; Seq.map (fun (l,c,f, y) -&gt; makeBond marketCondition.Today l c f y)
        |&gt; Seq.toArray

    let cleanPrices                 = bonds |&gt; Array.map (fun i -&gt; i.CleanPrice) 

    let cleanPrice                  = cell (fun () -&gt; cleanPrices |&gt; Seq.fold (fun a y -&gt; a + y.Value * quantity.Value) 0.0)
        
    do this.Bind ()

    member this.Amount              = faceAmount
    member this.Quantity            = quantity
    member this.Redemption          = redemption

    member this.CleanPrice          = cleanPrice
</code></pre>
<p>This model uses the business standards and market conditions and a set of permutations to build Fixed Rate Bonds by constructing schedule, yield curve and engine.. the user of the model does not need to know the intermediate steps that Quantlib uses to build a Bond.  Refactoring the Yield Curve functionality to be shared via market conditions is a simple task that is transparent to users</p>
<h5>Test</h5>
<pre><code>    [&lt;TestMethod&gt;]
    member this.TestLazy () =

        let lots = 
            seq { for n in 1..60 do
                    new BondPortfolio (standards, market)}
            |&gt; Seq.toList

        let r = 
            seq { for c in 0..100 do
                    market.ClockDate.Value &lt;- market.ClockDate.Value + c
                    let cleanPrice = lazy (lots |&gt; List.fold (fun a y -&gt; a + y.CleanPrice.Value) 0.0 )
                    Console.WriteLine ("Lazy, {1}, {0}", cleanPrice.Value, market.Today.Value)
                    cleanPrice
                    } |&gt; Seq.toArray

        Assert.IsTrue(true);
</code></pre>
<p>The test case generates 60 portfolios, and retrieves the clean price for 100 time points, but the event could be market prices, or a what-if of changing settlement period.  Enabling <code>do Cell.Parellel &lt;- true </code> reduced runtime by a factor of four on my workstation.
The key take-away is that the cost of profiling calculations (on background threads) still results in shorter runtime on multi-core computers that are now common.</p>
]]></description>
      <pubDate>Sat, 13 Jul 2024 21:45:25 GMT</pubDate>
      <guid isPermaLink="true">https://www.cepheis.com/blog/cephei-ql</guid>
    </item>
  </channel>
</rss>