Hiperspace is an Object technology that uses a key-addressable store to expand an application data-model beyond the limits of memory that can be directly referenced in main memory. Elements are not duplicated or changing to match database shapes.
Elements are serialized directly using “Protocol Buffers” to and from key/value structure for storage in memory stores including CXL expanded and pooled memory, shared cache , local SSD or key-value durable databases. Elements that are not currently being used are released from main memory, and transparently (and quickly) reloaded when referenced. Memory stores allows exabytes of data to be addressed.
Hiperspace uses compile-time code generation to map domain Elements to key/value stores, with lazy loading of references, and Entity Framework consistent context for complex queries
Modern high-performance systems avoid mutability for concurrency, but emulate the experience of mutability for applications. In memory “copy of right” to avoid partial changes, while databases write to new locations and update indexes once complete.
Hiperspace uses two techniques to provide the appearance of mutability, whilst retaining the lockless concurrency. Aspect
for single properties added later, and Segment
for multiple values added as needed.
Immutability avoids the need for complex database systems at runtime.
Fundamental Review of the Trading Book is a challenging financial services regulation because it requires the historical retention of information for back testing of model changes. Back Testing provides banks and regulators with confidence that value at risk projections are consistent with observed risk once time movesd on. The standard approach is to w arehouse each daily datasets separately in a data lake and reuse as needed Hiperspace addresses the need using:
Hiperspace uses the fasted available serialization technology to convert in-memory objects to key/value for memory drivers. Summary or simulation models that are computationally intensive to build can be stored in local hiperspace and re-used as needed, irrespective of size (when large CXL or SSD is used).
Hiperspace uses Low latency LSM stores such as RocksDB for storage, using performant and efficient serialization.
Hiperspace stores arbitrarily complex objects from simple key/value facts to arbitrarily complex hierarchical documents that can run to many gigabytes in size for each object. Unlike conventional document databases, a document can be stored as:
segment
Where a conventional document database will update an entire document, Hiperspace can store only the parts that have changed. For a complex contract document whose price changes many times, the price can be marked as a version segment
, with only the additional price being stored for each change.
With Hiperspace rules like “a Customer name is mandatory” are not enforced as constraints when an Element is created, but using two complementary features:
Hiperspace supports multiple versions of data schema through Views that allow the structure of an entity to appear to be updated without altering the data. Hiperspace support evolution in two ways:
Aspect
extents that are not known by older code and are effectively stored as addendum with new codeAdding an additional Aspect
, does not changed the stored structure of an entity because it is stored as an addendum - older code is unaware that the aspect
is available
entity Instrument (…) {…} [RWA : Valuation];
If an Aspect
needs a different implementation for different versions of the schema, views can be used
entity Instrument #1 (…) {…};
becomes
view Instument #2 {…, IPV};
entity InstrumentV1 = Instrument () #1 (…) {…} [IPV = retroPV(Price)];
entity InstrumentV2 = Instrument () #3 (…) {…} [IPV : Valuation];
Older code is unaware of the change, while newer code adds to InstrumentV2, reading Instrument yields all entities that present the view.
Version evolutions uses segments to contain each version. Because versioning is such as common case, there is build in support for Versioned
entity Instrument (…) {…} [TermSheet : Versioned<ByDate<TermSheet>];
In this scenario TermSheet
is a complex document that uses the Versioned<>
segment
(with a ByDate<>
timestamp) and stores every version in a segment
. The Value property always returns the latest, and Versions is a set of dated historical Termsheets.
As-at historical views are possible using horizon filters
The core runtime Hiperspace component is very small (20k), containing the minimal classes and interfaces that are needed to connect a domain model to an underlying storage driver. Hiperspace includes a utility driver (GenerationSpace) to chain together multiple read drivers with a single write driver to allow different storage tiers to be used for historical data
All drivers inherit from the HiperSpace abstract base class to insulate domain models from the underlying technology
RocksDB is a remarkable technology, originally developed by Google (LevelDB) and optimized by Facebook for absolutely lowest possible latency writing to SSD devices. RocksDB used Log-structured-Merge (LSM) to stream updates while maintaining fast key access. It is used both as a key/value database, and also as a driver for relational-databases, message-stores, blockchain and various analytical services. The use of LSM optimizes performance and life of SSD devices. Hiperspace.Rocks uses RockDB to store elements in durable SSD memory
The Heap driver provides the simplest hiperspace, storing objects in the managed process heap, it exists for testing purposes, but also for benchmark performance of other drivers. The Heap driver uses more memory, and is slower than the Rocks driver.
The Redis driver uses the shared in-memory caching technology provided as a service in Azure, AWS and GCP and other cloud platforms. For durable caching, Redis used RocksDB to optimize performance of SSD devices – for this reason Redis should only be used for transient elements.
At the cutting edge of technology is “High Bandwidth Memory” and “Processing-in-Memory” that brings together
PIM is intended for the huge quantities of data need for AI training, but can be applied to any data-processing problem that uses key/value access. While PIM is currently niche, it will provide a foundation for a future generation of high-performance databases. Hiperspace.PIM will provide is bridge from the high-level view of Hiperspace to emerging KV-SSD
HiLang is a minimal high-level language to describe the schema of a domain, taking inspiration from protobuf (.proto models) for hierarchical structures and SQL DML for entities, relations and views.
Elements can have keys (…), values {…} and extensions […] of arbitrary complexity, an example shows the benefit of the language:
entity Customer = Node (SKey = Skey, Name = Name, NodeType = "Customer" ) #100
(
Id : int #1
)
{
Name : string #2,
Address : Address #3,
SIC : int #4,
valid = Id = null || Name = null || Limit = null ? false : true #5
}
[
Accounts : Account (Owner = self) #105
Orders : Order (Customer = self) #106
];
In this example Customer can viewed as a Node, has an Id key, several data properties, a derived property for validation, and extension relationships to Account and Orders. Address is either stored with Customer, or as a key-reference (depending on declaration of Address). Hilang distinguishes between key and values to enable code generation of serialization, and lazy loading of referenced elements. Extensions appear as properties Customer with lookup code behind.
"#2" indicates the proto number used to serialize Name to and from the store.
The Hilang language parser/validator/transform/generation is implemented in F# with Roslyn Source Generator integration for simple integration into development projects.
.NET developers (the primary target) can include hilang files within a standard C# project with a reference to the Hilang nuget package with any source errors highlighted directly within the development environment.
Compile-time generation enables older versions of code to reference to the stored data, while newer versions use updated (but compatible) schema.
For targets other than .NET, hilangc is a command-line program to generates source code for {Java, C++, XMI, documentation}