History & Delta
History
Hiperspace efficiently stores versions of entities, so the latest version always appears first in the store, avoiding the need to lock versions for update. SubSpace
provides an AsAt parameter to view History at a point-in-time (including indexes).
Many databases implement temporal data by having a master table and a history table: When a row is updated the before image is copied to the history table before writing, providing fast access to current records and much slower access to history (often because the history table does not have indexes). Historical queries are achieved by unioning the main table and the history table and applying condition that the target date must be within the valid-from and valid-to dates for the row. This is great when the vast majority of queries are for the current date but poor, when history needs to be considered.
Hiperspace takes a different approach, and stores versions inline with the current view. When stored the AtAt
timestamp is transformed from number of milliseconds since the birth of Christ to "eternity - timestamp" (ulong.MaxValue - (ulong)version.Ticks
) so that the most current version always appears first in the ordered index of the underlying key/value store.
For a scan of all people, once a value is found that meets the AsAt
timestamp, the index is searched for the next value higher than key value, resulting in minimal overhead of storing all versions together.
For historical (SubSpace opened with an 'AsAt' parameter) queries, versions are visited until a valid value is found
Name░░ | AtAt░░░░ | Current Visit░ | History Visit |
---|---|---|---|
Adam | 23/07/24 | 1 | 1 |
Adam | 22/07/24 | 2 | |
Adam | 21/07/24 | ||
Adam | 20/07/24 | ||
Lucy | 23/07/24 | 2 | 3 |
Lucy | 22/07/24 | 4 | |
Lucy | 21/07/24 | ||
Lucy | 20/07/24 | ||
Mark | 23/07/24 | 3 | 5 |
Mark | 22/07/24 | 6 | |
Mark | 21/07/24 | ||
Mark | 20/07/24 |
Without AtAt
parameter to SubSpace
three rows are visited, but for AsAt
22/07/24 six keys are visited.
Contrast this with a normal temporal table that must union the current and history for the query and scanning all values of history before applying residual condition.
The other advantage of the Hiperspace approach is that there is no locking when updating a value, and no update to indexes. Even in the trivial Person example Person as two alternate key indexes. A convetional temporal table would have:
- Lock Row
- Read Person
- Read MotherChild index
- Read FatherChild index
- Write Person (if changed)
- Write MotherChild index (if changed)
- Write MotherChild index (if changed)
- Write History
- Unlock row
For Hiperspace this would be a lock-less write
- Read Person
- Write Person (if changed)
- Write MotherChild index (if changed)
- Write MotherChild index (if changed)
Additional space would be used because the historical indexes are not updated, but avoiding updates means that the rows are always appended, and the store does not need to keep space between rows for updates. Lock-less updates avoid the contention that slow down high-performance databases.
Because rows are not updated, GenerationSpace can be used to aggregate a number of historical Hiperspaces for current, daily close, weekly-close, monthly-close, yearly-close; without the need to tombstone updated entries. SessionSpace can be used to efficiently rool-up history if it is definitely not required.
Delta
Hiperspace `SubSpace' provides a Delta From parameter that filters all versioned objects that have not changed since this time. This is intended to support continuous aggregation for OLAP, where only the delta from the timestamp is included in the view of the space.
Delta from is supported by the @DeltaIndex
property (Indexes objects by datetime stamp) and {deltasum()
, deltamax()
, deltamin()
, deltacount()
} functions.
When an Entity / Segment / Aspect has the @DeltaIndex
property, and additional DeltaIndex is created that contains the timestamp and Element key. This index is used whenever the SubSpace
is opened with the DeltaFrom
parameter.