Consistency and Referential Integrity


When relational databases were introduced at the end of the 1970's a principal challenge was the maintenance of consistency between updates of summary values and the detailed rows that they are related to. An order row would typically include a Cost (Balance) column that reflected the sum of all related Costs from order lines. Consistency is maintained through transaction scope that would ensure:

  • Order and order_lines would be created all-or-nothing.
  • Update/Insert/Delete of order_lines would include transactional updates to the Order summary (commonly implemented with database triggers).
  • Updates to product.price would be applied to order_line and order in a transaction (again by triggers)

It quickly became apparent that referential integrity also needs to be provided within transactions to prevent:

  • Deletion of orders that did not also include deletion of order_lines.
  • Deletion of products that were referenced by order_lines.

Transactional integrity is the main reason that relational databases perform poorly in distributed scenarios and require separation of client and server in separate tiers and complex lock managers within the Servers.


Hiperspace takes a very different approach to consistency, that reflects the changing performance characteristics of modern computing:

  • Elements are immutable (but can be versioned)
  • Summaries can be calculated in real-time at the point of observation, and do not need to be stored
  • Invalid data can be hidden by horizon filters to provide a consistent view of data.
  • Distributed Transactions are rarely implemented with distributed locks but instead use Optimistic concurrency control (usually by a change-date column) that ensures that a row has not been updated between read and write of the update.

Hiperspace uses the immutability of elements to ensure that queries are always consistent to a point in time, and does not need transactions to ensure consistency.


Worked example

The hilang schema for Sales.Order, Sales.OrderLine, and Sales.Product below provides the domain model described above

entity Sales.Order : Versioned (Id : Int32)
  [Lines : Sales.OrderLine, Cost = sum(Lines.Cost)];
	
segment Sales.OrderLine : Versioned (Line : Int32) 
  { Product : Sales.Product,
    Units : Decimal, Cost = Units * Product.Price };
		
entity Sales.Product : Versioned (In : Int32) 
  {Name : String, 
   Price : Decimal }
  [OrderLines : Sales.OrderLine (Product = this) /* implicitly creates an index on OrderLine */];

with Horizon filter

public override Horizon[]? Horizon =>
[
    // hide deleted 
    new Horizon<Sales.Order>((order, context, user, read) => order.Deleted != read),
    new Horizon<Sales.OrderLine>((line, context, user, read) => line.Deleted != read),
    new Horizon<Sales.Product>((product, context, user, read) => product.Deleted != read),

    // consitency 
    new Horizon<Sales.Order>((order, context, user, read) => order.Deleted == false || (order.Lines.Count == 0 && order.Deleted)),
    new Horizon<Sales.OrderLine>((line, context, user, read) => line.owner != null && line.Product != null),
    new Horizon<Sales.Product>((product, context, user, read) => product.Deleted == false || (product.OrderLines.Count == 0 && product.Deleted )),
];

provides a consistent view of data and referential integrity without the need for transactional locking (provided you Bind Sales.OrderLine before Sales.Order):

  • Sales.Order, Sales.OrderLine, and Sales.Product are all versioned, and appear to remove values that have been deleted by setting the Deleted field to true.
  • Sales.OrderLine cannot be created (or versioned) if the referenced element is missing from the domain view.
  • Sales.Order cannot be deleted if there are any Sales.OrderLine elements in the domain.
  • Sales.Product cannot be deleted if there are any Sales.OrderLine elements in the domain.
  • Sales.OrderLine.Cost is calculated on-demand from the current Product
  • Sales.Order.Cost is calculated on-demand from sum of all Sales.OrderLine.Cost
  • Concurent update of Sales.OrderLine.Product and Sales.OrderLine.Unit by different sessions, is blocked because the second update does not have a AsWas value that is the same as the durable AsAt timestamp.

The hilang schema is equivalent to the SQL schema:

CREATE TABLE Orders
(
	Id INTEGER NOT NULL
);

CREATE TABLE Order_Lines
(
	Order_Id INTEGER NOT NULL,
	Line INTEGER NOT NULL,
	Units NUMERIC(5,3) NULL,
	Product_Id INTEGER NULL
);

CREATE TABLE Products
(
	Id INTEGER NOT NULL,
	Name VARCHAR(255) NULL,
	Price MONEY NULL
);

ALTER TABLE Order_Lines ADD CONSTRAINT PK_Order_Lines
	PRIMARY KEY (Order_Id,Line);

CREATE INDEX IXFK_Order_Lines_Products ON Order_Lines (Product_Id ASC);

ALTER TABLE Orders ADD CONSTRAINT PK_Orders
	PRIMARY KEY (Id);

ALTER TABLE Products ADD CONSTRAINT PK_Products
	PRIMARY KEY (Id);

ALTER TABLE Order_Lines ADD CONSTRAINT FK_Order_Lines_Orders
	FOREIGN KEY (Order_Id) REFERENCES Orders (Id) ON DELETE Set Null;

ALTER TABLE Order_Lines ADD CONSTRAINT FK_Order_Lines_Products
	FOREIGN KEY (Product_Id) REFERENCES Products (Id) ON DELETE Set Null;

CREATE VIEW V_Orders AS
  SELECT o.*, SUM (ol.Unit * p.Price) AS Cost
  FROM Orders AS o 
  INNER JOIN Order_Lines AS ol ON o.Id = ol.Order_Id
  INNER JOIN Products AS p ON ol.Product_Id = p.Id
  GROUP BY o.Id;

CREATE VIEW V_Order_Lines AS
  SELECT ol.*, ol.Unit * p.Price AS Cost
  FROM Order_Lines AS ol
  INNER JOIN Product AS p ON ol.Product_Id = p.Id;

Summary

By providing consistency without the need for transactions Hiperspace can provide a much higher throughput of business activity + the historical view can be viewed, by opening the domain space using the AsAt parameter, with fast access (because the index on OrderLine for Product is not changed with versions). Some Relational Databases can replicate Hiperspace history with temporal tables, but the schema is much more complex.

Hiperspace fast on-demand loading of references and of caching ensures that referential integrity can be enforced on the client.

Copyright © Cepheis 2024