Saturday, May 31, 2014

ideas: Write-Only Dataase: Datomic

Rich Hickey,  creator of Clojure programming language has also designed a matching database. Datomic has some quite special properties, based on principles of simplicity. By the way, he also gave a special presentation to explain difference of "symple" and "easy".

Datomic Content on InfoQ
Datomic is based on facts called "datom" (like "data atom" :) that are sets of:
  • Identity of the Entity
  • Attribute / property
  • Value (could be simple or an array also)
  • Transaction, that defines time and may also include creator etc.
  • (plus "Add/Retract" flag?)
All data in Datomic are represented as a tree of such values.
The only "schema" are definitions of Attributes.

The crucial distinction from most of other information systems is the requirement to include time (in the form of transaction). 
That is based on observation that any information is not complete without defining when it is added to the system.
In this case that is crucial, since "changes" are in fact new assignments of values to same Identity and Attribute, just at different time.

For example, field "Address" can have one address at one time, and some time later when the person moves, a new value can be assigned.  The old value still stays in the system, to preserve consistency of the system, and enable easy historical queries, based on any snapshot of database state in previous time.

Based on described "information architecture" the system separates function of "write" to system, that is transactional and very scalable, since it does not do anything else. The read from the system is enabled by separate  "readers", that are again simple and could be added as needed to handle the load.

There are many other nice properties, such as ability to represent queries / program as data (that is based on Clojure's Lisp heritage), and nice ability to match various data models, including
  • relational
  • object oriented
  • key-values
  • multi-value (arrays) properties
Datalog is a truly declarative logic programming language that syntactically is a subset of Prolog. It is often used as a query language for deductive databases

The Architecture of Datomic: @ InfoQ
"This representation has obvious similarities to the Subject/Predicate/Object data model of RDF statements. However, without a temporal notion or proper representation of retraction, RDF statements are insufficient for representing historical information. Being oriented toward business information systems, Datomic adopts the closed-world assumption, avoiding the challenges of universal naming, open-world, shared semantics etc of the semantic web. A Datom is a minimal and sufficient representation of a fact."

Note: I think that "add/remove" is not necessary,

The Architecture of Datomic

The Impedance Mismatch is Our Fault @ InfoQ
Stuart Dabbs Halloway explains what the impedance mismatch is and what can be done to solve it in the context of RDBMS, OOP, and NoSQL.

Deconstructing the Database @ InfoQ

Datomic - Home

download Datomic (free)
while the technology is striving to be simple, versions are a bit "complected"
presumably to help business...

Datomic @ YouTube

Datomic is almost perfect data management platform, with one significant shortcoming: it does not come with data synchronization tool for distributed environment, and identity resolution. That is for a good reason: to keep it simple.

There is likely a possibility to combine that platform with Semantic-Web like ontology based brokers.

The key issue, in my opinion, is that the such nice system is not separated from its "abstract" information model. It would be nice to have a simple specification, such as HTML/HTTP is for WWW, and then prominent implementation(s)...

No comments: