Thursday, August 26, 2010

InfoQ: Graph Databases, NOSQL and Neo4j

Not-only-SQL "databases" are, along with HTML5, are basis for "next big thing" (or bubble) in technology...

While the first objective of non-relational those data systems is "horizontal scalability" to be able to handle very large amount of data by using commodity systems, along the way they also change way of thinking about the data storage and APIs.

Currently there is a wide variety of NoSQL databases, much more diverse than SQL DBs. But there is something common they share: there is "theory" behind distributed systems, similar to relational theory behind SQL databases.

CAP theorem can also help classify various NoSQL databases...

  • Strong Consistency: all clients see the same version of the data, even on updates to the dataset - e. g. by means of the two-phase commit protocol (XA transactions), and ACID,
  • High Availability: all clients can always find at least one copy of the requested data, even if some of the machines in a cluster is down,
  • Partition-tolerance: the total system keeps its characteristic even when being deployed on different servers, transparent to the client.

    The CAP-Theorem postulates that only two of the three different aspects of scaling out are can be achieved fully at the same time