Saturday, May 31, 2014

2014 Internet Trends

A lot of interesting data and observations...

slides: 2014 Internet Trends — Kleiner Perkins Caufield Byers: by Mary Meeker

ideas: dual touch phones!

brilliant design sketches for interactivity using the back of the phone as a touch-sensitive input device.
If there is a patent, it could make somebody rich...
Otherwise, it will make world a better place!

Super Fun Happy Time Blog -:

double tap on touch pad

game played with double touch phone

Free Data Mining Books

12 Free (as in beer) Data Mining Books | christonard:

Each of these is free-as-in-beer, which means you can download the complete version without expectation for anything in return. I think most of them are available for purchase as well, if you prefer a hard copy. Some of them include code samples in R, Python or MATLAB.

Bitcoin: $300 M / day and growing

Bitcoin Set to Overtake eBay's PayPal in Transaction Volumes:

"bitcoin is fast establishing itself as the currency to use globally and instantly to make purchases or payments over the internet recording nearly $300m (£178m, €220m) daily in transactions. 

 "Whenever you have an instrument that trades over 300 million US dollars a day, it must be recognized," Peter Tasca, CEO of Laureate Trust, said in a statement. 

 "The digital currency works, Bitcoin has greater volume transactions than Western Union and we anticipate it will overtake PayPal later this year."

PayPal processes payments totalling $315.3m every day, according to Statistic Brain.


distributed database: cockroachdb

cockroachdb/cockroach · GitHub:

"Cockroach is a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services."

book: DevOps: A Software Architect’s Perspective

An interesting way to develop a book and receive feedback while in progress.
Pages shared as slides (images) @ SlideShare

DevOps: A Software Architect’s Perspective:
The book will be later published through a publisher

ideas: Write-Only Dataase: Datomic

Rich Hickey,  creator of Clojure programming language has also designed a matching database. Datomic has some quite special properties, based on principles of simplicity. By the way, he also gave a special presentation to explain difference of "symple" and "easy".

Datomic Content on InfoQ
Datomic is based on facts called "datom" (like "data atom" :) that are sets of:
  • Identity of the Entity
  • Attribute / property
  • Value (could be simple or an array also)
  • Transaction, that defines time and may also include creator etc.
  • (plus "Add/Retract" flag?)
All data in Datomic are represented as a tree of such values.
The only "schema" are definitions of Attributes.

The crucial distinction from most of other information systems is the requirement to include time (in the form of transaction). 
That is based on observation that any information is not complete without defining when it is added to the system.
In this case that is crucial, since "changes" are in fact new assignments of values to same Identity and Attribute, just at different time.

For example, field "Address" can have one address at one time, and some time later when the person moves, a new value can be assigned.  The old value still stays in the system, to preserve consistency of the system, and enable easy historical queries, based on any snapshot of database state in previous time.

Based on described "information architecture" the system separates function of "write" to system, that is transactional and very scalable, since it does not do anything else. The read from the system is enabled by separate  "readers", that are again simple and could be added as needed to handle the load.

There are many other nice properties, such as ability to represent queries / program as data (that is based on Clojure's Lisp heritage), and nice ability to match various data models, including
  • relational
  • object oriented
  • key-values
  • multi-value (arrays) properties
Datalog is a truly declarative logic programming language that syntactically is a subset of Prolog. It is often used as a query language for deductive databases

The Architecture of Datomic: @ InfoQ
"This representation has obvious similarities to the Subject/Predicate/Object data model of RDF statements. However, without a temporal notion or proper representation of retraction, RDF statements are insufficient for representing historical information. Being oriented toward business information systems, Datomic adopts the closed-world assumption, avoiding the challenges of universal naming, open-world, shared semantics etc of the semantic web. A Datom is a minimal and sufficient representation of a fact."

Note: I think that "add/remove" is not necessary,

The Architecture of Datomic

The Impedance Mismatch is Our Fault @ InfoQ
Stuart Dabbs Halloway explains what the impedance mismatch is and what can be done to solve it in the context of RDBMS, OOP, and NoSQL.

Deconstructing the Database @ InfoQ

Datomic - Home

download Datomic (free)
while the technology is striving to be simple, versions are a bit "complected"
presumably to help business...

Datomic @ YouTube

Datomic is almost perfect data management platform, with one significant shortcoming: it does not come with data synchronization tool for distributed environment, and identity resolution. That is for a good reason: to keep it simple.

There is likely a possibility to combine that platform with Semantic-Web like ontology based brokers.

The key issue, in my opinion, is that the such nice system is not separated from its "abstract" information model. It would be nice to have a simple specification, such as HTML/HTTP is for WWW, and then prominent implementation(s)...