Saturday, February 15, 2014

Graph Databases & Object people vs AI people

This post goes way back in time. I was working with a client in Seattle, and had just "got the OO bug". I was seeing everything in terms of objects, was reasoning from the intension and delighted that every software I created was a member of a "Class", and that all I needed to know was which class something belonged to and I knew what it could "do". It seemed so orderly, so logical and at some gut level, so wrong. It didn't fit with my mental model of the world, but I couldn't really get a handle on why. Until I met John H.
Now John came from the artificial intelligence world and he had a tendency to reason from the extension - ie the instances of things, rather than from what "Class" they belonged to. I railed against this, and all references to frames and other terms that were important to the AI practitioners. There I was trying to learn a different approach and now someone suggests that may not be right either. (Whatever right means). So to him, the idea of inheritance wasn't so much at the class level, but more what I would have called instantiation. In other words some notion of an instance being created from some other instance (perhaps) or just kind of arbitrarily. It all felt so foreign.
Also, at the time I taught a lot of data modeling - and was somewhat dissatisfied with the state of that art as well. (aside: I must have been no fun to hang out with - everything that I was learning was unsatisfactory) because I couldn't specify the constraints well in the data models. It became too easy to overly generalize, or overly specialize a model without really having a good idea what was going on - and then there was time. How do we do data modeling taking time into account? A really hard problem in some cases. Modeling Roles was tricky because we like inheritance and hierarchies as organizing principles. But with roles those principles become a whole lot harder.
Peter Chen's (Professor Chen was one of my sponsors for my "Expert in Field" visa application in 1984)  E/R modeling notation started to help me because it properly sorted out implementation details(keys, foreign keys especially) from the necessary business concepts. Relationships became First Class (and could be reified) so that properties belonging to a relationship could be described. We could easily see where the value production was (matched to creation of associative entities) vs the cost propositions (managing the static, or what is now called "Master" data).
All of this mental discomfort and anguish finally came together this week when I started looking at Graph Databases (and specifically Neo4J).
Suddenly the ideas of E/R from Dr. Chen, the navigational simplicity of walking a graph, the "multiple typing" of data elements started to come together into a cohesive whole. Not so much for managing the data of record, but for providing a safe place where it could be analysed, traversed, understood.
So I am definitely feeling better about the data that I have to manage and worry about. Clojure + Neo4J, here I come.....
Oh and it is time to dust off this gem too..Logic Algebra and Databases by Peter Gray published in 1984. Maybe I will finally understand it!

No comments:

Post a Comment