Big Data - Use as a Predictor? Or Not?Dec 22, 2015
I love reading articles like this http://www.nytimes.com/2014/04/07/opinion/eight-no-nine-problems-with-big-data.html?_r=0 because it both prompts positive thought (how to make it work) and at the same time reinforces the complexity of the environment we live in, especially when it comes to developing leading factors to improve safety and performance.
Most people are aware of the linear view of causality (if this, then that) and in a number of scenarios, it is indeed applicable. However, the majority of incidents and accidents involve humans and as a consequence I believe that this linear view is too simplistic.
My opinion is that it should be considered more like an object that requires a critical mass of contributory factors before the adverse event happens. Unfortunately, we aren't particularly great at spotting where those little bits and pieces are coming from to make up that critical mass. Sure we can spot the big ones, but you can get the same effect if you add lots and lots of little factors together and as we know, humans are poor at spotting small changes.
Whilst Big Data might be able to provide some correlation between those small, almost indiscernible, factors coming together, two things are needed before we can use it to prevent future incidents from happening: detail of the context which means it needs to be captured somehow and some way of validating the outputs to prove that it wasn't just a fluke that the correlation lined up with the causality. Detail takes time to capture and humans are naturally lazy, and I am not sure how you validate such a system without the feedback loop impacting behaviour e.g. risk perception and acceptance is dynamic because once we recognise something we change our behaviours to minimise the risk!
There was a recent podcast from Todd Conklin with a guest of his who was using Big Data to determine prescribing habits of physicians in the US, and therefore there was a suggestion that it might be possible to do the same in safety and performance. My main criticism for something like this is that a doctor is given a certain number of bound options in both disease and drugs, and therefore over time that single doctor's data and behaviour can be assessed. Now consider multiple individuals from different cultural backgrounds with different perceptions of risk and behaviour in dynamic environments, what are the possible permutations to a: determine correlation, and then more importantly, b: what are the chances that those same parameters will be encountered again?
I am sure that Big Data will be able to help predict leading factors at some point, but I also believe there are significant improvements needed if we are to use them as predictive means in terms of human behaviour and incident reduction. Supervisors can observe a scene and see behaviours, then relate those observations to their experience, and subsequently have a good idea what is going to happen. However, that is most likely because they have had years of watching people...how do you teach such experience to an AI system when these observations are not written down?!