My two cents on anomaly detection, AI neural networks and the future of all

“All models are wrong but some of them are useful” (George Box).

Data sciences is just a new name for some old math with new computer power. Yet the hype is not totally unjustified.

The market

One the supply side: Cheap computing power, open source software, huge data-sets availability and knowledge accessibility is making it a whole new ball game.

On the demand side of the curve: Organizations are dealing with growing amounts of data as machine instrumentation is growing, and IOT is being established as well the eco systems that are generating revenues directly from data collection and analysis (Google, Facebook). Data driven decision making has well established track-record, yet the immense quantity, diversity and rate of data aggregation – Create a growing need for solutions.

The end result is a “Blue Ocean” that should should restructure the information revolution into the knowledge revolution. There is simply no choice the market dynamics are pulling the equilibrium to the right.

The technological gap

Data analysts were customary the ones who bridged the gap between data availability and the demand for meaning. Data analysis can simply described as providing meaning to the data. Meaning is our model of reality, and the data analyst job is to chose the model / hypothesis wisely and test it according the available data / evidence. if the model can’ t be rejected then there is a good “probability” that the model useful.

However, Data analysts are human (Well, most of the time since this profession has a tendency to attract the the tails in the normal distribution). And humans have a limited processing ability. More importantly they don’t scale well – This can be demonstrated by the “Mythical Men Month” paradigm as well as a quick observation of the efficiency of large organizations. One can’ t really scale reasoning without getting stuck in Condorcet Paradox or Arrow’s impossibility theorem.

So bridging the gap is a question of replacing human intellect used in problem solving. Most of the talk now is lead by aggressive pessimists like Elon Musk, or Stephan Hawking who are top work dogs by their nature, and the idea of Skynet controlling their existence is intimidating to them. But for most of us, replacing an incompetent or corrupt government with an unbiased predictable entity, and getting UBI while watching reality shows instead of sweating our life in the job trenches – is a promising future for most of the population.

The process automation

Currently Data sciences are composed of various algorithms that are useful in different situations. Analysts first create the algorithms, usually in the academia. Then they choose which algorithm to implement in each context, which is more of an engineering problem, and finally they do the implementation on actual data including dealing with the problems of messy and missing data. Lastly they review the results and change the model accordingly until at end they get to satisfaction of delivery. This is no different from “regular science” but can we implement this process, less the scientist?

It turns out, we can.

Let’s start from the data because at beginning there was only data. Extracting features from data (Feature engineering) can be done nicely with Deep learning. – This means that extracting concepts from data including the concept of anomaly can be automated and be data driven using unsupervised learning (you can’t do supervised because then you’ll have to have human in the process). Google AlphaGo implements self-training algorithms that enable it to learn from it/him self. This reinforcement learning is not new, but, since now one can automatically build features of features, the ability for abstraction is born – From the picture you recognize the lines from the lines you make the face and from the relation of facial expression you can map the smile and give it a positive value function.

Anomaly detection is literally the weird kid on the block – Anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a data-set. This means that once your model is tuned all that is left out of it is Anomaly – And the number of anomalies have statistical significance then you may have to revise your model.

But where is the model George Box talked about? The model is the neural network and the weights derived on its arcs, this means that by dropping the symbolic logic of the “Model” and by reshaping the NN weights by back-propagation you get to restructure the model. Again this is not novel, back-propagation was created in 1986 by Hinton, but only recently modern computing power made it useful. This method of gradient decent algorithm is not guaranteed to fined local minimum, yet it’s good enough to be crowned as useful in most situations.

Still, not everything is automated yet. The neural network topology is still defined by humans (Convolution NN vs. Recurrent NN for example). but the ability to grow the network topology as an abstraction layer is the next step ahead.

sum of all things

Data sciences’ symbolic algorithms are going to be dropped in favor of Deep Learning Neural Networks.
Data scientists are a rare breed but their popularity will be short lived due to human cognitive and evolutionary limitations.
People will be dropped from major decision processes due to bounded rationality.
We are all going to be living on Universal Basic Income and be driven by automated cars to destinations we don’t care about, on routes not chosen by us.
The majority of us will think it’s paradise, only few anomalies will resist assimilation.

Delivering Project & Product Management as a Service