Machine Learning

Outliers are not liars: they have a significance in peripheral vision for business leaders

Mannat Soni

Mannat Soni

Nov 22nd 2021 - 2 min read

Kindly ignore the outliers created due to experimental errors while reading this article.

“Just because I haven’t found my distribution yet, am I an Outlier?”

Wise men say life is like Math; most problems have mathematical solutions, but, if all data points lie on your regression line, then something is not well. Every aspect of life experiences outliers. They live-in extremes showcasing both negative and positive possibilities. But our rudimentary action with them has been of complete discrimination and discard.

Why?

There is this concept called King and Pauper effect.

As per Wikipedia, in statistics, economics, and econo-physics the King effect refers to the phenomenon where the top one or two members of a ranked set show up as outliers. These top one or two members are unexpectedly large because they do not conform to the statistical distribution or rank-distribution which the remainder of the set obeys. Likewise, the Pauper Effect is understood as the observations in the lower ranks. Therefore, traditional understanding permeates outliers to be abnormal, indicating their importance with respect to normal or the mean of the distribution. I still wonder, why have we not been able to look at it as something “Outside the box”? Before we delve deeper into that concept, let us first explore the idea of an Outlier. Fragmenting it, piece by piece should give us a profounder perspective.

O.U.T.L.I.E.R Something that lives outside? or Something that lies, outside?

Every data scientist would define an outlier as a data point or a value that lies in a data series or any distribution on its extremes, which is either very small or very large and therefore may impact the overall observation made from the data series. Outliers are also explained as extremes because they lie on the either end of a data series.

Outlier observations lie at abnormal distances from other values in a random sample from a population. Outliers are usually treated as abnormal values as they do not conform to the normal and ipso facto affect the overall observation of the process by pulling the central tendencies towards the extremes due to their very high or low extreme values.

Outliers differ from other data points and observations significantly. They are thus discarded from the data series.

This is where my argument begins. We discard outliers because they are potentially so influential that they would shift the mean from its original line of action. Instead, can they actually show us the extremes to which a situation is bound to perform? Let me ask you a question. Would you consider an earthquake an outlier as it is not supposed to be an “average” event? Or maybe, Einstein was an outlier because his IQ was totally not around the average IQ of a human being?

If you do, would you want to remove such events because they don’t serve the “average” data? If yes, when will we rise from such mediocrity? The vision that I carry speaks for itself. It is not for those who want to complete everyday tasks. It is for those who want to build an enterprise.

The point is, an outlier screams its values out to tell you the positive or negative potential an event carry. Whether we maximize on it or remove it due to the volatility it causes, is now in our hands.

Outliers are not liars instead they have a future value: Not all outliers are false values. They may not be attributed to accidents, incidents and coincident of statistical studies. Outliers in distributions may even occur without random chance, accident, coincidence or incidence in sampling and data collection errors and experimental mistakes. They may be actual data point observations so what that they are lying at extreme distances from central tendencies and inter quartile ranges. It is in such cases that apt judgment and interpretation techniques of business leaders come to play the role. While managers are able to see everything that is visible to normal eyes and their comfort generally conforms to the trend, it is for the astute leaders to pick up these observations and respect them for what they reflect. If these are actual occurrences with in a process distribution than they not only indicate possibility realm of a process but also highlight the way points along which future contours of the process may shape up or are shaping up. Thus, while outliers are excluded from regression models, but they should not be ignored from leader’s cognition as they do show the future feasibilities.

Today’s outliers may be tomorrow’s central tendencies.

Also, they show that the process under study has the potential to generate such outcomes, however good, bad or ugly they may be. Outliers should not be discarded totally. They have a story encased within them which needs to be decoded in detail. Outliers, therefore indicate that there are areas of distribution where a certain theory, belief or status quo wisdom might not be valid. There is something more to explore and leveraged. Outliers may have been thus spawned by flaws in the existing theory that otherwise generated an assumed family of probability distributions, resulting in some observations being far from the central tendencies of the data. Since outliers manifest that the population has a heavy-tailed distribution, tails cannot be cut for simplicity of data building. Outlier analysis may stimulate the paradigm shift in the process management.

However, it is for leaders to perceive what is normal.

It is thus for analysts to decide what is normal or abnormal.

But one conviction is definitely there that normal for managers and leaders cannot be same. Leaders with the depth of vision and being able to perceive the invisible possibility frontiers can exploit these outlier observations in building futures and crafting tomorrows for their businesses. Outliers are signposts of future of the process under research. They are beacons of invisible spectrum and opportunities or challenges that are presently lying outside the normal process range but may occur in times to come. They are thus golden nuggets of peripheral vision a very important quality of leadership in industry. Majority data points indicate the present state of the process, but outliers indicate the future shape that the process can undertake in both positive and negative sense. Outliers can help in designing future strategies within the business either to accelerate the positive possibilities or requisite interventions to pre-empt negative pitfalls.

The soul of this blog lies in the argument that outliers have a story couched in them, decode if you can!

about the author

Mannat is a computer science engineer from Panjab university. She is passionate about Data Science, Machine Learning and Creative scientific writing.