When a basic question on a complex matter is asked, the answer is always multifold. That is, question’s ability to spot light on the matter becomes obsolete. Thus, a single response cannot answer the question even from a tiny extend. However, there is always a solution to even the darkest problems. Key thing is to think whether question is suitable or not. Furthermore, an examination of the question’s adequacy is also vital since you cannot learn everything by merely asking single question. All this applies to data science in a tailored fashion. Although its complexity is accepted from all fields, there is some part of the market that is still insisting on the simplicity of the data science. Whether it is complex as rocket science or simple as breathing, better way to understand is to discuss in detail.
Why Data Science Is Complex?
The complexity of the data science comes from its roots. The statistics, computer science, economics, and mathematics constitute the main system of the data science. As clear as it gets, above-mentioned fields are not even close to being simple. Their inherent complexity comes from their interested meta; natural behaviors. Although this looks like a flimsy argument to make, especially for computer science and mathematics, deep down all their working areas and target groups are human beings. Thus, data science’s intrinsic complexity comes from its roots in the human behaviors. In other words, due to unpredictable -or at least hardly predicted- nature of individuals, all science fields that are dealing with them are complex.
Even though roots of data science are enough to speak about its complexity, we must discuss the characteristic complexity of it by itself. That is, the inconsistent sample data, biases in the sampling, natural problems of the population data and the lack of power for the examination of the data. All these are so common in the natural environment of the data science that some of them are treated as default by nature. Besides, as if it’s not hard to find a remedy for them, some of the problems cannot be solved without an intervention to population data, which is almost always impossible. However, even if you achieve a solution with intervening to population, this will probably bring more problems than it removes.
What Can We Do to Simplify It?
There are, also, more than one answers to that question. One that is immediately comes to mind is that why are we trying to simplify it? Although it is a question than an answer, it shows the correct path for thinking. That is, you cannot oversimplify something without losing its vital parts. Therefore, whether it is for the sake of simplicity or for the sake of humanity, when you try to simplify a science field, thing you will get would not be science anymore. Thus, rather than trying to simplify something just because it looks hard, a better would be enhancing the capabilities of the operators, which are data scientists.
In order to ask the correct question about the complexity of the data science, one must understand that complexity is not artificial but an intrinsic one. That is, think of the example of cybersecurity (relative field of data science), it is being formed artificially complex so that it can achieve its duty. Of course, it has inherent complexities, however, the artificial complexity is a must in order it to function properly. On the other hand, the data science is complex because problems are expanding exponentially while solution mechanisms are expanding linearly. Thus, without requiring an artificial complexity, the data science, due to its nature, is complex.
The best way we can do for the sake of simplifying the data science is to enhance methods that are used for it. Moreover, leaving oversimplified fantasies a side, and concentrating on the development of data science as a whole since only way we can achieve a simple data science is to understand it better.