Dmitry Kaminskiy, General Partner, Deep Knowledge Group
For investors, return on investment is increasingly dependent on insights from big data.
Our reality today for sophisticated data collection and analytics is very different from the past when Mark Twain and others said in the early 2000s: "There are three kinds of lies - lies, goddamn lies, and statistics."
This is increasingly true, especially in the investment sector, where the data science approach is now prevalent. Investors who pour money into projects and companies without using insights from big data to guide their decisions may not see a significant return on their investment.
Public companies are more likely to use data science than their private counterparts, which still rely heavily on outdated investment tools and methods with limited use of machine learning algorithms and data collection techniques.
What is data science?
Data science involves a mixture of programming expertise, along with an in-depth understanding of statistics and mathematics to extract meaningful insights from data. Data scientists must earn a master's degree in many fields, including computing, statistical analysis, machine learning, deep learning, data visualization, mathematics, and programming.
Does data science have a future?
The answer is definitely yes. Data collection methods and capabilities continue to evolve, contributing to ever-increasing amounts of data. In addition, the increasing ability to automate data analysis is leading to widespread adoption of data science as a key measure of performance within companies.
It is estimated that the size of the global data science market reached $95.30 billion in 2021 and is estimated to reach $322.90 billion in 2026, at a compound annual growth rate of 27.7% over the forecast period, representing an increase of nearly 300% within Only six years.
Additional factors driving market growth include increasing reliance on cloud-based solutions, growing application of data science in various sectors, and the need to extract deep insights from big data to gain a competitive business advantage.
Data science reliability
Using data science is a bit like using Google when searching for something. Google provides answers, but there is no guarantee that these answers are accurate and sufficient, which leads to the question, "Who decides whether a particular answer is the correct answer?".
Since data science is currently very subjective, people are rethinking the evaluation of answers produced by data science approaches, and refining multiple ways to formulate solutions with the aim of achieving the best results. However, most of these methods deal with the computational validity of the models. On the other hand, a mathematically correct answer can be completely meaningless.
Suppose, if x = 5, and y = 2, it can be deduced that x/y = 2.5, but if x is the number of oranges and y is the temperature, then the number 2.5 makes no sense at all. In the same vein, many of the answers that Google provides are also meaningless. So, how can we overcome these challenges?
Common sense is one of the most important elements in data science
Even when a search yields several results, we can often, using common sense, ignore some answers by examining the first few results. This is not because we know the answer, but because we know the answer which cannot be correct.
How do we make better investment decisions using data science?
The answer to this question is a three-pronged approach: better models, better data, and explainable AI. Interpretability refers to how machine learning systems understand decisions and how to design systems whose decisions are easy to understand or explain.
First, more focused models must be built on a particular analytical intent. Biotechnology is a good example of this, as many solutions in the field of data science are built from scratch, specifically to address a specific problem. This helps avoid the problem of interpretability that arises when a model built to analyze one set of data is modified and used to analyze another set, and thus we often see unexpected flaws.
Second, building custom models takes a lot of time and a lot of data. For example, financial institutions generally suffer from a lack of data and a lack of time. To overcome these problems, one must realize that raw data is not the only data, and there is no way that data systems can advance without in-depth research.
Many financial institutions do not have dedicated research and development departments to help solve specific problems. Therefore, when it comes to data - especially in the case of private companies - alternative data is valuable. It is often much easier to obtain compared to raw data. With the right means, alternative data can be effectively transformed and used to shed light on different areas of a company's current and future performance.
Third, it is necessary to employ explainable AI. Increasingly sophisticated models are often used without a good understanding of how they work.
While catchphrases like "deep learning" and "artificial intelligence" are interesting when showing investors a range of models, it doesn't necessarily mean that all of them outperform the classics. Furthermore, it is important to realize that more than 3% accuracy is not always a sufficient reason to abandon an interpretable, but less accurate, model.
Major trends in data science by 2025
Data science changes almost daily. From data management to deep technology. The sector is set to face significant changes, and it is important to keep up with trends to ensure the authenticity, ethics, and accuracy of data science. Some of the fastest growing data science trends include:
Data science is relevant with many sectors
In addition to its important role in the investment industry, data science is also an essential element for making important decisions across a variety of sectors, including finance, agriculture, biotechnology, marketing, human resources, healthcare, government programs, and more.
Data science has contributed to solving complex problems in many areas, such as forecasting agricultural development and crop success, identifying drug toxins, budgeting and capital allocation.
How is data collected in data science?
Various technologies are used, some of which include transaction tracking, social media monitoring, surveys, interviews, focus groups, observation and deduction, online tracking, and other models.
For example, AI in the Deep Knowledge group is the main driver behind our strategy formulation and implementation. Today we play the role of computationally deep technology company, ensuring that analytics, data science and artificial intelligence functions are prioritized in our operations through continuous investment in these resources.
Over the past 9 years, we have established 10 sectoral analytical subsidiaries that act as analytical powerhouses to study and understand relevant sectors, and enable stakeholders to make informed strategic moves within these sectors.
Read the original Arabic version in Fortune Arabia here.
Comments