Data mining as a cognitive tool: Capabilities and limits
-
DOIhttp://dx.doi.org/10.21511/kpm.05(1).2021.01
-
Article InfoVolume 5 2021, Issue #1, pp. 1-13
- Cited by
- 593 Views
-
157 Downloads
This work is licensed under a
Creative Commons Attribution 4.0 International License
Due to the large volumes of empirical digitized data, a critical challenge is to identify their hidden and unobvious patterns, enabling to gain new knowledge. To make efficient use of data mining (DM) methods, it is required to know its capabilities and limits of application as a cognitive tool. The paper aims to specify the capabilities and limits of DM methods within the methodology of scientific cognition. This will enhance the efficiency of these DM methods for experts in this field as well as for professionals in other fields who analyze empirical data. It was proposed to supplement the existing classification of cognitive levels by the level of empirical regularity (ER) or provisional hypothesis. If ER is generated using DM software algorithm, it can be called the man-machine hypothesis. Thereby, the place of DM in the classification of the levels of empirical cognition was determined. The paper drawn up the scheme illustrating the relationship between the cognitive levels, which supplements the well-known schemes of their classification, demonstrates maximum capabilities of DM methods, and also shows the possibility of a transition from practice to the scientific method through the generation of ER, and further from ER to hypotheses, and from hypotheses to the scientific method. In terms of the methodology of scientific cognition, the most critical fact was established – the limitation of any DM methods is the level of ER. As a result of applying any software developed based on DM methods, the level of cognition achieved represents the ER level.
- Keywords
-
JEL Classification (Paper profile tab)D80, D83
-
References32
-
Tables0
-
Figures4
-
- Figure 1. General scheme of cognition
- Figure 2. Correlation between thinking, reality, and sign systems
- Figure 3. Relationship between the levels of cognition
- Figure 4. Search scheme for hidden empirical regularities
-
- Agapito, G., Guzzi, P., & Cannataro, M. (2018). Parallel and Distributed Association Rule Mining in Life Science: a Novel Parallel Algorithm to Mine Genomics Data. Information Sciences.
- Bongard, M. M. (1967). Problema uznavaniya [Recognition Problem]. Moscow: Nauka. (In Russian).
- Cao, L. (2017). Data Science: Challenges and Directions. Communications of the ACM, 60(8), 59-68.
- Carbone, A., Jensen, M., & Sato, A.-H. (2016). Challenges in data science: a complex systems perspective. Chaos, Solitons & Fractals, 90, 1-7.
- Chen, W., Pourghasemi, H. R., Zhang, S., & Wang, J. (2019). 21 – A Comparative Study of Functional Data Analysis and Generalized Linear Model Data-Mining Methods for Landslide Spatial Modeling. In H. R. Pourghasemi & C. Gokceoglu (Eds.), Spatial Modeling in GIS and R for Earth and Environmental Sciences (pp. 467-484). Elsevier.
- Cuomo, M. T., Tortora, D., Foroudi, P., Giordano, A., Festa, G., & Metallo, G. (2021). Digital transformation and tourist experience co-design: Big social data for planning cultural tourism. Technological Forecasting and Social Change, 162, 120345.
- Data4Logic. (n.d.). Finding cells attributes.
- Gibert, K., Izquierdo, J., Sànchez-Marrè, M., Hamilton, S. H., Rodríguez-Roda, I., & Holmes, G. (2018). Which method to use? An assessment of data mining methods in Environmental Data Science. Environmental Modelling & Software, 110, 3-27.
- Gluzman, D. F., Abramenko, I. V., Skliarenko, L. M, & Kriachok, I. A. (2000). Diahnostika leikozov. Atlas i prakticheskoe rukovodstvo [Diagnosis of leukemia. Atlas and practice guidelines]. Kiev: MORION. (In Russian).
- Gnedenko, B. V. (1983). Matematika i nauchnoye poznaniye [Mathematics and scientific knowledge]. Moscow: Znanie. (In Russian).
- Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Methods (3rd ed.). Elsevier Inc.
- Hastings, H. M., Davidsen, J., & Leung, H. (2017). Challenges in the analysis of complex systems: introduction and overview. The European Physical Journal Special Topics, 226, 3185-3197.
- Iugov, A. K. (1942). Ivan Petrovich Pavlov. Moscow: Detgiz. (In Russian).
- Kozjeka, D., Vrabič, R., Kralj, D., Butala, P., & Lavrač, N. (2019). Data mining for fault diagnostics: A case for plastic injection molding. Procedia CIRP, 81, 809-814.
- Lekakh, V. A. (2011). Bolnyye voprosy onkologii i novyye podkhody v lechenii onkologicheskikh zabolevaniy [Pressing issues of modern oncology and new approaches to the treatment of oncological diseases]. Moscow: Librokom. (In Russian).
- Li, L. (2020). Real time auxiliary data mining method for wireless communication mechanism optimization based on Internet of things system. Computer Communications, 160, 333-341.
- Liu, J., Kong, X., Zhou, X., Wang, L., Zhang, D., Lee, I., Xu, B., & Xia, F. (2019). Data Mining and Information Retrieval in the 21st century: A bibliographic review. Computer Science Review, 34, 100193.
- Malinovskii, L. G. (1986). Protsessy klassifikatsii – osnova postroyeniya nauk o deystvitelnosti [Classification processes are the basis for the construction of the sciences of reality]. In I. A. Ovseevich (Ed.), Algoritmy obrabotki eksperimentalnykh dannykh [Experimental data processing algorithms] (pp. 155-182). Moscow: Nauka. (In Russian).
- Moiseev, N. N. (1982). Chelovek, sreda, obshchestvo. Problemy formalizovannogo opisaniya [A person, environment, society. Problems of formalized description]. Moscow: Nauka. (In Russian).
- Rozhkov, V. A. (2011). On an Information Approach to Soil Classification. Dokuchaev Soil Bulletin, 69, 4-24. (In Russian).
- Salo, F., Injadat, M., Nassif, A. B., Shami, A., & Essex, A. (2018). Data Mining Methods in Intrusion Detection Systems: A Systematic Literature Review. IEEE Access, 6, 56046-56058.
- Santhosh, R., & Mohanapriya, M. (2020). Generalized fuzzy logic based performance prediction in data mining. Materials Today: Proceedings, 45(2), 1770-1774.
- Schuh, G., Reinhart, G., Prote, J.-Ph., Sauermann, F., Horsthofer, J., Oppolzer, F., & Knoll, D. (2019). Data Mining Definitions and Applications for the Management of Production Complexity. Procedia CIRP, 81, 874-879.
- ScienceHunter. (n.d.). O nas [About us].
- Shtoff, V. A. (1978). Problemy metodologii nauchnogo poznaniya [Problems of scientific knowledge methodology]. Moscow: Vysshaia shkola. (In Russian).
- Sunhare, P., Chowdhary, R. R., & Chattopadhyay, M. K. (2020). Internet of things and data mining: An application oriented survey. Journal of King Saud University – Computer and Information Sciences.
- Szymańska, E. (2018). Modern data science for analytical chemical data – A comprehensive review. Analytica Chimica Acta, 1028, 1-10.
- Thakkar, H., Shah, V., Yagnik, H., & Shah, M. (2021). Comparative anatomization of data mining and fuzzy logic methods used in diabetes prognosis. Clinical eHealth, 4, 12-23.
- Zagoruiko, N. G. (1972). Metody raspoznavaniya i ikh primeneniye [Recognition Methods and Their Application]. Moscow: Sovetskoe radio. (In Russian).
- Zagoruiko, N. G. (1999). Prikladnyye metody analiza dannykh i znaniy [Applied methods of data and knowledge analysis]. Novosibirsk: Sobolev Institute of Mathematics, SBRAS. (In Russian).
- Zakrevskii, A. D. (1988). Logika raspoznavaniya [Recognition logic]. Minsk: Nauka i tekhnika. (In Russian).
- Zheng, Q., Li, Y., & Cao, J. (2020). Application of data mining technology in alarm analysis of communication network. Computer Communications, 163, 84-90.