Professor Jerzy Stefanowski: Our scientists are invisible, research is floundering, we are rarely cited in the world
To evaluate the state of research in Poland in the field of machine learning, especially in the area of exploration of massive and complex data (Big Data), it is necessary to check if that research is visible in the world.
Research communities measure that with the number of articles presented at / accepted for the best global conferences and published in prestigious journals.
Elite conferences on broadly understood artificial intelligence include ECAI [Editor’s note: European Conference on AI], IJCAI [Editor’s note: International Joint Conference on Artificial Intelligence] and AAAI Conference on Artificial Intelligence.
As far as machine learning is concerned, the most prestigious conferences are: Neural Information Processing Systems – NIPS, International Conference on Machine Learning ICML and ECML PKDD; other main conferences include ACM SIGKDD (data exploration) and some IEEE conferences, e.g. IEEE Data Mining or less prestigious Big Data. Known for its high standards, SIAM Data Mining also ranks amongst the most renowned symposiums.
In the last five or seven years only several authors with Polish affiliation presented their articles at the above mentioned conferences, all of them coming from three centers: Institute of Computer Science – Polish Academy of Sciences in Warsaw, Warsaw University; Poznan University of Technology; and Wrocław University of Science and Technology.
Very few papers of Polish authors appear in world’s most renowned magazines, even if we compare Poland to other European countries of similar potential. In relative terms, Poland performs best at ECML PKDD [European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases], but, as a long-term participant, I have to say that the success is owed to a handful of people.
It would also be insightful to check how many experienced researchers have held senior positions connected with organization of the conference and supervision over reviews and decisions regarding the fate of the articles. For example, professor Jacek Koronacki was one of the program chairs at ECML PKDD in 2007 ; ten years later the position of the track chair was held by the author of this paper. Over the period of more than ten years a slightly lesser role of area chair has been given only twice (to professor Szymon Jaroszewicz and to the author of this paper), and, in the case of IJCAI, over a similar period of time, only three persons from Poland have obtained the position of senior PC (professors: Krzysztof Dembczyński, Jerzy Stefanowski and Piotr Faliszewski). Similarly, there have been just a few Poles in the basic ICML, NIPS or SIAM Data Mining sets of reviewers (so-called PC members).
Polish names are scarce
If you consider the articles put out in the most important and influential scientific journals dealing with the above-mentioned field, e.g. “Journal of Machine Learning Research”, “IEEE Transactions on Neural Networks and Learning Systems”, “Machine Learning”, “Journal of Data Mining and Knowledge Discovery”, “IEEE Transactions on Pattern Analysis and Machine Intelligence”, the conclusion will be similar: Polish names in such journals are scarce: Wojciech Kotłowski and Krzysztof Krawiec from Poznan University of Technology, Michał Woźniak and his colleagues as well as Przemysław Kazienko from Wrocław University of Science and Technology; Leszek Rutkowski and his colleagues from Czestochowa University of Technology; Jan Mielniczuk from the Institute of Computer Science – Polish Academy of Sciences.
Slightly more articles appear in less prestigious journals on computational intelligence and so-called soft/fuzzy computing, but those fields of study are not closely related to the matters dealt within the scope of machine learning and, especially, Big Data.
Being fully aware of the flaws of the so-called citation indexes, there are only a few scientists from Poland, both closely and loosely connected with artificial intelligence, whose Hirsch indexes and a number of citations are relatively high (see the so called “100 top best cited Polish scientists” ranking by Google Scholar published two years ago). The situation looks somewhat better if you take into account Polish scientists working abroad, e.g. professor Jacek Żurada or professor Witold Pedrycz, and if you include their publications in the field of computational intelligence.
Less than ten
Polish scientists are very rarely members of the boards of international AI and ML associations. Only three Poles hold the ECAI (EuroAi) title, one of them, professor Stanisław Matwin, being a scientist with foreign affiliation (in my opinion he is currently the most recognized scientist of Polish origin in the international community).
Summing up, international visibility of research conducted in Poland within the scope of machine learning and exploration of complex and massive data (such as Big Data) is too low and does not exceed 10 people. Apart from that group, there are also other researchers of foreign origin who went abroad at different stages of their career and achieved significant successes.
There are very few papers of Polish authors in world’s most renowned magazines, even if we compare Poland to other European countries of similar potential.
If you take into account the awards and distinctions in all competitions for the best doctoral theses of the Polish Artificial Intelligence Society (all editions are listed on website http://www.pssi.agh.edu.pl/pl:konkurs), three centers appear dominant:
- Poznan University of Technology (three main awards, four distinctions);
- University of Warsaw (two awards, three distinctions);
- Wrocław University of Science and Technology (two awards).
Distinctions were awarded to doctors from the Warsaw University of Technology (four times), AGH University of Science and Technology (two times) and Silesian University of Technology, Institute of Computer Science – Polish Academy of Sciences, and Systems Research Institute – Polish Academy of Sciences (one time each).
Machine learning and advanced data exploration are covered mainly in the theses of doctoral students from Poznan University of Technology, Wrocław University of Science and Technology, and Warsaw University of Technology. Research conducted by other centers focused on more standard areas of artificial intelligence (natural language and text processing, non-standard logics for multi agent systems, robotics).
The youth are coming
In many other centers, it is the young generation that has achieved interesting results in recent years. Here are some examples:
• Analysis of social networks or advanced graph structures – e.g. Wrocłąw University of Science and Technology (team led by professor Kazienko), Warsaw University (Tomasz Michalak), AGH University of Science and Technology (Piotr Faliszewski), Polish-Japanese Academy of Information Technology (Adam Wierzbicki) and Poznan University of Technology (Mikołaj Morzy);
• Exploration of big graph models for internet networks and searching for information in large text repositories – mainly Institute of Computer Science, Polish Academy of Sciences (team led by professor Kłopotek, but also Marcin Sydow); the analysis of large text repositories and of using inspirations of association rules methods in natural language processing tasks was also conducted at the Warsaw University of Technology (team led by professor Rybiński);
• Choice of attributes, traits selection methods, signal decomposition to essential components, especially for highly multidimensional data, namely for biomedical data, applications in bioinformatics and molecular biology – e.g. Warsaw University, Institute of Computer Science – Polish Academy of Sciences, Silesian University of Technology, Jagiellonian University, Nicolaus Copernicus University in Toruń.
• Classifiers ensembles learning, complex output structures (e.g. multi label learning), inclusion of semantic correlations between data elements (ordinal classification) – e.g. Wrocław University of Science and Technology, Poznan University of Science, Institute of Computer Science – Polish Academy of Sciences, Nicolaus Copernicus University in Toruń.
• Learning non-stationary data streams (predicting, grouping, searching for relevant traits) – mainly Wrocław University of Science and Technology, Poznan University of Technology and Czestochowa University of Technology.
• Data modeling in the process of integration of various data sources in the context of Big Data (mainly Silesian University of Technology and Poznan University of Technology).
• Adaptation of learning methods to different types of data deficiencies (inaccuracy, incompleteness or partially contradictory data, information granulation) – e.g. University of Warsaw, Institute of Computer Science – Polish Academy of Sciences, Systems Research Institute – Polish Academy of Sciences, Silesian University of Technology, Poznan University of Technology, Czestochowa University of Technology and Warsaw University of Technology; data distribution complexity (e.g. imbalance, decomposition of classes): Wrocław University of Science and Technology, Institute of Computer Science – Polish Academy of Sciences.
• Development and applications of deep neural networks in robotics, control, machine vision and image recognition (currently many different workgroups, e.g. Warsaw University of Technology, Silesian University of Technology, Gdańsk University of Technology, Poznan University of Technology, AGH University of Science and Technology, Jagiellonian University).
• Big multimedia data processing for biomedical applications – mainly Gdańsk University of Technology.
• Big data processing with the use of machine learning within the scope of neurocognitive science and research focused on human brain imaging – e.g. Neurocognitive Laboratory, Nicolaus Copernicus University in Toruń; Warsaw University project; and Institute of Experimental Biology – Polish Academy of Sciences.
Unfortunately, the research community in Poland is highly atomized. Cooperation between the teams is not as smooth as it should be, common research projects are scarce, there are no doctor exchange programs, internationalization is weak (which could be improved if, for example, renowned foreign scientists were temporarily employed as foreign professors or co-supervisors for doctoral students – the team from the Wrocław University of Science and Technology being an exception; the resources from the ENGINE project also allowed to organize the so-called international doctoral schools).
Polish machine learning researchers are only occasionally involved in big international projects. Although many academic conferences, including international ones, are held in Poland, they are not the most important events in the field discussed. As a matter of fact, since 2007, i.e. since professor Matwin and professor Koronacki managed to organize ECML PKDD in Warsaw, no other major conference on broadly understood machine learning and knowledge discovery/data mining has been held in Poland.
Unfortunately, the research community in Poland is highly atomized. Cooperation between the teams is not as smooth as it should be, common research projects are scarce, there are no doctor exchange programs, internationalization is weak.
Sometimes interesting events are organized in parallel with significant conferences. For instance, during the IFIP world congress in September 2018 a special Oxford debate on AI ethical and regulatory aspects was organized and talks were held between representatives of the Polish Artificial Intelligence Society and the leaders of the IFIP TC12 group [International Federation for Information Processing Technical Committee on Artificial Intelligence], which is responsible for artificial intelligence.
Cooperation between the industry and public administration is also extremely unsatisfactory. The scientific community does not participate in big projects of key importance for economy and society. Instead, it obtains small orders from a limited number of companies or gets involved in the projects of the National Center for Research and Development.
Overall negative balance
On the other hand, over the past two or three years there has been a significant number of young people (employees of companies and students) who are fascinated by newly created machine learning libraries and data mining software and by their potential commercial applications. This is seen in an increasing number of various (although characterized by a very uneven level) ML Meetups or conferences such as Polish People in Machine Learning and numerous types of informal seminar groups of young doctoral students.
One of the initiatives adopting a more scientifically-oriented approach is the Polish Learning Systems Group (in the organization of which the author of this report has been involved).
Such informal and, in a way, random initiatives show that the community of young enthusiasts is growing rapidly. They may form a foundation for applications used in companies or start ups based on selection and adaptation of often ready software developed in other countries.
Unfortunately, minor positive aspects are not enough to change my final opinion: today in Poland research on machine learning and data mining – exploration of complex and massive data (such as Big Data) is too limited and the number of renowned researchers working in our country is definitely too small.
This paper is an adaptation of a part of the study by professor Jerzy Stefanowski entitled “Development of artificial intelligence and machine learning in the context of Big Data” (Poznań 2018), prepared for the National Information Processing Institute.