UT & Swedbank Data Science Seminar “When, Why and How? The Importance of Business Intelligence”

Last week I had the pleasure of taking part in a Data Science Seminar titled “When, Why and How? The Importance of Business Intelligence. In this seminar, organized by the Institute of Computer Science  (University of Tartu) in cooperation with Swedbank, we (me, Mohammad Gharib, Jurgen Koitsalu, Igor Artemtsuk) discussed the importance of BI with some focus on data quality. More precisely, 2 of 4 talks were delivered by representatives of the University of Tartu and were more theoretical in nature, where we both decided to focus our talks on data quality (for my talk, however, this was not the main focus this time), while another two talks were delivered by representatives of Swedbank, mainly elaborating on BI – what it can give, what it already gives, how it is achieved and much more. These talks were followed by a panel moderated by prof. Marlon Dumas.

In a bit more detail…. In my presentation I talked about:

  • Data warehouse vs. data lake – what are they and what is the difference between them?” – in a very few words – structured vs unstructured, static vs dynamic (real-time data), schema-on-write vs schema on-read, ETL vs ELT. With further elaboration on What are their goals and purposes? What is their target audience? What are their pros and cons? 
  • Is the Data warehouse the only data repository suitable for BI?” – no, (today) data lakes can also be suitable. And even more, both are considered the key to “a single version of the truth”. Although, if descriptive BI is the only purpose, it might still be better to stay within data warehouse. But, if you want to either have predictive BI or use your data for ML (or do not have a specific idea on how you want to use the data, but want to be able to explore your data effectively and efficiently), you know that a data warehouse might not be the best option.
  • So, the data lake will save my resources a lot, because I do not have to worry about how to store /allocate the data – just put it in one storage and voila?!” – no, in this case your data lake will turn into a data swamp! And you are forgetting about the data quality you should (must!) be thinking of!
  • But how do you prevent the data lake from becoming a data swamp?” – in short and simple terms – proper data governance & metadata management is the answer (but not as easy as it sounds – do not forget about your data engineer and be friendly with him [always… literally always :D) and also think about the culture in your organization.
  • So, the use of a data warehouse is the key to high quality data?” – no, it is not! Having ETL do not guarantee the quality of your data (transform&load is not data quality management). Think about data quality regardless of the repository!
  • Are data warehouses and data lakes the only options to consider or are we missing something?“– true! Data lakehouse!
  • If a data lakehouse is a combination of benefits of a data warehouse and data lake, is it a silver bullet?“– no, it is not! This is another option (relatively immature) to consider that may be the best bit for you, but not a panacea. Dealing with data is not easy (still)…

In addition, in this talk I also briefly introduced the ongoing research into the integration of the data lake as a data repository and data wrangling seeking for an increased data quality in IS. In short, this is somewhat like an improved data lakehouse, where we emphasize the need of data governance and data wrangling to be integrated to really get the benefits that the data lakehouses promise (although we still call it a data lake, since a data lakehouse, although not a super new concept, is still debated a lot, including but not limited to, on the definition of such).

However, my colleague Mohamad Gharib discussed what DQ and more specifically data quality requirements, why they really matter, and provided a very interesting perspective of how to define high quality data, which further would serve as the basis for defining these requirements.

All in all, although we did not know each other before and had a very limited idea of what each of us will talk about, we all admitted that this seminar turned out to be very coherent, where we and our talks, respectively, complemented each other, extending some previously touched but not thoroughly elaborated points. This allowed us not only to make the seminar a success, but also to establish a very lively discussion (although the prevailing part of this discussion took place during the coffee break – as it usually happens – so, unfortunately, is not available in the recordings, the link to which is available below).

The recordings are available here.

First paper in my Special Issue “Hybrid Data-Driven and Physical Modelling for Energy-Related Problems: Towards Smarter Energy Management”

As an academic editor of the Special Issue “Hybrid Data-Driven and Physical Modelling for Energy-Related Problems: Towards Smarter Energy Management” (Energies MDPI, Impact Factor:3.004 (2020); 5-Year Impact Factor: 3.085 (2020)), I am glad to announce that the first article has been published!

This very special for me issue belongs to the section “A8: Artificial Intelligence and Smart Energy” and it seeking for the latest research on advances in the field covering both data-related topics and next-generation power electronic techniques and their applications. All papers accepted for their publishing are published in an Open Access, thereby significantly contributing to the visibility of these papers, and are submitted for their indexation in both Scopus and Web of Science in addition to other databases. The journal also falls in Q1 (Cite Score, Control and Optimization).

Topics of interest for this Special Issue include, but are not limited to:

  • energy management systems
  • data-driven approaches to energy-related issues
  • advances in energy analytic, including open data on energy, its benefits, re-uses and impact
  • Big Data management in the context of energy data
  • machine learning (ML) techniques
  • physical modelling for energy-related problems
  • disruptive technology of renewable energy
  • blockchain for Internet of energy management
  • the role of the “energy” within the context of Industry 4.0 and Sustainable Goals
  • energy data management in the context of the internet of things (IoT)
  • energy data management via distributed systems
  • smart grid and microgrid
  • sustainable electrical energy systems
  • hybrid and electric vehicle.

This Special Issue is of articular importance, given that energy-related issues are becoming more and more relevant today, including the topic of disruptive technologies, where renewable energy is referred to as one of the 12 most significant disruptive technologies. The topic is no longer limited to energy production/generation and storage and supply as a source of energy; it is becoming broader, including the close links with the electrification of transport, including electric vehicles (smart and green transportation), industrial automation, energy storage systems, data storage and data management systems. With both being very common, and at the same time disruptive and new, energy-related issues relate to both the adaptation of well-known foundations for recent trends and the optimization of the methods and techniques already used, as well as introducing completely new methods and developing new applications, thereby promoting open innovation and smarter living.

Special Issue “Hybrid Data-Driven and Physical Modelling for Energy-Related Problems: Towards Smarter Energy Management”


The first paper accepted for its publishing in the SI is entitled “Machine Learning Schemes for Anomaly Detection in Solar Power Plants” and is authored by Mariam Ibrahim (German Jordanian University), Ahmad Alsheikh (Deggendorf Institute of Technology), Feras M. Awaysheh (University of Tartu Institute of Computer Science) and Mohammad Dahman Alshehri (Taif University). The paper deals with the anomaly detection in photovoltaic (PV) systems by evaluating the performance of different machine learning schemes – AutoEncoder Long Short-Term Memory (AE-LSTM), Facebook-Prophet, and Isolation Forest – and applying them to detect anomalies on photovoltaic components. These models allow the authors to identify the PV system’s healthy and abnormal actual behaviors. The results provide clear insights to make an informed decision, especially with experimental trade-offs for such a complex solution space. The issue explored is all the more topical considering the rapid industrial growth in solar energy, which gains an increasing interest in renewable power from smart grids and plants.

I am very glad and honored to be an academic editor of both this issue (together with my colleagues from Germany and Greece) and this paper in particular.

Special Issue, Academic Editor

Looking forward further submissions (particular focus on advances in the field covering both data-related topics and next-generation power electronic techniques and their applications)!!!

First International Electronic Governance with Emerging Technologies Conference (EGETC)

As the general co-chair of the First International Conference in Electronic Governance with Emerging Technologies (EGETC-2022), I sincerely invite you to consider your participation as authors and presenters or the attendees in this event.

First International Electronic Governance with Emerging Technologies Conference
First International Electronic Governance with Emerging Technologies Conference

Over the last decade, the importance of emerging technologies in government and public administrations has grown significantly. The growing demand for services that better meet changing user expectations for responsiveness and personalization, coupled with higher expectations of the role of government in the digital age, calls for a technologically mature public sector. There are many new emerging technologies serving as enablers to new forms of governance and novel applications in traditional governance functions, which role has been witnessed across various domains, including healthcare, medicine, education, tourism, and industry etc.. 

The aim of the First International Conference in Electronic Governance with Emerging Technologies (EGETC-2022) is to provide a forum for academics, scholars, and practitioners from academia and industry to share and exchange the recent developments in the domain of eGovernment and governance of digital organizations to shed light on the emerging research trends and their applications. 

Topics of interest include, but not limited to:

  • Intelligent systems for coordination in crisis emergency management
  • Distributed ledgers and Blockchains: governance, decentralized autonomous organizations (DAO)
  • Machine Learning (ML), Artificial Intelligence (AI) and Big Data management for Public Sector
  • Privacy, security and legal Informatics – AI and Law
  • Open Data, Open Government Data: transparency, trust, public participation, co-creation and Open Innovation
  • Digital transformation and Society 5.0
  • Linked Data, Linked Open Data (LOD)
  • Semantic E-government applications
  • Public Sector Knowledge Representation
  • Decision Support Systems (DSS) in Digital Governance
  • Natural Language Processing (NLP)
  • Cloud Computing
  • Bots, Automation agents, Self-learning systems 
  • Cryptocurrencies and incentive mechanism design
  • Multimedia and multilingual systems

It is particularly important that, in order to ensure the widest possible participation of communities, despite the availability of funding, the conference does not foresee any charges. I.e. both authors and presenters, and attendees/ listeners are welcome without registration fees.

First International Electronic Governance with Emerging Technologies  Conference (author: Anastasija Nikiforova)

In addition to the great team of organizers and members of the program committee, which has a rich list of outstanding experts, participants of this event will have an opportunity to enjoy the keynote speeches by Prof. Marijn Janssen – Full Professor in ICT & Governance at TU Delft, Netherlands, Dr. B K Murthy, CEO – Innovation and Technology Foundation, IIT Bhilai, Prof. Luis Martinez – Full Professor, University of Jaén, Jaén Spain. More information on their talk will follow…

Accepted papers presented at the EGETC2022 will be published in the proceeding published by Springer in Lecture Notes in Computer Science series (approval pending…). A short list of best papers will be invited for a post-conference publication in Government Information Quarterly (GIQ), Elsevier, Q1, Cite Score: 11.6, Impact Factor: 7.279 and Technological Forecasting and Social Change, An International Journal, Elsevier, Q1, Cite Score: 12.1, Impact Factor: 8.593.

If you are interested in submitting your paper, add to your calendar the submission date – May 30, while the event will take place during September 12-14, 2022.

Due to the unpredictability of the current situation in the light of pandemic, we expect to have a hybrid event, i.e. both online and on-site participation will be possible. For the later mode, we will be very glad to meet participants, who will be able to attend the event physically, in peaceful and spectacular city of Tamaulipas, Mexico in mid-September, 2022. Hope to meet you there!!!