CyberCommando’s meetup and my talk on Internet of Things Search Engines and their role in detecting vulnerable open data sources

October is Cybersecurity Awareness Month, as part of which CyberCommando’s meetup 2023 took place in the very heart of Latvia – Riga, where I was invited to deliver an invited talk that I devoted to IoTSE and entitled “What do Internet of Things Search Engines know about you? or IoTSE as a vulnerable open data sources detection tool“.

CyberCommando’s meetup organizers claim it to be the most anticipated vendor independent industry event in the realm of cybersecurity, a conference designed to empower our local and regional IT security professionals as we face the evolving challenges of the digital age by bringing together high-level ICT professionals from local, regional, and international businesses, governments and government agencies, tech communities, financial, public and critical infrastructure sectors. CyberCommando’s meetup covered a broad set of topics, starting from development of ICT security skills and Awareness Raising, to modern market developments and numerous technological solutions in the Cloud, Data, Mobility, Network, Application, Endpoint, Identity & Access, and SecOps, to corporate and government strategies and the future of the sector. Three parallel sessions and numerous talks delivered by 20+ local and international experts, including but not limited to IT-Harvest, Radware, DeepInstinct, Pentera, ForeScout Technologies, CERT.LV, ESET. It is a great honor to complement this list by the University of Tartu, which I represented delivering my talk at the main stage 🙂

Let’s refer to my talk – “What do Internet of Things Search Engines know about you? or IoTSE as a vulnerable open data sources detection tool“. Luckily, very few attendees knew or used OSINT (Open Source INTelligence), Internet of Things Search Engines (IoTSE) (however, perhaps they were just too shy to raise their hands when I asked this), so, hopefully, this was a good choice of topic. So, what was it about?

Today, there are billions of interconnected devices that form Cyber-Physical Systems (CPS), Internet of Things (IoT) and Industrial Internet of Things (IIoT) ecosystems. As the number of devices and systems in use and the volume and the value of data increases, the risks of security breaches increase as well.

As I discussed previously, this “has become even more relevant in terms of COVID-19 pandemic, when in addition to affecting the health, lives, and lifestyle of billions of citizens globally, making it even more digitized, it has had a significant impact on business [3]. This is especially the case because of challenges companies have faced in maintaining business continuity in this so-called “new normal”. However, in addition to those cybersecurity threats that are caused by changes directly related to the pandemic and its consequences, many previously known threats have become even more desirable targets for intruders, hackers. Every year millions of personal records become available online [4-6]. Lallie et al. [3] have compiled statistics on the current state of cybersecurity horizon during the pandemic, which clearly indicate a significant increase of such. As an example, Shi [7] reported a 600% increase in phishing attacks in March 2020, just a few months after the start of the pandemic, when some countries were not even affected. Miles [8], however, reported that in 2021, there was a record-breaking number of data compromises, where “the number of data compromises was up more than 68% when compared to 2020”, when LinkedIn was the most exploited brand in phishing attacks, followed by DHL, Google, Microsoft, FedEx, WhatsApp, Amazon, Maersk, AliExpress and Apple.”

And while Risk based security & Flashpoint (2021) [5] suggests that vulnerability landscape is returning to normal, , incl. but not limited due to various activities, such as #WashYourCyberHands INTERPOL capmaign and “vaccinate your organization” movements, another trigger closely related to cybersecurity that is now affecting the world is geopolitical upheaval. Additionally, according to Cybersecurity Ventures, by 2025, cybercrime will cost the world economy around $10.5 trillion annually, increasing from $3 trillion in 2015. Moreover, we are at risk of what is called Cyber Apocalypse or Cyber Armageddon, as was discussed during World Economic Forum (and according to Forbes), which is very likely to happen in coming 2 years (hopefully, it will not).

According to Forbes, the key drivers for this are the ongoing digitization of society, behavioral changes due to COVID-19 pandemic, political instability such as wars, the global economic downturn, while WEF relate this to the fact that technology becomes more complex, in particular, breakthrough technologies such as AI (considering current state-of-the-art, I would stress the role of quantum computing here), where I would stress that this “complexity” is two-fold, i.e., technologies become more advanced, while at the same time – easier to use, including those that can be used to detect and expose vulnerabilities. At the same time, although society is being digitized, society tend to lack digital literacy, data literacy & security literacy.

Hence, when we ask what should be done to tackle associated issues, the answer is also multi-fold, where some recommendations being actively discussed, including Forbes and Accenture, are to “secure the core”, which, in turn, involves ensuring that security and resilience are built into every aspect of the organization, understanding that cybersecurity is not something that’s only discussed within the IT department but rather at all levels of organization, organizations need to address the skills shortage within the cybersecurity domain, and it should involve utilizing automation where possible

To put it simply:

  • (cyber)security governance
  • digital literacy
  • cybersecurity is not a one-time event, but a continuous process
  • automation whenever possible
  • «security first!» as a principle for all artifacts, processes and ecosystem
  • preferably – «security-by-design» and «absolute security», which, of course, is rather an utopia, but still something we have to try to achieve (despite the fact we know it is impossible to achieve this level).

Or even simpler, as I typically say – “security to every home!”.

In the light of the above, i.e., “security first!” as a principle for all artifacts and the need to “secure the core” – are our data management systems always protected by default (i.e., secure-by-design)? While it can sound surprisingly and weird in 2023, but this is a fact that while various security protection mechanisms have been widely implemented, the concept of a “primitive” artifact such as a data management system seems to have been more neglected and the number of unprotected or insufficiently protected data sources is enormous. Recent research demonstrated that weak data and database protection in particular is one of the key security threats [4,6,9-11]. According to a list drawn up by Bekker [5] and Identity Force on major security breaches in 2020, a large number of data leaks occur due to unsecured databases. As an example:

  • Estee Lauder – 440 million customer records
  • Prestige Software hotel reservation platform – over 10 million hotel guests, including Expedia, Hotels.com, Booking.com, Agoda etc.
  • U.K-based Security Firm gained data of Adobe, Twitter, Tumbler, LinkedIn etc. and users with a total of over 5 billion records
  • Marijuana Dispensaries – 85 000 medical patient and recreational user records

to name just a few… At times it is due to their (mis)configuration, at times – due to the vulnerabilities in products or services, where additional security mechanisms would be required. Sometimes, of course, this due to the very targeted attacks, where the remaining of this post will have limited value, but let’s rather focus on those very critical cases, which refer to the above, especially in the context of the above mentioned fact that recent advances in ICT decreased the level of complexity of searching for connected devices on the Internet and easy access to them even for novices due to the widespread popularity of step-by-step guides on how to use IoTSE – aka Internet of Everything (IoE) or Open Source Intelligence (OSINT) Search Engines such as Shodan, BinaryEdge, Censys, ZoomEye, Hunter, Greynoise, Shodan, Censys, IoTCrawler – to find and gain access to insufficiently protected webcams, routers, databases, refrigerators, power plants, and even wind turbines. As a result, OSINT was recognized to be one of the five major categories of CTI (Cyber Threat Intelligence )sources (at times more than five are named, but OSINT remain to be part of this X categories), along with Human Intelligence (HUMINT), Counter Intelligence, Internal Intelligence and Finished Intelligence (FINTEL).

While these tools may represent a security risk, they provide many positive and security-enhancing opportunities. They provide an overview on network security, i.e., devices connected to the Internet within the company, are useful for market research and adapting business strategies, allow to track the growing number of smart devices representing the IoT world, tracking ransomware – the number and nature of devices affected by it, and therefore allow to determine the appropriate actions to protect yourself in the light of current trends. However, almost every of these white hat-oriented objectives can be exploited by black-hatters.

In this talk I raised several questions that can be at least partly answered with the help of IoTSE, such as:

  • Whether data source is visible and even accessible outside the organization?
  • What data can be gathered from it? and what is their “value” for external actors, such as attackers and fraudsters? I.e., whether these data can pose a threat to the organization using them to deploy an attack?
  • Are stronger security mechanisms needed? Is the vulnerability related to internal (mis)configuration or database in use?

To answer the above questions, I referred to the study that has been conducted by me and my former student – Artjoms Daškevičs (very talented student, whose bachelor thesis was even nominated to the best Computer Science thesis of in Latvia) some time ago. As part of that study an Internet of Things Search Engines- (IoTSE-) based tool called ShoBEVODSDT (Shodan- and Binary Edge-based Vulnerable Open Data Sources Detection Tool) was developed. This “toy example” of IoTSE conducts the passive assessment – it does not harm the databases but rather checks for potentially existing bottlenecks or weaknesses which, if the attack would take place, could be exposed. It allows for both comprehensive analysis for all unprotected data sources falling into the list of predefined data sources – MySQL, PostgreSQL, MongoDB, Redis, Elasticsearch, CouchDB, Cassandra and Memcached, or to define IP range to examine what can be seen from the outside of the organization about the data source (read more in (Daskevics and Nikiforova, 2021)).

The remainder was mostly built around four questions (and articles / book chapters) that we addressed with its help, namely:

  • Which data sources have proven to be the most vulnerable and visible outside the organization?
  • What data can be gathered from open data sources (if any), and what is their “value” for external actors, such as attacker and fraudsters? Whether these data can pose a threat to the organization using them to deploy an attack?

This part was built around our conference paper and this book chapter. In short (for a bit longer answer refer to the article), the number of data sources accessible outside the organization is less than 2% (more than 98% of data sources are not accessible via a simple IoTSE tool). However, there are some data sources that may pose risks to organizations and 12% of open data sources – data sources IoTSE tool was able to reach were already compromised or contain the data that can be used to compromise them. ElasticSearch and Memcached had the highest ratio of instances to which it was possible to connect, while MongoDB, PostgreSQL and ElasticSearch demonstrate the most negative trend in terms of already compromised databases (not by us, of course).

In addition, we might be interested in comparing SQL and NoSQL databases, where the latter are less likely to provide security measures, including sometimes very primitive and simple measures such as an authentication,authorization (Sahafizadeh et al., 2015) and data encryption. This is what we explored in the book chapter. We were not able to find significant differences, where from the “most secure”service viewpoint, CouchDB has demonstrated very good results in the context of security as the NoSQL database and MySQL as a relational database. However, if the developer needs to use Redis or Memcached, additional security mechanisms and/ or activities should be introduced to protect them. It must be understood, however, that these results cannot be broadly disseminated with regard to the security of the open data storage facility, mostly by demonstrating how many data storage holders were concerned about the security of their data storage facilities, since many data storage facilities have the potential to apply a series of built-in mechanisms. For the “most unsecure” service, Elasticsearch is characterized by weaker and less frequently used security protection mechanisms. This means that the database holder should be wary of using it. Similar conclusion can be drawn on Memcached (although it contradicts to CVE Details), where the total number of vulnerabilities found was the highest.However, the risk of these vulnerabilities was lower compared to ElasticSearch, so it can be assumed that CVE Details either does not respect such “low-level” weaknesses or have not yet identified them. Here in the future, an in-depth analysis of what CVE Details counts as vulnerability, and further exploration of the correlation with our results, could be carried out.

The next question we were interested in was:

  • Which Baltic country – Latvia, Lithuania, Estonia, has the most open & vulnerable data sources? and whether technological development of Estonia will be visible here as well?

This question was raised and partially answered in another conference paper. It is impossible to give an unambiguous answer here, since while Latvia showed the highest ratio of successful connections (and Estonia the lowest), Lithuania showed the most negative result in terms of already compromised data sources, and Estonia – for sensitive and non-sensitive data. Estonia, however, had the largest number of data sources from which data could not be obtained (with Latvia having a slightly lower but still relatively good result in this regard). And based on the average value of the data that could be obtained form these data sources, Lithuania again demonstrated the most negative result, which, however, was only slightly different from the result demonstrated by Estonia and Latvia (which may be a statistical error, since the total number of data sources found by our tool, differed significantly for these countries). When examining specific data sources that are more likely causing lower results, they vary from one country to another, so it is impossible to find the most insecure database that is the root of all problems.

And one more question I raised was:

  • Do “traditional” vulnerability registries provide a sufficiently comprehensive view of the DBMS security, or should they be subject for intensive and dynamic inspection by DBMS owners?

This was covered in the book chapter, which provides a comparative analysis of the results extracted from the CVE database with the results obtained as a result of the application of the IoTSE-based tool. It is not surprising – the results in most cases are rather complimentary, and one source cannot completely replace the second. This is not only due to scope limitations of both sources – CVE Details cover some databases not covered by ShobeVODSDT, as well as provide insights on more diverse set of vulnerabilities, while not providing the most up-to-date information with a very limited insight on MySQL. At the same time, there are cases when both sources refer to a security-related issue and their frequency, which can be seen as a trend and treated by users respectively taking action to secure the database that definitely do not comply with the “secure by design” principle. This refers to MongoDB, PostgreSQL and Redis.

All in all, it can be said that the answers to some of those questions may seem obvious or expected, however, as our research has shown, firstly, not all of them are obvious to everyone (i.e., there are no secure-by-design databases/data sources, so the data source owner has to think about its security), and, secondly, not all of these “obvious” answers are 100% correct.

All in all, both the talk and these studies show an obvious reality, which, however, is not always visible to the company. While “this may seem surprisingly in light of current advances, the first step that still needs to be taken thinking about date security is to make sure that the database uses the basic security features […] Ignorance or non-awareness can have serious consequences leading to data leakages if these vulnerabilities are exploited. Data security and appropriate database configuration is not only about NoSQL, which is typically considered to be much less secured, but also about RDBMS. This study has shown that RDBMS are also relatively inferior to various types of vulnerabilities. Moreover, there is no “secure by design” database, which is not surprising since absolute security is known to be impossible. However, this does not mean that actions should not be taken to improve it. More precisely, it should be a continuous process consisting of a set of interrelated steps, sometimes referred to as “reveal-prioritize-remediate”. It should be noted that 85% of breaches in 2021 were due to a human factor, with social engineering recognized as the most popular pattern [12]. The reason for this is that even in the case of highly developed and mature data and system protection mechanism (e.g., IDS), the human factor remains very difficult to control. Therefore, education and training of system users regarding digital literacy, as well as the definition, implementation and maintaining security policies and risk management strategy, must complement technical advances.

Or, to put it even simpler, once again: digital literacy “to every home”, cybersecurity is not a one-time event but a continuous process, automation whenever possible, cybersecurity governance, “security first!” principle for all artifacts, processes and ecosystem, and, preferably, “security-by-design” principle whenever and wherever possible. Or, as I concluded the talk – “We have got to start locking that door!” (by Ross, F.R.I.E.N.D.S) before we act as Commando

Big thanks goes to the organizers of the event, esp. to Andris Soroka and sponsors, who supported such a wonderful event – HeadTechnology, ForeScout, LogPoint, DeepInstinct, IT-Harvest, Pentera, GTB Technologies, Stellar Cyber, Appgate, OneSpan, ESET Digital Security, Veriato, Radware, Riseba, Ministry of Defence of Latvia, CERT.LV, Latvijas Sertificēto Persona Datu Aizsardzības Speciālistu Asociācija, Dati Group, Latvijas Kiberpshiloģijas Asociācija, Optimcom, Vidzeme University of Appliced Sciences, Stallion, ITEksperts, Kingston Technology.

P.S. If, considering the topics I typically cover, you are wondering, why I am talking about security this time, let me briefly answer your question. First, for those who knows me better, it is a well-known fact that cybersecurity was my first choice in a big IT world – it was, is and probably remain my passion, although now it is rather a hobby. This was also the central part of my duties in one of my previous workplaces, incl. the one when I worked with the organizer of this event (oh my first honeypot…). Second, but related to the first point, this was the topic, addressing which one of my professors (during the first or the second year of my studies) told me that I must become a researcher (“yes, sure 😀 😀 😀 you must be kidding” was my thought at that point, but I do not laugh on this “ridiculous joke” anymore, and am rather grateful that I was noticed so early and was then constantly reminded about this by other colleagues, which resulted in the current version of me). Third, data quality and open data that I am talking about a lot are all about the value of the data, while two main prerequisites for this are (1) data quality and (2) data security, so, in fact, data security is inevitable component that we must think and talk about.

References:

The International Conference on Intelligent Metaverse Technologies & Applications (iMeta) and the 8th IEEE International Conference on Fog and Mobile Edge Computing (FMEC) in Tartu

This year we – University of Tartu, Institute of Computer Science – have a pleasure to host FMEC2023, taking place in conjunction with iMETA, where iMETA, as you can assume, is associated with the metaverse (more precisely, the International Conference on Intelligent Metaverse Technologies & Applications), while FMEC – for the The Eighth IEEE International Conference on Fog and Mobile Edge Computing.

FMEC 2023 conference aims to investigate the opportunities and requirements for Mobile Edge Computing dominance, and seeks for novel contributions that help mitigating Mobile Edge Computing challenges. That is, the objective of FMEC 2023 is to provide a forum for scientists, engineers, and researchers to discuss and exchange new ideas, novel results and experience on all aspects of Fog and Mobile Edge Computing (FMEC) covering one of its major areas, which include, but not limited to the following tracks:

  • Track 1: Fog and Mobile Edge Computing fuels Smart Mobility
  • Track 2: Edge-Cloud Continuum and Networking
  • Track 3: Industrial Fog and Mobile Edge Computing Applications
  • Track 4: Trustworthy AI for Edge and Fog Computing
  • Track 5: Security and privacy in Fog and Mobile Edge Computing
  • Track 6: Decentralized Data Management and Streaming Systems in FMEC
  • Track 7: FMEC General Track

iMETA conference, in turn, aims to provide attendees with comprehensive understanding of the communication, computing, and system requirements of the metaverse. Through keynote speeches, panel discussions, and presentations, attendees had the opportunity to engage with experts and learn about the latest developments and future trends in the field, covering areas such as:

  • AI
  • Security and Privacy
  • Networking and Communications
  • Systems and Computing
  • Multimedia and Computer Vision
  • Immersive Technologies and Services
  • Storage and Processing

As part of these conferences, I had the pleasure of chairing one of the sessions, where the room was carefully selected by the organizers to make me feel as I would be at home – we were located in the so-called Baltic rooms of VSpa conference center, i.e., Estonia, Lithuania, and Latvia, so guess which room the session took place in? Bingo, Latvia! All in all, 5 talks were delivered:

  • Federated Object Detection for Quality Inspection in Shared Production by Vinit Hegiste
  • Federated Bayesian Network Ensembles by Florian van Daalen
  • Hyperparameters Optimization for Federated Learning System: Speech Emotion Recognition Case Study by Mohammadreza Mohammadi
  • Towards Energy-Aware Federated Traffic Prediction for Cellular Networks by Vasileios Perifanis
  • RegAgg: A Scalable Approach for Efficient Weight Aggregation in Federated Lesion Segmentation of Brain MRIs by Muhammad Irfan Khan, Esa Alhoniemi, Elina Kontio, Suleiman A. Khan and Mojtaba Jafaritadi

Each of the above was followed by a very lively discussion, which continued also after the session. This, in turn, was followed by an insightful keynote delivered by Mérouane Debbah on “Immersive Media and Massive Twinning: Advancing Towards the Metaverse”.

Also, thanks to our colleagues from EEVR (Estonian VR and AR Association), I briefly went to my school times and chemistry lessons having a bit of fun – good point, I’ve always loved them (nerd and weirdo, I know…).

Thanks to the entire FMEC and iMETA organizing team!

💬💬💬 Contributed talk for QWorld Quantum Science Days 2023 (QSD 2023)

In the very last days of May 2023, I had yet another experience – I delivered a contributed talk at QWorld Quantum Science Days 2023 (QSD 2023) titled “Framework for understanding quantum computing use cases from a multidisciplinary perspective and future research directions” (Ukpabi, D.C., Karjaluoto, H., Botticher, A., Nikiforova, A., Petrescu, D.I., Schindler, P., Valtenbergs, V., Lehmann, L., & Yakaryılmaz, A), which, in fact, is based on the paper we made publicly available some time ago and developed it even earlier when together with Germany, Spain, Finland, Romania, and Latvia we built a consortia and submitted a project proposal to CHANSE call “Transformations: Social and Cultural Dynamics in the Digital Age”. We went there much far beyond my expectations, i.e. in fact, we were notified that this time we will not be granted the funding for the project at the very last stage, having gone through all those intermediate evaluation rounds, which were already fascinating news (at least for me). While working on the proposal and building our network, we conducted a preliminary analysis of the area, which then, regardless of the output of the application, we decided to continue and bring to at least some logical end. We like our result so decided to make it publicly available. And now, a few years from that, we submitted our work to QWorld Quantum Science Days 2023 (QSD 2023) and were accepted. It was a big surprise, and I, as the person delegated by our team to present our study, delivered this talk, where I finally familiarized the audience with our findings. What was my surprise when my talk, which followed immediately after the keynote “Let’s talk about Quantum; Societal readiness through science communication research” delivered on behalf of Quantum DELTA NL by Julia Cramer, was in the very similar direction? It is also worth mentioning a very interesting coincidence that while the keynote elaborated on the DELTA that stands for five major quantum hubs, namely Delft, Eindhoven, Leiden, Twente, Amsterdam, I was preparing the last things for my presentation located in the Delta building – it is the name of the building my office is located in. In both cases, no connection with COVID-19 😀

🤔 What is the paper about?

There has been increasing awareness of the tremendous opportunities inherent in quantum computing. It is expected that the speed and efficiency of quantum computing will significantly impact the Internet of Things, cryptography, finance, and marketing. Accordingly, there has been increased quantum computing research funding from national and regional governments and private firms. However, ❗❗❗ critical concerns regarding legal, political, and business-related policies germane to quantum computing adoption exist ❗❗❗

Since this is an emerging and highly technical domain, most of the existing studies focus heavily on the technical aspects of quantum computing. In contrast, our study highlights its practical and social uses cases, which are needed for the increased interest of governments. More specifically, our study offers a multidisciplinary review of quantum computing, drawing on the expertise of scholars from a wide range of disciplines whose insights coalesce into a framework that simplifies the understanding of quantum computing, identifies possible areas of market disruption and offer empirically based recommendations that are critical for forecasting, planning, and strategically positioning QCs for accelerated diffusion.

"Framework for understanding quantum computing use cases from a multidisciplinary perspective and future research directions" (Ukpabi, D.C., Karjaluoto, H., Botticher, A., Nikiforova, A., Petrescu, D.I., Schindler, P., Valtenbergs, V., Lehmann, L., & Yakaryılmaz, A)

To this end, we conducted a gray literature research, whose outputs were then structured in accordance with Dwivedi et al., 2021 (Dwivedi et al. (2021). Setting the future of digital and social media marketing research: Perspectives and research propositions. International Journal of Information Management, 59, 102168), which embodies three broad areas—environment, users, and application areas—and the dominant sub-themes presented in figure below. We found that for application areas, business and finance, renewable energy, medicine & pharmaceuticals, and manufacturing are now the hottest. While for environment, we found subdomains such as ecosystem, security, jurisprudence, institutional change & geopolitics. And for the users, nothing surprising – as typically, customers, firms, countries. We then dive into each of those areas, as well as later come up with the most popular topics, the most promising, and overlooked.

Sounds interesting? Read the paper here, find slides here, watch video here.

Quantum Science Days is an annual, international, and virtual scientific conference organized by QWorld (Association) to provide opportunities to the quantum community to present and discuss their research results at all levels (from short projects to thesis work to research publications), and to get to know each other. The third edition (QSD2023) included 7 invited speakers, 10 thematic talks on “Building an Open Quantum Ecosystem”, 31 contributed talks, an industrial demo session by Classiq, and a career talk on quantum. QSD2023 was sponsored by Unitary Fund & Classiq and supported by Latvian Quantum Initiative.

Qworld

Guest Lecture for the Federal University of Technology – Paraná (UTFPR) on Open Data Ecosystems in and for sustainable development of data-driven smart cities and Society 5.0

Today (May 16, 2023), I had a pleasure to deliver one more guest lecture for master and doctoral students of the Federal University of Technology – Paraná (Universidade Tecnológica Federal do Paraná (UTFPR)) as part of Smart Cities course delivered by prof. Regina Negri Pagani. This time the topic of my lecture was “Open Data Ecosystems in and for sustainable development of data-driven smart cities and Society 5.0”.

As part of this lecture we talked about open data and open government data (OGD) phenomena and how they evolved over years, what the open data ecosystem is and what constitutes it. I then tried to put it in the context of Brazil reflecting on the current state-of-the-art of open government and OGD in Brazil and its cities referring to both Open Government Partnership (Brazil was one of the the founding countries of OGP), existing OGD, transparency and central bank portals, studies that explored effects of predictors of citizens’ attitudes and intention to use OGD (*by de Souza, Ariel Antônio Conceição, Marcia Juliana d’Angelo, and Raimundo Nonato Lima Filho), factors influencing civil servant’s intention to disclose data (**by Fernando Kleiman, Sylvia J.T. Jansen, Sebastiaan Meijer, Marijn Janssen), as well as the relationship between transparency and open data initiatives in five Brazilian cities (identifying that they are not related for these five cities) (***by Araújo, Ana Carolina, Lucas Reis, and Rafael Cardoso Sampaio)

Then, presenting the concepts of Smart Cities and their “generations”, Sustainable Cities and Sustainable Smart Cities, as well as Society 5.0 (aka Super Smart Society and Society of imagination), I highlighted the overlaps and interweavings of the above and how the development of one contributes to the other, i.e. how interrelated they are and how complex this large ecosystem is.

And then, the remaining part of the lecture was focused around the topic of open data ecosystems starting with the current state of the art around the topic, i.e. different and similar definitions, components, characteristics etc., and finally the study we conducted some time ago with my colleagues from Czech Republic, Poland, Finland, Germany and Latvia, namely “Transparency of open data ecosystems in smart cities: Definition and assessment of the maturity of transparency in 22 smart cities“**** published in . Sustainable Cities and Society (Elsevier), in which we:

  • developed a benchmarking framework to assess the level of transparency of open data ecosystems in smart cities consisting of 36 features by adapting transparency-by-design framework for open data portals (*****by Lněnička and Nikiforova, 2021);
  • investigated smart city data portals’ compliance with the transparency requirements, where the developed framework has been applied to 34 portals representing 22 smart cities, allowing determination of the level of transparency maturity at general, individual, and group levels;
  • developed four-level transparency maturity model to allow the classification of the portal as developing, defined, managed, and integrated, thereby allowing to identify key issues to be transformed into corrective actions to be included into agenda and navigate to the set of more competitive portals;
  • ranked the portals concerned based on their transparency maturity, thereby allowing more successful portals to be identified in order to be used as an example for improving overall or feature-wised performance by providing recommendations for the identification and improvement of current maturity level and specific features;
  • conceptualized an open data ecosystem in the context of a smart city (!!!) and determined its key components considering the data-centric and data-driven infrastructure and other components and relationships, using the system theory approach;
  • on the basis of the dominant components of data infrastructure, defined five types of current open data ecosystems (see below) thereby opening up a new horizon for research in the area of sustainable and socially resilient smart cities by means of open data and citizen-centered open smart city governance.

Our definition of open data ecosystem in the smart city context , established based on the knowledge and experience of the experts involved and observations made during the study is:


systematic efforts to integrate ICT and technologies into city life to deliver citizen-centric, better-quality services, solutions to city problems with open data published through the data-centric and data-driven infrastructure.”

However, the concepts that affect/shape the ecosystem are:

  • stakeholders and their roles,
  • phases of the data lifecycle, in which a stakeholder participates in the ecosystem,
  • technical and technological infrastructure,
  • generic services and platforms,
  • human capacities and skills of both providers and consumers,
  • smart city domains (thematic categories) as the targeted areas for data reuse,
  • externalities affecting goals, policy, and resources,
  • level of (de)centralization of data sources – development, restrictions,
  • perception of importance and support from public officials,
  • user interface, user experience, and usability.

As for the types of current open data ecosystems, we identified 5 types that are as follows:

  • type#1: the city’s OGD portal is the center of the data infrastructure, and all OGD, including those labeled as smart, are published and centralized through it. For this type of open data ecosystem, other websites that had previously provided open data or other services to access public sector information have been replaced by the OGD portal. The focus is on datasets, providing features to work with them, reuse them, and make all data requests transparent in one place;
  • type#2: this ecosystem also usually has the OGD portal as the central point, but other portals and platforms publish open data. The smart data portal and online city dashboards focusing on different dimensions such as transport, health, air quality, etc., are important components of this ecosystem;
  • type#3: a decentralized type of ecosystem that includes many components such as OGD portal, smart data portal, geodata portal, etc. However, it increases the ecosystem’s complexity, which is more difficult to manage and less usable for stakeholders
  • type#4: the smart city portal focused on projects and services is usually the center of this ecosystem, but it is not the priority to provide data and appropriate features to reuse them. Most services are developed by public sector organizations, research institutions, or businesses and provided to citizens;
  • type#5: apart from the city’s OGD portal, there are additional transparency-, participation-, collaboration-, and cooperation-oriented websites and portals to support the formation and improvement of relations between stakeholders. This type of ecosystem is focused on processes to improve open data reuse.

Sounds interesting? Read the article here and see other recommended articles below! 🙂

This was then wrapped up by emphasizing key overseen topics that are paid to little attention to, although being crucial for a sustainable public data ecosystem.

And I can only hope that this lecture was just a little bit as interesting as my dear colleague prof. Regina Negri Pagani characterized it! It is always pleasure to hear her feedback, as her comments are so gentle and inspiring! And there is nothing better than hear such wonderful and positive feedback and an immediate invitation for the next editions of this course, which will be my pleasure – this was the 2nd edition of the course, when I served as a guest lecture and will be definitely glad to make this yet another good tradition!

References:

*de Souza, Ariel Antônio Conceição, Marcia Juliana d’Angelo, and Raimundo Nonato Lima Filho. “Effects of Predictors of Citizens’ Attitudes and Intention to Use Open Government Data and Government 2.0.” Government Information Quarterly 39.2 (2022): 101663.

**Kleiman, F., Jansen, S. J., Meijer, S., & Janssen, M. (2023). Understanding civil servants’ intentions to open data: factors influencing behavior to disclose data. Information Technology & People.

***Araújo, Ana Carolina, Lucas Reis, and Rafael Cardoso Sampaio. “Do transparency and open data walk together? An analysis of initiatives in five Brazilian capitals.” Media Studies 7.14 (2016).

****Lnenicka, M., Nikiforova, A., Luterek, M., Azeroual, O., Ukpabi, D., Valtenbergs, V., & Machova, R. (2022). Transparency of open data ecosystems in smart cities: Definition and assessment of the maturity of transparency in 22 smart cities. Sustainable Cities and Society, 82, 103906.

*****Lnenicka, M., & Nikiforova, A. (2021). Transparency-by-design: What is the role of open data portals?. Telematics and Informatics, 61, 101605.

Some other studies you might be interested:

Rii Forum 2023 “Innovation 5.0: Navigating shocks and crises in uncertain times Technology-Business-Society” & a plenary debate “Advances in ICT & the Society”

Last week, I had an unforgettable experience at the Research and Innovation Forum (RiiForum) on which I posted previously in Krakow, Poland, serving as plenary speaker and session chair. It was another great experience to have an absolutely amazing plenary session titled “Advances in ICT & the Society: threading the thin line between progress, development and mental health”, where we – Prof. Dr. Yves Wautelet, Prof. Dr. Marek Krzystanek, Karolina Laurentowska & Prof. Marek Pawlicki – discussed disruptive technologies in our professional lives in the past years, how they affected us and our colleagues, how they affect(ed) society and its specific groups, including their mental health, and general perception of technology, i.e. an enemy of humanity, or rather a friend and support, and how to make sure the second take place. And from this we have developed a discussion around AI, chatGPT, Metaverse, blockchain, even slightly touching on quantum computing. Of course, all this was placed in the context of democracy and freedoms / liberties. All in all, we approached the topic of governance and policy-making, which is too often reactive rather than proactive, which, in turn, leads to many negative consequences, as well as elaborated on the engineering practices. 

To sum up – emerging and disruptive technologies, Blockchain, AI, Metaverse, digital competencies, education, liberty, democracy, openness, engagement, metaverse, inclusivity, Industry 5.0, Society 5.0 – and it is not a list of buzzwords, but a list of topics we have managed to cover both plenary speakers and the audience and continued to talk about them during the whole conference. Rich enough, isn’t it?

And then the day did not end, continuing with several super insightful sessions, where, of course, one I enjoyed most is the one that I chaired. Three qualitative talks with further rich discussion after each thanks to an excellent audience, despite the fact this was the last session of the day (before the dinner), namely:

  • Privacy in smart cities using VOSviewer: a bibliometric analysis by Xhimi Hysa, Gianluca Maria Guazzo, Vilma Cekani, Pierangelo Rosati
  • Public policy of innovation in China by Krzysztof Karwowski, Anna Visvizi
  • How Human-Centric solutions and Artificial Intelligence meet smart cities in Industry 5.0 by Tamai Ramirez, Sandra Amador, Antonio Macia-Lillo, Higinio Mora
     

And the last, but not the least, Krakow surprised me lot (in a positive sense, of course) – it was my first time in Poland, and I am absolutely glad that it was on such a beautiful city as Krakow – the place with the rich history and culture! Thank you dear RiiForum2023 organizers – Anna Visvizi, Vincenzo Corvello, ORLANDO TROISI, Mara Grimaldi, Giovanni Baldi and everyone who was involved – it is always a pleasure to be a part of this community!