ICEGOV2022 workshop: Identification of high-value dataset determinants: is there a silver bullet?

This year the 15th International Conference on Theory and Practice of Electronic Governance known as ICEGOV2022 will be focused on “Digital Governance for Social, Economic, and Environmental Prosperity“. And we – me, Charalampos Alexopoulos, Nina Rizun and Magdalena Ciesielska are glad to announce our own a community-based, participatory, interactive workshop aimed at identifying High-Value Dataset (HVD) determinants towards efficient sustainability-oriented data-driven development.

Briefly about the workshop, our motivation, our objective and why we want to make you a part of it…

Today, Open Government Data (OGD) are seen as one of the trends that can potentially benefit the economy, improve the quality, efficiency, and transparency of public services, as well as transform our lives contributing to efficient sustainability-oriented data-driven development. Their scope, as well as actors who can work with them, do not meet any restrictions. In addition to “classical” benefits such as improving the quality, efficiency, and transparency of public services, they are considered drivers and promoters of Industry 4.0 and Society 5.0 [1,2], including Smart cities trends. OGD is also a driver of economic growth, and, according to [3], the open data market size in 2020 was estimated at €184 billion and it is expected to grow in the coming years reaching €199.51 and €334.21 billion in 2025. However, the achievement of these benefits is closely linked to the “value” of the data, i.e. the extent to which the data provided by public agencies are interesting, useful and valuable for their reuse, creating value for society and the economy. High data availability however can disorient users when deciding which sources are best suited to their needs [4]. The practice demonstrates that the majority of data sets available on the OGD portals are not used, where only a few datasets create value for users [5], [6]. This is also in line with Quarati and Martino [4], who provided a snapshot on the use of 15 OGD portals, based on usage indicators available. This also applies to Latvia [7,8]. In other words, in order to gain benefit from the OGD, countries should open data cleverly, where not quantity, but quality and data value must be more important, since all benefits of the OGD can only be obtained if the data are re-used and transformed to value.

Here, the concept of “high-value datasets” comes, pointing to data that would create highest value to society and economy. The concept of “high-value data” comes into force here. High-value data are defined as the data “the re-use of which is associated with important benefits for society, the environment and the economy, in particular because of their suitability for the creation of value-added services, applications and new, high-quality and decent jobs, and of the number of potential beneficiaries of the value-added services and applications based on those datasets” [9]. Although the PSI directive is a step in this direction by announcing six categories [9], they appear to be generic and do not take into account the national perspective, i.e. the nature of these data sets will depend to a large extent on the country concerned [10,11].
It is therefore important to support the identification of high-value datasets, which would enhance the interest of users of the OGD by transforming data in innovative solutions and services. The research suggests that different perspectives appear in the literature to identify “high-value datasets” and there is no consensus on the most comprehensive, so a number of activities will be taken covering these perspectives but prior identified within the workshop.

This workshop expects to raise a discussion on the identification of high-value data sets for a common understanding of how this could be done in general terms, i.e. what possible activities will lead to better understanding and clearer vision of what are the most valuable data sets for the society and economics of a particular country and how they can be identified (how? who? etc.). The topic under consideration is very important these days, given that the opening up of data sets with high potential for their use and re-use is expected to facilitate creation of new products or services with positive economic and social impact [12]. However, identifying these data is a complicated task, particularly where country-specific data sets should be identified.

This workshop is a step in this direction and is a continuation of the paper presented at ICEGOV2021 [13], where a first step in this direction was taken by conducting a survey of individual users and SME of Latvia aimed at clarifying their level of awareness about the existence of the OGD, their usage habits, as well as the overall level of satisfaction with the value of the OGD and their potential. This time we aim to develop the framework for identification of high-value datasets (and their determinants) as a result of comprehensive study conducted jointly with participants of ICEGOV. All in all, the objective of the workshop is to raise awareness of and establish a network of the major stakeholders around the HVD issue, allow each participant to think about how and whether the determination of HVD is taking place in their country and how this can be improved with the help of portal owners, data publishers, data owners and citizens. Our main motivation is that, as members of the ICEGOV community, we could jointly answer the following questions representing the objectives of the workshop:

  1. How can the “value” of open data be defined?
  2. What are the current indicators  for determining the value of data? Can they be used to identify valuable datasets to be opened? What are the country-specific high-value determinants (aspects) participants can think of?
  3. How high-value datasets can be identified? What mechanisms and/ or methods should be put in place to allow their determination? Could it be there an automated way to gather information for HVD? Can they be identified by third parties, e.g. researchers, enthusiasts AND potential data publishers, i.e. data owners?
  4. What should be the scope of the framework, i.e. who should be the target audience who should be made aware of the HVD applying this framework? public officials / servants? data owners? Intermediaries? (discussion with participants OR direction for our discussion depending on the participants and their profile).

More precisely, the following “procedure” is expected to be followed:

  • STEP 0 (conducted by participants (not mandatory)): participants are invited to get familiar with open data portals of their country (higher coverage, i.e. of more than their own country, is welcome) by inspecting the current state-of-the-art in terms of both the content – data available, functionality with particular interest of HVD determination-related features (if any) including citizen-engagement-oriented features, features allowing to track the current interest of users etc.
  • STEP 1: A brief introduction to the current state-of-the art [approximately 45 minutes]: How HVD are seen by the PSI Directive and what tasks are set for countries regarding determination and opening HVD, how countries are coping with this (both from grey literature and from personal experience on Latvia), what approaches and methods for determining HVDs are known and why is there no uniform method / framework? A brief overview of the results of a survey of individual users and small and medium-sized businesses (SME) of Latvia on their view regarding the current state of the data, i.e. in which extent they meet their needs, and what data might be useful for them, and how their availability would affect their willingness to use these data. Overview of Deloitte report on HVD. What is the methodology used? What are the indicators used? What are the results of the study?
  • STEP 2: Considering the diversity of perceptions of the term “value” (depending on the domain, actor etc.), the discussion in the form of brainstorming (idea generation) is expected to be held providing as many definitions as possible, which are then used to provide a more comprehensive definition(s) considering different perspectives (domain- and actor-related) [approximately 30-45 minutes]
  • STEP 3: Discussion on current methods / mechanisms to determine the current value of the data and determining HVD in the form of brainstorming [approximately 20-30 minutes]
  • STEP 4: Idea generation on potential methods / mechanisms to determine the current value of the data and determining HVD in the form of brainstorming [approximately 20-30 minutes]
  • STEP 5: Iterative filtering of features, methods, approaches that could constitute the framework for determination of high value datasets in the form of DELPHI-like analysis [approximately 45 minutes]
  • STEP 6: Agenda for future research, networking [approximately 30 minutes]

This is a community-based, participatory, interactive workshop aimed at engaging participants – instead of asking participants to write a paper to be later presented during the workshop in the form of sit-and-listen, we expect to establish a lively and interesting discussion of novel ideas, answering existing questions and raising new ones. The audience of the workshop is ICEGOV participants without restriction on the domain they represent, affiliation, interests, knowledge and experience. Both OGD experts and those who are not familiar with OGD are welcome.

Join us this October (4 – 7 October 2022)!

References:

  1. Bargiotti, L., De Keyzer, M., Goedertier, S., & Loutas, N. (2014). Value based prioritisation of Open Government Data investments. European Public Sector Information Platform.
  2. Bertot, J. C., McDermott, P., & Smith, T. (2012, January). Measurement of open government: Metrics and process. In 2012 45th Hawaii International Conference on System Sciences (pp. 2491-2499). IEEE.
  3. Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information
  4. European Comission, The Digital Economy and Society Index (DESI), online, https://ec.europa.eu/digital-single-market/en/digital-economy-and-society-index-desi, last accessed: 7.04.2021
  5. Gagliardi, D., Schina, L., Sarcinella, M. L., Mangialardi, G., Niglia, F., & Corallo, A. (2017). Information and communication technologies and public participation: interactive maps and value added for citizens. Government Information Quarterly, 34(1), 153-166.
  6. Huyer, E., Blank, M. (2020). Analytical Report 15: High-value datasets: understanding the perspective of data providers. Luxembourg: Publications Office of the European Union, 2020 doi:10.2830/363773
  7. Kampars, J., Zdravkovic, J., Stirna, J., & Grabis, J. (2020). Extending organizational capabilities with Open Data to support sustainable and dynamic business ecosystems. Software and Systems Modeling, 19(2), 371-398.
  8. Kotsev, A., Cetl, V., Dusart, J., & Mavridis, D. (2018). Data-driven Economies in Central and Eastern Europe
  9. Kucera, J., Chlapek, D., Klímek, J., & Necaský, M. (2015). Methodologies and Best Practices for Open Data Publication. In DATESO (pp. 52-64).
  10. McBride, K., Toots, M., Kalvet, T., & Krimmer, R. (2019). Turning Open Government Data into Public Value: Testing the COPS Framework for the Co-creation of OGD-Driven Public Services. In Governance Models for Creating Public Value in Open Data Initiatives (pp. 3-31). Springer, Cham.
  11. Nikiforova, A., & Lnenicka, M. (2021). A multi-perspective knowledge-driven approach for analysis of the demand side of the Open Government Data portal. Government Information Quarterly, 101622
  12. Ruijer, E., Détienne, F., Baker, M., Groff, J., & Meijer, A. J. (2020). The politics of open government data: Understanding organizational responses to pressure for more transparency. The American review of public administration, 50(3), 260-274
  13. Nikiforova, A. (2021, October). Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia. In 14th International Conference on Theory and Practice of Electronic Governance (pp. 367-372).


Editorial Board Member of Data & Policy (Cambridge University Press)

Since July 2022, I am elected by Syndicate of Cambridge University Press as an Editorial Board Member of the Cambridge University Journal Data & Policy. Data & Policy is a peer-reviewed, open access venue dedicated to the potential of data science to address important policy challenges. For more information about the goal and vision of the journal, read the Editorial Data & Policy: A new venue to study and explore policy–data interaction by Stefaan G. Verhulst, Zeynep Engin, and Jon Crowcroft. More precisely, I act as an Area Editor of “Focus on Data-driven Transformations in Policy and Governance” area (with a proud short name “Area 1“). This Area focuses on the high-level vision for philosophy, ideation, formulation and implementation of new approaches leading to paradigm shifts, innovation and efficiency gains in collective decision making processes. Topics include, but are not limited to:

  • Data-driven innovation in public, private and voluntary sector governance and policy-making at all levels (international; national and local): applications for real-time management, future planning, and rethinking/reframing governance and policy-making in the digital era;
  • Data and evidence-based policy-making;
  • Government-private sector-citizen interactions: data and digital power dynamics, asymmetry of information; democracy, public opinion and deliberation; citizen services;
  • Interactions between human, institutional and algorithmic decision-making processes, psychology and behaviour of decision-making;
  • Global policy-making: global existential debates on utilizing data-driven innovation with impact beyond individual institutions and states;
  • Socio-technical and cyber-physical systems, and their policy and governance implications.

The remaining areas represent more specifically the current applications, methodologies, strategies which underpin the broad aims of Data & Policy‘s vision: Area 2 “Data Technologies and Analytics for Policy and Governance“, Area 3 “Policy Frameworks, Governance and Management of Data-driven Innovations“, Area 4 “Ethics, Equity and Trust in Policy Data Interactions“, Area 5 “Algorithmic Governance“, Area 6 “Data to Tackle Global Issues and Dynamic Societal Threats“.

Editorial committees of Data & Policy (Area 1)

For the types of submission we are interested in, they are four:

  • Research articles that use rigorous methods that investigate how data science can inform or impact policy by, for example, improving situation analysis, predictions, public service design, and/or the legitimacy and/or effectiveness of policy making. Published research articles are typically reviewed by three peer reviewers: two assessing the academic or methodological rigour of the paper; and one providing an interdisciplinary or policy-specific perspective. (Approx 8,000 words in length).
  • Commentaries are shorter articles that discuss and/or problematize an issue relevant to the Data & Policy scope. Commentaries are typically reviewed by two peer reviewers. (Approx 4,000 words in length).
  • Translational articles are focused on the transfer of knowledge from research to practice and from practice to research. See our guide to writing translational papers. (Approx 6,000 words in length).
  • Replication studies examine previously published research, whether in Data & Policy or elsewhere, and report on an attempt to replicate findings.

Read more about Data & Policy and consider submitting your contribution!

Moreover, as a part of this journal, we (Data & Policy community) organize a hybrid physical-virtual format, with one-day, in-person conferences held in three regions: Asia (Hong Kong), America (Seattle) and Europe (Brussels). “Data for Policy: Ecosystems of innovation and virtual-physical interactions” conference I sincerely recommend you to consider and preferably to attend! While this is already the seventh edition of the conference, I take part in its organization for the first year, thus am especially excited and interested in its success!

Data for policy, Area Editors

In addition to its six established Standard Tracks, and reflecting its three-regions model this year, the Data for Policy 2022 conference highlights “Ecosystems of innovation and virtual-physical interactions” as its theme. Distinct geopolitical and virtual-physical ecosystems are emerging as everyday operations and important socio-economic decisions are increasingly outsourced to digital systems. For example, the US’s open market approach empowering multinational digital corporations contrasts with greater central government control in the Chinese digital ecosystem, and radically differs from Europe’s priority on individual rights, personal privacy and digital sovereignty. Other localised ecosystems are emerging around national priorities: India focuses on the domestic economy, and Russia prioritises public and national security. The Global South remains underrepresented in the global debate. The developmental trajectory for the different ecosystems will shape future governance models, democratic values, and the provision of citizen services. In an envisioned ‘metaverse’ future, boundaries between physical and virtual spaces will become even more blurred, further underlining the need to scrutinise and challenge the various systems of governance.

The Data for Policy conference series is the premier global forum for multiple disciplinary and cross-sector discussions around the theories, applications and implications of data science innovation in governance and the public sector. Its associated journal, Data & Policy, published by Cambridge University Press has quickly established itself as a major venue for publishing research in the field of data-policy interactions. Data for Policy is a non-profit initiative, registered as a community interest company in the UK, supported by sustainer partners Cambridge University Press, the Alan Turing Institute and the Office for National Statistics.

Read more about Data for Policy and become a part of it!