2023-03-09

Agenda

  • The DAUF project - “Datadriven Analys och Uppföljning av KTHs Forskning”
  • A demo of the “KTH Publication Analysis app” with projects data from CASE
  • Overview of available data sources for projects at KTH
  • Your questions and feedback

About the project

  • Collaboration within VS, between KTH-Library, RSO and ITA.
  • Agile model with 2 week sprints.
  • Creating services and tools for presentation of data, improved data flows and connecting data sources within KTH.

Background and progress

DAUF - background and context

Trend towards better overview of research outputs as well as integration of systems.

Example from Stanford

  Stanford RIALTO project (Youtube)

Related, currently at KTH

Progress - What is new since the last demo meeting

  • KTH Publication Analysis app
    • Allowing exploration of co-publication and collaboration at KTH
    • For ad hoc research groups
    • Rolling data
    • CASE projects data
  • ABM, Annual Bibliometric Monitoring
    • Sustainable Development Goals
  • Mobilizations of projects data from other sources
    • CASE - “Case Management System for Research Projects” @ KTH
    • Swecris
    • CORDIS (EU projects)
    • Other sources

Demo of KTH Publication Analysis app

KTH Publication Analysis app

The idea is that the analysis app will complement the Annual bibliometric monitoring (ABM), and answer questions like:

  • Who are the main collaborating partners to the division of Electric Power and Energy Systems?

  • How large co-publication output between is the publication output of the department of Gene technology over time, in co-publication with German organisations?

  • Which senior staff at CBH is collaborating with MIT, and what is the bibliometric performance of this research compared to CBH baseline?

  • What is the publication output and project count for an ad hoc grouping of staff related to a particular project?

KTH Publication Analysis app - collaboration

KTH Publication Analysis app - project view

Projects related to staff from ITM:Energy technology

Example based on TECoSA

TECoSA (Trustworthy Edge Computing Systems and Applications) Centre running since 2020 used as example, based on 9 PIs.

  • 462 publications 2010-2022

  • 137 publications 2020-2022 (e.g. from the TECoSA-period)

  • 28 projects in CASE - “Case Management System for Research Projects” @ KTH

  • 14 projects 2020-2022 (seems to be more in CASE, but some with missing data on duration)

  • Info on funders and SDG-goals on projects

  • Picture of collaborations, through co-publication

Data for projects at KTH

DAUF and data about projects at KTH

DAUF has been building bottom-up data consolidation and analytics for KTH, making links between Researchers <-> Outputs <-> Projects <-> Organisations

“Which projects are active? Which researchers are involved? From which organisations? Grant sizes?”

Definition of “project”?

There is no single “source of truth” for all projects at KTH. Data requirements depend on who asks the questions and for what purpose.

  • What is a project, really? It is an amorphous/vague concept:
    • Externally financed research activity, i.e. requires that funder(s) and funding exists?
    • A call which has (not yet?) been awarded financing?
    • Internal ongoing “project work” - internally financed?
    • A way to topically group publications and their authors together?

What are the most important questions you have in relation to projects at KTH?

Mobilized data for KTH projects

Object storage / S3 used

Tooling and components

External data sources for KTH Projects - Overview

CASE: a Research Management process in Edge

  • Internal source at KTH (Efecte Edge) - a “Case Management System for Research Projects”
  • Keeps track of agreements, contracts, legal documents
  • Could be a good source for internally financed projects
  • Can complement what is missing from other open sources
  • Comparatively more diverse (project types, funding organizations)

CASE (I/II) - fields and missingness

CASE (II/II)

  • Encompasses projects that have received funding from 29 different agencies, covering a time frame from 2009 to 2024

  • Although information on funding organizations is available, it remains insufficient and requires improvement (e.g. 22% are NA)

Swecris, CORDIS, OpenAIRE

Synergies and Challenges of Integrating Heterogeneous KTH Project Datasets

  • Swecris, Cordis & OpenAire have better data quality than Case (e.g. see image below).

  • They also differ in their information, while also having some overlapping fields.

  • The data quality is however far from perfect. For example,the percentage of missing information for Primary Researcher and Leading Role in Swecris is around 20%, which is moderately high.

Combining data sources for KTH projects

Overlap - Unions and intersections for different data sources

CASE versus Swecris - 174 exact matches found

Why are projects not being matched at a high rate?

Why the discrepancy?

Discussion, Questions, Feedback

Future work and directions

Determined based on your input and directions provided by product owners.

  • Subject area analysis in KTH Publication Analysis app
    • UKÄ/SCB categories
    • Topic clusters and keywords
    • Journal classifications
  • Improved filtering
  • Preliminary merged project list (based on CASE + external sources)

Questions and Answers

Please provide your input!

  • Do you have a need to follow-up projects from both the “contract” perspective and the “research area/topic” perspective?
  • Do you have ideas for additional reference datasets that are useful for analyzing your projects at KTH?
  • Would parameterized interactive reports be useful in addition to the raw data and the app?
  • Should CASE (or other internal KTH sources) be used to complement external sources for project data?
  • Other questions from the Zoom chat
  • Suggestions and comments?

If you prefer to provide written feedback, please use the following jamboard.

Thank you for attending!