LD Connect: Sharing Publication Metadata for Scientific Advancement

31 May 2022 | Martin van Aken and Hodan Mursal, IOS Press, Amsterdam, NL
As a scientific, technical, medical publisher, IOS Press stands at the forefront of disseminating research in areas such as health and computing science, among many others. In addition to publishing high quality scholarly journals and books, the production process needs to provide structured, searchable, and actionable information. All IOS Press publication metadata is available via the linked data portal LD Connect. At the IOS Press 35th anniversary symposium in March 2022, its power was demonstrated and the progress the company has made in taking advantage of this technology. In this post, further insights are revealed, and we invite you to dive in and explore LD Connect!

Tags:

knowledge graph

LD Connect

linked data

IOS Press 35

31 May 2022

Discover the tools that provide compelling insights and reveal connections within vast quantities of data

by Martin van Aken and Hodan Mursal, IOS Press, Amsterdam, NL

As a scientific, technical, medical publisher, IOS Press stands at the forefront of disseminating research in areas such as health and computing science, among many others. In addition to publishing high quality scholarly journals and books, the production process needs to provide structured, searchable, and actionable information. All IOS Press publication metadata is available via the linked data portal LD Connect. At the IOS Press 35th anniversary symposium in March 2022, its power was demonstrated and the progress the company has made in taking advantage of this technology. In this post, further insights are revealed, and we invite you to dive in and explore LD Connect!

IOS Press publishes hundreds of journal articles and book chapters each month, in both print and digital format. Digitization, while still adhering to formats that are optimized for print use (such as PDF files), is only the first step towards this goal of creating searchable, structured, and actionable scientific knowledge. Information can be buried inside a large folder of PDF files as easily – or actually even more easily – than in a bookcase, and with the same outcome of making it unusable.

LD Connect helps to ensure the move towards a knowledge-centric model. Imagine a world where all published information could be searched and connected, across many journals, authors, and articles. IOS Press' own knowledge graph – which is what LD Connect is – could one day be part of a much larger knowledge graph of all scientific knowledge accessible to researchers across publishers, universities, and countries' boundaries – the entire scholarly information ecosystem.

Discover the knowledge graph

For now, we are thrilled to be able to share details about the IOS Press structured data, which is machine-readable and can be interlinked with other data – making it useful through semantic queries. Using this linked data, we have developed tools for users to dive in and gain insights from all the published content. This is done through the knowledge graph. The IOS Press linked data portal LD Connect builds a powerful knowledge graph using links between the data known as “triples” in the form of subject–predicate–object expressions. By enriching and fostering the interlinking of data, contextual relationships among authors, institutions, and research areas can be visualized and interpreted and new links uncovered.

LD Connect visual for linked data discoveries

About LD Connect

LD Connect is the IOS Press linked data portal containing machine-readable metadata for all our digital publications. In collaboration with the research community, we are enriching and connecting human- and machine-readable data in more meaningful ways to contribute to an increased understanding of published research. Our datasets include, for example, metadata of journal articles and book chapters, authors, affiliations, countries, volumes, issues, series, pre-press and publication dates, ISSNs, DOIs accessibility, keywords, pages, and abstracts.

We offer powerful semantic search tools to query and visualize data. With artificial intelligence (AI)-powered embeddings derived from full text, our linked data portal LD Connect builds a powerful knowledge graph constructing links between the data that reveal contextual relationships among authors, institutions, and research areas that can be visualized and interpreted and new relationships uncovered.

Discover More

What powers LD Connect?

As part of our continued commitment to linked data, a new section has been incorporated into the IOS Press website that provides background on this initiative and explains the capabilities of LD Connect, which we hope will help you understand the benefits and what you can gain from using it. You can explore the simple search and gain insights as highlighted later in this post.

First, we look at what powers our linked data portal. At its core, LD Connect is a database of all IOS Press metadata, i.e., information about a given paper (journal article or book chapter) – such as title, publication date, abstract, keywords, authors and their affiliations, and more.

It is a powerful database with all the information stored in a structured way. Underlying the database, we use a custom vocabulary and web standards while describing our data in order to make that data even more discoverable, accessible, linkable, and interoperable with other datasets. Affiliations are geocoded and we are working towards authors, as well as affiliations, being disambiguated using our co-reference resolution script. With the help of machine learning techniques, the data conversion pipeline keeps on improving as more data are added.

This all simply means that it allows us to ask the database questions. Things like:

How many papers were published in the Statistical Journal of the IAOS in 2019?
What is the proportion of open access papers published in the Journal of Alzheimer's Disease over time?
How many authors who published papers on biology are from Canada?

LD Connect’s technology facilitates the communication between the human- and machine-readable data. These questions are called “queries,” and we need to input a query in a language the database can understand. For most databases, it is something called SQL (Standard Query Language). Our database uses a language called SPARQL (Simple Protocol and RDF Query Language) to retrieve information. Here is the SPARQL version of the first query (Figure 1).

Figure 1: Options for an expert level semantic search (click visual to input searches via the LD Connect website)

This can be explained as follows:

We tell LD Connect what we want to see in the output – in this case the paper (i.e., the article identifier), the title, and the journal in which it is published
We then explain where to find that information, i.e., that the paper is part of an issue, the issue part of a volume, and finally the volume is part of a journal
Then, we ask for the results to be filtered to the specific journal using its three-digit identifier of “sji”

The result is a list like the one shown in Figure 2.

Figure 2: Example of the behind-the-scenes steps showing the results of a semantic search

For such advanced search functionalities, we recommend using the expert level "semantic search" option via the linked data portal.

Explore LD Connect for yourself

You may be wondering if LD Connect is only intended for data scientists – and the answer is a definite “no.” LD Connect is open to all users. Being presented with all IOS Press metadata might sound overwhelming, but what is important to remember is that within that data, there is information that we hope will be of interest to you.

Perhaps you have seen applications of our data tools in action without realizing it. LD Connect is already powering the automated feed into the IOS Press website, whereby the “latest articles” are featured on the journal pages in a block that is filled dynamically by querying LD Connect for all articles in the latest available issue – see for example: Journal of Huntington’s Disease. It also powers the extraction of data that we are using to communicate actions for the Sustainable Development Goals.

Now, via the IOS Press website, we have made tools available that can seamlessly allow you to experience LD Connect at first hand whereby you can undertake a simple search and gain insights via visualizations.

SEARCH

We invite you to extract information by carrying out your own searches to discover inferences and links within the data that are relevant to you. For the majority of users, the easiest way to do that is using the "simple search" functionality, which allows you to browse the data with minimum effort and brings you maximum return. We share the screenshot in Figure 3 to show this easy-to-use tool.

Figure 3: To use the search tool, you can simply choose your criteria from the dropdown box (click to view larger on the IOS Press website)

If you would like tips about what you can search for in order to get the most out of LD Connect, click here!

Search Now

INSIGHTS

Another key aspect is the ability to gain insights from the datasets. LD Connect contains large amounts of quantitative, spatial, and textual information, and so that all of this makes sense to the user we turn to visuals. Data visualization can provide compelling insights and help us see connections within vast quantities of data that we would otherwise not perceive. That is why we have been developing tools to do just that, i.e., visualize the data for human consumption.

Here, we provide two examples that you can interact with on the IOS Press website to gain insights. Figure 4 shows a screenshot of the tool that extracts author data from LD Connect and allows you to visualize the connections and geographic locations of authors (based on their affiliations) for papers in any specific journal. You can select a publication from the drop-down list and watch as the results are revealed. You can see the top collaborations based on location.

Figure 4: Visualization showing the top collaborations, based on location, per title (click to view larger on the IOS Press website)

Figure 5 shows a screenshot of the “treemap” tool that extracts journal data from LD Connect and gives you a true representation of how fields of coverage and publishing trends have changed in the journals over the years. The various boxes represent the journals, with their size being relative to the number of articles published in each. When assessing all IOS Press journals (shown below), it is clear that we publish the majority of journal articles in the Medicine & Health category, with Computing Science as the second largest category. Jump to the IOS Press website and select a year from the drop-down list to see how these categories have gained prominence in recent years.

Figure 5: Visualization based on subject category and quantities of published papers (click to view larger on the IOS Press website)

For additional visualization tools, you can interact with the LD Connect presentation from the IOS Press anniversary symposium!

Gain Insights Now

What is next for LD Connect?

We’re just getting started – we have a lot of ambition for LD Connect to provide more and more value and additional tools. Among the plans on the horizon are:

Construct further discovery tools that can unlock the relationships and patterns embedded in the data – and help accelerate research
Utilize LD Connect to work at a nanopublication scale where the “machine-interpretable formal semantics cover the main scientific claims the work is making” (source: the Data Science special issue, as covered in the previous Labs post here)
Integrate further automated feeds for data to be incorporated into the IOS Press site; one example that is planned is to gain insights into the content published in our journals relating to the Sustainable Development Goals

Interested to know more about how LD Connect could help your research or monitoring the scholarly publishing landscape? We welcome feedback on how we could provide more value, and you are invited to get in touch (with "LD Connect" in the subject line) or reply in a comment to this post below. We look forward to your input.

We hope you enjoy exploring the IOS Press LD Connect section!

About the Authors

Martin van Aken has spent his career as a software engineer. He works as a developer for IOS Press on the linked data portal LD Connect, maximizing its potential for data usage and extracting analytics. He is based in Belgium.

Hodan Mursal is a data analyst and, while working at IOS Press, was responsible for LD Connect and other analytical tools. Since contributing to this post, Hodan has moved to another Dutch company to further her career as a data scientist.

Visit LD Connect