International Journal of Production Research

ISSN: 0020-7543 (Print) 1366-588X (Online) Journal homepage: www.tandfonline.com/journals/tprs20

Enhancing supply chain visibility with generative
AI: an exploratory case study on relationship
prediction in knowledge graphs

Ge Zheng & Alexandra Brintrup

To cite this article: Ge Zheng & Alexandra Brintrup (13 Aug 2025): Enhancing supply chain
visibility with generative AI: an exploratory case study on relationship prediction in knowledge
graphs, International Journal of Production Research, DOI: 10.1080/00207543.2025.2543964

To link to this article:  https://doi.org/10.1080/00207543.2025.2543964

© 2025 The Author(s). Published by Informa
UK Limited, trading as Taylor & Francis
Group.

Published online: 13 Aug 2025.

Submit your article to this journal 

Article views: 118

View related articles 

View Crossmark data

Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=tprs20

https://www.tandfonline.com/journals/tprs20?src=pdf
https://www.tandfonline.com/action/showCitFormats?doi=10.1080/00207543.2025.2543964
https://doi.org/10.1080/00207543.2025.2543964
https://www.tandfonline.com/action/authorSubmission?journalCode=tprs20&show=instructions&src=pdf
https://www.tandfonline.com/action/authorSubmission?journalCode=tprs20&show=instructions&src=pdf
https://www.tandfonline.com/doi/mlt/10.1080/00207543.2025.2543964?src=pdf
https://www.tandfonline.com/doi/mlt/10.1080/00207543.2025.2543964?src=pdf
http://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=13%20Aug%202025
http://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=13%20Aug%202025
https://www.tandfonline.com/action/journalInformation?journalCode=tprs20


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH
https://doi.org/10.1080/00207543.2025.2543964

Enhancing supply chain visibility with generative AI: an exploratory case study on
relationship prediction in knowledge graphs

Ge Zheng a and Alexandra Brintrup a,b

aSupply Chain AI Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom; bAlan Turing Institute, London,
United Kingdom

ABSTRACT
A key stumbling block in effective supply chain risk management for companies and policymakers
is a lack of visibility on interdependent supply network relationships. Relationship prediction, also
called link prediction is an emergent area of supply chain surveillance research that aims to increase
the visibility of supply chains using data-driven techniques. Existing methods have been success-
ful for predicting relationships but struggle to extract the context in which these relationships are
embedded – such as the products being supplied or locations they are supplied from. Lack of con-
text prevents practitioners fromdistinguishing transactional relations from established supply chain
relations, hindering accurate estimations of risk. In this work, we develop a new Generative Artifi-
cial Intelligence (GenAI) enhancedmachine learning framework that leverages pre-trained language
models as embeddingmodels combinedwithmachine learningmodels to predict supply chain rela-
tionships within knowledge graphs. By integrating Generative AI techniques, our approach captures
the nuanced semantic relationships between entities, thereby improving supply chain visibility and
facilitating more precise risk management. Using data from a real case study, we show that GenAI-
enhanced link prediction surpasses all benchmarks, and demonstrate how GenAI models can be
explored and effectively used in supply chain risk management.

ARTICLE HISTORY
Received 15 November 2024
Accepted 30 July 2025

KEYWORDS
Generative artificial
intelligence (GenAI);
pretrained language models
(pretrained LMs); supply
chain visibility; link
prediction; knowledge graph
(KG); machine learning

1. Introduction

Global supply chains emerge as companies buy prod-
ucts from one another to produce and deliver their
own (Bellamy and Basole 2013). They play a critical role
in almost every aspect of our daily lives. 80% of global
trade flows through multinational corporations, and one
in five jobs worldwide is tied to global supply chains.
Increased volatility and geopolitical tension in recent
years have shown how vulnerable we are to supply chain
disruptions, with major shortages impacting our food,
medicines and supply of electric batteries. In tandem
there is rising awareness on the exposure of global sup-
ply chains to human rights violations and unsustainable
environmental practices, with US and European policy
makers proposing legislativemeasures that demand com-
prehensive supply chain traceability (Küblböck 2013).

One of the key stumbling blocks in beginning to
address these concerns is a lack of knowledge on inter-
dependent supply chain connections. Most companies
have limited visibility beyond their direct connections.
Increasing visibility in supply chains has been a rich
area of research in the past decade, with multiple
technical innovations having been proposed, such as

CONTACT Alexandra Brintrup ab702@cam.ac.uk

electronic product codes, radio frequency identification,
and blockchain technologies. Although these have been
successful to some extent, their reach is typically lim-
ited to one or two tiers at most. That is because to
adopt tracking technology, companies need to be will-
ing to share data. There is little incentive for companies
to share data on whom they purchase from, for vari-
ous reasons. Companies typically view their own supply
chains as a competitive advantage, and fear that dis-
closing information could result in their buyers working
directly with their suppliers, reveal their pricing struc-
ture, or they may simply be not wish their manufac-
turing and purchasing practices to be known to the
buyer.

More recently, researchers have proposed a new solu-
tion to this problem, which is to use data driven methods
to ‘estimate’ who supplies whom, rather than rely on the
willingness of companies to share data. Termed as ‘Dig-
ital Supply Chain Surveillance’ (Brintrup et al. 2024),
these methods include network reconstruction (Mungo
et al. 2023), web scraping to recognise supply relation-
ships in text obtained from news articles and com-
pany annual reports (AlMahri, Xu, and Brintrup 2024;

© 2025 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted
Manuscript in a repository by the author(s) or with their consent.

http://www.tandfonline.com
https://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=2025-08-12
http://orcid.org/0000-0002-9983-7120
http://orcid.org/0000-0002-4189-2434
mailto:ab702@cam.ac.uk
http://creativecommons.org/licenses/by/4.0/


2 G. ZHENG AND A. BRINTRUP

Wichmann et al. 2018), and machine learning methods
for predicting relationships (formally, link prediction).

Most current methods focus on a single type of rela-
tionship, such as firm-level networks that map supply
or buy relationships between firms (Brintrup et al. 2018;
Mungo et al. 2023).While these approaches provide valu-
able insights, they offer a limited understanding of supply
chains due to not considering the multifaceted interac-
tions and dependencies that exist between different enti-
ties, thereby restricting a comprehensive understanding
of the entire network.

Considering the supply chain as an interconnected
network of entities and relationships, we can construct
supply chain relevant data into a supply chain knowl-
edge graph that can capture complex relationships and
attributes associated with supply chain entities. The mul-
tiple types of relationships in the supply chain knowledge
graph, such as manufacturing processes required to pro-
duce a product, product flows, and types of partnerships,
can contribute to supply chain visibility for a compre-
hensive understanding of supply chain dynamics. It also
can reveal hidden patterns, identify potential bottlenecks,
and support the development of strategies to enhance
network resilience.

Generative Artificial Intelligence (GenAI), a branch
of machine learning, is designed to create new content,
ideas, or data (known as synthetic data), by learning
patterns from existing data. Unlike traditional Artifi-
cial Intelligence (AI) where the output depends on the
given inputs, GenAIs can generate novel outputs. Exam-
ples could be generative machine learning models used
for synthetic data generation (Zhang et al. 2018) and
large language models like ChatGPT (Open AI 2024),
Copilot (Microsoft 2023), Gemini (Google DeepMind
2023), and LLaMA (Touvron et al. 2023) used for the
generations of human-like text, images, audio and even
videos.

GenAI as a powerful tool has gained tremendous
attention in recent years and been used in various fields.
For instance, Zholus et al. (2024) explored how a lan-
guage model can be used to accurately create molecular
structures, facilitating the drug discovery process. Zhao
et al. (2024) introduced self-attention Generative Adver-
sarial Networks (SAGANs) model to generate the syn-
thetic data for solving data imbalanced issue in financial
transaction data and then used it for credit card fraud
detection. Gayam (2023) investigated how GenAIs con-
tribute to the creation of music and visual art.

In the context of supply chain operation manage-
ment, GenAIs have been hypothesised to empower the
human workforce, improve project management pro-
cesses, and help optimise manufacturing and supply
chain procedure (Mohammed and Skibniewski 2023).

Early adopters, Walmart and Maersk, integrated GenAIs
into their operations to optimise pricing negotia-
tions (Jackson et al. 2024). A logistics company, C.H.
Robinson, is exploring GenAIs for automating the
freight shipment (Business Wire 2024) while Ryder Sys-
tem (2024) leveragedGenAIs to power chatbots to handle
customer inquiries.

These early reports on GenAIs inspire us to explore
its potential in enhancing supply chain visibility, par-
ticularly by predicting relationships within supply chain
knowledge graphs. GenAIs, including pre-trained lan-
guage models but not limited to, are trained on extensive
datasets containing diverse text-based informationwhich
enables them to find relevant patterns in unstructured
data such as emails, reports, contracts, and social media
posts. This ability might allow it to elicit the complex
structure and patterns of relationships within networks.
GenAIs employ sophisticated neural network architec-
tures, such as transformers, which allow the models to
handle complex, non-linear relationships. These archi-
tectures also use mechanisms like self-attention to cap-
ture dependencies and interactions between different
parts of the data. In supply chain networks, where rela-
tionships between entities (e.g. suppliers, manufacturers,
distributors) are intricate andmultifaceted, GenAIs’ abil-
ity to model these complexities has great potential to
enhance supply chain visibility.

However, one area of concern of applying GenAIs into
industrial applications has been so called ‘hallucinations’,
where GenAIs generate not factual or inaccurate out-
puts due to the fact that they are primarily optimised
for language fluency and pattern recognition rather than
strict adherence to factual data (Huang et al. 2023). The
phenomena of ‘hallucination’ results from the biases and
inaccuracies in massive amounts of training data from
various sources and also from the extension beyond the
constraints of the training data (Bender et al. 2021). In
the context of supply chain visibility, hallucinations can
be particularly problematic. For example, when predict-
ing relationships in a supply chain knowledge graph, a
model with the issue of hallucinations may infer non-
existent relationships between suppliers, manufacturers,
and customers. These errors could lead to misguided
decisions, potentially disrupting operations, misinform-
ing risk assessments, or causing inefficient resource allo-
cation in a system where precision is critical. Thus, it is
essential to ensure that model’s predictions are grounded
in actual verified relationships. To address this challenge,
a hybrid approach can be adopted: using a GenAI model
as an embedding model to encode data into vectors,
followed by applying machine learning models for rela-
tionship prediction on the knowledge graph as a ‘fac-
tual anchor’. This strategy leverages the strengths of the


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 3

GenAI model in capturing intricate patterns and seman-
tic contexts while relying on the verified and factual
relationships as ground truth in the knowledge graph to
avoid the risks of hallucinations.

In this paper, we explore the potential of GenAIs
in enhancing supply chain visibility by integrating pre-
trained language models with machine learning models
to predict relationships within supply chain knowledge
graphs.We also introduce a new term, ‘quintuplet’, to rep-
resent more intricate relationships within the knowledge
graph. Unlike traditional triplets that capture a single
relationship between two entities, quintuplets condense
multiple triplets to provide a deeper understanding of the
supply chain network. After transferring regular triplets
based knowledge graph into quintuplets based one, we
are able to generate textual descriptions passed onto a
pre-trained language model to retrieve vectorised rela-
tional knowledge learned by the pre-trained language
model from large amount of website information, which
are then further learned using a machine learning model
to predict quintuplets. The process thus allows us to com-
bine structured knowledge representation by a knowl-
edge graph, with the general knowledge base that can be
used to augment the graph to make additional inference.

Combining GenAIs and Knowledge Graphs is pow-
erful and goes beyond the state of the art in the sup-
ply chain domain, because we mitigate hallucination
effects of GenAIs by restricting its use to augment struc-
tured prior data, and also allow additional contextual
knowledge to arise from it, which would not have arisen
by merely using knowledge graph completion methods.
Thus our contribution extends the current state of the art
in supply chain relationship prediction for visibility and
also allows us to provide a use case in the application of
GenAIs to supply chain management research. We com-
pare our method to existing benchmarks with a use case
in electronic vehicle battery supply chains, and present
experimental results validatedwith a range of pre-trained
language models and machine learning methods. Our
method surpasses all existing benchmarks in accuracy,
and also yields better information.

The rest of this paper is organised as follows. Section 2
reviews relevant existing works in the literature, includ-
ing GenAIs, and existing link prediction methods in
supply chain networks. Section 3 presents our proposed
approach for multiple connected relationship prediction
in supply chain networks, including problem definition,
preliminaries, and framework explanation of the pro-
posed approach. Section 4 uses a case study to evaluate
the proposed approach while Section 5 concludes this
work and explains its managerial implications, limita-
tions, and potential future work directions.

2. Related works

Generative artificial intelligence and link prediction in
supply chain networks are twomain topics relevant to the
proposed approach in this work, reviewed below.

2.1. Generative artificial intelligence

Generative Artificial Intelligence (GenAI) refers to
a class of machine learning designed to generate
new content including text, images, music and even
videos, by learning patterns from existing data (Ooi
et al. 2023). GenAI models include Generative Adver-
sarial Networks (GANs) (Goodfellow et al. 2014), Vari-
ational Autoencoders (VAEs) (Kingma, Welling, and
al. 2019), Transformer-based ChatGPTs (Open AI 2024),
LLaMA (Touvron et al. 2023), DALL-E (Open AI 2022)
and so on. These models have been applied into various
fields, leading to significant achievements and efficien-
cies. One of the most significant capabilities of GenAI is
natural language generation. For example, Transformer-
based models like ChatGPTs (Open AI 2024) have
demonstrated remarkable abilities in tasks ranging from
drafting emails towriting code, showcasing the versatility
of GenAI in understanding and generating natural lan-
guage. Noy andZhang (2023) reported that ChatGPT can
substantially raise productivity by decreasing the aver-
age time consuming on mid-level professional writing
tasks by 40% and increasing the output quality by 18%.
Apart from natural language generation, GenAI mod-
els have been used for generating realistic images. For
instance, GANs have been used to create high-resolution,
photorealistic images that are indistinguishable from
real photographs (Karras, Laine, and Aila 2019). Such
capabilities have enabled applications in fields like art
generation (Louie et al. 2020), virtual reality (Hashim
et al. 2023), and data augmentation used for other AI
model trainings (Gowda and Rao 2024).

As with many other domains, the capabilities of
GenAIs have recently been explored in the field of sup-
ply chain management, although very few studies exist.
Those which does exist, do not yet report technical
performance, but rather explore potential benefits and
applications. Fosso Wamba et al. (2023) explored the
benefits, challenges and trends associated with GenAI
technologies like ChatGPT in Supply Chain and Opera-
tion Management (SCOM) by surveying practitioners in
the United Kingdom and the United States. This study
reveals an increased efficiency from GenAI adopters
compared to non-adopters and highlights that the inte-
gration ofGenAI can significantly enhance overall supply
chain performance. A subsequent study (Fosso Wamba


4 G. ZHENG AND A. BRINTRUP

et al. 2024) extended the exploration by additionallymap-
ping the maturity levels of GenAI projects across supply
chains and identified the specific operational benefits and
challenges that organisations need to overcome. Mean-
while, Jackson et al. (2024) provided a comprehensive
understanding of both AI and GenAI functionalities and
applications in the SCOM context. It also offers a practi-
cal framework for both practitioners and researchers to
identify where and how AI and GenAI can be applied in
SCOM to enhance decision-making processes, optimise
operations, prioritise investments, and develop necessary
skills.

In the industry, companies, early adopters of GenAIs,
have reported the enhancement of their supply chain
task performance. Mars applied an GenAI platform
offered by Celonis (2024) to optimise truck loads and
reduce manual efforts by 80% and improve delivery effi-
ciency (du Preez 2023). Amazon leveraged GenAI to
streamline and improve the delivery process (Meiyappan
and Bales 2021) while FedEx applied GenAI to gener-
ate more precise package arrival estimates (CNBC Evolve
Global Summit 2023). Shein in the fast fashion sector
leveraged GenAI to understand the changes in customer
demand and interest, allowing it to adjust its supply
chain in real time (Astha 2023). A report from Holger
et al. (2023) shows that GenAI could add up to $275
billion to the operating profits of apparel, fashion, and
luxury sectors in the next three to five years.

Inspired by such achievements of GenAI in both aca-
demic and industry, we explore how GenAI models can
enhance supply chain visibility which is a crucial chal-
lenge in supply chains (Pichler et al. 2023). Language
models as a type of GenAI models have been pointed
out the great potential of improving supply chain perfor-
mance (Aguero and Nelson 2024; Srivastava et al. 2024).
One of the earlier applications in the supply chain domain
has been supply chain mapping wherein Wichmann
et al. (2018) automatically extracted structured supply
chain information from unstructured natural text to
answer questions such as ‘who supplies whom with what
from where?’, indicating that language models can help
extract relationships of entities in supply chain networks.

Several studies (Bouraoui, Camacho-Collados, and
Schockaert 2020; Petroni et al. 2019; Safavi and Koutra
2021) have demonstrated that pretrained language mod-
els (pretrained LMs) encode rich relational knowledge,
enabling the recovery of factual relationships from their
internal representations. This points to the potential
opportunity to retrieve relational knowledge of entities
in supply chain networks from pretrained LMs to predict
hidden dependencies.

As pretrained LMs are trained on massive data from
diverse sources, the learned knowledge in these models

is general and not capable of a task that requires spe-
cialised domain knowledge. To solve this problem, two
solutions can be used. One is training a language model
for a specific task and the other one is fine-tuning a
pretrained LM for a specific task. The former requires
large computational resources and data, and is also time-
consuming, prompting researchers to advocate the lat-
ter option of fine-tuning (Fichtel, Kalo, and Balke 2021;
Yasunaga, Leskovec, and Liang 2022). Relevant exam-
ples include Yasunaga, Leskovec, and Liang (2022), who
fined-tuned a BERT model to predict links among
documents, and Tan et al. (2024) developed a uni-
form framework to fine-tune language models on mul-
tiple tasks including link prediction. Results from both
works showed that fine-tuning pretrained LMs can accu-
rately predict links. However, fine-tuning pretrained LMs
requires expensive expertise in NLP, and most compa-
nies in supply chains are small and midsize enterprises
(SMEs) limited by the budget to employ such expen-
sive experts. Additionally, SMEs typically lack access to
sufficient high-quality labelled data for effective fine-
tuning and are also constrained by limited computational
sources.

In addition, another concern for using pretrained LMs
is the issue of ‘hallucination’ leading to the generated out-
puts being not factual or inaccurate (Huang et al. 2023).
The phenomena of ‘hallucination’ results from the biases
and inaccuracies in massive amounts of training data
from various sources. In the supply chain context, such
hallucinations can lead to erroneous demand forecasts
ormisinterpretation of supply chain relationships, poten-
tially resulting in operational disruptions and financial
losses. For instance, a generativemodel might incorrectly
predict a surge in demand based on fabricated trends,
leading to overproduction and increased inventory costs.

Recent studies (Agrawal et al. 2023; Guan et al. 2024;
Martino, Iannelli, and Truong 2023) have highlighted the
potential of knowledge graphs to mitigate hallucination.
By organising information into entities and relationships,
creating a network of interconnected facts, knowledge
graphs offer a structured and verifiable repository of fac-
tual information that the language models can reference
to maintain consistency and accuracy in the generated
outputs.

Building on prior research works that highlight
the potential of relational knowledge learned in pre-
trained LMs (Bouraoui, Camacho-Collados, and Schock-
aert 2020; Petroni et al. 2019; Safavi andKoutra 2021) and
addressing the hallucination issue that knowledge graphs
can alleviate, we develop a new approach that combines
pretrained LMs with traditional machine learning tech-
niques to predict multiple interconnected relationships
within supply chain networks represented by knowledge


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 5

graphs. This new approach does not require fine-tuning.
The pre-trained language models are used to convert
textual data into high-dimensional vector embeddings
that capture semantic meanings and contextual nuances.
These embeddings are rich in linguistic information but
may lack domain-specific factual accuracy when used in
isolation, leading to hallucinations. The machine learn-
ing models learn to map the semantic embeddings to
the factual relationships represented in the knowledge
graph. This integration ensures that the predictions are
not solely based on language patterns but also aligned
with the actual data from the knowledge graph. The
knowledge graph acts as a factual anchor, constraining
the model to produce outputs consistent with known
supply chain relationships. It can reduce the risk of hal-
lucinations by providing a factual basis for relationship
predictions, enhancing the reliability and accuracy of the
model, and also leverage the strengths of pre-trained
language models in understanding and encoding contex-
tual information while counteracting their tendency to
generate incorrect information when used alone.

2.2. Link prediction in supply networks

Supply chains are complex networks that exhibit non-
linear interactions and inter-dependencies among vari-
ous entities, processes and resources (Choi, Dooley, and
Rungtusanatham 2001). The large-scale and non-linear
nature of these networks often hinder their visibility,
which, in return, makes the identification of potential
risks challenging. Researchers have shown that predict-
ing relationships of entities within a supply chain net-
work can contribute to improved visibility (Brintrup
et al. 2018).

Formally, researchers framed the problem of iden-
tifying relationships in a supply chain as a link pre-
diction problem on a graph. Link prediction is widely
used to solve problems in various domains such as
social networks, where it predicts potential connections
between individuals based on existing ties, interests, or
behaviours (Hasan andZaki 2011); recommendation sys-
tems, where it identifies associations among product
sales (Su et al. 2020); and biological networks, where it
predicts interactions among genes, proteins, and other
biological entities (Coşkun and Koyutürk 2021). The
approaches used in the majority of these works are
similarity-based and learning-based. Similarity-based
approaches predict the connection of two nodes by
leveraging the similarity of characteristics between two
nodes (Zareie and Sakellariou 2020), while learning-
based approaches involve training machine learning
models to infer the likelihood of connections between
nodes based on node and network level features (Ahmed,

ElKorany, and Bahgat 2016). Supply chain researchers
have argued that similarity-based approaches are unsuit-
able for predicting links in this domain as similar com-
panies usually do not connect due to competition, advo-
cating for learning-based approaches.

Brintrup et al. (2018) used twomachine learningmod-
els to develop a link prediction approach to predict sup-
plier interdependencies in a manufacturing supply net-
work. Kosasih and Brintrup (2022) developed a machine
learning model using Graph Neural Network (GNN) to
detect potential links that are unknown to the buyer in
an automotive supply chain network. Later on, Kosasih
et al. (2022) proposed a neurosymbolic machine learn-
ing method using a combination of GNN and knowl-
edge graph reasoning to predict multiple types of links
in two supply chain networks. In the same year, Brock-
mann, Elson Kosasih, and Brintrup (2022) also per-
formed link prediction using GNN but on an uncertain
supply chain knowledge graph. Brintrup et al. (2018)
and Kosasih and Brintrup (2022) predicted one type of
relationship. Kosasih et al. (2022) and Brockmann, Elson
Kosasih, and Brintrup (2022) predicted multiple types of
relationships on knowledge graphs. Furthermore,Mungo
et al. (2023) considered production network reconstruc-
tion as link prediction and used Gradient Boosting to
predict hidden relationships to reconstruct the produc-
tion network. Another work that focuses on the firm-
level network reconstruction is from Ialongo et al. (2022),
in which a generalised maximum-entropy reconstruc-
tion method is introduced to reconstruct the firm-level
network based on partial information.

While these approaches have been very valuable,
the mere identification of a transactional relationship
between companies does not allow sufficient contextual
information for actionable insights. Considering the case
that Toyota and Hewlett Packard (HP) are predicted to
share a transactional relationship, it might be that HP has
sold a large number of office printing equipment to Toy-
ota and itmight also be that Toyota usesHP’s 3D printing
material in its production. For an analyst looking to iden-
tify whether a disruption at HP would cause an issue to
the automotive industry, the two types of relationships
would have different implications on actual production
output. Similar issues arise when we consider other con-
textual information such as production locations. In this
work, we will focus on adding context to identified rela-
tionships by a type of GenAI models, pretrained LMs, as
a knowledge base (Srivastava et al. 2024).

Our hypothesis is that doing so will allow us to
combine the structured factual knowledge that can be
obtained from an ontological presentations afforded by
knowledge graphs, with the general unstructured knowl-
edge base that can be obtained from pretrained LM,


6 G. ZHENG AND A. BRINTRUP

thereby mitigating the hallucination issue caused by
using pretrained LMs alone. Besides, it will allow us to
enhance the supply chain visibility by a complete sup-
ply chain knowledge graph for a comprehensive under-
standing of the supply chain dynamics. A comprehensive
understanding of supply chain relationship dynamics,
facilitated by relationship prediction, might offer signif-
icant strategic and operational advantage. Conceptualis-
ing relationship prediction as a form of ‘Digital Supply
Chain Surveillance’, an industrial survey carried out in
the UK has suggested that the use of digital data and AI
can allow for early identification of vulnerabilities and
bottlenecks, which in turn would help dynamically plan
for proactive risk mitigation and contingency planning,
thus enhancing the overall resilience and agility of the
supply chain (Brintrup et al. 2024). Proactive resilience
planning was cited as one of the top three advantages for
whichUKmanufacturers were hoping to use surveillance
technology. Authors also found that a detailed under-
standing of supply chain interconnections can facilitate
strategic decision-making, through, for example, refin-
ing supplier selection processes, negotiate more effec-
tively. Knowledge over a supplier’s other connections,
both horizontal and vertical, can help the buyer under-
stand whether the supplier is connected to its competi-
tors, which might be helpful in negotiation, especially
in capacity constrained contexts. Similarly, downstream
connections may inform compliance with sustainability
standards and regulations such as the Modern Slavery
Act (Act 2015; Caspersz et al. 2022).

2.3. Summary of research gaps

Despite significant advances in both topics mentioned
above, several gaps remain, especially with respect to
production research.

Regarding link prediction for supply chains, existing
approaches suffer from three key shortcomings. First,
they focus on predictingwhether a relationship exists but
lack the ability to infer contextual attributes (e.g. rela-
tionship type, product-specific dependencies). Second,
these methods depend entirely on structured knowl-
edge graphs, which are often incomplete or outdated in
dynamic supply chains. Third, they ignore the wealth
of unstructured data (e.g. news articles, procurement
contracts) that encode latent relationships.

In terms of GenAI for supply chain link prediction,
while it has shown promise in supply chain applications,
existing studies and industry implementations exhibit
critical limitations. First, most works focus narrowly on
productivity gains (e.g. reducing manual efforts in logis-
tics) but fail to address the fundamental challenge of hal-
lucination in relational tasks like supply chain mapping.

Second, current approaches heavily rely on fine-tuning
pre-trained LMs, which assumes access to labelled data,
NLP expertise, and computational resources that are pro-
hibitive for SMEs. Third, while knowledge graphs are
recognised as potential anchors for factual accuracy, no
previous work integrates pre-trained LMs with supply
chain knowledge graphs in a zero-shot framework (i.e.
without fine-tuning) tomitigate hallucinationswhile pre-
serving scalability.

This work aims to address these gaps by integrat-
ing pretrained LMs with knowledge graphs. Unlike prior
GenAI applications, our method avoids fine-tuning,
making it accessible to SMEs, while leveraging knowl-
edge graphs as a “grounding” mechanism to counteract
hallucinations. Unlike traditional link prediction mod-
els, our approach enriches predictions with contextual
semantics from unstructured data (via LM embeddings)
while maintaining factual consistency through knowl-
edge graph constraints. This hybrid methodology is the
first to simultaneously resolve incompleteness in knowl-
edge graphs and inaccuracies in LM output, enhancing
supply chain visibility.

3. Combining pretrained languagemodels and
knowledge graphs

3.1. Preliminaries: from triplets to quintuplets

As mentioned earlier, the first set of studies in lit-
erature solved the supply chain link prediction prob-
lem on a graph with links representing relationships
between companies (Brintrup et al. 2018; Cai et al. 2021;
Kazemi and Poole 2018; Kosasih and Brintrup 2022).
This approach aimed at learning connection patterns
surrounding two companies, estimating a connection
between them (Figure 1). The second set of studies repre-
sent supply chain information as a knowledge graph (KG)
with multiple types of links (Kosasih et al. 2022; Rossi
et al. 2021).

A KG is represented by a heterogeneous graph G and
its ontology O, the former being the actual data and
the latter its schema. KG can also be seen as a collec-
tion of facts represented by predicate logic statements. A
KG is based on an ontology, that defines data types and
attributes, with a relational taxonomy. Each item in the
data is an entity (or a node in a graph), and the rela-
tionships between entities are edges, or links. In previous
works, KGs have been used to model edges such as who-
produces-what, who-has-what-certification, in addition
to buyer-supplier links (Figure 1), the structure of which
then inform one another.

Both of the above approaches introduce triplets to
describe and predict relationships. A triplet, also known


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 7

Figure 1. (a) company-level relationships in a supply chain network (Brintrup et al. 2018); (b) multiple relationships including (company,
supplies to, company), (company, has, certificate) and (company, has, product) in a knowledge graph (Kosasih et al. 2022); (c) quintuplet
based relationships on a knowledge graph. For example, company A supplies product 1 to company B, and company A with certificate 1
has product 1.

as a triple or a statement, consists of three components:
subject, predicate, and object, and is used to define the
relationship between a subject and an object (Figure 1).
Singular links are analysed at a time – where we can pre-
dict ‘what a company produces’, and ‘to whom it sells’,
but not contextual information such as ‘which product a
company sells to its buyer in which location’. Understand-
ing context in supply chains is important to accurately
predict risks.

To define context, we introduce a new term, ‘quin-
tuplet’, where information that is represented by three
triplets, (CompanyA, has_product, Product 1), (Product 1,
purchased_by, Company B) and (Company A, supplies_to,
Company B), can be condensed into (Company A, sup-
plies, Product 1, to, Company B).

Given a knowledge graph G(V , E), V is the set of
entities and E ⊆ (V × V) is a set of relationships. The
relationship between entity v1 and entity v2, in a knowl-
edge graph can be represented by a triplet (v1, ε1,2, v2), in
which v1, v2 ⊆ V and ε1,2 ⊆ E .

In contrast, a quintuplet would inform, (v1, ε1,2, v2,
ε2,3, v3); v1, v2, v3 ⊆ V and ε1,2, ε2,3 ⊆ E , which becomes
the target of the prediction. Compared to a triplet rep-
resenting one relationship, a quintuplet describes mul-
tiple connected relationships. The multiple connected
relationships and the connected entities in a quintu-
plet can represent a small subgraph of the knowledge
graph, leading to contextual information for an entity or
relationship.

3.2. The pretrained LM-basedmachine learning
framework

We propose a pretrained LM-based machine learning
framework which transfers a knowledge graph described
by triplets into quintuplets to generate composed texts,
and then sends these composed snippets of text to a
pretrained LM to retrieve the relational knowledge that
has been learned a priori (Figure 2). The retrieved rela-
tional knowledge is represented by vectors of fixed length,
that are further learned by a machine learning model to
predict the multiple connected relationships of entities
represented by quintuplets in supply chain networks.

We begin by constructing a knowledge graph from
already known data that characterises the supply chain.
This may involve but is not limited to a priori known
supply-buy relationships, products and certifications,
and to a large extent determined by data that is avail-
able to the researcher. The original data used to construct
the supply chain knowledge graphs are commonly col-
lected from various sources such as Enterprise Resource
Planning (ERP) systems, transaction records, market
reports, and social media (see Brintrup et al. 2024 for
a review). The supply chain data used in this work
was collected by a third-party data provider. To con-
struct a supply chain knowledge graph, we need to define
the ontology using the data that can character supply
chain information. In this case, the ontology includes
the definitions of entities, i.e. companies, products and


8 G. ZHENG AND A. BRINTRUP

Figure 2. The pretrained LM-enhanced supply chain link prediction framework.

certificates, and the relationships between these entities,
such as has_product, purchased_by, has_cert, and sup-
plies_to, shown in Figure 2. The knowledge defined by
the ontology is structured into triplets, and each triplet
is an instance of the relationships and entities defined by
the ontology.

We define the quintuplets to represent the contextual
knowledge that we aim to predict. As a case example, we
use product flow on supply-buy links, however a quintu-
plet can also represent contextual information on loca-
tions, types and volumes of transactions, depending on
the question at hand.

We then reconstruct relationships originally repre-
sented by triplets, into quintuplets. For example three
triplets (Company A, has_product, Product 1), (Product 1,
purchased_by, Company B) and (Company A, supplies_to,
Company B) can be used to generate a quintuplet of
the sort: (Company A, supplies, Product 1, to, Company
B). Another example is: (Company A, with, Certificate
1, has, Product 1) generated by two triplets, (Company
A, has_cert, Certificate 1) and (Company A, has_product,
Product 1).

The next step involves transferring quintuplets into
composed snippets of text with a user-defined schema.
For example, a quintuplet of (Company A, supplies, Prod-
uct 1, to, Company B) can be transferred into Company A
supplies product 1 to company B or Company A has prod-
uct 1 and supplies it to company B. The text used to repre-
sent a quintuplet can include different types of sentences
but needs to be contextually accurate and consistent.

The composed text is then sent to a pretrained LM for
embedding and retrieved hidden relational knowledge
previously learned in the model.

The embeddings of quintuplets with retrieved rela-
tional knowledge are used to train a suitable machine
learning model for quintuplet prediction. As pretrained
LMs cannot directly predict factual relationships in a sup-
ply chain, we use a machine learning model for it. The

resulting trained model can then be used for quintuplet
prediction.

3.2.1. Languagemodel selection
We select five general open-source pre-trained LMs for
experimentation with the following considerations:

(1) Diversity: We test several pretrained LMs to inves-
tigate whether our approach is applicable across the
state of the art.

(2) Model size: Many existing works in the field of
NLP (Narayanan et al. 2021; Shin et al. 2020; Shoeybi
et al. 2019) suggest that larger model size can lead to
improved performance. Therefore, pretrained LMs
with different model sizes are selected for experi-
mentation.

(3) Output dimensions: Increasing the dimension of a
LM can potentially capture more complex patterns
and nuances in the data, however, cannot guaran-
tee better representation by default (Kenton and
Toutanova 2019). Thus, selected pretrained LMs
have different dimensions of their representations
for evaluation.

Based on the considerations above, we select five pre-
trained LMs that all were developed on the basis of Trans-
former (Vaswani et al. 2017) but have different model
sizes and output dimensions (Table 1).

Among the selected pretrained LMs, ‘paraphrase-
albert-small-v2’ is the smallest with 43MB and a six-
layer version of ‘albert-base-v2’ that originates from Lan
et al. (2019) aiming to solve the problems of GPU/TPU
memory limitations and longer training times by lower-
ing model size. Compared to the original BERT, ‘albert-
base-v2’ introduces two parameter reduction techniques
to reduce memory consumption and increase training
speed.


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 9

Table 1. Selected pretrained language models.

Model Name Model Size Training Data Size Dimensions
Max Sequence

Length Reference

distiluse-base-multilingual-cased-v2 480MB 1 million sentence pairs (15
languages)

512 128 (tokens) Reimers and Gurevych (2019)

all-distilroberta-v1 290MB over 1 billion pairs 768 512 (tokens) Liu et al. (2019)
all-MiniLM-L12-v2 120MB over 1 billion pairs 384 256 (tokens) Wang et al. (2020)
all-MiniLM-L6-v2 80MB over 1 billion pairs 384 256 (tokens) Wang et al. (2020)
paraphrase-albert-small-v2 43MB 16 GB of uncompressed

text
768 256 (tokens) Lan et al. (2019)

The second smallest model is ‘all-MiniLM-L6-v2’, a
six-layer version ofWang et al. (2020), developed by com-
pressing the large Transformer-based pretrained model
using a simple but effective approach called deep self-
attention distillation (Wang et al. 2020). It introduces
the conceptions of the student model and the teacher
model. ‘all-MiniLM-L6-v2’ referred to as the student
model in Wang et al. (2020) is trained by mimicking the
self-attention module, in Transformer networks, of the
large language model referred to as the teacher model
and also by distilling the self-attention module of the
last Transformer layer of the teacher model. In addi-
tion, ‘all-MiniLM-L6-v2’ only keeps 50% of parameters
of the teacher model but can retain more than 99% of
accuracy on several benchmark tasks (Wang et al. 2020).
‘all-MiniLM-L12-v2’ is similar to ‘all-MiniLM-L6-v2’ but
is a 12-layer version of Wang et al. (2020), leading to big
model size with more parameters.

‘all-distilroberta-v1’ is a distilled version of the
BERT base model in Kenton and Toutanova (2019).
It is smaller and faster than BERT but developed
using the BERT base model as a teacher. Com-
pared to ‘paraphrase-albert-small-v2’, ‘all-MiniLM-L6-
v2’ and ‘all-MiniLM-L12-v2’, this model size is larger
but smaller than ‘distiluse-base-multilingual-cased-v2’.
‘distiluse-base-multilingual-cased-v2’ is also a modifica-
tion of the pretrained BERT network, but trained on

the data in 15 languages (Reimers and Gurevych 2019),
compared to all other models being trained in English.

3.2.2. Machine learningmodel selection
Five machine learning models are selected based on
the suitability of the model for application to link pre-
diction and past works. These include Artificial Neu-
ral Network (ANN) (Yegnanarayana 2009), Convolu-
tional Neural Network (CNN) (Albawi, Mohammed,
andAl-Zawi 2017), Logistic Regression (LogReg) (Klein-
baum and Klein 2002), Long Short-Term Memory
(LSTM) (Hochreiter and Schmidhuber 1997), andAutoEn-
coder (Wang et al. 2014). Their architectures and param-
eter settings are presented in Table 2.

ANN is composed of three linear layers with 300 neu-
rons each and each linear layer is followed by a Batch
Normalization layer (BatchNorm) and a Rectified Lin-
ear Unit (‘ReLU’) activation function. BatchNorm aims
to normalise the output of each linear layer to ensure a
more stable training process while the function ‘ReLU’
contributes to the acceleration of the training phase and
themitigation of the problem of vanishing gradients. The
output from the last ‘ReLU’ activation function is then
sent to a fully-connected (FC) layer before being coupled
with a ‘Softmax’ function to achieve the final prediction.

TheCNNmodel consists of three convolutional layers,
each followed by the ‘ReLU’ function, anAverage Pooling

Table 2. The architectures and parameters of five selected commonmachine learning models.

ANN CNN1D LogReg LSTM AutoEncoder

Layer Name Parameters Layer Name Parameters Layer Name Parameters Layer Name Parameters Layer Name Parameters

Linear1 300 Conv1D1 32,(7),2 Linear1 200 LSTM1 16,16,bi Linear1 (Encoder) 96
BatchNorm1 300 ReLU1 – – – – – ReLU1 (Encoder) –
ReLU1 – AvgPooling1 -,(7),2 – – – – Linear2 (Encoder) 48
– – BatchNorm1 32 – – – – ReLU2 (Encoder) –
Linear2 300 Conv1D2 64,(7),1 Linear2 2 LSTM2 16,16,bi Linear1 (Decoder) 48
BatchNorm2 300 ReLU2 – – – – – ReLU1 (Decoder) –
ReLU2 – AvgPooling2 -,(7),1 – – – – Linear2 (Decoder) 96
– – BatchNorm2 64 – – – – ReLU2 (Decoder) –
Linear3 300 CNN-1D3 64,(7),1 – – – – – –
BatchNorm3 300 ReLU3 – – – – – – –
ReLU3 – AvgPooling3 -,(7),1 – – – – – –
– – BatchNorm3 64 – – – – – –
– – Flatten – – – – – – –
FC 2 FC 2 – – FC 2 FC 2
Softmax – Softmax – Sigmoid – Softmax – Softmax –


10 G. ZHENG AND A. BRINTRUP

layer (AvgPooling), and a BatchNorm layer. ‘ReLU’ and
‘BatchNorm’ function the same as them in ANN while
AvgPooling layer serves to decrease the dimensionality of
the features outputted by the ‘ReLU’. Similar to ANN, a
FC layer followed by a Softmax function is used to output
the final prediction. Considering the computation cost
and the performance, the first Conv1D layer is configured
with 32 kernels of size 7 and of stride 2 while the second
and third Conv1D layers use 64 kernels of size 7 and of
stride 1.

The LogReg model has two linear layers, followed by
a logistic function ‘Sigmoid’. The first linear layer with
200 neurons and the second with 2 neurons are used to
analyse the relationship between output and input fea-
tures. The LSTM model contains two LSTM layers with
bi-directions, followed by the FC layer. The input and
hidden sizes in each layer are set as 16. The AutoEn-
coder model consists of an encoder for transforming
the input to a compressed representation, a decoder for
reconstructing the original input from the encoded rep-
resentation, and an FC layer followed by the Softmax
function for the final link prediction. Both encoder and
decoder are composed of two linear layers, each followed
by the ‘ReLU’ function. To encode input to a compressed
representation, the number of neurons in two linear lay-
ers are respectively 96 and 48. In the decoder, two linear
layers with 48 and 96 respectively are used to reconstruct
the original input.

Since our task considers link prediction as a binary
classification problem, all models, excluding LogReg, use
an FC layer with 2 neurons.

4. Case study

The case study used to evaluate the proposed approach
from the automotive sector where companies produce
car parts, such as engines, front axles, fuel tanks and
sell them to car manufacturing companies around the
world. The dataset has been used as a benchmark for
link prediction problem within supply chain networks in
previous works and therefore offers potential for cross-
comparison (Brintrup et al. 2018; Kosasih and Brin-
trup 2022; Kosasih et al. 2022). The dataset comprises
43,131 companies spanning 72 countries and producing
927 distinct products, each company associated with one
or more of 5 certification types, along with the relation-
ships among these entities (Table 3).

We separated the data at the country level so as to eval-
uate our approach over multiple heterogeneous datasets.
As shown in Table 4 each partition has different numbers
of companies, products, certificates and relationships.

27 datasets have thus been generated. As a starting
point, we use the same knowledge graph ontology as

Table 3. Basic descriptions of Marklines data.

Entity Example Unique Number

Company Hamenz For German Tech. Ind. (S.A.E.) 43, 131
Country Egypt 79
Certificate IS09001, QS9000, ISO/TS16949 5
Product Piston Ring Machining 927

previous works with three types of entities: companies,
products, and certificates, and four types of relationships
(triplets): (company, has_product, product), (company,
has_cert, certificate), (company, supplies_to, company)
and (product, purchased_by, company) (see Figure 1(c)).

Four triplets are used to generate two quintuplets for
the evaluation of the proposed approach. Twoquintuplets
are (company, supplies, product, to, company) and (com-
pany, with, certificate, has, product). The prediction prob-
lem thus is the existence of a given quintuplet. Therefore,
we consider the relationship prediction in a quintuplet as
a binary classification problem.

Next, we explain how we generate positive and nega-
tive relationships based on quintuplets to train the mod-
els, followed by experimental settings and results.

4.1. Generating training data

We refer to actual relationships in a quintuplet as posi-
tive relationships and non-existing relationships as neg-
ative relationships. To train a machine learning model
both positive relationships and negative relationships are
needed.

Negative quintuplets are generated by the same triplets
that were used to produce positive quintuplets. Consider
the following positive quintuplet, (company A, supplies,
product 1, to, company B), three negative quintuplets can
be generated by replacing any one of the three entities:
company A, product 1 and company B, using the one that
does not connect the other two.

Given two lists of unique entities i.e. the list of unique
companies and the list of unique products, (company A,
supplies, product 1, to, company B), we randomly select a
company named company C, that does not connect both
company A and company B, from the list of unique com-
panies to replace companyAor companyB.Alternatively,
we can randomly select a product named product 2, that
does not connect to either company A and company B,
and replace it with product 1. Therefore, we can generate
three negative quintuplets: (company C, supplies, prod-
uct 1, to, company B), (company A, supplies, product 1, to,
company C) and (company A, supplies, product 2, to, com-
pany B). In addition, incorrect relationship direction is
also considered as negative quintuplets, such as (company
B, supplies, product 1, to, company A).


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 11

Table 4. Data description for each country.

Country Name Num-Company Num-Product Num-Certificate Num-Relations

AUSTRALIA 150 251 4 1, 997
AUSTRIA 195 196 5 3, 199
BELGIUM 293 245 5 4, 704
BRAZIL 603 458 5 10, 705
CANADA 390 359 5 22, 734
CHINA 10, 500 877 5 330, 180
CZECH REPUBLIC 329 281 5 3, 395
FRANCE 518 510 5 21, 052
GERMANY 1, 965 769 5 114, 793
HUNGARY 363 283 5 3, 820
INDIA 1, 509 646 5 61, 539
INDONESIA 486 380 4 12, 020
ITALY 389 425 5 15, 264
JAPAN 5, 068 890 5 215, 514
KOREA 1, 195 706 5 38, 295
MALAYSIA 384 355 4 5, 481
MEXICO 463 395 5 7, 900
POLAND 456 303 5 3, 467
RUSSIA 213 235 4 2, 038
SOUTH AFRICA 185 242 5 2, 623
SPAIN 282 385 5 12, 044
SWEDEN 236 224 5 6, 802
TAIWAN 777 488 5 15, 982
THAILAND 1, 296 535 4 20, 439
TURKEY 449 369 5 13, 141
U.S.A. 2, 577 800 5 91, 292
UK 932 502 5 15, 836

Based on the above method each positive quintuplet
can be used to produce several negative quintuplets. If
all possible negative quintuplets are used, it leads to neg-
atives far exceeding positives. This natural imbalanced
situation results from a characteristic of real-world net-
work structures, in which the vast majority of possible
node pairs in the graph do not have a direct link (Bacilieri
et al. 2023; Mungo et al. 2023). To take this imbalance
into account and ensure robust and reliable model train-
ing, we randomly select one negative quintuplet with
the related positive quintuplet to train the model in
order to have a balanced dataset (Kosasih and Brintrup
2022).

4.2. Experimental settings

Experimental settings in this work include benchmark
settings andmodel training settings. The benchmark set-
tings aim to evaluate whether machine learning models
with the help of pretrained LMs can provide more accu-
rate relationship predictions in supply chain networks
while the model training settings aim to set the optimal
parameters during the model training phase.

4.2.1. Benchmarks
To evaluate the effectiveness of our proposed approach,
we set machine learning models without the help of
pretrained LMs as benchmarks. The proposed approach
is designed to power machine learning models by

pretrained LMs so we also select five pretrained LMs (cf.
Section 3) to test the approach.

4.2.2. Settings ofmodel training
As the language models used in this work are pre-
trained models, we only need to set parameters to train
machine learning models, which are shown in Table 2.
The parameters during the training phase include the
number of epochs, E, batch size, B, and learning rate, r.
To ensure the uniformity of experiments and follow com-
mon guidelines in machine learning training (Yang and
Shami 2020), we set B as 64 and r as 0.001 for all five
machine learning models. For the number of epochs, E,
we use the stop−early strategy to stop the training process
if the training loss decreases but validating loss increases
in 10 continuous epochs for the determination of E and
also prevents overfitting.

We consider the problem of relationship prediction in
supply chain networks as a binary classification problem
mentioned earlier. We thus use the Cross-Entropy Loss as
the loss function for all machine learningmodel training.
Adam (Kingma and Ba 2014) is selected as the optimiser
for all machine learning models.

In addition, following the common rules for splitting
dataset into training, validating and testing, we use 70%,
10% and 20% of relationships present in each data par-
tition. All experiments are run on a desktop with an
IntelR CoreTM i9-9900KCPU and aGeForce RTX 2080 Ti


12 G. ZHENG AND A. BRINTRUP

Table 5. Confusion Matrix.

Predicted Positive Predicted Negative

Real Positive True Positive (TP) False Negative (FN)
Real Negative False Positive (FP) True Negative (TN)

GPU card with 11GB physical memory. PyTorch is used
to develop and train all models.

4.2.3. Performancemetrics
Common metrics, including accuracy, precision, recall
and f-score, used to evaluate the performance of a clas-
sification approach are also used to evaluate our pro-
posed approach. Table 5 shows a confusion matrix used
to calculate the four metrics. In our case, TP and TN
respectively represent positive and negative relationships
that are predicted correctly while FP and FN respectively
describe positive and negative relationships that are pre-
dicted incorrectly. Based on Table 5, the four common
metrics can be calculated as:

• accuracy = TP+TN
TP+TN+FP+FN describes the ratio of cor-

rect relationship predictions to the total number of
relationships.

• precision= TP
TP+FP stands for the ratio of accurate pre-

dictions of positive relationships to the total number
of predicted positive relationships.

• recall= TP
TP+FN represents the ratio of accurate predic-

tions of positive relationships to the total number of
positive relationships.

• f-score shows the equilibrium between the precision
and the recall, 2×Precision×Recall

Precision+Recall .

When we split the dataset into training, validating
and testing, we randomly shuffle all positive and nega-
tive relationships for fairness. This process may lead to
an imbalanced testing dataset even though the overall
dataset is balanced. Therefore, to truly reflect the perfor-
mance of our approach, we use weighted f-score shown in
Equation (1) to replace the commonly used f-score.

fw − score =
∑

k

2 × wk
Precisionk × Recallk
Precisionk + Recallk

(1)

where wk is the ratio of relationships for class k over all
relationships and is equal to nk

N . N is the total number of
relationships while nk is the number of relationships in
class k.

In addition, we expect the developed approach to
equally consider the importance of positive and nega-
tive relationships. Thus, we follow Grandini, Bagli, and
Visani (2020), who compared metrics for multi-class
classification, and use balanced accuracy weighted =

wp × TP
TP+FN + wn × TN

TN+FP where wp and wn respec-
tively represent the ratio of positive relationships and the
ratio negative relationships in the testing dataset (notes
that wp + wn = 1), instead of the commonly used accu-
racy. balanced accuracy weighted does not only show the
ability of the model to predict positive relationships but
also reflect its ability to predict negative relationships.

4.3. Experimental results and discussions

4.3.1. Benchmark comparison between pretrained
LM-enhanced link prediction and general machine
learningmodels
Tables 6 and 7 respectively show results achieved by
the machine learning models to predict the quintuplets
of (company, supplies, product, to, company) and (com-
pany, with, certificate, has, product), while Tables 8 and 9
present results achieved by the pretrained LM-enhanced
approach.

Based on balanced accuracy weighted (Accbw), preci-
sion (Pre) and recall (Rec) in these tables, an obvious
finding is that the pretrained LM-enhanced link pre-
diction outperforms all general machine learning mod-
els for both quintuplets on all datasets. This finding
can also be observed in Figures 3 and 4 that show the
fw−score achieved by our proposed approach and gen-
eral machine learning models for the link predictions
in (company, supplies, product, to, company) and (com-
pany, with, certificate, has, product). These results indicate
that pretrained LMs indeed can help general machine
learning models achieve better link predictions in sup-
ply chain networks, confirming observation thatmachine
learning models benefit from the relational knowl-
edge learned in pretrained LMs (Bouraoui, Camacho-
Collados, and Schockaert 2020; Petroni et al. 2019;
Safavi and Koutra 2021). This improved prediction
accuracy is critical for mapping out complex inter-
firm relationships, thus enabling organisations to gain a
more comprehensive understanding of their supply chain
dynamics.

In addition, the pretrained LM-enhanced approach
presents more consistent results across different quintu-
plets, compared to results obtained by direct application
of machine learning models, which yield poor perfor-
mance in predicting quintuplets of (company, with, cer-
tificate, has, product) compared to quintuplets of (com-
pany, supplies, product, to, company). The enhanced con-
sistency across different quintuplets and datasets with
varying sizes suggests that our approach can reliably
function under conditions of data scarcity or heterogene-
ity, ensuring robust performance in real-world supply
chain environments.


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 13

Table 6. Results for the quintuplet of (company, supplies, product, to, company).

Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec)

AUSTRALIA 0.7953/0.7971/0.7936 0.8106/0.8106/0.8103 0.7872/0.7876/0.7873 0.8653/0.8675/0.8640 0.9037/0.9063/0.9024
AUSTRIA 0.7615/0.7619/0.7616 0.8853/0.8853/0.8853 0.9073/0.9075/0.9073 0.9199/0.9225/0.9200 0.9393/0.9418/0.9393
BELGIUM 0.8379/0.8375/0.8375 0.8748/0.8749/0.8756 0.8969/0.8968/0.8972 0.9251/0.9264/0.9267 0.9391/0.9392/0.9400
BRAZIL 0.8285/0.8292/0.8285 0.8672/0.8679/0.8672 0.9410/0.9414/0.9410 0.8899/0.8915/0.8898 0.9144/0.9152/0.9144
CANADA 0.8822/0.8857/0.8821 0.9245/0.9266/0.9246 0.9449/0.9458/0.9449 0.9702/0.9703/0.9702 0.9758/0.9759/0.9758
CHINA 0.8558/0.8558/0.8558 0.8698/0.8703/0.8697 0.8849/0.8852/0.8848 0.8841/0.8846/0.8839 0.8908/0.8909/0.8907
CZECH REPUBLIC 0.8115/0.8116/0.8115 0.8007/0.8013/0.8008 0.8227/0.8239/0.8229 0.8833/0.8847/0.8834 0.9168/0.9182/0.9170
FRANCE 0.8418/0.8427/0.8417 0.9083/0.9093/0.9082 0.9265/0.9272/0.9265 0.9240/0.9261/0.9239 0.9325/0.9347/0.9324
GERMANY 0.8437/0.8466/0.8442 0.9062/0.9079/0.9060 0.9187/0.9189/0.9187 0.9012/0.9014/0.9012 0.9145/0.9146/0.9145
HUNGARY 0.8321/0.8323/0.8329 0.8650/0.8649/0.8656 0.8989/0.8988/0.8995 0.8820/0.8828/0.8829 0.8992/0.9000/0.9000
INDIA 0.8236/0.8256/0.8237 0.8749/0.8752/0.8749 0.8939/0.8940/0.8939 0.8736/0.8745/0.8736 0.9037/0.9039/0.9037
INDONESIA 0.8514/0.8522/0.8512 0.8686/0.8705/0.8684 0.9209/0.9217/0.9207 0.9130/0.9141/0.9128 0.9230/0.9240/0.9228
ITALY 0.8548/0.8554/0.8547 0.9138/0.9153/0.9137 0.9212/0.9233/0.9210 0.9242/0.9259/0.9241 0.9331/0.9347/0.9330
JAPAN 0.8469/0.8533/0.8469 0.8643/0.8662/0.8643 0.9112/0.9115/0.9112 0.8986/0.8990/0.8987 0.9080/0.9083/0.9080
KOREA 0.8782/0.8784/0.8782 0.9243/0.9249/0.9243 0.9434/0.9435/0.9434 0.9195/0.9195/0.9195 0.9311/0.9311/0.9311
MALAYSIA 0.8187/0.8206/0.8197 0.8645/0.8665/0.8654 0.9057/0.9061/0.9062 0.8950/0.8959/0.8956 0.9211/0.9225/0.9219
MEXICO 0.8513/0.8522/0.8514 0.9198/0.9204/0.9198 0.9267/0.9274/0.9266 0.9129/0.9132/0.9128 0.9361/0.9363/0.9361
POLAND 0.7763/0.7765/0.7741 0.8461/0.8458/0.8457 0.8510/0.8514/0.8497 0.8615/0.8619/0.8608 0.8760/0.8834/0.8737
RUSSIA 0.8276/0.8277/0.8272 0.8512/0.8523/0.8505 0.8726/0.8730/0.8722 0.8823/0.8834/0.8817 0.8937/0.8978/0.8935
SOUTH AFRICA 0.7716/0.7722/0.7730 0.8542/0.8554/0.8522 0.8971/0.8987/0.8954 0.8919/0.8925/0.8911 0.9044/0.9062/0.9029
SPAIN 0.8310/0.8351/0.8311 0.9030/0.9050/0.9031 0.9324/0.9341/0.9324 0.9365/0.9383/0.9365 0.9528/0.9542/0.9528
SWEDEN 0.8456/0.8459/0.8454 0.9160/0.9169/0.9157 0.9127/0.9157/0.9122 0.9543/0.9549/0.9542 0.9667/0.9674/0.9665
TAIWAN 0.8408/0.8408/0.8409 0.9219/0.9223/0.9217 0.9424/0.9424/0.9424 0.9091/0.9094/0.9089 0.9310/0.9311/0.9310
THAILAND 0.8614/0.8618/0.8617 0.8731/0.8745/0.8727 0.9141/0.9145/0.9139 0.8927/0.8928/0.8926 0.9055/0.9060/0.9053
TURKEY 0.8446/0.8449/0.8444 0.8725/0.8752/0.8718 0.9062/0.9075/0.9057 0.8975/0.8992/0.8970 0.9183/0.9192/0.9180
U.S.A. 0.8585/0.8603/0.8582 0.8859/0.8891/0.8862 0.9306/0.9306/0.9306 0.9139/0.9145/0.9138 0.9351/0.9352/0.9351
UK 0.8558/0.8559/0.8560 0.8673/0.8717/0.8680 0.9147/0.9148/0.9148 0.9073/0.9081/0.9075 0.9188/0.9191/0.9190

Table 7. Results for the quintuplet of (company, with, certificate, has, product).

Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec)

AUSTRALIA 0.7052/0.7063/0.7052 0.6742/0.6750/0.6742 0.7052/0.7057/0.7052 0.7768/0.7780/0.7768 0.7987/0.8029/0.7987
AUSTRIA 0.6891/0.6901/0.6894 0.6698/0.6704/0.6695 0.6583/0.6622/0.6576 0.7708/0.7721/0.7710 0.8068/0.8072/0.8067
BELGIUM 0.7148/0.7162/0.7138 0.6784/0.6787/0.6778 0.7107/0.7115/0.7099 0.7651/0.7677/0.7642 0.7956/0.7981/0.7946
BRAZIL 0.7878/0.8012/0.7829 0.7908/0.8097/0.7852 0.8348/0.8361/0.8335 0.8185/0.8293/0.8147 0.8406/0.8440/0.8384
CANADA 0.7766/0.7801/0.7699 0.7684/0.7672/0.7683 0.7924/0.7911/0.7913 0.8225/0.8269/0.8176 0.8572/0.8585/0.8541
CHINA 0.7752/0.7844/0.7812 0.7961/0.7958/0.7971 0.8103/0.8097/0.8100 0.8058/0.8059/0.8038 0.8044/0.8050/0.8043
CZECH REPUBLIC 0.7087/0.7114/0.7092 0.7261/0.7262/0.7262 0.7076/0.7081/0.7075 0.7431/0.7442/0.7430 0.7984/0.7988/0.7983
FRANCE 0.7743/0.7795/0.7716 0.8011/0.8032/0.7996 0.8183/0.8205/0.8168 0.8121/0.8177/0.8098 0.8262/0.8279/0.8249
GERMANY 0.7247/0.7248/0.7246 0.7610/0.7612/0.7609 0.7702/0.7708/0.7703 0.7787/0.7816/0.7781 0.7897/0.7904/0.7893
HUNGARY 0.6435/0.6450/0.6454 0.6658/0.6648/0.6649 0.7969/0.7967/0.7956 0.7353/0.7351/0.7344 0.7661/0.7654/0.7657
INDIA 0.7354/0.7413/0.7380 0.7716/0.7715/0.7712 0.8135/0.8143/0.8142 0.8067/0.8080/0.8055 0.8130/0.8135/0.8121
INDONESIA 0.7364/0.7409/0.7368 0.7278/0.7308/0.7281 0.7695/0.7710/0.7697 0.7705/0.7911/0.7713 0.8177/0.8220/0.8180
ITALY 0.7499/0.7546/0.7516 0.7283/0.7285/0.7286 0.8036/0.8050/0.8045 0.8257/0.8332/0.8278 0.8419/0.8451/0.8433
JAPAN 0.8413/0.8412/0.8399 0.8506/0.8529/0.8478 0.8492/0.8499/0.8474 0.8563/0.8584/0.8537 0.8590/0.8615/0.8562
KOREA 0.7466/0.7469/0.7466 0.7799/0.7804/0.7800 0.8173/0.8186/0.8173 0.7990/0.8046/0.7991 0.8039/0.8045/0.8039
MALAYSIA 0.6856/0.6864/0.6861 0.6937/0.6980/0.6953 0.7219/0.7253/0.7232 0.7358/0.7449/0.7377 0.7552/0.7619/0.7570
MEXICO 0.7342/0.7368/0.7340 0.6884/0.6896/0.6882 0.7729/0.7736/0.7729 0.7917/0.7982/0.7914 0.8214/0.8222/0.8213
POLAND 0.7339/0.7328/0.7339 0.7301/0.7285/0.7269 0.7842/0.7828/0.7828 0.7729/0.7725/0.7708 0.7954/0.7947/0.7926
RUSSIA 0.8536/0.8577/0.8488 0.7651/0.7640/0.7635 0.6365/0.6359/0.6353 0.8568/0.8569/0.8553 0.8693/0.8711/0.8663
SOUTH AFRICA 0.6576/0.6584/0.6579 0.7023/0.7049/0.7016 0.7918/0.7944/0.7914 0.7545/0.7584/0.7540 0.7793/0.7837/0.7787
SPAIN 0.7017/0.7037/0.7033 0.7592/0.7603/0.7574 0.7868/0.7875/0.7862 0.8234/0.8244/0.8232 0.8394/0.8406/0.8383
SWEDEN 0.6786/0.6784/0.6787 0.6828/0.6825/0.6817 0.6984/0.7007/0.6955 0.7531/0.7587/0.7500 0.8000/0.8057/0.7970
TAIWAN 0.7218/0.7221/0.7212 0.7639/0.7674/0.7647 0.8301/0.8309/0.8305 0.7963/0.7983/0.7970 0.8037/0.8043/0.8040
THAILAND 0.7064/0.7151/0.7097 0.7056/0.7056/0.7052 0.7483/0.7502/0.7496 0.7659/0.7666/0.7658 0.7824/0.7833/0.7810
TURKEY 0.7273/0.7279/0.7279 0.7047/0.7048/0.7039 0.8175/0.8212/0.8161 0.7796/0.7850/0.7779 0.8058/0.8066/0.8051
U.S.A. 0.7795/0.7863/0.7821 0.8148/0.8146/0.8147 0.8330/0.8333/0.8323 0.8378/0.8384/0.8372 0.8400/0.8409/0.8392
UK 0.7590/0.7612/0.7602 0.7932/0.7938/0.7934 0.8342/0.8356/0.8335 0.8117/0.8138/0.8106 0.8269/0.8277/0.8262

4.3.2. Comparisons betweenmachine learning
models
Based on results achieved by machine learning mod-
els shown in Tables 6 and 7, Figures 3(a) and 4(a), we
find that, on all datasets, all machine learning models
perform better for predicting relationships in the quintu-
plet of (company, supplies, product, to, company) than in
(company, with, certificate, has, product). This is because

the number of (company, supplies, product, to, company)
samples is more than the samples of (company, with, cer-
tificate, has, product) in each country’s dataset, which can
be indirectly observed by a large number of companies
and a small number of certificates and products shown
in Table 3.

Among the different machine learning models, ANN,
CNN1D and AutoEncoder predict relationships more


14 G. ZHENG AND A. BRINTRUP

Table 8. Pretrained LM-enhancedmachine learning, ‘all-MiniLM-L12-v2’, for differentmachine learningmodels for quintuplet (company,
supplies, product, to, company)

Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec)

AUSTRALIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9991/0.9990/0.9991
AUSTRIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9614/0.9658/0.9615
BELGIUM 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9982/0.9981/0.9982
BRAZIL 1.0000/1.0000/1.0000 0.9990/0.9990/0.9990 0.9996/0.9996/0.9996 0.9997/0.9997/0.9997 0.9994/0.9994/0.9994
CANADA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998
CHINA 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9998/0.9998/0.9998 0.9486/0.9239/0.9492 0.9999/0.9999/0.9999
CZECH REPUBLIC 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994
FRANCE 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9999/0.9999/0.9999 0.9989/0.9989/0.9989
GERMANY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9998/0.9998/0.9998
HUNGARY 1.0000/1.0000/1.0000 0.9985/0.9985/0.9986 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 0.9989/0.9988/0.9989
INDIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993 0.9999/0.9999/0.9999
INDONESIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9983/0.9983/0.9983
ITALY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
JAPAN 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9999/0.9999/0.9999 0.9980/0.9980/0.9980 0.9998/0.9998/0.9998
KOREA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
MALAYSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9997 0.9999/0.9999/0.9999 0.9959/0.9960/0.9958
MEXICO 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
POLAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9979/0.9979/0.9979
RUSSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9996 1.0000/1.0000/1.0000 0.9852/0.9881/0.9856
SOUTH AFRICA 1.0000/1.0000/1.0000 0.9995/0.9995/0.9994 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
SPAIN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994
SWEDEN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
TAIWAN 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9992/0.9992/0.9992
THAILAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
TURKEY 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
U.S.A. 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9997/0.9997/0.9997 0.9998/0.9998/0.9998 0.9998/0.9998/0.9998
UK 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9994/0.9994/0.9994

Table 9. Pretrained LM-enhancedmachine learning, ‘all-MiniLM-L12-v2’, for different machine learning models to predict relationships
in the quintuplet of (company, with, certificate, has, product)

Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec)

AUSTRALIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
AUSTRIA 1.0000/1.0000/1.0000 0.9990/0.9990/0.9989 0.9990/0.9990/0.9989 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
BELGIUM 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9992/0.9992/0.9992 0.9984/0.9984/0.9985
BRAZIL 1.0000/1.0000/1.0000 0.9899/0.9899/0.9900 0.9984/0.9984/0.9983 0.9992/0.9992/0.9992 0.9995/0.9995/0.9995
CANADA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9990/0.9990/0.9990
CHINA 1.0000/1.0000/1.0000 0.9996/0.9996/0.9996 0.9992/0.9992/0.9992 0.9992/0.9992/0.9992 0.9998/0.9998/0.9998
CZECH REPUBLIC 1.0000/1.0000/1.0000 0.9991/0.9991/0.9991 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993
FRANCE 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9996/0.9996/0.9996
GERMANY 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 0.9996/0.9996/0.9996 0.9998/0.9998/0.9998 0.9994/0.9994/0.9994
HUNGARY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
INDIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9997 0.9997/0.9996/0.9997
INDONESIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
ITALY 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 0.9987/0.9987/0.9987
JAPAN 0.9998/0.9998/0.9999 0.9998/0.9998/0.9999 0.9999/0.9999/0.9999 0.9993/0.9993/0.9994 0.9992/0.9992/0.9993
KOREA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
MALAYSIA 0.9984/0.9985/0.9984 0.9984/0.9984/0.9985 1.0000/1.0000/1.0000 0.9984/0.9985/0.9984 0.9970/0.9971/0.9970
MEXICO 0.9983/0.9983/0.9983 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9988/0.9988/0.9988 0.9979/0.9979/0.9979
POLAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9983/0.9982/0.9985 0.9993/0.9993/0.9994 0.9992/0.9992/0.9992
RUSSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9990/0.9990/0.9989 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000
SOUTH AFRICA 0.9983/0.9983/0.9982 0.9983/0.9983/0.9982 0.9974/0.9974/0.9974 0.9983/0.9983/0.9982 0.9983/0.9983/0.9982
SPAIN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9991/0.9991/0.9992 0.9993/0.9993/0.9993 0.9995/0.9995/0.9995
SWEDEN 0.9984/0.9984/0.9985 0.9984/0.9984/0.9985 0.9995/0.9995/0.9995 0.9979/0.9980/0.9979 0.9984/0.9984/0.9985
TAIWAN 1.0000/1.0000/1.0000 0.9991/0.9991/0.9991 0.9997/0.9997/0.9997 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995
THAILAND 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994
TURKEY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999
U.S.A. 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 0.9998/0.9998/0.9998 0.9998/0.9998/0.9998 0.9995/0.9995/0.9995
UK 1.0000/1.0000/1.0000 0.9989/0.9989/0.9989 0.9993/0.9993/0.9993 1.0000/1.0000/1.0000 0.9997/0.9997/0.9998

accurately in both types of quintuplets than LSTM and
LogReg. This finding matches a common conclusion
from many existing works (Abu-Nimeh et al. 2007;
Caruana and Niculescu-Mizil 2006; Zheng, Ivanov, and

Brintrup 2024; Zheng, Kong, and Brintrup 2023) show-
ing that ANN and CNN are better than LogReg and
LSTM in binary classification tasks. This observation can
help practitioners in model selection when deploying.


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 15

Figure 3. (a) results on quintuplet (company, supplies, product, to, company) achieved by different machine learning models (b) results
achieved by pretrained LM-enhanced machine learning models with ‘all-MiniLM-L12-v2’


16 G. ZHENG AND A. BRINTRUP

Figure 4. (a) shows results of predicting relationships in the quintuplet of (company, with, certificate, has, product) achieved by five
machine learningmodels on all countries’ datasets while (b) presents the results achieved by our proposed approach using fivemachine
learning models empowered by the pretrained LM, ‘all-MiniLM-L12-v2’.


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 17

Table 10. Results of predicting relationships in a quintruplet of (company, supplies, product, to, company) ( referred to as X) and a
quintuplet (company, with, certificate, has, product) (referred to as Y), both achieved using pretrained LM-enhanced CNN.

all-MiniLM-L6-v2 all-MiniLM-L12-v2 all-distilroberta-v1 paraphrase-albert-small-v2 distiluse-base-multilingual-cased-v2
Country (X / Y) (X / Y) (X / Y) (X / Y) (X / Y)

AUSTRALIA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
AUSTRIA 1.0000/1.0000 1.0000/0.9990 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
BELGIUM 1.0000/1.0000 1.0000/1.0000 0.9994/1.0000 1.0000/1.0000 1.0000/1.0000
BRAZIL 0.9994/0.9997 0.9996/0.9983 1.0000/0.9997 1.0000/0.9999 1.0000/1.0000
CANADA 1.0000/1.0000 1.0000/1.0000 1.0000/0.9998 1.0000/1.0000 1.0000/1.0000
CHINA 0.9999/0.9998 0.9998/0.9992 1.0000/0.9998 1.0000/0.9999 1.0000/1.0000
CZECH REPUBLIC 1.0000/1.0000 1.0000/1.0000 0.9996/0.9996 1.0000/0.9996 1.0000/1.0000
FRANCE 0.9997/1.0000 0.9998/1.0000 1.0000/0.9998 1.0000/0.9993 1.0000/1.0000
GERMANY 0.9999/0.9998 1.0000/0.9996 1.0000/1.0000 1.0000/0.9999 1.0000/1.0000
HUNGARY 1.0000/1.0000 0.9998/1.0000 1.0000/0.9998 1.0000/1.0000 1.0000/1.0000
INDIA 1.0000/0.9999 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
INDONESIA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/0.9983 1.0000/1.0000
ITALY 1.0000/0.9999 0.9998/0.9999 1.0000/1.0000 1.0000/0.9986 1.0000/1.0000
JAPAN 1.0000/1.0000 0.9999/0.9999 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
KOREA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
MALAYSIA 0.9993/0.9977 0.9997/1.0000 1.0000/0.9995 1.0000/0.9997 1.0000/1.0000
MEXICO 1.0000/1.0000 1.0000/0.9998 1.0000/0.9991 1.0000/1.0000 1.0000/1.0000
POLAND 1.0000/1.0000 1.0000/0.9983 0.9987/1.0000 1.0000/1.0000 1.0000/1.0000
RUSSIA 1.0000/1.0000 0.9997/0.9990 1.0000/1.0000 1.0000/0.9963 1.0000/1.0000
SOUTH AFRICA 1.0000/0.9997 1.0000/0.9974 0.9997/0.9991 1.0000/1.0000 1.0000/1.0000
SPAIN 1.0000/1.0000 1.0000/0.9991 1.0000/0.9998 1.0000/0.9995 1.0000/1.0000
SWEDEN 1.0000/0.9984 1.0000/0.9995 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
TAIWAN 1.0000/0.9998 1.0000/0.9997 1.0000/0.9998 1.0000/0.9998 1.0000/1.0000
THAILAND 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000
TURKEY 0.9999/1.0000 1.0000/1.0000 1.0000/0.9997 0.9999/0.9999 1.0000/1.0000
U.S.A. 0.9998/0.9998 0.9997/0.9998 0.9998/0.9998 0.9999/0.9999 0.9999/1.0000
UK 1.0000/1.0000 1.0000/0.9993 1.0000/1.0000 0.9998/0.9997 1.0000/1.0000

4.3.3. Comparisons between pretrained language
models
Table 10 shows that results achieved by the pretrained
LMs to enhance the CNN model provide higher pre-
diction accuracy on both types of quintuplets. Although
all five pretrained LMs improve relationship prediction
accuracy, the performances of pretrained LMs have sub-
tle differences.

For example, notice that in predicting (company,
supplies, product, to, company) (see results from all X
columns in Table 10), ‘distiluse-base-multilingual-cased-
v2’ outperforms all other four. This can also be observed
for predicting quintuplet (company, with, certificate, has,
product) (see results from all Y columns in Table 10)).

The performance of ‘distiluse-base-multilingual-cas-
ed-v2’ is more consistent across all tasks, compared to the
other four models. This pretrained LM is trained on the
data in 15 languages (Reimers and Gurevych 2019), pro-
viding richer knowledge hidden in different languages.
Particularly for using language models as knowledge
bases, language models trained on multilingual data can
learn better representations than being trained onmono-
lingual data (Kassner, Dufter, and Schütze 2021; Pratap
et al. 2020). This indicates an advantage that is particu-
larly applicable to global supply chain networks. This is
also the reason of the multilingual model outperforming
the monolingual model in our case study. Although the
primary language of the used dataset is English, it was

collected from various countries, and certain elements,
such as company names, contain characters from other
languages. These multilingual components present chal-
lenges for monolingual language models, which are less
adept at processing non-English characters compared to
multilingual language models. As a result, the enhanced
predictive performance observed with multilingual lan-
guage models can be partially attributed to their ability
to effectively interpret and handle these diverse linguistic
features.

Regarding the relationship predictions in different
types of quintuplets, four of five pretrained LMs exclud-
ing ‘distiluse-base-multilingual-cased-v2’ perform better
predictions on (company, supplies, product, to, company)
than on (company, with, certificate, has, product). This
is because relationships in the former describe a type of
network-level information in supply chains while rela-
tionships in the latter present a type of internal informa-
tion in a company. Network-level information in supply
chains, such as relationships between companies, tends to
be more accessible and observable compared to internal
information of a specific company, due to network-level
information that can be inferred from public sources
and industry publications. Internal information about a
company such as specific product certificates or prod-
uct processes is more sensitive and not readily avail-
able to external observers. Even so, our pretrained LM-
enhanced models still provide high prediction accuracy


18 G. ZHENG AND A. BRINTRUP

compared to traditional ML methods. This directly con-
tributes to greater supply chain visibility and supports
more effective risk management and strategic planning.

4.3.4. Summary of findings
Our findings can be summarised as follows:

• Enhancing link prediction models with pretrained
LMs outperforms all five benchmarks, indicating that
pretrained LMs indeed can help common machine
learning models achieve better relationship predic-
tions in supply chain networks due to learned rela-
tional knowledge in these pretrained LMs.

• Pretrained LM-enhanced machine learning model
on prediction tasks is more consistent than using
machine learning model alone, and is less affected by
differences in dataset size.

• Pretrained LM-enhancedmodels are better in predict-
ing relationships that rely on network-level informa-
tion, compared to relationships that rely on internal-
company information. This is because network-level
information in supply chains tends to be more acces-
sible and observable from public sources, compared to
internal information of a specific entity in the network.
As such, pretrained LM-enhancement works better
in cases where we predict who supplies, which prod-
uct to whom, compared to for example, which quality
certification a company may have for which product.

• ANN, CNN and AutoEncoder that are commonly
good at solving binary classification problems can
predict relationships in supply chain networks more
accurately than LSTM and LogReg in our case.

• Pretrainedmultilingual LMsbenefit commonmachine
learning models better than monolingual LMs due
to being trained on multilingual data to learn better
representations.

5. Conclusions, managerial implications,
limitations, discussions, and future works

5.1. Conclusions

Relationship prediction, also called link prediction, or
supply network reconstruction, is an emergent area of
‘digital supply chain surveillance’ research that aims to
increase visibility of supply networks using data-driven
techniques without having to rely on the willingness of
supply chain actors to share information. Althoughmany
of the proposed methods have been very successful in
reconstructing supply-buy relationships, the context in
which these relationships are embedded has thus far
lacked attention. This hinders researchers and practition-
ers to take full advantage of thesemethods, as they cannot

accurately differentiate between a transactional relation-
ship and established supply relationships that charac-
terise physical resources needed to produce a product.
As such, estimations of resilience, distance to malicious
actors and harmful practices based remain inaccurate.

Recently, Generative AI (GenAI) methods such as
LLMs have become popular in eliciting information pat-
terns fromnatural language data. There is alsomuchhype
in their potential in SCM. However we cannot simply
ask an LLM whether a supply relationship exists, due to
their hallucination problem. Hence we need methods to
combine the power of GenAI methods with structured,
guaranteeable methods when it comes to supply network
surveillance. To date, there have been no studies on the
use of LLMs for supply network surveillance.

In this work, we developed a novel framework for
predicting complex, multi-relational interactions in sup-
ply chain networks by integrating GenAI with machine
learning.We introduced a new term, ‘quintuplet’, a struc-
tured representation that extends traditional triplets by
embedding contextual information such as product flows
and multi-hop dependencies. Compared to conventional
triplets, which capture isolated relationships (e.g. (com-
pany A, supplies, company B)), quintuplets model inter-
connected chains (e.g. (company A, supplies, Product 1,
to, company B)), enabling holistic visibility into end-to-
end supply chain dynamics. We formulate the link pre-
diction as a binary classification task, aiming to predict
whether a quintuplet exists or not to address the inherent
incompleteness of supply chain knowledge graphs.

Our work advances the literature in the three key
ways. First, we bridge the gap between natural lan-
guage processing (NLP) and supply chain knowledge
graph research by demonstrating that untuned pre-
trained LMs can generate semantically rich embed-
dings with relational knowledge while mitigating hal-
lucinations through knowledge graph anchoring. While
prior studies highlight hallucination risks in genera-
tive tasks (e.g. text synthesis Huang et al. 2023), we
extend mitigation strategies to structured link prediction
by training machine learning models to map language
model embeddings to knowledge graph-verified relation-
ships. This regulates predictions to align with domain-
specific facts. Second, quintuplets address knowledge
graph sparsity by enabling inference of indirect rela-
tionships (e.g. multi-tier supplier dependencies) through
contextualised chains. Third, we empirically validate that
pretrained LMs encode latent relational knowledge rel-
evant to supply chains (Bouraoui, Camacho-Collados,
and Schockaert 2020; Petroni et al. 2019; Safavi and
Koutra 2021), a finding that invites further exploration
of LMs for tasks like risk prediction and sustainability
analytics.


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 19

A real-world practical case study is used to eval-
uate the proposed approach with comparative bench-
marks that use machine learning methods without pre-
tained LM enhancement. Results show that pretrained
LM-enhanced quintuplet prediction surpasses all bench-
marks and provides consistent performance across all
datasets with the advantage of providing contextual
information, allowing stakeholders to be able to track
the movements of products in a global network. More
importantly, our method avoids LM fine-tuning, mak-
ing it deployable for organisations lacking NLP exper-
tise or extensive textual data. This scalability bridges
a longstanding gap between academic GenAI research
and industrial supply chain applications, where manual
efforts and reactive decision-making remain prevalent,
and also opens a door for more practical works in solving
supply chain challenges with the use of language models.

5.2. Managerial implications

Beyond the successful real-world case study evaluation
mentioned above, our work also yields several practical
implications for supply chain management. For exam-
ple, organisations could consider integrating their exist-
ing enterprise resource planning (ERP) systems, inven-
tory databases, and other structured data sources with
unstructured data (e.g. supplier communications, mar-
ket reports) to construct comprehensive supply chain
knowledge graphs. This integration will enhance vis-
ibility into complex relationships and dependencies,
enabling more accurate forecasting and risk manage-
ment. Besdies, supply chain managers, logistics coordi-
nators, and procurement teams can work together to
adopt their decision-support systems that incorporate
GenAI and machine learning models. By embedding
our quintuplet-based link prediction approach within
these systems, organisations can proactively identify
hidden dependencies and potential disruptions, lead-
ing to better-informed production planning and inven-
tory management. With improved supply chain visibil-
ity, organisations can use the insights from our frame-
work to strengthen supplier relationships and man-
age risks more effectively. For instance, by identify-
ing previously unrecognised dependencies, companies
can diversify their supplier base, negotiate more robust
contracts, and prepare contingency plans to mitigate
disruptions.

5.3. Limitations, discussions, and future works

Although the integration of pretrained LMswith a supply
chain knowledge graph significantly enhances link pre-
diction capabilities and contributes to improved supply

chain visibility, several limitations still should be consid-
ered as follows.

First, the performance of the proposedmodel is inher-
ently dependent on the quality and completeness of the
underlying knowledge graph; inaccuracies or missing
links can propagate errors throughout the prediction
process, thereby compromising overall reliability and
potentially causing the pretrained LMs to generate ‘hal-
lucinated’ outputs. These erroneous outputs may lead
to misguided decision-making if not adequately man-
aged, underscoring the necessity for a comprehensive and
high-quality supply chain knowledge graph.

Second, although pretrained LMs excel at capturing
rich contextual information from text, they are typically
trained on general-purpose corpora, whichmay not fully
encapsulate the specialised terminologies and nuances
inherent in supply chain contexts.

Third, with regard to scalability, our current frame-
work employs a single-shot (zero-shot) approach rather
than amulti-shot strategy (e.g. few-shot or iterative query
refinement). While multi-shot approaches could poten-
tially enhance accuracy by providing richer contextual
guidance and iterative refinement, they also entail sig-
nificantly higher computational costs. Each additional
‘shot’ increases token processing requirements and infer-
ence latency in a linear fashion. For instance, a five-
shot prompt could require approximately five timesmore
tokens and thus proportionally longer inference times
compared to a single-shot prompt. Given typical resource
constraints in exploratory studies, our implementation
prioritises computational efficiency and feasibility on
standard hardware.

Fourth, the selection of relatively small-to-medium
pretrained language models, as detailed in Table 1,
reflects a deliberate balance between performance and
practicality. Although larger models, such as the full
BERT or cloud-based models, might offer enhanced
performance due to a greater number of parameters,
their use would require substantially more memory and
computational resources, potentially complicating repro-
ducibility. Moreover, the deployment of cloud-based
models raises additional concerns regarding data confi-
dentiality, particularlywhen dealingwith sensitive supply
chain information.

Fifth, while our framework leverages a relatively
small pretrained languagemodel without fine-tuning, the
results demonstrate meaningful performance improve-
ments attributable to the injection of semantic context
into the link prediction process (see Tables 6–10). The
generative AI component plays an integral role by pro-
ducing rich embeddings that encode domain-specific
knowledge, thus enabling the downstream classifier to
achieve superior predictive accuracy and cross-context


20 G. ZHENG AND A. BRINTRUP

consistency compared to baseline methods. Although
larger models or fine-tuning could potentially yield fur-
ther improvements, our choice to use a smaller model
was driven by practical constraints and the objective of
demonstrating a deployable solution.

Futureworkwill address these limitations by focussing
on: (1) constructing a more accurate and comprehen-
sive supply chain knowledge graph; (2) evaluating the
applicability of our approach across a broader range of
industrial use cases; (3) exploring additional forms of
contextual knowledge derived from pretrained language
models to further advance supply chain management
practices; (4) examining the trade-offs between efficiency
and accuracy when employing a multi-shot framework;
and (5) assessing the performance gains fromusing larger
language models relative to the increased computational
requirement

Disclosure statement

No potential conflict of interest was reported by the author(s).

Funding

This work was supported by the Engineering and Physical
Sciences Research Council (grant number EP/W019868/1).

Notes on contributors

Dr. Ge Zheng is currently a Research
Associate in the Supply Chain AI Lab
(SCAIL), led by Professor Alexandra Brin-
trup, at the Institute for Manufacturing
(IfM), Department of Engineering, Uni-
versity of Cambridge, UK. She received
her PhD degree in Computer Science with
a full PhD scholarship at the University of

Bournemouth, UK, and anMSc degree in Electronic Engineer-
ing with an Academic Excellence International Masters Schol-
arship at the University of Essex. Her research areas involve
supply chain risk prediction, pattern recognition, intelligent
transportation systems, and healthcare applications.

Alexandra Brintrup is a Professor of Dig-
ital Manufacturing and head of the Sup-
ply Chain AI Lab (SCAIL) at the Insti-
tute forManufacturing (IfM),Department
of Engineering, University of Cambridge,
UK. She has a PhD in Artificial Intel-
ligence, an MSc in Applied Maths and
Computing, and a BEng in Manufactur-

ing Engineering. She specialises in distributed negotiation,
machine learning, autonomous systems, and nature-inspired
optimisation in complex supply networks.

Data availability statement

Due to the commercially sensitive nature of this research, sup-
porting data is not available.

ORCID

Ge Zheng http://orcid.org/0000-0002-9983-7120
Alexandra Brintrup http://orcid.org/0000-0002-4189-2434

References

Abu-Nimeh, S., D. Nappa, X. Wang, and S. Nair. 2007. “A
Comparison of Machine Learning Techniques for Phish-
ing Detection.” In Proceedings of the Anti-Phishing Work-
ing Groups 2nd Annual ECrime Researchers Summit, 60–69.
Pittsburgh, PA, USA, October 04–05, 2007.

Act, S. 2015. “Modern Slavery Act.” United Kingdom Parlia-
ment.

Agrawal, G., T. Kumarage, Z. Alghami, and H. Liu. 2023. “Can
Knowledge Graphs Reduce Hallucinations in LLMs?: A Sur-
vey.” arXiv preprint arXiv:2311.07914.

Aguero, D., and S. D. Nelson. 2024. “The Potential Application
of Large Language Models in Pharmaceutical Supply Chain
Management.” The Journal of Pediatric Pharmacology and
Therapeutics 29 (2): 200–205. https://doi.org/10.5863/1551-
6776-29.2.200.

Ahmed, C., A. ElKorany, and R. Bahgat. 2016. “A Super-
vised Learning Approach to Link Prediction in Twit-
ter.” Social Network Analysis and Mining 6 (1): 1–11.
https://doi.org/10.1007/s13278-016-0333-1.

Albawi, S., T. A. Mohammed, and S. Al-Zawi. 2017. “Under-
standing of a Convolutional Neural Network.” In 2017 Inter-
national Conference on Engineering and Technology (ICET),
1–6. IEEE.

AlMahri, S., L. Xu, and A. Brintrup. 2024. “Enhancing Supply
Chain Visibility with Knowledge Graphs and Large Lan-
guage Models.” arXiv preprint arXiv:2408.07705.

Astha, Rajvanshi. 2023. “How AI Could Transform Fast Fash-
ion for the Better–andWorse.” AccessedNovember 12, 2024.

Bacilieri, A., A. Borsos, P. Astudillo-Estevez, and F. Lafond.
2023. “Firm-Level Production Networks: What Do We
(Really) Know.” INET Oxford Working Paper 2023.

Bellamy, M. A., and R. C. Basole. 2013. “Network Analy-
sis of Supply Chain Systems: A Systematic Review and
Future Research.” Systems Engineering 16 (2): 235–249.
https://doi.org/10.1002/sys.v16.2.

Bender, E.M., T. Gebru, A.McMillan-Major, and S. Shmitchell.
2021. “On the Dangers of Stochastic Parrots: Can Language
Models Be Too Big?” in Proceedings of the 2021 ACMConfer-
ence on Fairness, Accountability, and Transparency, 610–623.
Virtual Event, Canada, March 3–10, 2021.

Bouraoui, Z., J. Camacho-Collados, and S. Schockaert. 2020.
“Inducing Relational Knowledge from Bert.” In Proceedings
of the AAAI Conference on Artificial Intelligence, Vol. 34,
7456–7463. New York, USA, February 7–12, 2020.

Brintrup, A., E. Kosasih, P. Schaffer, G. Zheng, G. Demirel, and
B. L. MacCarthy. 2024. “Digital Supply Chain Surveillance
Using Artificial Intelligence: Definitions, Opportunities and
Risks.” International Journal of Production Research 62 (13):
1–22.

Brintrup, A., P.Wichmann, P.Woodall, D.McFarlane, E. Nicks,
and W. Krechel. 2018. “Predicting Hidden Links in Supply
Networks.” Complexity 2018:1–12. https://doi.org/10.1155/
cplx.v2018.1.

Brockmann, N., E. Elson Kosasih, and A. Brintrup. 2022. “Sup-
ply Chain Link Prediction on Uncertain Knowledge Graph.”

http://orcid.org/0000-0002-9983-7120
http://orcid.org/0000-0002-4189-2434
https://doi.org/10.5863/1551-6776-29.2.200
https://doi.org/10.1007/s13278-016-0333-1
https://doi.org/10.1002/sys.v16.2
https://doi.org/10.1155/cplx.v2018.1


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 21

ACM SIGKDD Explorations Newsletter 24 (2): 124–130.
https://doi.org/10.1145/3575637.3575655.

BusinessWire. 2024. “Using Generative AI, C.H. RobinsonHas
Achieved Automation across the Entire Lifecycle of a Freight
Shipment.” Accessed November 11, 2024.

Cai, L., J. Li, J. Wang, and S. Ji. 2021. “Line Graph Neural Net-
works for Link Prediction.” IEEE Transactions on Pattern
Analysis and Machine Intelligence 44 (9): 5103–5113.

Caruana, R., and A. Niculescu-Mizil. 2006. “An Empirical
Comparison of Supervised Learning Algorithms.” In Pro-
ceedings of the 23rd International Conference on Machine
Learning, 161–168. Pittsburgh, PA, USA.

Caspersz, D., H. Cullen, M. C. Davis, D. Jog, F. McGaughey,
D. Singhal, M. Sumner, and H. Voss. 2022. “Modern Slavery
in Global Value Chains: A Global Factory and Governance
Perspective.” Journal of Industrial Relations 64 (2): 177–199.
https://doi.org/10.1177/00221856211054586.

Celonis. 2024. “Process Mining Meets Generative AI: Celonis
Rides Industry Wave to Democratize Core Tech.” Accessed
November 12, 2024.

Choi, T. Y., K. J. Dooley, and M. Rungtusanatham. 2001.
“Supply Networks and Complex Adaptive Systems: Control
versus Emergence.” Journal of Operations Management 19
(3): 351–366. https://doi.org/10.1016/S0272-6963(00)0006
8-1.

CNBCEvolve Global Summit. 2023. “Fedex at 50:What’s Driv-
ing Transformation?” Accessed November 12, 2024.

Coşkun, M., and M. Koyutürk. 2021. “Node Similarity-
Based Graph Convolution for Link Prediction in Bio-
logical Networks.” Bioinformatics 37 (23): 4501–4508.
https://doi.org/10.1093/bioinformatics/btab464.

du Preez, Derek. 2023. “Mars Develops a Sweet Tooth for
Celonis Process Intelligence.” Accessed November 12, 2024.

Fichtel, L., J.-C. Kalo, and W.-T. Balke. 2021. “Prompt Tuning
or Fine-Tuning-Investigating Relational Knowledge in pre-
Trained LanguageModels.” In 3rd Conference on Automated
Knowledge Base Construction. Virtual, October 4–8, 2021.

Fosso Wamba, S., C. Guthrie, M. M. Queiroz, and S. Min-
ner. 2024. “Chatgpt and Generative Artificial Intelligence:
An Exploratory Study of Key Benefits and Challenges
in Operations and Supply Chain Management.” Interna-
tional Journal of Production Research 62 (16): 5676–5696.
https://doi.org/10.1080/00207543.2023.2294116.

FossoWamba, S., M.M. Queiroz, C. J. C. Jabbour, and C. V. Shi.
2023. “Are Both Generative Ai and Chatgpt Game Changers
for 21st-Century Operations and Supply Chain Excellence?”
International Journal of Production Economics 265:109015.
https://doi.org/10.1016/j.ijpe.2023.109015.

Gayam, S. R. 2023. “Enhancing Creative Industries with Gen-
erative Ai: Techniques for Music Composition, Art Genera-
tion, and Interactive Media.” Journal of Machine Learning in
Pharmaceutical Research 3 (1): 54–88.

Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-
Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. “Gener-
ative Adversarial Nets.” In Advances in Neural Information
Processing Systems 27.Montreal, Quebec, Canada,December
8–13, 2014.

Google DeepMind. 2023. “Gemini Models.” Accessed Novem-
ber 13, 2024.

Gowda, S. R., and Y. R. Rao, 2024. “Data Augmentation Using
Generative-Ai.” Journal of Innovative Image Processing 6 (3):
273–289. https://doi.org/10.36548/jiip.

Grandini, M., E. Bagli, and G. Visani. 2020. “Metrics for
Multi-class Classification: An Overview.” arXiv preprint
arXiv:2008.05756.

Guan, X., Y. Liu, H. Lin, Y. Lu, B. He, X. Han, and L. Sun.
2024. “Mitigating Large Language Model Hallucinations via
Autonomous Knowledge Graph-Based Retrofitting.” In Pro-
ceedings of the AAAI Conference on Artificial Intelligence,
Vol. 38, 18126–18134. Vancouver, Canada, February 20–27,
2024.

Hasan, M. A., and M. J. Zaki. 2011. “A Survey of Link
Prediction in Social Networks.” In Social Network Data
Analytics, edited by C. Aggarwal. Boston, MA: Springer.
https://doi.org/10.1007/978-1-4419-8462-3_9

Hashim, M. E. A., W. A. W. Mustafa, N. S. Prameswari, M.
M. Ghani, and H. F. Hanafi. 2023. “Revolutionizing Virtual
Reality with Generative Ai: An in-depth Review.” Journal
of Advanced Research in Computing and Applications 30 (1):
19–30. https://doi.org/10.37934/arca.30.1.1930.

Hochreiter, S., and J. Schmidhuber. 1997. “Long short-
term Memory.” Neural Computation 9 (8): 1735–1780.
https://doi.org/10.1162/neco.1997.9.8.1735.

Holger, H., K. Theodora, R. Roger, and T. Kimberly. 2023.
“While Still Nascent, Generative AI Has the Potential to
Help Fashion Businesses Become More Productive, Get to
Market Faster, and Serve Customers Better. The Time to
Explore the Technology Is Now.” Accessed November 12,
2024.

Huang, L., W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q.
Chen, et al. 2023. “A Survey on Hallucination in Large Lan-
guage Models: Principles, Taxonomy, Challenges, and Open
Questions.” arXiv preprint arXiv:2311.05232.

Ialongo, L.N., C. deValk, E.Marchese, F. Jansen,H. Zmarrou, T.
Squartini, and D. Garlaschelli. 2022. “Reconstructing Firm-
Level Interactions in the Dutch Input–output Network from
Production Constraints.” Scientific Reports 12 (1): 11847.
https://doi.org/10.1038/s41598-022-13996-3.

Jackson, I., D. Ivanov, A. Dolgui, and J. Namdar. 2024. “Gener-
ative Artificial Intelligence in Supply Chain and Operations
Management: A Capability-Based Framework for Analysis
and Implementation.” International Journal of Production
Research 62 (17): 1–26.

Karras, T., S. Laine, and T. Aila. 2019. “A Style-Based Gener-
ator Architecture for Generative Adversarial Networks.” In
Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, 4401–4410. Long Beach, CA, USA,
June 15–20, 2019.

Kassner, N., P. Dufter, and H. Schütze. 2021. “Multilingual
Lama: Investigating Knowledge in Multilingual Pretrained
Language Models.” In Proceedings of the 16th Conference of
the European Chapter of the Association for Computational
Linguistics: Main Volume, 3250–3258. Kyiv, Ukraine, April
19–23, 2021.

Kazemi, S.M., andD. Poole. 2018. “Simple Embedding for Link
Prediction in Knowledge Graphs.” Advances in Neural Infor-
mation Processing Systems 31. Montreal, Canada, December
3–8, 2018.

Kenton, J. D. M.-W. C., and L. K. Toutanova. 2019. “Bert: Pre-
Training of Deep Bidirectional Transformers for Language
Understanding.” In Proceedings of NAACL-HLT, 4171–4186.
Minneapolis, Minnesota, USA, June 2–7, 2019.

Kingma, D. P., and J. Ba. 2014. “Adam: AMethod for Stochastic
Optimization.” arXiv preprint arXiv:1412.6980.

https://doi.org/10.1145/3575637.3575655
https://doi.org/10.1177/00221856211054586
https://doi.org/10.1016/S0272-6963(00)00068-1
https://doi.org/10.1093/bioinformatics/btab464
https://doi.org/10.1080/00207543.2023.2294116
https://doi.org/10.1016/j.ijpe.2023.109015
https://doi.org/10.36548/jiip
https://doi.org/10.1007/978-1-4419-8462-3_9
https://doi.org/10.37934/arca.30.1.1930
https://doi.org/10.1162/neco.1997.9.8.1735
https://doi.org/10.1038/s41598-022-13996-3


22 G. ZHENG AND A. BRINTRUP

Kingma, D. P., and M. Welling. 2019. “An Introduction to
Variational Autoencoders.” Foundations and Trends® in
Machine Learning 12 (4): 307–392. https://doi.org/10.1561/
2200000056.

Kleinbaum, D. G., and M. Klein. 2002. Logistic Regression: A
Self-Learning Text. New York, NY: Springer New York.

Kosasih, E. E., and A. Brintrup. 2022. “A Machine Learning
Approach for Predicting Hidden Links in Supply Chain with
Graph Neural Networks.” International Journal of Produc-
tion Research 60 (17): 5380–5393. https://doi.org/10.1080/00
207543.2021.1956697.

Kosasih, E. E., F. Margaroli, S. Gelli, A. Aziz, N. Wildgoose,
and A. Brintrup. 2022. “Towards Knowledge Graph Reason-
ing for Supply Chain Risk Management Using Graph Neural
Networks.” International Journal of Production Research 62
(15): 1–17.

Küblböck, K. 2013. The EU Raw Materials Initiative: Scope and
Critical Assessment. Technical Report, ÖFSE Briefing Paper.

Lan, Z., M. Chen, S. Goodman, K. Gimpel, P. Sharma, and
R. Soricut. 2019. “Albert: A Lite Bert for Self-supervised
Learning of Language Representations.” arXiv preprint
arXiv:1909.11942.

Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy,
M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. “Roberta:
A Robustly Optimized Bert Pretraining Approach.” arXiv
preprint arXiv:1907.11692.

Louie, R., A. Coenen, C. Z. Huang, M. Terry, and C. J. Cai.
2020. “Novice-AI Music Co-creation via AI-Steering Tools
for DeepGenerativeModels.” In Proceedings of the 2020 CHI
Conference on Human Factors in Computing Systems, 1–13.
Hawaii, USA, April 25–30, 2020.

Martino, A., M. Iannelli, and C. Truong. 2023. “Knowledge
Injection to Counter Large Language Model (LLM) Hallu-
cination.” In European Semantic Web Conference, 182–185.
Springer. Hersonissos, Greece, May 28th–June 1st, 2023.

Meiyappan, P., and M. Bales. 2021. Position Paper: Reducing
Amazon’s PackagingWaste UsingMultimodal Deep Learning.
Amazon Science. https://www.amazon.science/publications/
position-paper-reducing-amazons-packaging-wasteusing-
multimodal-deep-learning

Microsoft. 2023. “Empower Your Organization with Copilot.”
Accessed November 13, 2024.

Mohammed, M. Y., and M. J. Skibniewski. 2023. “The Role of
Generative Ai in Managing Industry Projects: Transform-
ing Industry 4.0 into Industry 5.0 Driven Economy.” Law
and Business 3 (1): 27–41. https://doi.org/10.2478/law-2023-
0006.

Mungo, L., F. Lafond, P. Astudillo-Estévez, and J. D. Farmer.
2023. “Reconstructing Production Networks UsingMachine
Learning.” Journal of Economic Dynamics and Control
148:104607. https://doi.org/10.1016/j.jedc.2023.104607.

Narayanan, D., M. Shoeybi, J. Casper, P. LeGresley, M. Patwary,
V. Korthikanti, D. Vainbrand, et al. 2021. “Efficient Large-
Scale Language Model Training on GPU Clusters Using
Megatron-LM.” In Proceedings of the International Confer-
ence for High Performance Computing, Networking, Storage
and Analysis, 1–15. St. Louis, Missouri, USA, November
14–19, 2021.

Noy, S., and W. Zhang. 2023. “Experimental Evidence on the
Productivity Effects of Generative Artificial Intelligence.”
Science 381 (6654): 187–192. https://doi.org/10.1126/scien
ce.adh2586.

Ooi, K.-B., G. W.-H. Tan, M. Al-Emran, M. A. Al-Sharafi,
A. Capatina, A. Chakraborty, Y. K. Dwivedi, et al. 2023.
“The Potential of Generative Artificial Intelligence across
Disciplines: Perspectives and Future Directions.” Journal of
Computer Information Systems 65 (1): 1–32.

Open AI. 2022. “Dall.e2.” Accessed Novemver 11, 2024.
OpenAI. 2024. “Chatgpt – ReleaseNotes.” AccessedNovember

11, 2024.
Petroni, F., T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin, Y.

Wu, and A. Miller. 2019. “Language Models as Knowledge
Bases?” In Proceedings of the 2019 Conference on Empirical
Methods in Natural Language Processing and the 9th Inter-
national Joint Conference on Natural Language Processing
(EMNLP-IJCNLP). Association for Computational Linguis-
tics.

Pichler, A., C. Diem, A. Brintrup, F. Lafond, G. Mager-
man, G. Buiten, T. Y. Choi, V. M. Carvalho, J. D. Farmer,
and S. Thurner. 2023. “Building an Alliance to Map
Global Supply Networks.” Science 382 (6668): 270–272.
https://doi.org/10.1126/science.adi7521.

Pratap, V., A. Sriram, P. Tomasello, A. Hannun, V. Liptchinsky,
G. Synnaeve, andR.Collobert. 2020. “MassivelyMultilingual
ASR: 50 Languages, 1 Model, 1 Billion Parameters.” arXiv
preprint arXiv:2007.03001.

Reimers, N., and I. Gurevych.November, 2019. “Sentence-Bert:
Sentence Embeddings Using Siamese Bert-Networks.” In
Proceedings of the 2019 Conference on Empirical Methods in
Natural Language Processing. Association for Computational
Linguistics.

Rossi, A., D. Barbosa, D. Firmani, A. Matinata, and P. Meri-
aldo. 2021. “Knowledge Graph Embedding for Link Pre-
diction: A Comparative Analysis.” ACM Transactions on
Knowledge Discovery from Data (TKDD) 15 (2): 1–49.
https://doi.org/10.1145/3424672.

Ryder System. 2024. “6 Ways Generative AI Is Boosting Logis-
tics.”’ Accessed November 11, 2024.

Safavi, T., and D. Koutra. 2021. “Relational World Knowledge
Representation in Contextual Language Models: A Review.”
In Proceedings of the 2021 Conference on Empirical Methods
in Natural Language Processing, 1053–1067.

Shin, H.-C., Y. Zhang, E. Bakhturina, R. Puri, M. Patwary, M.
Shoeybi, and R. Mani. 2020. “Biomegatron: Larger Biomed-
ical Domain Language Model.” In Proceedings of the 2020
Conference on Empirical Methods in Natural Language Pro-
cessing (EMNLP), 4700–4706.

Shoeybi, M., M. Patwary, R. Puri, P. LeGresley, J. Casper,
and B. Catanzaro. 2019. “Megatron-LM: Training Multi-
billion Parameter Language Models Using Model Paral-
lelism.” arXiv preprint arXiv:1909.08053.

Srivastava, S. K., S. Routray, S. Bag, S. Gupta, and J. Z. Zhang.
2024. “Exploring the Potential of Large Language Models in
Supply Chain Management: A Study Using Big Data.” Jour-
nal of Global Information Management (JGIM) 32 (1): 1–29.
https://doi.org/10.4018/JGIM.

Su, Z., X. Zheng, J. Ai, Y. Shen, and X. Zhang. 2020. “Link
Prediction in Recommender Systems Based on Vector Sim-
ilarity.” Physica A: Statistical Mechanics and Its Applications
560:125154. https://doi.org/10.1016/j.physa.2020.125154.

Tan, Y., Z. Zhou, H. Lv, W. Liu, and C. Yang. 2024. “Walklm:
A Uniform Language Model Fine-Tuning Framework for
Attributed Graph Embedding.” In Advances in Neural Infor-
mation Processing Systems 36.

https://doi.org/10.1561/2200000056
https://doi.org/10.1080/00207543.2021.1956697
https://www.amazon.science/publications/position-paper-reducing-amazons-packaging-wasteusing-multimodal-deep-learning
https://doi.org/10.2478/law-2023-0006
https://doi.org/10.1016/j.jedc.2023.104607
https://doi.org/10.1126/science.adh2586
https://doi.org/10.1126/science.adi7521
https://doi.org/10.1145/3424672
https://doi.org/10.4018/JGIM
https://doi.org/10.1016/j.physa.2020.125154


INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 23

Touvron, H., T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux,
T. Lacroix, B. Rozière, et al. 2023. “Llama: Open and
Efficient Foundation Language Models.’ arXiv preprint
arXiv:2302.13971.

Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
Gomez, Ł. Kaiser, and I. Polosukhin. 2017. “Attention Is All
You Need.” In Proceedings of the 31st International Confer-
ence on Neural Information Processing Systems, 6000–6010.

Wang,W., Y. Huang, Y.Wang, and L.Wang. 2014. “Generalized
Autoencoder: A Neural Network Framework for Dimen-
sionality Reduction.” In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition Workshops,
490–497.

Wang,W., F.Wei, L. Dong, H. Bao, N. Yang, andM. Zhou. 2020.
“Minilm: Deep Self-attention Distillation for Task-Agnostic
Compression of pre-trained Transformers.” Advances in
Neural Information Processing Systems 33:5776–5788.

Wichmann, P., A. Brintrup, S. Baker, P. Woodall, and D.
McFarlane. 2018. “Towards Automatically Generating Sup-
ply Chain Maps from Natural Language Text.” IFAC-
PapersOnLine 51 (11): 1726–1731. https://doi.org/10.1016/j.
ifacol.2018.08.207.

Yang, L., and A. Shami. 2020. “On Hyperparameter Optimiza-
tion ofMachine Learning Algorithms: Theory and Practice.”
Neurocomputing 415:295–316. https://doi.org/10.1016/j.ne
ucom.2020.07.061.

Yasunaga, M., J. Leskovec, and P. Liang. 2022. “Linkbert:
Pretraining Language Models with Document Links.” In
Proceedings of the 60th Annual Meeting of the Associa-
tion for Computational Linguistics (Volume 1: Long Papers),
8003–8016.

Yegnanarayana, B. 2009.Artificial Neural Networks. PHI Learn-
ing Pvt. Ltd.

Zareie, A., and R. Sakellariou. 2020. “Similarity-Based Link
Prediction in Social Networks Using Latent Relation-
ships between the Users.” Scientific Reports 10 (1): 20137.
https://doi.org/10.1038/s41598-020-76799-4.

Zhang, C., S. R. Kuppannagari, R. Kannan, and V. K. Prasanna.
2018. “Generative Adversarial Network for Synthetic Time
Series Data Generation in Smart Grids.” In 2018 IEEE
International Conference on Communications, Control, and
Computing Technologies for Smart Grids (SmartGridComm),
1–6. IEEE.

Zhao, C., X. Sun,M.Wu, and L. Kang. 2024. “Advancing Finan-
cial Fraud Detection: Self-attention Generative Adversarial
Networks for Precise and Effective Identification.” Finance
Research Letters 60:104843. https://doi.org/10.1016/j.frl.20
23.104843.

Zheng, G., D. Ivanov, and A. Brintrup. 2024. “AnAdaptive Fed-
erated Learning System for Information Sharing in Supply
Chains.” International Journal of Production Research 1–23.
https://doi.org/10.1080/00207543.2024.2392635.

Zheng, G., L. Kong, and A. Brintrup. 2023. “Federated
Machine Learning for Privacy Preserving, Collective Supply
Chain Risk Prediction.” International Journal of Production
Research 61 (23): 8115–8132. https://doi.org/10.1080/00207
543.2022.2164628.

Zholus, A., M. Kuznetsov, R. Schutski, R. Shayakhmetov,
D. Polykovskiy, S. Chandar, and A. Zhavoronkov. 2024.
“Bindgpt: A Scalable Framework for 3D Molecular Design
via LanguageModeling and Reinforcement Learning.” arXiv
preprint arXiv:2406.03686.

https://doi.org/10.1016/j.ifacol.2018.08.207
https://doi.org/10.1016/j.neucom.2020.07.061
https://doi.org/10.1038/s41598-020-76799-4
https://doi.org/10.1016/j.frl.2023.104843
https://doi.org/10.1080/00207543.2024.2392635
https://doi.org/10.1080/00207543.2022.2164628

	1. Introduction
	2. Related works
	2.1. Generative artificial intelligence
	2.2. Link prediction in supply networks
	2.3. Summary of research gaps

	3. Combining pretrained language models and knowledge graphs
	3.1. Preliminaries: from triplets to quintuplets
	3.2. The pretrained LM-based machine learning framework
	3.2.1. Language model selection
	3.2.2. Machine learning model selection


	4. Case study
	4.1. Generating training data
	4.2. Experimental settings
	4.2.1. Benchmarks
	4.2.2. Settings of model training
	4.2.3. Performance metrics

	4.3. Experimental results and discussions
	4.3.1. Benchmark comparison between pretrained LM-enhanced link prediction and general machine learning models
	4.3.2. Comparisons between machine learning models
	4.3.3. Comparisons between pretrained language models
	4.3.4. Summary of findings


	5. Conclusions, managerial implications, limitations, discussions, and future works
	5.1. Conclusions
	5.2. Managerial implications
	5.3. Limitations, discussions, and future works

	Disclosure statement
	Funding
	ORCID
	Data availability statement
	References


<<
  /ASCII85EncodePages false
  /AllowTransparency false
  /AutoPositionEPSFiles false
  /AutoRotatePages /PageByPage
  /Binding /Left
  /CalGrayProfile ()
  /CalRGBProfile (Adobe RGB \0501998\051)
  /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2)
  /sRGBProfile (sRGB IEC61966-2.1)
  /CannotEmbedFontPolicy /Error
  /CompatibilityLevel 1.5
  /CompressObjects /Off
  /CompressPages true
  /ConvertImagesToIndexed true
  /PassThroughJPEGImages false
  /CreateJobTicket false
  /DefaultRenderingIntent /Default
  /DetectBlends true
  /DetectCurves 0.1000
  /ColorConversionStrategy /sRGB
  /DoThumbnails true
  /EmbedAllFonts true
  /EmbedOpenType false
  /ParseICCProfilesInComments true
  /EmbedJobOptions true
  /DSCReportingLevel 0
  /EmitDSCWarnings false
  /EndPage -1
  /ImageMemory 524288
  /LockDistillerParams true
  /MaxSubsetPct 100
  /Optimize true
  /OPM 1
  /ParseDSCComments false
  /ParseDSCCommentsForDocInfo true
  /PreserveCopyPage true
  /PreserveDICMYKValues true
  /PreserveEPSInfo false
  /PreserveFlatness true
  /PreserveHalftoneInfo false
  /PreserveOPIComments false
  /PreserveOverprintSettings false
  /StartPage 1
  /SubsetFonts true
  /TransferFunctionInfo /Remove
  /UCRandBGInfo /Remove
  /UsePrologue false
  /ColorSettingsFile ()
  /AlwaysEmbed [ true
  ]
  /NeverEmbed [ true
  ]
  /AntiAliasColorImages false
  /CropColorImages true
  /ColorImageMinResolution 150
  /ColorImageMinResolutionPolicy /OK
  /DownsampleColorImages true
  /ColorImageDownsampleType /Bicubic
  /ColorImageResolution 300
  /ColorImageDepth -1
  /ColorImageMinDownsampleDepth 1
  /ColorImageDownsampleThreshold 1.50000
  /EncodeColorImages true
  /ColorImageFilter /DCTEncode
  /AutoFilterColorImages true
  /ColorImageAutoFilterStrategy /JPEG
  /ColorACSImageDict <<
    /QFactor 0.40
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /ColorImageDict <<
    /QFactor 0.40
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000ColorACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 15
  >>
  /JPEG2000ColorImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 15
  >>
  /AntiAliasGrayImages false
  /CropGrayImages true
  /GrayImageMinResolution 150
  /GrayImageMinResolutionPolicy /OK
  /DownsampleGrayImages true
  /GrayImageDownsampleType /Bicubic
  /GrayImageResolution 300
  /GrayImageDepth -1
  /GrayImageMinDownsampleDepth 2
  /GrayImageDownsampleThreshold 1.50000
  /EncodeGrayImages true
  /GrayImageFilter /DCTEncode
  /AutoFilterGrayImages true
  /GrayImageAutoFilterStrategy /JPEG
  /GrayACSImageDict <<
    /QFactor 0.40
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /GrayImageDict <<
    /QFactor 0.40
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000GrayACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 15
  >>
  /JPEG2000GrayImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 15
  >>
  /AntiAliasMonoImages false
  /CropMonoImages true
  /MonoImageMinResolution 1200
  /MonoImageMinResolutionPolicy /OK
  /DownsampleMonoImages true
  /MonoImageDownsampleType /Bicubic
  /MonoImageResolution 600
  /MonoImageDepth -1
  /MonoImageDownsampleThreshold 1.50000
  /EncodeMonoImages true
  /MonoImageFilter /CCITTFaxEncode
  /MonoImageDict <<
    /K -1
  >>
  /AllowPSXObjects true
  /CheckCompliance [
    /None
  ]
  /PDFX1aCheck false
  /PDFX3Check false
  /PDFXCompliantPDFOnly false
  /PDFXNoTrimBoxError true
  /PDFXTrimBoxToMediaBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXSetBleedBoxToMediaBox true
  /PDFXBleedBoxToTrimBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXOutputIntentProfile (None)
  /PDFXOutputConditionIdentifier ()
  /PDFXOutputCondition ()
  /PDFXRegistryName ()
  /PDFXTrapped /False

  /Description <<
    /ENU ()
  >>
>> setdistillerparams
<<
  /HWResolution [600 600]
  /PageSize [609.704 794.013]
>> setpagedevice