International Journal of Production Research ISSN: 0020-7543 (Print) 1366-588X (Online) Journal homepage: www.tandfonline.com/journals/tprs20 Enhancing supply chain visibility with generative AI: an exploratory case study on relationship prediction in knowledge graphs Ge Zheng & Alexandra Brintrup To cite this article: Ge Zheng & Alexandra Brintrup (13 Aug 2025): Enhancing supply chain visibility with generative AI: an exploratory case study on relationship prediction in knowledge graphs, International Journal of Production Research, DOI: 10.1080/00207543.2025.2543964 To link to this article: https://doi.org/10.1080/00207543.2025.2543964 © 2025 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. Published online: 13 Aug 2025. Submit your article to this journal Article views: 118 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=tprs20 https://www.tandfonline.com/journals/tprs20?src=pdf https://www.tandfonline.com/action/showCitFormats?doi=10.1080/00207543.2025.2543964 https://doi.org/10.1080/00207543.2025.2543964 https://www.tandfonline.com/action/authorSubmission?journalCode=tprs20&show=instructions&src=pdf https://www.tandfonline.com/action/authorSubmission?journalCode=tprs20&show=instructions&src=pdf https://www.tandfonline.com/doi/mlt/10.1080/00207543.2025.2543964?src=pdf https://www.tandfonline.com/doi/mlt/10.1080/00207543.2025.2543964?src=pdf http://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=13%20Aug%202025 http://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=13%20Aug%202025 https://www.tandfonline.com/action/journalInformation?journalCode=tprs20 INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH https://doi.org/10.1080/00207543.2025.2543964 Enhancing supply chain visibility with generative AI: an exploratory case study on relationship prediction in knowledge graphs Ge Zheng a and Alexandra Brintrup a,b aSupply Chain AI Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom; bAlan Turing Institute, London, United Kingdom ABSTRACT A key stumbling block in effective supply chain risk management for companies and policymakers is a lack of visibility on interdependent supply network relationships. Relationship prediction, also called link prediction is an emergent area of supply chain surveillance research that aims to increase the visibility of supply chains using data-driven techniques. Existing methods have been success- ful for predicting relationships but struggle to extract the context in which these relationships are embedded – such as the products being supplied or locations they are supplied from. Lack of con- text prevents practitioners fromdistinguishing transactional relations from established supply chain relations, hindering accurate estimations of risk. In this work, we develop a new Generative Artifi- cial Intelligence (GenAI) enhancedmachine learning framework that leverages pre-trained language models as embeddingmodels combinedwithmachine learningmodels to predict supply chain rela- tionships within knowledge graphs. By integrating Generative AI techniques, our approach captures the nuanced semantic relationships between entities, thereby improving supply chain visibility and facilitating more precise risk management. Using data from a real case study, we show that GenAI- enhanced link prediction surpasses all benchmarks, and demonstrate how GenAI models can be explored and effectively used in supply chain risk management. ARTICLE HISTORY Received 15 November 2024 Accepted 30 July 2025 KEYWORDS Generative artificial intelligence (GenAI); pretrained language models (pretrained LMs); supply chain visibility; link prediction; knowledge graph (KG); machine learning 1. Introduction Global supply chains emerge as companies buy prod- ucts from one another to produce and deliver their own (Bellamy and Basole 2013). They play a critical role in almost every aspect of our daily lives. 80% of global trade flows through multinational corporations, and one in five jobs worldwide is tied to global supply chains. Increased volatility and geopolitical tension in recent years have shown how vulnerable we are to supply chain disruptions, with major shortages impacting our food, medicines and supply of electric batteries. In tandem there is rising awareness on the exposure of global sup- ply chains to human rights violations and unsustainable environmental practices, with US and European policy makers proposing legislativemeasures that demand com- prehensive supply chain traceability (Küblböck 2013). One of the key stumbling blocks in beginning to address these concerns is a lack of knowledge on inter- dependent supply chain connections. Most companies have limited visibility beyond their direct connections. Increasing visibility in supply chains has been a rich area of research in the past decade, with multiple technical innovations having been proposed, such as CONTACT Alexandra Brintrup ab702@cam.ac.uk electronic product codes, radio frequency identification, and blockchain technologies. Although these have been successful to some extent, their reach is typically lim- ited to one or two tiers at most. That is because to adopt tracking technology, companies need to be will- ing to share data. There is little incentive for companies to share data on whom they purchase from, for vari- ous reasons. Companies typically view their own supply chains as a competitive advantage, and fear that dis- closing information could result in their buyers working directly with their suppliers, reveal their pricing struc- ture, or they may simply be not wish their manufac- turing and purchasing practices to be known to the buyer. More recently, researchers have proposed a new solu- tion to this problem, which is to use data driven methods to ‘estimate’ who supplies whom, rather than rely on the willingness of companies to share data. Termed as ‘Dig- ital Supply Chain Surveillance’ (Brintrup et al. 2024), these methods include network reconstruction (Mungo et al. 2023), web scraping to recognise supply relation- ships in text obtained from news articles and com- pany annual reports (AlMahri, Xu, and Brintrup 2024; © 2025 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent. http://www.tandfonline.com https://crossmark.crossref.org/dialog/?doi=10.1080/00207543.2025.2543964&domain=pdf&date_stamp=2025-08-12 http://orcid.org/0000-0002-9983-7120 http://orcid.org/0000-0002-4189-2434 mailto:ab702@cam.ac.uk http://creativecommons.org/licenses/by/4.0/ 2 G. ZHENG AND A. BRINTRUP Wichmann et al. 2018), and machine learning methods for predicting relationships (formally, link prediction). Most current methods focus on a single type of rela- tionship, such as firm-level networks that map supply or buy relationships between firms (Brintrup et al. 2018; Mungo et al. 2023).While these approaches provide valu- able insights, they offer a limited understanding of supply chains due to not considering the multifaceted interac- tions and dependencies that exist between different enti- ties, thereby restricting a comprehensive understanding of the entire network. Considering the supply chain as an interconnected network of entities and relationships, we can construct supply chain relevant data into a supply chain knowl- edge graph that can capture complex relationships and attributes associated with supply chain entities. The mul- tiple types of relationships in the supply chain knowledge graph, such as manufacturing processes required to pro- duce a product, product flows, and types of partnerships, can contribute to supply chain visibility for a compre- hensive understanding of supply chain dynamics. It also can reveal hidden patterns, identify potential bottlenecks, and support the development of strategies to enhance network resilience. Generative Artificial Intelligence (GenAI), a branch of machine learning, is designed to create new content, ideas, or data (known as synthetic data), by learning patterns from existing data. Unlike traditional Artifi- cial Intelligence (AI) where the output depends on the given inputs, GenAIs can generate novel outputs. Exam- ples could be generative machine learning models used for synthetic data generation (Zhang et al. 2018) and large language models like ChatGPT (Open AI 2024), Copilot (Microsoft 2023), Gemini (Google DeepMind 2023), and LLaMA (Touvron et al. 2023) used for the generations of human-like text, images, audio and even videos. GenAI as a powerful tool has gained tremendous attention in recent years and been used in various fields. For instance, Zholus et al. (2024) explored how a lan- guage model can be used to accurately create molecular structures, facilitating the drug discovery process. Zhao et al. (2024) introduced self-attention Generative Adver- sarial Networks (SAGANs) model to generate the syn- thetic data for solving data imbalanced issue in financial transaction data and then used it for credit card fraud detection. Gayam (2023) investigated how GenAIs con- tribute to the creation of music and visual art. In the context of supply chain operation manage- ment, GenAIs have been hypothesised to empower the human workforce, improve project management pro- cesses, and help optimise manufacturing and supply chain procedure (Mohammed and Skibniewski 2023). Early adopters, Walmart and Maersk, integrated GenAIs into their operations to optimise pricing negotia- tions (Jackson et al. 2024). A logistics company, C.H. Robinson, is exploring GenAIs for automating the freight shipment (Business Wire 2024) while Ryder Sys- tem (2024) leveragedGenAIs to power chatbots to handle customer inquiries. These early reports on GenAIs inspire us to explore its potential in enhancing supply chain visibility, par- ticularly by predicting relationships within supply chain knowledge graphs. GenAIs, including pre-trained lan- guage models but not limited to, are trained on extensive datasets containing diverse text-based informationwhich enables them to find relevant patterns in unstructured data such as emails, reports, contracts, and social media posts. This ability might allow it to elicit the complex structure and patterns of relationships within networks. GenAIs employ sophisticated neural network architec- tures, such as transformers, which allow the models to handle complex, non-linear relationships. These archi- tectures also use mechanisms like self-attention to cap- ture dependencies and interactions between different parts of the data. In supply chain networks, where rela- tionships between entities (e.g. suppliers, manufacturers, distributors) are intricate andmultifaceted, GenAIs’ abil- ity to model these complexities has great potential to enhance supply chain visibility. However, one area of concern of applying GenAIs into industrial applications has been so called ‘hallucinations’, where GenAIs generate not factual or inaccurate out- puts due to the fact that they are primarily optimised for language fluency and pattern recognition rather than strict adherence to factual data (Huang et al. 2023). The phenomena of ‘hallucination’ results from the biases and inaccuracies in massive amounts of training data from various sources and also from the extension beyond the constraints of the training data (Bender et al. 2021). In the context of supply chain visibility, hallucinations can be particularly problematic. For example, when predict- ing relationships in a supply chain knowledge graph, a model with the issue of hallucinations may infer non- existent relationships between suppliers, manufacturers, and customers. These errors could lead to misguided decisions, potentially disrupting operations, misinform- ing risk assessments, or causing inefficient resource allo- cation in a system where precision is critical. Thus, it is essential to ensure that model’s predictions are grounded in actual verified relationships. To address this challenge, a hybrid approach can be adopted: using a GenAI model as an embedding model to encode data into vectors, followed by applying machine learning models for rela- tionship prediction on the knowledge graph as a ‘fac- tual anchor’. This strategy leverages the strengths of the INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 3 GenAI model in capturing intricate patterns and seman- tic contexts while relying on the verified and factual relationships as ground truth in the knowledge graph to avoid the risks of hallucinations. In this paper, we explore the potential of GenAIs in enhancing supply chain visibility by integrating pre- trained language models with machine learning models to predict relationships within supply chain knowledge graphs.We also introduce a new term, ‘quintuplet’, to rep- resent more intricate relationships within the knowledge graph. Unlike traditional triplets that capture a single relationship between two entities, quintuplets condense multiple triplets to provide a deeper understanding of the supply chain network. After transferring regular triplets based knowledge graph into quintuplets based one, we are able to generate textual descriptions passed onto a pre-trained language model to retrieve vectorised rela- tional knowledge learned by the pre-trained language model from large amount of website information, which are then further learned using a machine learning model to predict quintuplets. The process thus allows us to com- bine structured knowledge representation by a knowl- edge graph, with the general knowledge base that can be used to augment the graph to make additional inference. Combining GenAIs and Knowledge Graphs is pow- erful and goes beyond the state of the art in the sup- ply chain domain, because we mitigate hallucination effects of GenAIs by restricting its use to augment struc- tured prior data, and also allow additional contextual knowledge to arise from it, which would not have arisen by merely using knowledge graph completion methods. Thus our contribution extends the current state of the art in supply chain relationship prediction for visibility and also allows us to provide a use case in the application of GenAIs to supply chain management research. We com- pare our method to existing benchmarks with a use case in electronic vehicle battery supply chains, and present experimental results validatedwith a range of pre-trained language models and machine learning methods. Our method surpasses all existing benchmarks in accuracy, and also yields better information. The rest of this paper is organised as follows. Section 2 reviews relevant existing works in the literature, includ- ing GenAIs, and existing link prediction methods in supply chain networks. Section 3 presents our proposed approach for multiple connected relationship prediction in supply chain networks, including problem definition, preliminaries, and framework explanation of the pro- posed approach. Section 4 uses a case study to evaluate the proposed approach while Section 5 concludes this work and explains its managerial implications, limita- tions, and potential future work directions. 2. Related works Generative artificial intelligence and link prediction in supply chain networks are twomain topics relevant to the proposed approach in this work, reviewed below. 2.1. Generative artificial intelligence Generative Artificial Intelligence (GenAI) refers to a class of machine learning designed to generate new content including text, images, music and even videos, by learning patterns from existing data (Ooi et al. 2023). GenAI models include Generative Adver- sarial Networks (GANs) (Goodfellow et al. 2014), Vari- ational Autoencoders (VAEs) (Kingma, Welling, and al. 2019), Transformer-based ChatGPTs (Open AI 2024), LLaMA (Touvron et al. 2023), DALL-E (Open AI 2022) and so on. These models have been applied into various fields, leading to significant achievements and efficien- cies. One of the most significant capabilities of GenAI is natural language generation. For example, Transformer- based models like ChatGPTs (Open AI 2024) have demonstrated remarkable abilities in tasks ranging from drafting emails towriting code, showcasing the versatility of GenAI in understanding and generating natural lan- guage. Noy andZhang (2023) reported that ChatGPT can substantially raise productivity by decreasing the aver- age time consuming on mid-level professional writing tasks by 40% and increasing the output quality by 18%. Apart from natural language generation, GenAI mod- els have been used for generating realistic images. For instance, GANs have been used to create high-resolution, photorealistic images that are indistinguishable from real photographs (Karras, Laine, and Aila 2019). Such capabilities have enabled applications in fields like art generation (Louie et al. 2020), virtual reality (Hashim et al. 2023), and data augmentation used for other AI model trainings (Gowda and Rao 2024). As with many other domains, the capabilities of GenAIs have recently been explored in the field of sup- ply chain management, although very few studies exist. Those which does exist, do not yet report technical performance, but rather explore potential benefits and applications. Fosso Wamba et al. (2023) explored the benefits, challenges and trends associated with GenAI technologies like ChatGPT in Supply Chain and Opera- tion Management (SCOM) by surveying practitioners in the United Kingdom and the United States. This study reveals an increased efficiency from GenAI adopters compared to non-adopters and highlights that the inte- gration ofGenAI can significantly enhance overall supply chain performance. A subsequent study (Fosso Wamba 4 G. ZHENG AND A. BRINTRUP et al. 2024) extended the exploration by additionallymap- ping the maturity levels of GenAI projects across supply chains and identified the specific operational benefits and challenges that organisations need to overcome. Mean- while, Jackson et al. (2024) provided a comprehensive understanding of both AI and GenAI functionalities and applications in the SCOM context. It also offers a practi- cal framework for both practitioners and researchers to identify where and how AI and GenAI can be applied in SCOM to enhance decision-making processes, optimise operations, prioritise investments, and develop necessary skills. In the industry, companies, early adopters of GenAIs, have reported the enhancement of their supply chain task performance. Mars applied an GenAI platform offered by Celonis (2024) to optimise truck loads and reduce manual efforts by 80% and improve delivery effi- ciency (du Preez 2023). Amazon leveraged GenAI to streamline and improve the delivery process (Meiyappan and Bales 2021) while FedEx applied GenAI to gener- ate more precise package arrival estimates (CNBC Evolve Global Summit 2023). Shein in the fast fashion sector leveraged GenAI to understand the changes in customer demand and interest, allowing it to adjust its supply chain in real time (Astha 2023). A report from Holger et al. (2023) shows that GenAI could add up to $275 billion to the operating profits of apparel, fashion, and luxury sectors in the next three to five years. Inspired by such achievements of GenAI in both aca- demic and industry, we explore how GenAI models can enhance supply chain visibility which is a crucial chal- lenge in supply chains (Pichler et al. 2023). Language models as a type of GenAI models have been pointed out the great potential of improving supply chain perfor- mance (Aguero and Nelson 2024; Srivastava et al. 2024). One of the earlier applications in the supply chain domain has been supply chain mapping wherein Wichmann et al. (2018) automatically extracted structured supply chain information from unstructured natural text to answer questions such as ‘who supplies whom with what from where?’, indicating that language models can help extract relationships of entities in supply chain networks. Several studies (Bouraoui, Camacho-Collados, and Schockaert 2020; Petroni et al. 2019; Safavi and Koutra 2021) have demonstrated that pretrained language mod- els (pretrained LMs) encode rich relational knowledge, enabling the recovery of factual relationships from their internal representations. This points to the potential opportunity to retrieve relational knowledge of entities in supply chain networks from pretrained LMs to predict hidden dependencies. As pretrained LMs are trained on massive data from diverse sources, the learned knowledge in these models is general and not capable of a task that requires spe- cialised domain knowledge. To solve this problem, two solutions can be used. One is training a language model for a specific task and the other one is fine-tuning a pretrained LM for a specific task. The former requires large computational resources and data, and is also time- consuming, prompting researchers to advocate the lat- ter option of fine-tuning (Fichtel, Kalo, and Balke 2021; Yasunaga, Leskovec, and Liang 2022). Relevant exam- ples include Yasunaga, Leskovec, and Liang (2022), who fined-tuned a BERT model to predict links among documents, and Tan et al. (2024) developed a uni- form framework to fine-tune language models on mul- tiple tasks including link prediction. Results from both works showed that fine-tuning pretrained LMs can accu- rately predict links. However, fine-tuning pretrained LMs requires expensive expertise in NLP, and most compa- nies in supply chains are small and midsize enterprises (SMEs) limited by the budget to employ such expen- sive experts. Additionally, SMEs typically lack access to sufficient high-quality labelled data for effective fine- tuning and are also constrained by limited computational sources. In addition, another concern for using pretrained LMs is the issue of ‘hallucination’ leading to the generated out- puts being not factual or inaccurate (Huang et al. 2023). The phenomena of ‘hallucination’ results from the biases and inaccuracies in massive amounts of training data from various sources. In the supply chain context, such hallucinations can lead to erroneous demand forecasts ormisinterpretation of supply chain relationships, poten- tially resulting in operational disruptions and financial losses. For instance, a generativemodel might incorrectly predict a surge in demand based on fabricated trends, leading to overproduction and increased inventory costs. Recent studies (Agrawal et al. 2023; Guan et al. 2024; Martino, Iannelli, and Truong 2023) have highlighted the potential of knowledge graphs to mitigate hallucination. By organising information into entities and relationships, creating a network of interconnected facts, knowledge graphs offer a structured and verifiable repository of fac- tual information that the language models can reference to maintain consistency and accuracy in the generated outputs. Building on prior research works that highlight the potential of relational knowledge learned in pre- trained LMs (Bouraoui, Camacho-Collados, and Schock- aert 2020; Petroni et al. 2019; Safavi andKoutra 2021) and addressing the hallucination issue that knowledge graphs can alleviate, we develop a new approach that combines pretrained LMs with traditional machine learning tech- niques to predict multiple interconnected relationships within supply chain networks represented by knowledge INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 5 graphs. This new approach does not require fine-tuning. The pre-trained language models are used to convert textual data into high-dimensional vector embeddings that capture semantic meanings and contextual nuances. These embeddings are rich in linguistic information but may lack domain-specific factual accuracy when used in isolation, leading to hallucinations. The machine learn- ing models learn to map the semantic embeddings to the factual relationships represented in the knowledge graph. This integration ensures that the predictions are not solely based on language patterns but also aligned with the actual data from the knowledge graph. The knowledge graph acts as a factual anchor, constraining the model to produce outputs consistent with known supply chain relationships. It can reduce the risk of hal- lucinations by providing a factual basis for relationship predictions, enhancing the reliability and accuracy of the model, and also leverage the strengths of pre-trained language models in understanding and encoding contex- tual information while counteracting their tendency to generate incorrect information when used alone. 2.2. Link prediction in supply networks Supply chains are complex networks that exhibit non- linear interactions and inter-dependencies among vari- ous entities, processes and resources (Choi, Dooley, and Rungtusanatham 2001). The large-scale and non-linear nature of these networks often hinder their visibility, which, in return, makes the identification of potential risks challenging. Researchers have shown that predict- ing relationships of entities within a supply chain net- work can contribute to improved visibility (Brintrup et al. 2018). Formally, researchers framed the problem of iden- tifying relationships in a supply chain as a link pre- diction problem on a graph. Link prediction is widely used to solve problems in various domains such as social networks, where it predicts potential connections between individuals based on existing ties, interests, or behaviours (Hasan andZaki 2011); recommendation sys- tems, where it identifies associations among product sales (Su et al. 2020); and biological networks, where it predicts interactions among genes, proteins, and other biological entities (Coşkun and Koyutürk 2021). The approaches used in the majority of these works are similarity-based and learning-based. Similarity-based approaches predict the connection of two nodes by leveraging the similarity of characteristics between two nodes (Zareie and Sakellariou 2020), while learning- based approaches involve training machine learning models to infer the likelihood of connections between nodes based on node and network level features (Ahmed, ElKorany, and Bahgat 2016). Supply chain researchers have argued that similarity-based approaches are unsuit- able for predicting links in this domain as similar com- panies usually do not connect due to competition, advo- cating for learning-based approaches. Brintrup et al. (2018) used twomachine learningmod- els to develop a link prediction approach to predict sup- plier interdependencies in a manufacturing supply net- work. Kosasih and Brintrup (2022) developed a machine learning model using Graph Neural Network (GNN) to detect potential links that are unknown to the buyer in an automotive supply chain network. Later on, Kosasih et al. (2022) proposed a neurosymbolic machine learn- ing method using a combination of GNN and knowl- edge graph reasoning to predict multiple types of links in two supply chain networks. In the same year, Brock- mann, Elson Kosasih, and Brintrup (2022) also per- formed link prediction using GNN but on an uncertain supply chain knowledge graph. Brintrup et al. (2018) and Kosasih and Brintrup (2022) predicted one type of relationship. Kosasih et al. (2022) and Brockmann, Elson Kosasih, and Brintrup (2022) predicted multiple types of relationships on knowledge graphs. Furthermore,Mungo et al. (2023) considered production network reconstruc- tion as link prediction and used Gradient Boosting to predict hidden relationships to reconstruct the produc- tion network. Another work that focuses on the firm- level network reconstruction is from Ialongo et al. (2022), in which a generalised maximum-entropy reconstruc- tion method is introduced to reconstruct the firm-level network based on partial information. While these approaches have been very valuable, the mere identification of a transactional relationship between companies does not allow sufficient contextual information for actionable insights. Considering the case that Toyota and Hewlett Packard (HP) are predicted to share a transactional relationship, it might be that HP has sold a large number of office printing equipment to Toy- ota and itmight also be that Toyota usesHP’s 3D printing material in its production. For an analyst looking to iden- tify whether a disruption at HP would cause an issue to the automotive industry, the two types of relationships would have different implications on actual production output. Similar issues arise when we consider other con- textual information such as production locations. In this work, we will focus on adding context to identified rela- tionships by a type of GenAI models, pretrained LMs, as a knowledge base (Srivastava et al. 2024). Our hypothesis is that doing so will allow us to combine the structured factual knowledge that can be obtained from an ontological presentations afforded by knowledge graphs, with the general unstructured knowl- edge base that can be obtained from pretrained LM, 6 G. ZHENG AND A. BRINTRUP thereby mitigating the hallucination issue caused by using pretrained LMs alone. Besides, it will allow us to enhance the supply chain visibility by a complete sup- ply chain knowledge graph for a comprehensive under- standing of the supply chain dynamics. A comprehensive understanding of supply chain relationship dynamics, facilitated by relationship prediction, might offer signif- icant strategic and operational advantage. Conceptualis- ing relationship prediction as a form of ‘Digital Supply Chain Surveillance’, an industrial survey carried out in the UK has suggested that the use of digital data and AI can allow for early identification of vulnerabilities and bottlenecks, which in turn would help dynamically plan for proactive risk mitigation and contingency planning, thus enhancing the overall resilience and agility of the supply chain (Brintrup et al. 2024). Proactive resilience planning was cited as one of the top three advantages for whichUKmanufacturers were hoping to use surveillance technology. Authors also found that a detailed under- standing of supply chain interconnections can facilitate strategic decision-making, through, for example, refin- ing supplier selection processes, negotiate more effec- tively. Knowledge over a supplier’s other connections, both horizontal and vertical, can help the buyer under- stand whether the supplier is connected to its competi- tors, which might be helpful in negotiation, especially in capacity constrained contexts. Similarly, downstream connections may inform compliance with sustainability standards and regulations such as the Modern Slavery Act (Act 2015; Caspersz et al. 2022). 2.3. Summary of research gaps Despite significant advances in both topics mentioned above, several gaps remain, especially with respect to production research. Regarding link prediction for supply chains, existing approaches suffer from three key shortcomings. First, they focus on predictingwhether a relationship exists but lack the ability to infer contextual attributes (e.g. rela- tionship type, product-specific dependencies). Second, these methods depend entirely on structured knowl- edge graphs, which are often incomplete or outdated in dynamic supply chains. Third, they ignore the wealth of unstructured data (e.g. news articles, procurement contracts) that encode latent relationships. In terms of GenAI for supply chain link prediction, while it has shown promise in supply chain applications, existing studies and industry implementations exhibit critical limitations. First, most works focus narrowly on productivity gains (e.g. reducing manual efforts in logis- tics) but fail to address the fundamental challenge of hal- lucination in relational tasks like supply chain mapping. Second, current approaches heavily rely on fine-tuning pre-trained LMs, which assumes access to labelled data, NLP expertise, and computational resources that are pro- hibitive for SMEs. Third, while knowledge graphs are recognised as potential anchors for factual accuracy, no previous work integrates pre-trained LMs with supply chain knowledge graphs in a zero-shot framework (i.e. without fine-tuning) tomitigate hallucinationswhile pre- serving scalability. This work aims to address these gaps by integrat- ing pretrained LMs with knowledge graphs. Unlike prior GenAI applications, our method avoids fine-tuning, making it accessible to SMEs, while leveraging knowl- edge graphs as a “grounding” mechanism to counteract hallucinations. Unlike traditional link prediction mod- els, our approach enriches predictions with contextual semantics from unstructured data (via LM embeddings) while maintaining factual consistency through knowl- edge graph constraints. This hybrid methodology is the first to simultaneously resolve incompleteness in knowl- edge graphs and inaccuracies in LM output, enhancing supply chain visibility. 3. Combining pretrained languagemodels and knowledge graphs 3.1. Preliminaries: from triplets to quintuplets As mentioned earlier, the first set of studies in lit- erature solved the supply chain link prediction prob- lem on a graph with links representing relationships between companies (Brintrup et al. 2018; Cai et al. 2021; Kazemi and Poole 2018; Kosasih and Brintrup 2022). This approach aimed at learning connection patterns surrounding two companies, estimating a connection between them (Figure 1). The second set of studies repre- sent supply chain information as a knowledge graph (KG) with multiple types of links (Kosasih et al. 2022; Rossi et al. 2021). A KG is represented by a heterogeneous graph G and its ontology O, the former being the actual data and the latter its schema. KG can also be seen as a collec- tion of facts represented by predicate logic statements. A KG is based on an ontology, that defines data types and attributes, with a relational taxonomy. Each item in the data is an entity (or a node in a graph), and the rela- tionships between entities are edges, or links. In previous works, KGs have been used to model edges such as who- produces-what, who-has-what-certification, in addition to buyer-supplier links (Figure 1), the structure of which then inform one another. Both of the above approaches introduce triplets to describe and predict relationships. A triplet, also known INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 7 Figure 1. (a) company-level relationships in a supply chain network (Brintrup et al. 2018); (b) multiple relationships including (company, supplies to, company), (company, has, certificate) and (company, has, product) in a knowledge graph (Kosasih et al. 2022); (c) quintuplet based relationships on a knowledge graph. For example, company A supplies product 1 to company B, and company A with certificate 1 has product 1. as a triple or a statement, consists of three components: subject, predicate, and object, and is used to define the relationship between a subject and an object (Figure 1). Singular links are analysed at a time – where we can pre- dict ‘what a company produces’, and ‘to whom it sells’, but not contextual information such as ‘which product a company sells to its buyer in which location’. Understand- ing context in supply chains is important to accurately predict risks. To define context, we introduce a new term, ‘quin- tuplet’, where information that is represented by three triplets, (CompanyA, has_product, Product 1), (Product 1, purchased_by, Company B) and (Company A, supplies_to, Company B), can be condensed into (Company A, sup- plies, Product 1, to, Company B). Given a knowledge graph G(V , E), V is the set of entities and E ⊆ (V × V) is a set of relationships. The relationship between entity v1 and entity v2, in a knowl- edge graph can be represented by a triplet (v1, ε1,2, v2), in which v1, v2 ⊆ V and ε1,2 ⊆ E . In contrast, a quintuplet would inform, (v1, ε1,2, v2, ε2,3, v3); v1, v2, v3 ⊆ V and ε1,2, ε2,3 ⊆ E , which becomes the target of the prediction. Compared to a triplet rep- resenting one relationship, a quintuplet describes mul- tiple connected relationships. The multiple connected relationships and the connected entities in a quintu- plet can represent a small subgraph of the knowledge graph, leading to contextual information for an entity or relationship. 3.2. The pretrained LM-basedmachine learning framework We propose a pretrained LM-based machine learning framework which transfers a knowledge graph described by triplets into quintuplets to generate composed texts, and then sends these composed snippets of text to a pretrained LM to retrieve the relational knowledge that has been learned a priori (Figure 2). The retrieved rela- tional knowledge is represented by vectors of fixed length, that are further learned by a machine learning model to predict the multiple connected relationships of entities represented by quintuplets in supply chain networks. We begin by constructing a knowledge graph from already known data that characterises the supply chain. This may involve but is not limited to a priori known supply-buy relationships, products and certifications, and to a large extent determined by data that is avail- able to the researcher. The original data used to construct the supply chain knowledge graphs are commonly col- lected from various sources such as Enterprise Resource Planning (ERP) systems, transaction records, market reports, and social media (see Brintrup et al. 2024 for a review). The supply chain data used in this work was collected by a third-party data provider. To con- struct a supply chain knowledge graph, we need to define the ontology using the data that can character supply chain information. In this case, the ontology includes the definitions of entities, i.e. companies, products and 8 G. ZHENG AND A. BRINTRUP Figure 2. The pretrained LM-enhanced supply chain link prediction framework. certificates, and the relationships between these entities, such as has_product, purchased_by, has_cert, and sup- plies_to, shown in Figure 2. The knowledge defined by the ontology is structured into triplets, and each triplet is an instance of the relationships and entities defined by the ontology. We define the quintuplets to represent the contextual knowledge that we aim to predict. As a case example, we use product flow on supply-buy links, however a quintu- plet can also represent contextual information on loca- tions, types and volumes of transactions, depending on the question at hand. We then reconstruct relationships originally repre- sented by triplets, into quintuplets. For example three triplets (Company A, has_product, Product 1), (Product 1, purchased_by, Company B) and (Company A, supplies_to, Company B) can be used to generate a quintuplet of the sort: (Company A, supplies, Product 1, to, Company B). Another example is: (Company A, with, Certificate 1, has, Product 1) generated by two triplets, (Company A, has_cert, Certificate 1) and (Company A, has_product, Product 1). The next step involves transferring quintuplets into composed snippets of text with a user-defined schema. For example, a quintuplet of (Company A, supplies, Prod- uct 1, to, Company B) can be transferred into Company A supplies product 1 to company B or Company A has prod- uct 1 and supplies it to company B. The text used to repre- sent a quintuplet can include different types of sentences but needs to be contextually accurate and consistent. The composed text is then sent to a pretrained LM for embedding and retrieved hidden relational knowledge previously learned in the model. The embeddings of quintuplets with retrieved rela- tional knowledge are used to train a suitable machine learning model for quintuplet prediction. As pretrained LMs cannot directly predict factual relationships in a sup- ply chain, we use a machine learning model for it. The resulting trained model can then be used for quintuplet prediction. 3.2.1. Languagemodel selection We select five general open-source pre-trained LMs for experimentation with the following considerations: (1) Diversity: We test several pretrained LMs to inves- tigate whether our approach is applicable across the state of the art. (2) Model size: Many existing works in the field of NLP (Narayanan et al. 2021; Shin et al. 2020; Shoeybi et al. 2019) suggest that larger model size can lead to improved performance. Therefore, pretrained LMs with different model sizes are selected for experi- mentation. (3) Output dimensions: Increasing the dimension of a LM can potentially capture more complex patterns and nuances in the data, however, cannot guaran- tee better representation by default (Kenton and Toutanova 2019). Thus, selected pretrained LMs have different dimensions of their representations for evaluation. Based on the considerations above, we select five pre- trained LMs that all were developed on the basis of Trans- former (Vaswani et al. 2017) but have different model sizes and output dimensions (Table 1). Among the selected pretrained LMs, ‘paraphrase- albert-small-v2’ is the smallest with 43MB and a six- layer version of ‘albert-base-v2’ that originates from Lan et al. (2019) aiming to solve the problems of GPU/TPU memory limitations and longer training times by lower- ing model size. Compared to the original BERT, ‘albert- base-v2’ introduces two parameter reduction techniques to reduce memory consumption and increase training speed. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 9 Table 1. Selected pretrained language models. Model Name Model Size Training Data Size Dimensions Max Sequence Length Reference distiluse-base-multilingual-cased-v2 480MB 1 million sentence pairs (15 languages) 512 128 (tokens) Reimers and Gurevych (2019) all-distilroberta-v1 290MB over 1 billion pairs 768 512 (tokens) Liu et al. (2019) all-MiniLM-L12-v2 120MB over 1 billion pairs 384 256 (tokens) Wang et al. (2020) all-MiniLM-L6-v2 80MB over 1 billion pairs 384 256 (tokens) Wang et al. (2020) paraphrase-albert-small-v2 43MB 16 GB of uncompressed text 768 256 (tokens) Lan et al. (2019) The second smallest model is ‘all-MiniLM-L6-v2’, a six-layer version ofWang et al. (2020), developed by com- pressing the large Transformer-based pretrained model using a simple but effective approach called deep self- attention distillation (Wang et al. 2020). It introduces the conceptions of the student model and the teacher model. ‘all-MiniLM-L6-v2’ referred to as the student model in Wang et al. (2020) is trained by mimicking the self-attention module, in Transformer networks, of the large language model referred to as the teacher model and also by distilling the self-attention module of the last Transformer layer of the teacher model. In addi- tion, ‘all-MiniLM-L6-v2’ only keeps 50% of parameters of the teacher model but can retain more than 99% of accuracy on several benchmark tasks (Wang et al. 2020). ‘all-MiniLM-L12-v2’ is similar to ‘all-MiniLM-L6-v2’ but is a 12-layer version of Wang et al. (2020), leading to big model size with more parameters. ‘all-distilroberta-v1’ is a distilled version of the BERT base model in Kenton and Toutanova (2019). It is smaller and faster than BERT but developed using the BERT base model as a teacher. Com- pared to ‘paraphrase-albert-small-v2’, ‘all-MiniLM-L6- v2’ and ‘all-MiniLM-L12-v2’, this model size is larger but smaller than ‘distiluse-base-multilingual-cased-v2’. ‘distiluse-base-multilingual-cased-v2’ is also a modifica- tion of the pretrained BERT network, but trained on the data in 15 languages (Reimers and Gurevych 2019), compared to all other models being trained in English. 3.2.2. Machine learningmodel selection Five machine learning models are selected based on the suitability of the model for application to link pre- diction and past works. These include Artificial Neu- ral Network (ANN) (Yegnanarayana 2009), Convolu- tional Neural Network (CNN) (Albawi, Mohammed, andAl-Zawi 2017), Logistic Regression (LogReg) (Klein- baum and Klein 2002), Long Short-Term Memory (LSTM) (Hochreiter and Schmidhuber 1997), andAutoEn- coder (Wang et al. 2014). Their architectures and param- eter settings are presented in Table 2. ANN is composed of three linear layers with 300 neu- rons each and each linear layer is followed by a Batch Normalization layer (BatchNorm) and a Rectified Lin- ear Unit (‘ReLU’) activation function. BatchNorm aims to normalise the output of each linear layer to ensure a more stable training process while the function ‘ReLU’ contributes to the acceleration of the training phase and themitigation of the problem of vanishing gradients. The output from the last ‘ReLU’ activation function is then sent to a fully-connected (FC) layer before being coupled with a ‘Softmax’ function to achieve the final prediction. TheCNNmodel consists of three convolutional layers, each followed by the ‘ReLU’ function, anAverage Pooling Table 2. The architectures and parameters of five selected commonmachine learning models. ANN CNN1D LogReg LSTM AutoEncoder Layer Name Parameters Layer Name Parameters Layer Name Parameters Layer Name Parameters Layer Name Parameters Linear1 300 Conv1D1 32,(7),2 Linear1 200 LSTM1 16,16,bi Linear1 (Encoder) 96 BatchNorm1 300 ReLU1 – – – – – ReLU1 (Encoder) – ReLU1 – AvgPooling1 -,(7),2 – – – – Linear2 (Encoder) 48 – – BatchNorm1 32 – – – – ReLU2 (Encoder) – Linear2 300 Conv1D2 64,(7),1 Linear2 2 LSTM2 16,16,bi Linear1 (Decoder) 48 BatchNorm2 300 ReLU2 – – – – – ReLU1 (Decoder) – ReLU2 – AvgPooling2 -,(7),1 – – – – Linear2 (Decoder) 96 – – BatchNorm2 64 – – – – ReLU2 (Decoder) – Linear3 300 CNN-1D3 64,(7),1 – – – – – – BatchNorm3 300 ReLU3 – – – – – – – ReLU3 – AvgPooling3 -,(7),1 – – – – – – – – BatchNorm3 64 – – – – – – – – Flatten – – – – – – – FC 2 FC 2 – – FC 2 FC 2 Softmax – Softmax – Sigmoid – Softmax – Softmax – 10 G. ZHENG AND A. BRINTRUP layer (AvgPooling), and a BatchNorm layer. ‘ReLU’ and ‘BatchNorm’ function the same as them in ANN while AvgPooling layer serves to decrease the dimensionality of the features outputted by the ‘ReLU’. Similar to ANN, a FC layer followed by a Softmax function is used to output the final prediction. Considering the computation cost and the performance, the first Conv1D layer is configured with 32 kernels of size 7 and of stride 2 while the second and third Conv1D layers use 64 kernels of size 7 and of stride 1. The LogReg model has two linear layers, followed by a logistic function ‘Sigmoid’. The first linear layer with 200 neurons and the second with 2 neurons are used to analyse the relationship between output and input fea- tures. The LSTM model contains two LSTM layers with bi-directions, followed by the FC layer. The input and hidden sizes in each layer are set as 16. The AutoEn- coder model consists of an encoder for transforming the input to a compressed representation, a decoder for reconstructing the original input from the encoded rep- resentation, and an FC layer followed by the Softmax function for the final link prediction. Both encoder and decoder are composed of two linear layers, each followed by the ‘ReLU’ function. To encode input to a compressed representation, the number of neurons in two linear lay- ers are respectively 96 and 48. In the decoder, two linear layers with 48 and 96 respectively are used to reconstruct the original input. Since our task considers link prediction as a binary classification problem, all models, excluding LogReg, use an FC layer with 2 neurons. 4. Case study The case study used to evaluate the proposed approach from the automotive sector where companies produce car parts, such as engines, front axles, fuel tanks and sell them to car manufacturing companies around the world. The dataset has been used as a benchmark for link prediction problem within supply chain networks in previous works and therefore offers potential for cross- comparison (Brintrup et al. 2018; Kosasih and Brin- trup 2022; Kosasih et al. 2022). The dataset comprises 43,131 companies spanning 72 countries and producing 927 distinct products, each company associated with one or more of 5 certification types, along with the relation- ships among these entities (Table 3). We separated the data at the country level so as to eval- uate our approach over multiple heterogeneous datasets. As shown in Table 4 each partition has different numbers of companies, products, certificates and relationships. 27 datasets have thus been generated. As a starting point, we use the same knowledge graph ontology as Table 3. Basic descriptions of Marklines data. Entity Example Unique Number Company Hamenz For German Tech. Ind. (S.A.E.) 43, 131 Country Egypt 79 Certificate IS09001, QS9000, ISO/TS16949 5 Product Piston Ring Machining 927 previous works with three types of entities: companies, products, and certificates, and four types of relationships (triplets): (company, has_product, product), (company, has_cert, certificate), (company, supplies_to, company) and (product, purchased_by, company) (see Figure 1(c)). Four triplets are used to generate two quintuplets for the evaluation of the proposed approach. Twoquintuplets are (company, supplies, product, to, company) and (com- pany, with, certificate, has, product). The prediction prob- lem thus is the existence of a given quintuplet. Therefore, we consider the relationship prediction in a quintuplet as a binary classification problem. Next, we explain how we generate positive and nega- tive relationships based on quintuplets to train the mod- els, followed by experimental settings and results. 4.1. Generating training data We refer to actual relationships in a quintuplet as posi- tive relationships and non-existing relationships as neg- ative relationships. To train a machine learning model both positive relationships and negative relationships are needed. Negative quintuplets are generated by the same triplets that were used to produce positive quintuplets. Consider the following positive quintuplet, (company A, supplies, product 1, to, company B), three negative quintuplets can be generated by replacing any one of the three entities: company A, product 1 and company B, using the one that does not connect the other two. Given two lists of unique entities i.e. the list of unique companies and the list of unique products, (company A, supplies, product 1, to, company B), we randomly select a company named company C, that does not connect both company A and company B, from the list of unique com- panies to replace companyAor companyB.Alternatively, we can randomly select a product named product 2, that does not connect to either company A and company B, and replace it with product 1. Therefore, we can generate three negative quintuplets: (company C, supplies, prod- uct 1, to, company B), (company A, supplies, product 1, to, company C) and (company A, supplies, product 2, to, com- pany B). In addition, incorrect relationship direction is also considered as negative quintuplets, such as (company B, supplies, product 1, to, company A). INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 11 Table 4. Data description for each country. Country Name Num-Company Num-Product Num-Certificate Num-Relations AUSTRALIA 150 251 4 1, 997 AUSTRIA 195 196 5 3, 199 BELGIUM 293 245 5 4, 704 BRAZIL 603 458 5 10, 705 CANADA 390 359 5 22, 734 CHINA 10, 500 877 5 330, 180 CZECH REPUBLIC 329 281 5 3, 395 FRANCE 518 510 5 21, 052 GERMANY 1, 965 769 5 114, 793 HUNGARY 363 283 5 3, 820 INDIA 1, 509 646 5 61, 539 INDONESIA 486 380 4 12, 020 ITALY 389 425 5 15, 264 JAPAN 5, 068 890 5 215, 514 KOREA 1, 195 706 5 38, 295 MALAYSIA 384 355 4 5, 481 MEXICO 463 395 5 7, 900 POLAND 456 303 5 3, 467 RUSSIA 213 235 4 2, 038 SOUTH AFRICA 185 242 5 2, 623 SPAIN 282 385 5 12, 044 SWEDEN 236 224 5 6, 802 TAIWAN 777 488 5 15, 982 THAILAND 1, 296 535 4 20, 439 TURKEY 449 369 5 13, 141 U.S.A. 2, 577 800 5 91, 292 UK 932 502 5 15, 836 Based on the above method each positive quintuplet can be used to produce several negative quintuplets. If all possible negative quintuplets are used, it leads to neg- atives far exceeding positives. This natural imbalanced situation results from a characteristic of real-world net- work structures, in which the vast majority of possible node pairs in the graph do not have a direct link (Bacilieri et al. 2023; Mungo et al. 2023). To take this imbalance into account and ensure robust and reliable model train- ing, we randomly select one negative quintuplet with the related positive quintuplet to train the model in order to have a balanced dataset (Kosasih and Brintrup 2022). 4.2. Experimental settings Experimental settings in this work include benchmark settings andmodel training settings. The benchmark set- tings aim to evaluate whether machine learning models with the help of pretrained LMs can provide more accu- rate relationship predictions in supply chain networks while the model training settings aim to set the optimal parameters during the model training phase. 4.2.1. Benchmarks To evaluate the effectiveness of our proposed approach, we set machine learning models without the help of pretrained LMs as benchmarks. The proposed approach is designed to power machine learning models by pretrained LMs so we also select five pretrained LMs (cf. Section 3) to test the approach. 4.2.2. Settings ofmodel training As the language models used in this work are pre- trained models, we only need to set parameters to train machine learning models, which are shown in Table 2. The parameters during the training phase include the number of epochs, E, batch size, B, and learning rate, r. To ensure the uniformity of experiments and follow com- mon guidelines in machine learning training (Yang and Shami 2020), we set B as 64 and r as 0.001 for all five machine learning models. For the number of epochs, E, we use the stop−early strategy to stop the training process if the training loss decreases but validating loss increases in 10 continuous epochs for the determination of E and also prevents overfitting. We consider the problem of relationship prediction in supply chain networks as a binary classification problem mentioned earlier. We thus use the Cross-Entropy Loss as the loss function for all machine learningmodel training. Adam (Kingma and Ba 2014) is selected as the optimiser for all machine learning models. In addition, following the common rules for splitting dataset into training, validating and testing, we use 70%, 10% and 20% of relationships present in each data par- tition. All experiments are run on a desktop with an IntelR CoreTM i9-9900KCPU and aGeForce RTX 2080 Ti 12 G. ZHENG AND A. BRINTRUP Table 5. Confusion Matrix. Predicted Positive Predicted Negative Real Positive True Positive (TP) False Negative (FN) Real Negative False Positive (FP) True Negative (TN) GPU card with 11GB physical memory. PyTorch is used to develop and train all models. 4.2.3. Performancemetrics Common metrics, including accuracy, precision, recall and f-score, used to evaluate the performance of a clas- sification approach are also used to evaluate our pro- posed approach. Table 5 shows a confusion matrix used to calculate the four metrics. In our case, TP and TN respectively represent positive and negative relationships that are predicted correctly while FP and FN respectively describe positive and negative relationships that are pre- dicted incorrectly. Based on Table 5, the four common metrics can be calculated as: • accuracy = TP+TN TP+TN+FP+FN describes the ratio of cor- rect relationship predictions to the total number of relationships. • precision= TP TP+FP stands for the ratio of accurate pre- dictions of positive relationships to the total number of predicted positive relationships. • recall= TP TP+FN represents the ratio of accurate predic- tions of positive relationships to the total number of positive relationships. • f-score shows the equilibrium between the precision and the recall, 2×Precision×Recall Precision+Recall . When we split the dataset into training, validating and testing, we randomly shuffle all positive and nega- tive relationships for fairness. This process may lead to an imbalanced testing dataset even though the overall dataset is balanced. Therefore, to truly reflect the perfor- mance of our approach, we use weighted f-score shown in Equation (1) to replace the commonly used f-score. fw − score = ∑ k 2 × wk Precisionk × Recallk Precisionk + Recallk (1) where wk is the ratio of relationships for class k over all relationships and is equal to nk N . N is the total number of relationships while nk is the number of relationships in class k. In addition, we expect the developed approach to equally consider the importance of positive and nega- tive relationships. Thus, we follow Grandini, Bagli, and Visani (2020), who compared metrics for multi-class classification, and use balanced accuracy weighted = wp × TP TP+FN + wn × TN TN+FP where wp and wn respec- tively represent the ratio of positive relationships and the ratio negative relationships in the testing dataset (notes that wp + wn = 1), instead of the commonly used accu- racy. balanced accuracy weighted does not only show the ability of the model to predict positive relationships but also reflect its ability to predict negative relationships. 4.3. Experimental results and discussions 4.3.1. Benchmark comparison between pretrained LM-enhanced link prediction and general machine learningmodels Tables 6 and 7 respectively show results achieved by the machine learning models to predict the quintuplets of (company, supplies, product, to, company) and (com- pany, with, certificate, has, product), while Tables 8 and 9 present results achieved by the pretrained LM-enhanced approach. Based on balanced accuracy weighted (Accbw), preci- sion (Pre) and recall (Rec) in these tables, an obvious finding is that the pretrained LM-enhanced link pre- diction outperforms all general machine learning mod- els for both quintuplets on all datasets. This finding can also be observed in Figures 3 and 4 that show the fw−score achieved by our proposed approach and gen- eral machine learning models for the link predictions in (company, supplies, product, to, company) and (com- pany, with, certificate, has, product). These results indicate that pretrained LMs indeed can help general machine learning models achieve better link predictions in sup- ply chain networks, confirming observation thatmachine learning models benefit from the relational knowl- edge learned in pretrained LMs (Bouraoui, Camacho- Collados, and Schockaert 2020; Petroni et al. 2019; Safavi and Koutra 2021). This improved prediction accuracy is critical for mapping out complex inter- firm relationships, thus enabling organisations to gain a more comprehensive understanding of their supply chain dynamics. In addition, the pretrained LM-enhanced approach presents more consistent results across different quintu- plets, compared to results obtained by direct application of machine learning models, which yield poor perfor- mance in predicting quintuplets of (company, with, cer- tificate, has, product) compared to quintuplets of (com- pany, supplies, product, to, company). The enhanced con- sistency across different quintuplets and datasets with varying sizes suggests that our approach can reliably function under conditions of data scarcity or heterogene- ity, ensuring robust performance in real-world supply chain environments. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 13 Table 6. Results for the quintuplet of (company, supplies, product, to, company). Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec) AUSTRALIA 0.7953/0.7971/0.7936 0.8106/0.8106/0.8103 0.7872/0.7876/0.7873 0.8653/0.8675/0.8640 0.9037/0.9063/0.9024 AUSTRIA 0.7615/0.7619/0.7616 0.8853/0.8853/0.8853 0.9073/0.9075/0.9073 0.9199/0.9225/0.9200 0.9393/0.9418/0.9393 BELGIUM 0.8379/0.8375/0.8375 0.8748/0.8749/0.8756 0.8969/0.8968/0.8972 0.9251/0.9264/0.9267 0.9391/0.9392/0.9400 BRAZIL 0.8285/0.8292/0.8285 0.8672/0.8679/0.8672 0.9410/0.9414/0.9410 0.8899/0.8915/0.8898 0.9144/0.9152/0.9144 CANADA 0.8822/0.8857/0.8821 0.9245/0.9266/0.9246 0.9449/0.9458/0.9449 0.9702/0.9703/0.9702 0.9758/0.9759/0.9758 CHINA 0.8558/0.8558/0.8558 0.8698/0.8703/0.8697 0.8849/0.8852/0.8848 0.8841/0.8846/0.8839 0.8908/0.8909/0.8907 CZECH REPUBLIC 0.8115/0.8116/0.8115 0.8007/0.8013/0.8008 0.8227/0.8239/0.8229 0.8833/0.8847/0.8834 0.9168/0.9182/0.9170 FRANCE 0.8418/0.8427/0.8417 0.9083/0.9093/0.9082 0.9265/0.9272/0.9265 0.9240/0.9261/0.9239 0.9325/0.9347/0.9324 GERMANY 0.8437/0.8466/0.8442 0.9062/0.9079/0.9060 0.9187/0.9189/0.9187 0.9012/0.9014/0.9012 0.9145/0.9146/0.9145 HUNGARY 0.8321/0.8323/0.8329 0.8650/0.8649/0.8656 0.8989/0.8988/0.8995 0.8820/0.8828/0.8829 0.8992/0.9000/0.9000 INDIA 0.8236/0.8256/0.8237 0.8749/0.8752/0.8749 0.8939/0.8940/0.8939 0.8736/0.8745/0.8736 0.9037/0.9039/0.9037 INDONESIA 0.8514/0.8522/0.8512 0.8686/0.8705/0.8684 0.9209/0.9217/0.9207 0.9130/0.9141/0.9128 0.9230/0.9240/0.9228 ITALY 0.8548/0.8554/0.8547 0.9138/0.9153/0.9137 0.9212/0.9233/0.9210 0.9242/0.9259/0.9241 0.9331/0.9347/0.9330 JAPAN 0.8469/0.8533/0.8469 0.8643/0.8662/0.8643 0.9112/0.9115/0.9112 0.8986/0.8990/0.8987 0.9080/0.9083/0.9080 KOREA 0.8782/0.8784/0.8782 0.9243/0.9249/0.9243 0.9434/0.9435/0.9434 0.9195/0.9195/0.9195 0.9311/0.9311/0.9311 MALAYSIA 0.8187/0.8206/0.8197 0.8645/0.8665/0.8654 0.9057/0.9061/0.9062 0.8950/0.8959/0.8956 0.9211/0.9225/0.9219 MEXICO 0.8513/0.8522/0.8514 0.9198/0.9204/0.9198 0.9267/0.9274/0.9266 0.9129/0.9132/0.9128 0.9361/0.9363/0.9361 POLAND 0.7763/0.7765/0.7741 0.8461/0.8458/0.8457 0.8510/0.8514/0.8497 0.8615/0.8619/0.8608 0.8760/0.8834/0.8737 RUSSIA 0.8276/0.8277/0.8272 0.8512/0.8523/0.8505 0.8726/0.8730/0.8722 0.8823/0.8834/0.8817 0.8937/0.8978/0.8935 SOUTH AFRICA 0.7716/0.7722/0.7730 0.8542/0.8554/0.8522 0.8971/0.8987/0.8954 0.8919/0.8925/0.8911 0.9044/0.9062/0.9029 SPAIN 0.8310/0.8351/0.8311 0.9030/0.9050/0.9031 0.9324/0.9341/0.9324 0.9365/0.9383/0.9365 0.9528/0.9542/0.9528 SWEDEN 0.8456/0.8459/0.8454 0.9160/0.9169/0.9157 0.9127/0.9157/0.9122 0.9543/0.9549/0.9542 0.9667/0.9674/0.9665 TAIWAN 0.8408/0.8408/0.8409 0.9219/0.9223/0.9217 0.9424/0.9424/0.9424 0.9091/0.9094/0.9089 0.9310/0.9311/0.9310 THAILAND 0.8614/0.8618/0.8617 0.8731/0.8745/0.8727 0.9141/0.9145/0.9139 0.8927/0.8928/0.8926 0.9055/0.9060/0.9053 TURKEY 0.8446/0.8449/0.8444 0.8725/0.8752/0.8718 0.9062/0.9075/0.9057 0.8975/0.8992/0.8970 0.9183/0.9192/0.9180 U.S.A. 0.8585/0.8603/0.8582 0.8859/0.8891/0.8862 0.9306/0.9306/0.9306 0.9139/0.9145/0.9138 0.9351/0.9352/0.9351 UK 0.8558/0.8559/0.8560 0.8673/0.8717/0.8680 0.9147/0.9148/0.9148 0.9073/0.9081/0.9075 0.9188/0.9191/0.9190 Table 7. Results for the quintuplet of (company, with, certificate, has, product). Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec) AUSTRALIA 0.7052/0.7063/0.7052 0.6742/0.6750/0.6742 0.7052/0.7057/0.7052 0.7768/0.7780/0.7768 0.7987/0.8029/0.7987 AUSTRIA 0.6891/0.6901/0.6894 0.6698/0.6704/0.6695 0.6583/0.6622/0.6576 0.7708/0.7721/0.7710 0.8068/0.8072/0.8067 BELGIUM 0.7148/0.7162/0.7138 0.6784/0.6787/0.6778 0.7107/0.7115/0.7099 0.7651/0.7677/0.7642 0.7956/0.7981/0.7946 BRAZIL 0.7878/0.8012/0.7829 0.7908/0.8097/0.7852 0.8348/0.8361/0.8335 0.8185/0.8293/0.8147 0.8406/0.8440/0.8384 CANADA 0.7766/0.7801/0.7699 0.7684/0.7672/0.7683 0.7924/0.7911/0.7913 0.8225/0.8269/0.8176 0.8572/0.8585/0.8541 CHINA 0.7752/0.7844/0.7812 0.7961/0.7958/0.7971 0.8103/0.8097/0.8100 0.8058/0.8059/0.8038 0.8044/0.8050/0.8043 CZECH REPUBLIC 0.7087/0.7114/0.7092 0.7261/0.7262/0.7262 0.7076/0.7081/0.7075 0.7431/0.7442/0.7430 0.7984/0.7988/0.7983 FRANCE 0.7743/0.7795/0.7716 0.8011/0.8032/0.7996 0.8183/0.8205/0.8168 0.8121/0.8177/0.8098 0.8262/0.8279/0.8249 GERMANY 0.7247/0.7248/0.7246 0.7610/0.7612/0.7609 0.7702/0.7708/0.7703 0.7787/0.7816/0.7781 0.7897/0.7904/0.7893 HUNGARY 0.6435/0.6450/0.6454 0.6658/0.6648/0.6649 0.7969/0.7967/0.7956 0.7353/0.7351/0.7344 0.7661/0.7654/0.7657 INDIA 0.7354/0.7413/0.7380 0.7716/0.7715/0.7712 0.8135/0.8143/0.8142 0.8067/0.8080/0.8055 0.8130/0.8135/0.8121 INDONESIA 0.7364/0.7409/0.7368 0.7278/0.7308/0.7281 0.7695/0.7710/0.7697 0.7705/0.7911/0.7713 0.8177/0.8220/0.8180 ITALY 0.7499/0.7546/0.7516 0.7283/0.7285/0.7286 0.8036/0.8050/0.8045 0.8257/0.8332/0.8278 0.8419/0.8451/0.8433 JAPAN 0.8413/0.8412/0.8399 0.8506/0.8529/0.8478 0.8492/0.8499/0.8474 0.8563/0.8584/0.8537 0.8590/0.8615/0.8562 KOREA 0.7466/0.7469/0.7466 0.7799/0.7804/0.7800 0.8173/0.8186/0.8173 0.7990/0.8046/0.7991 0.8039/0.8045/0.8039 MALAYSIA 0.6856/0.6864/0.6861 0.6937/0.6980/0.6953 0.7219/0.7253/0.7232 0.7358/0.7449/0.7377 0.7552/0.7619/0.7570 MEXICO 0.7342/0.7368/0.7340 0.6884/0.6896/0.6882 0.7729/0.7736/0.7729 0.7917/0.7982/0.7914 0.8214/0.8222/0.8213 POLAND 0.7339/0.7328/0.7339 0.7301/0.7285/0.7269 0.7842/0.7828/0.7828 0.7729/0.7725/0.7708 0.7954/0.7947/0.7926 RUSSIA 0.8536/0.8577/0.8488 0.7651/0.7640/0.7635 0.6365/0.6359/0.6353 0.8568/0.8569/0.8553 0.8693/0.8711/0.8663 SOUTH AFRICA 0.6576/0.6584/0.6579 0.7023/0.7049/0.7016 0.7918/0.7944/0.7914 0.7545/0.7584/0.7540 0.7793/0.7837/0.7787 SPAIN 0.7017/0.7037/0.7033 0.7592/0.7603/0.7574 0.7868/0.7875/0.7862 0.8234/0.8244/0.8232 0.8394/0.8406/0.8383 SWEDEN 0.6786/0.6784/0.6787 0.6828/0.6825/0.6817 0.6984/0.7007/0.6955 0.7531/0.7587/0.7500 0.8000/0.8057/0.7970 TAIWAN 0.7218/0.7221/0.7212 0.7639/0.7674/0.7647 0.8301/0.8309/0.8305 0.7963/0.7983/0.7970 0.8037/0.8043/0.8040 THAILAND 0.7064/0.7151/0.7097 0.7056/0.7056/0.7052 0.7483/0.7502/0.7496 0.7659/0.7666/0.7658 0.7824/0.7833/0.7810 TURKEY 0.7273/0.7279/0.7279 0.7047/0.7048/0.7039 0.8175/0.8212/0.8161 0.7796/0.7850/0.7779 0.8058/0.8066/0.8051 U.S.A. 0.7795/0.7863/0.7821 0.8148/0.8146/0.8147 0.8330/0.8333/0.8323 0.8378/0.8384/0.8372 0.8400/0.8409/0.8392 UK 0.7590/0.7612/0.7602 0.7932/0.7938/0.7934 0.8342/0.8356/0.8335 0.8117/0.8138/0.8106 0.8269/0.8277/0.8262 4.3.2. Comparisons betweenmachine learning models Based on results achieved by machine learning mod- els shown in Tables 6 and 7, Figures 3(a) and 4(a), we find that, on all datasets, all machine learning models perform better for predicting relationships in the quintu- plet of (company, supplies, product, to, company) than in (company, with, certificate, has, product). This is because the number of (company, supplies, product, to, company) samples is more than the samples of (company, with, cer- tificate, has, product) in each country’s dataset, which can be indirectly observed by a large number of companies and a small number of certificates and products shown in Table 3. Among the different machine learning models, ANN, CNN1D and AutoEncoder predict relationships more 14 G. ZHENG AND A. BRINTRUP Table 8. Pretrained LM-enhancedmachine learning, ‘all-MiniLM-L12-v2’, for differentmachine learningmodels for quintuplet (company, supplies, product, to, company) Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec) AUSTRALIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9991/0.9990/0.9991 AUSTRIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9614/0.9658/0.9615 BELGIUM 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9982/0.9981/0.9982 BRAZIL 1.0000/1.0000/1.0000 0.9990/0.9990/0.9990 0.9996/0.9996/0.9996 0.9997/0.9997/0.9997 0.9994/0.9994/0.9994 CANADA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 CHINA 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9998/0.9998/0.9998 0.9486/0.9239/0.9492 0.9999/0.9999/0.9999 CZECH REPUBLIC 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 FRANCE 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9999/0.9999/0.9999 0.9989/0.9989/0.9989 GERMANY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9998/0.9998/0.9998 HUNGARY 1.0000/1.0000/1.0000 0.9985/0.9985/0.9986 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 0.9989/0.9988/0.9989 INDIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993 0.9999/0.9999/0.9999 INDONESIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9983/0.9983/0.9983 ITALY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 JAPAN 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9999/0.9999/0.9999 0.9980/0.9980/0.9980 0.9998/0.9998/0.9998 KOREA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 MALAYSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9997 0.9999/0.9999/0.9999 0.9959/0.9960/0.9958 MEXICO 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 POLAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9979/0.9979/0.9979 RUSSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9996 1.0000/1.0000/1.0000 0.9852/0.9881/0.9856 SOUTH AFRICA 1.0000/1.0000/1.0000 0.9995/0.9995/0.9994 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 SPAIN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 SWEDEN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 TAIWAN 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9992/0.9992/0.9992 THAILAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 TURKEY 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 U.S.A. 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9997/0.9997/0.9997 0.9998/0.9998/0.9998 0.9998/0.9998/0.9998 UK 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 0.9994/0.9994/0.9994 Table 9. Pretrained LM-enhancedmachine learning, ‘all-MiniLM-L12-v2’, for different machine learning models to predict relationships in the quintuplet of (company, with, certificate, has, product) Country Name LogReg (Accbw/Pre/Rec) LSTM (Accbw/Pre/Rec) CNN (Accbw/Pre/Rec) AutoEncoder (Accbw/Pre/Rec) ANN (Accbw/Pre/Rec) AUSTRALIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 AUSTRIA 1.0000/1.0000/1.0000 0.9990/0.9990/0.9989 0.9990/0.9990/0.9989 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 BELGIUM 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9992/0.9992/0.9992 0.9984/0.9984/0.9985 BRAZIL 1.0000/1.0000/1.0000 0.9899/0.9899/0.9900 0.9984/0.9984/0.9983 0.9992/0.9992/0.9992 0.9995/0.9995/0.9995 CANADA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9990/0.9990/0.9990 CHINA 1.0000/1.0000/1.0000 0.9996/0.9996/0.9996 0.9992/0.9992/0.9992 0.9992/0.9992/0.9992 0.9998/0.9998/0.9998 CZECH REPUBLIC 1.0000/1.0000/1.0000 0.9991/0.9991/0.9991 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993 FRANCE 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9996/0.9996/0.9996 GERMANY 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 0.9996/0.9996/0.9996 0.9998/0.9998/0.9998 0.9994/0.9994/0.9994 HUNGARY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 INDIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9997/0.9997/0.9997 0.9997/0.9996/0.9997 INDONESIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 ITALY 1.0000/1.0000/1.0000 0.9993/0.9993/0.9993 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 0.9987/0.9987/0.9987 JAPAN 0.9998/0.9998/0.9999 0.9998/0.9998/0.9999 0.9999/0.9999/0.9999 0.9993/0.9993/0.9994 0.9992/0.9992/0.9993 KOREA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 MALAYSIA 0.9984/0.9985/0.9984 0.9984/0.9984/0.9985 1.0000/1.0000/1.0000 0.9984/0.9985/0.9984 0.9970/0.9971/0.9970 MEXICO 0.9983/0.9983/0.9983 1.0000/1.0000/1.0000 0.9998/0.9998/0.9998 0.9988/0.9988/0.9988 0.9979/0.9979/0.9979 POLAND 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9983/0.9982/0.9985 0.9993/0.9993/0.9994 0.9992/0.9992/0.9992 RUSSIA 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9990/0.9990/0.9989 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 SOUTH AFRICA 0.9983/0.9983/0.9982 0.9983/0.9983/0.9982 0.9974/0.9974/0.9974 0.9983/0.9983/0.9982 0.9983/0.9983/0.9982 SPAIN 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9991/0.9991/0.9992 0.9993/0.9993/0.9993 0.9995/0.9995/0.9995 SWEDEN 0.9984/0.9984/0.9985 0.9984/0.9984/0.9985 0.9995/0.9995/0.9995 0.9979/0.9980/0.9979 0.9984/0.9984/0.9985 TAIWAN 1.0000/1.0000/1.0000 0.9991/0.9991/0.9991 0.9997/0.9997/0.9997 1.0000/1.0000/1.0000 0.9995/0.9995/0.9995 THAILAND 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 TURKEY 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 1.0000/1.0000/1.0000 0.9999/0.9999/0.9999 U.S.A. 1.0000/1.0000/1.0000 0.9994/0.9994/0.9994 0.9998/0.9998/0.9998 0.9998/0.9998/0.9998 0.9995/0.9995/0.9995 UK 1.0000/1.0000/1.0000 0.9989/0.9989/0.9989 0.9993/0.9993/0.9993 1.0000/1.0000/1.0000 0.9997/0.9997/0.9998 accurately in both types of quintuplets than LSTM and LogReg. This finding matches a common conclusion from many existing works (Abu-Nimeh et al. 2007; Caruana and Niculescu-Mizil 2006; Zheng, Ivanov, and Brintrup 2024; Zheng, Kong, and Brintrup 2023) show- ing that ANN and CNN are better than LogReg and LSTM in binary classification tasks. This observation can help practitioners in model selection when deploying. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 15 Figure 3. (a) results on quintuplet (company, supplies, product, to, company) achieved by different machine learning models (b) results achieved by pretrained LM-enhanced machine learning models with ‘all-MiniLM-L12-v2’ 16 G. ZHENG AND A. BRINTRUP Figure 4. (a) shows results of predicting relationships in the quintuplet of (company, with, certificate, has, product) achieved by five machine learningmodels on all countries’ datasets while (b) presents the results achieved by our proposed approach using fivemachine learning models empowered by the pretrained LM, ‘all-MiniLM-L12-v2’. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 17 Table 10. Results of predicting relationships in a quintruplet of (company, supplies, product, to, company) ( referred to as X) and a quintuplet (company, with, certificate, has, product) (referred to as Y), both achieved using pretrained LM-enhanced CNN. all-MiniLM-L6-v2 all-MiniLM-L12-v2 all-distilroberta-v1 paraphrase-albert-small-v2 distiluse-base-multilingual-cased-v2 Country (X / Y) (X / Y) (X / Y) (X / Y) (X / Y) AUSTRALIA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 AUSTRIA 1.0000/1.0000 1.0000/0.9990 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 BELGIUM 1.0000/1.0000 1.0000/1.0000 0.9994/1.0000 1.0000/1.0000 1.0000/1.0000 BRAZIL 0.9994/0.9997 0.9996/0.9983 1.0000/0.9997 1.0000/0.9999 1.0000/1.0000 CANADA 1.0000/1.0000 1.0000/1.0000 1.0000/0.9998 1.0000/1.0000 1.0000/1.0000 CHINA 0.9999/0.9998 0.9998/0.9992 1.0000/0.9998 1.0000/0.9999 1.0000/1.0000 CZECH REPUBLIC 1.0000/1.0000 1.0000/1.0000 0.9996/0.9996 1.0000/0.9996 1.0000/1.0000 FRANCE 0.9997/1.0000 0.9998/1.0000 1.0000/0.9998 1.0000/0.9993 1.0000/1.0000 GERMANY 0.9999/0.9998 1.0000/0.9996 1.0000/1.0000 1.0000/0.9999 1.0000/1.0000 HUNGARY 1.0000/1.0000 0.9998/1.0000 1.0000/0.9998 1.0000/1.0000 1.0000/1.0000 INDIA 1.0000/0.9999 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 INDONESIA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/0.9983 1.0000/1.0000 ITALY 1.0000/0.9999 0.9998/0.9999 1.0000/1.0000 1.0000/0.9986 1.0000/1.0000 JAPAN 1.0000/1.0000 0.9999/0.9999 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 KOREA 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 MALAYSIA 0.9993/0.9977 0.9997/1.0000 1.0000/0.9995 1.0000/0.9997 1.0000/1.0000 MEXICO 1.0000/1.0000 1.0000/0.9998 1.0000/0.9991 1.0000/1.0000 1.0000/1.0000 POLAND 1.0000/1.0000 1.0000/0.9983 0.9987/1.0000 1.0000/1.0000 1.0000/1.0000 RUSSIA 1.0000/1.0000 0.9997/0.9990 1.0000/1.0000 1.0000/0.9963 1.0000/1.0000 SOUTH AFRICA 1.0000/0.9997 1.0000/0.9974 0.9997/0.9991 1.0000/1.0000 1.0000/1.0000 SPAIN 1.0000/1.0000 1.0000/0.9991 1.0000/0.9998 1.0000/0.9995 1.0000/1.0000 SWEDEN 1.0000/0.9984 1.0000/0.9995 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 TAIWAN 1.0000/0.9998 1.0000/0.9997 1.0000/0.9998 1.0000/0.9998 1.0000/1.0000 THAILAND 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 1.0000/1.0000 TURKEY 0.9999/1.0000 1.0000/1.0000 1.0000/0.9997 0.9999/0.9999 1.0000/1.0000 U.S.A. 0.9998/0.9998 0.9997/0.9998 0.9998/0.9998 0.9999/0.9999 0.9999/1.0000 UK 1.0000/1.0000 1.0000/0.9993 1.0000/1.0000 0.9998/0.9997 1.0000/1.0000 4.3.3. Comparisons between pretrained language models Table 10 shows that results achieved by the pretrained LMs to enhance the CNN model provide higher pre- diction accuracy on both types of quintuplets. Although all five pretrained LMs improve relationship prediction accuracy, the performances of pretrained LMs have sub- tle differences. For example, notice that in predicting (company, supplies, product, to, company) (see results from all X columns in Table 10), ‘distiluse-base-multilingual-cased- v2’ outperforms all other four. This can also be observed for predicting quintuplet (company, with, certificate, has, product) (see results from all Y columns in Table 10)). The performance of ‘distiluse-base-multilingual-cas- ed-v2’ is more consistent across all tasks, compared to the other four models. This pretrained LM is trained on the data in 15 languages (Reimers and Gurevych 2019), pro- viding richer knowledge hidden in different languages. Particularly for using language models as knowledge bases, language models trained on multilingual data can learn better representations than being trained onmono- lingual data (Kassner, Dufter, and Schütze 2021; Pratap et al. 2020). This indicates an advantage that is particu- larly applicable to global supply chain networks. This is also the reason of the multilingual model outperforming the monolingual model in our case study. Although the primary language of the used dataset is English, it was collected from various countries, and certain elements, such as company names, contain characters from other languages. These multilingual components present chal- lenges for monolingual language models, which are less adept at processing non-English characters compared to multilingual language models. As a result, the enhanced predictive performance observed with multilingual lan- guage models can be partially attributed to their ability to effectively interpret and handle these diverse linguistic features. Regarding the relationship predictions in different types of quintuplets, four of five pretrained LMs exclud- ing ‘distiluse-base-multilingual-cased-v2’ perform better predictions on (company, supplies, product, to, company) than on (company, with, certificate, has, product). This is because relationships in the former describe a type of network-level information in supply chains while rela- tionships in the latter present a type of internal informa- tion in a company. Network-level information in supply chains, such as relationships between companies, tends to be more accessible and observable compared to internal information of a specific company, due to network-level information that can be inferred from public sources and industry publications. Internal information about a company such as specific product certificates or prod- uct processes is more sensitive and not readily avail- able to external observers. Even so, our pretrained LM- enhanced models still provide high prediction accuracy 18 G. ZHENG AND A. BRINTRUP compared to traditional ML methods. This directly con- tributes to greater supply chain visibility and supports more effective risk management and strategic planning. 4.3.4. Summary of findings Our findings can be summarised as follows: • Enhancing link prediction models with pretrained LMs outperforms all five benchmarks, indicating that pretrained LMs indeed can help common machine learning models achieve better relationship predic- tions in supply chain networks due to learned rela- tional knowledge in these pretrained LMs. • Pretrained LM-enhanced machine learning model on prediction tasks is more consistent than using machine learning model alone, and is less affected by differences in dataset size. • Pretrained LM-enhancedmodels are better in predict- ing relationships that rely on network-level informa- tion, compared to relationships that rely on internal- company information. This is because network-level information in supply chains tends to be more acces- sible and observable from public sources, compared to internal information of a specific entity in the network. As such, pretrained LM-enhancement works better in cases where we predict who supplies, which prod- uct to whom, compared to for example, which quality certification a company may have for which product. • ANN, CNN and AutoEncoder that are commonly good at solving binary classification problems can predict relationships in supply chain networks more accurately than LSTM and LogReg in our case. • Pretrainedmultilingual LMsbenefit commonmachine learning models better than monolingual LMs due to being trained on multilingual data to learn better representations. 5. Conclusions, managerial implications, limitations, discussions, and future works 5.1. Conclusions Relationship prediction, also called link prediction, or supply network reconstruction, is an emergent area of ‘digital supply chain surveillance’ research that aims to increase visibility of supply networks using data-driven techniques without having to rely on the willingness of supply chain actors to share information. Althoughmany of the proposed methods have been very successful in reconstructing supply-buy relationships, the context in which these relationships are embedded has thus far lacked attention. This hinders researchers and practition- ers to take full advantage of thesemethods, as they cannot accurately differentiate between a transactional relation- ship and established supply relationships that charac- terise physical resources needed to produce a product. As such, estimations of resilience, distance to malicious actors and harmful practices based remain inaccurate. Recently, Generative AI (GenAI) methods such as LLMs have become popular in eliciting information pat- terns fromnatural language data. There is alsomuchhype in their potential in SCM. However we cannot simply ask an LLM whether a supply relationship exists, due to their hallucination problem. Hence we need methods to combine the power of GenAI methods with structured, guaranteeable methods when it comes to supply network surveillance. To date, there have been no studies on the use of LLMs for supply network surveillance. In this work, we developed a novel framework for predicting complex, multi-relational interactions in sup- ply chain networks by integrating GenAI with machine learning.We introduced a new term, ‘quintuplet’, a struc- tured representation that extends traditional triplets by embedding contextual information such as product flows and multi-hop dependencies. Compared to conventional triplets, which capture isolated relationships (e.g. (com- pany A, supplies, company B)), quintuplets model inter- connected chains (e.g. (company A, supplies, Product 1, to, company B)), enabling holistic visibility into end-to- end supply chain dynamics. We formulate the link pre- diction as a binary classification task, aiming to predict whether a quintuplet exists or not to address the inherent incompleteness of supply chain knowledge graphs. Our work advances the literature in the three key ways. First, we bridge the gap between natural lan- guage processing (NLP) and supply chain knowledge graph research by demonstrating that untuned pre- trained LMs can generate semantically rich embed- dings with relational knowledge while mitigating hal- lucinations through knowledge graph anchoring. While prior studies highlight hallucination risks in genera- tive tasks (e.g. text synthesis Huang et al. 2023), we extend mitigation strategies to structured link prediction by training machine learning models to map language model embeddings to knowledge graph-verified relation- ships. This regulates predictions to align with domain- specific facts. Second, quintuplets address knowledge graph sparsity by enabling inference of indirect rela- tionships (e.g. multi-tier supplier dependencies) through contextualised chains. Third, we empirically validate that pretrained LMs encode latent relational knowledge rel- evant to supply chains (Bouraoui, Camacho-Collados, and Schockaert 2020; Petroni et al. 2019; Safavi and Koutra 2021), a finding that invites further exploration of LMs for tasks like risk prediction and sustainability analytics. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 19 A real-world practical case study is used to eval- uate the proposed approach with comparative bench- marks that use machine learning methods without pre- tained LM enhancement. Results show that pretrained LM-enhanced quintuplet prediction surpasses all bench- marks and provides consistent performance across all datasets with the advantage of providing contextual information, allowing stakeholders to be able to track the movements of products in a global network. More importantly, our method avoids LM fine-tuning, mak- ing it deployable for organisations lacking NLP exper- tise or extensive textual data. This scalability bridges a longstanding gap between academic GenAI research and industrial supply chain applications, where manual efforts and reactive decision-making remain prevalent, and also opens a door for more practical works in solving supply chain challenges with the use of language models. 5.2. Managerial implications Beyond the successful real-world case study evaluation mentioned above, our work also yields several practical implications for supply chain management. For exam- ple, organisations could consider integrating their exist- ing enterprise resource planning (ERP) systems, inven- tory databases, and other structured data sources with unstructured data (e.g. supplier communications, mar- ket reports) to construct comprehensive supply chain knowledge graphs. This integration will enhance vis- ibility into complex relationships and dependencies, enabling more accurate forecasting and risk manage- ment. Besdies, supply chain managers, logistics coordi- nators, and procurement teams can work together to adopt their decision-support systems that incorporate GenAI and machine learning models. By embedding our quintuplet-based link prediction approach within these systems, organisations can proactively identify hidden dependencies and potential disruptions, lead- ing to better-informed production planning and inven- tory management. With improved supply chain visibil- ity, organisations can use the insights from our frame- work to strengthen supplier relationships and man- age risks more effectively. For instance, by identify- ing previously unrecognised dependencies, companies can diversify their supplier base, negotiate more robust contracts, and prepare contingency plans to mitigate disruptions. 5.3. Limitations, discussions, and future works Although the integration of pretrained LMswith a supply chain knowledge graph significantly enhances link pre- diction capabilities and contributes to improved supply chain visibility, several limitations still should be consid- ered as follows. First, the performance of the proposedmodel is inher- ently dependent on the quality and completeness of the underlying knowledge graph; inaccuracies or missing links can propagate errors throughout the prediction process, thereby compromising overall reliability and potentially causing the pretrained LMs to generate ‘hal- lucinated’ outputs. These erroneous outputs may lead to misguided decision-making if not adequately man- aged, underscoring the necessity for a comprehensive and high-quality supply chain knowledge graph. Second, although pretrained LMs excel at capturing rich contextual information from text, they are typically trained on general-purpose corpora, whichmay not fully encapsulate the specialised terminologies and nuances inherent in supply chain contexts. Third, with regard to scalability, our current frame- work employs a single-shot (zero-shot) approach rather than amulti-shot strategy (e.g. few-shot or iterative query refinement). While multi-shot approaches could poten- tially enhance accuracy by providing richer contextual guidance and iterative refinement, they also entail sig- nificantly higher computational costs. Each additional ‘shot’ increases token processing requirements and infer- ence latency in a linear fashion. For instance, a five- shot prompt could require approximately five timesmore tokens and thus proportionally longer inference times compared to a single-shot prompt. Given typical resource constraints in exploratory studies, our implementation prioritises computational efficiency and feasibility on standard hardware. Fourth, the selection of relatively small-to-medium pretrained language models, as detailed in Table 1, reflects a deliberate balance between performance and practicality. Although larger models, such as the full BERT or cloud-based models, might offer enhanced performance due to a greater number of parameters, their use would require substantially more memory and computational resources, potentially complicating repro- ducibility. Moreover, the deployment of cloud-based models raises additional concerns regarding data confi- dentiality, particularlywhen dealingwith sensitive supply chain information. Fifth, while our framework leverages a relatively small pretrained languagemodel without fine-tuning, the results demonstrate meaningful performance improve- ments attributable to the injection of semantic context into the link prediction process (see Tables 6–10). The generative AI component plays an integral role by pro- ducing rich embeddings that encode domain-specific knowledge, thus enabling the downstream classifier to achieve superior predictive accuracy and cross-context 20 G. ZHENG AND A. BRINTRUP consistency compared to baseline methods. Although larger models or fine-tuning could potentially yield fur- ther improvements, our choice to use a smaller model was driven by practical constraints and the objective of demonstrating a deployable solution. Futureworkwill address these limitations by focussing on: (1) constructing a more accurate and comprehen- sive supply chain knowledge graph; (2) evaluating the applicability of our approach across a broader range of industrial use cases; (3) exploring additional forms of contextual knowledge derived from pretrained language models to further advance supply chain management practices; (4) examining the trade-offs between efficiency and accuracy when employing a multi-shot framework; and (5) assessing the performance gains fromusing larger language models relative to the increased computational requirement Disclosure statement No potential conflict of interest was reported by the author(s). Funding This work was supported by the Engineering and Physical Sciences Research Council (grant number EP/W019868/1). Notes on contributors Dr. Ge Zheng is currently a Research Associate in the Supply Chain AI Lab (SCAIL), led by Professor Alexandra Brin- trup, at the Institute for Manufacturing (IfM), Department of Engineering, Uni- versity of Cambridge, UK. She received her PhD degree in Computer Science with a full PhD scholarship at the University of Bournemouth, UK, and anMSc degree in Electronic Engineer- ing with an Academic Excellence International Masters Schol- arship at the University of Essex. Her research areas involve supply chain risk prediction, pattern recognition, intelligent transportation systems, and healthcare applications. Alexandra Brintrup is a Professor of Dig- ital Manufacturing and head of the Sup- ply Chain AI Lab (SCAIL) at the Insti- tute forManufacturing (IfM),Department of Engineering, University of Cambridge, UK. She has a PhD in Artificial Intel- ligence, an MSc in Applied Maths and Computing, and a BEng in Manufactur- ing Engineering. She specialises in distributed negotiation, machine learning, autonomous systems, and nature-inspired optimisation in complex supply networks. Data availability statement Due to the commercially sensitive nature of this research, sup- porting data is not available. ORCID Ge Zheng http://orcid.org/0000-0002-9983-7120 Alexandra Brintrup http://orcid.org/0000-0002-4189-2434 References Abu-Nimeh, S., D. Nappa, X. Wang, and S. Nair. 2007. “A Comparison of Machine Learning Techniques for Phish- ing Detection.” In Proceedings of the Anti-Phishing Work- ing Groups 2nd Annual ECrime Researchers Summit, 60–69. Pittsburgh, PA, USA, October 04–05, 2007. Act, S. 2015. “Modern Slavery Act.” United Kingdom Parlia- ment. Agrawal, G., T. Kumarage, Z. Alghami, and H. Liu. 2023. “Can Knowledge Graphs Reduce Hallucinations in LLMs?: A Sur- vey.” arXiv preprint arXiv:2311.07914. Aguero, D., and S. D. Nelson. 2024. “The Potential Application of Large Language Models in Pharmaceutical Supply Chain Management.” The Journal of Pediatric Pharmacology and Therapeutics 29 (2): 200–205. https://doi.org/10.5863/1551- 6776-29.2.200. Ahmed, C., A. ElKorany, and R. Bahgat. 2016. “A Super- vised Learning Approach to Link Prediction in Twit- ter.” Social Network Analysis and Mining 6 (1): 1–11. https://doi.org/10.1007/s13278-016-0333-1. Albawi, S., T. A. Mohammed, and S. Al-Zawi. 2017. “Under- standing of a Convolutional Neural Network.” In 2017 Inter- national Conference on Engineering and Technology (ICET), 1–6. IEEE. AlMahri, S., L. Xu, and A. Brintrup. 2024. “Enhancing Supply Chain Visibility with Knowledge Graphs and Large Lan- guage Models.” arXiv preprint arXiv:2408.07705. Astha, Rajvanshi. 2023. “How AI Could Transform Fast Fash- ion for the Better–andWorse.” AccessedNovember 12, 2024. Bacilieri, A., A. Borsos, P. Astudillo-Estevez, and F. Lafond. 2023. “Firm-Level Production Networks: What Do We (Really) Know.” INET Oxford Working Paper 2023. Bellamy, M. A., and R. C. Basole. 2013. “Network Analy- sis of Supply Chain Systems: A Systematic Review and Future Research.” Systems Engineering 16 (2): 235–249. https://doi.org/10.1002/sys.v16.2. Bender, E.M., T. Gebru, A.McMillan-Major, and S. Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” in Proceedings of the 2021 ACMConfer- ence on Fairness, Accountability, and Transparency, 610–623. Virtual Event, Canada, March 3–10, 2021. Bouraoui, Z., J. Camacho-Collados, and S. Schockaert. 2020. “Inducing Relational Knowledge from Bert.” In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 7456–7463. New York, USA, February 7–12, 2020. Brintrup, A., E. Kosasih, P. Schaffer, G. Zheng, G. Demirel, and B. L. MacCarthy. 2024. “Digital Supply Chain Surveillance Using Artificial Intelligence: Definitions, Opportunities and Risks.” International Journal of Production Research 62 (13): 1–22. Brintrup, A., P.Wichmann, P.Woodall, D.McFarlane, E. Nicks, and W. Krechel. 2018. “Predicting Hidden Links in Supply Networks.” Complexity 2018:1–12. https://doi.org/10.1155/ cplx.v2018.1. Brockmann, N., E. Elson Kosasih, and A. Brintrup. 2022. “Sup- ply Chain Link Prediction on Uncertain Knowledge Graph.” http://orcid.org/0000-0002-9983-7120 http://orcid.org/0000-0002-4189-2434 https://doi.org/10.5863/1551-6776-29.2.200 https://doi.org/10.1007/s13278-016-0333-1 https://doi.org/10.1002/sys.v16.2 https://doi.org/10.1155/cplx.v2018.1 INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 21 ACM SIGKDD Explorations Newsletter 24 (2): 124–130. https://doi.org/10.1145/3575637.3575655. BusinessWire. 2024. “Using Generative AI, C.H. RobinsonHas Achieved Automation across the Entire Lifecycle of a Freight Shipment.” Accessed November 11, 2024. Cai, L., J. Li, J. Wang, and S. Ji. 2021. “Line Graph Neural Net- works for Link Prediction.” IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (9): 5103–5113. Caruana, R., and A. Niculescu-Mizil. 2006. “An Empirical Comparison of Supervised Learning Algorithms.” In Pro- ceedings of the 23rd International Conference on Machine Learning, 161–168. Pittsburgh, PA, USA. Caspersz, D., H. Cullen, M. C. Davis, D. Jog, F. McGaughey, D. Singhal, M. Sumner, and H. Voss. 2022. “Modern Slavery in Global Value Chains: A Global Factory and Governance Perspective.” Journal of Industrial Relations 64 (2): 177–199. https://doi.org/10.1177/00221856211054586. Celonis. 2024. “Process Mining Meets Generative AI: Celonis Rides Industry Wave to Democratize Core Tech.” Accessed November 12, 2024. Choi, T. Y., K. J. Dooley, and M. Rungtusanatham. 2001. “Supply Networks and Complex Adaptive Systems: Control versus Emergence.” Journal of Operations Management 19 (3): 351–366. https://doi.org/10.1016/S0272-6963(00)0006 8-1. CNBCEvolve Global Summit. 2023. “Fedex at 50:What’s Driv- ing Transformation?” Accessed November 12, 2024. Coşkun, M., and M. Koyutürk. 2021. “Node Similarity- Based Graph Convolution for Link Prediction in Bio- logical Networks.” Bioinformatics 37 (23): 4501–4508. https://doi.org/10.1093/bioinformatics/btab464. du Preez, Derek. 2023. “Mars Develops a Sweet Tooth for Celonis Process Intelligence.” Accessed November 12, 2024. Fichtel, L., J.-C. Kalo, and W.-T. Balke. 2021. “Prompt Tuning or Fine-Tuning-Investigating Relational Knowledge in pre- Trained LanguageModels.” In 3rd Conference on Automated Knowledge Base Construction. Virtual, October 4–8, 2021. Fosso Wamba, S., C. Guthrie, M. M. Queiroz, and S. Min- ner. 2024. “Chatgpt and Generative Artificial Intelligence: An Exploratory Study of Key Benefits and Challenges in Operations and Supply Chain Management.” Interna- tional Journal of Production Research 62 (16): 5676–5696. https://doi.org/10.1080/00207543.2023.2294116. FossoWamba, S., M.M. Queiroz, C. J. C. Jabbour, and C. V. Shi. 2023. “Are Both Generative Ai and Chatgpt Game Changers for 21st-Century Operations and Supply Chain Excellence?” International Journal of Production Economics 265:109015. https://doi.org/10.1016/j.ijpe.2023.109015. Gayam, S. R. 2023. “Enhancing Creative Industries with Gen- erative Ai: Techniques for Music Composition, Art Genera- tion, and Interactive Media.” Journal of Machine Learning in Pharmaceutical Research 3 (1): 54–88. Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde- Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. “Gener- ative Adversarial Nets.” In Advances in Neural Information Processing Systems 27.Montreal, Quebec, Canada,December 8–13, 2014. Google DeepMind. 2023. “Gemini Models.” Accessed Novem- ber 13, 2024. Gowda, S. R., and Y. R. Rao, 2024. “Data Augmentation Using Generative-Ai.” Journal of Innovative Image Processing 6 (3): 273–289. https://doi.org/10.36548/jiip. Grandini, M., E. Bagli, and G. Visani. 2020. “Metrics for Multi-class Classification: An Overview.” arXiv preprint arXiv:2008.05756. Guan, X., Y. Liu, H. Lin, Y. Lu, B. He, X. Han, and L. Sun. 2024. “Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting.” In Pro- ceedings of the AAAI Conference on Artificial Intelligence, Vol. 38, 18126–18134. Vancouver, Canada, February 20–27, 2024. Hasan, M. A., and M. J. Zaki. 2011. “A Survey of Link Prediction in Social Networks.” In Social Network Data Analytics, edited by C. Aggarwal. Boston, MA: Springer. https://doi.org/10.1007/978-1-4419-8462-3_9 Hashim, M. E. A., W. A. W. Mustafa, N. S. Prameswari, M. M. Ghani, and H. F. Hanafi. 2023. “Revolutionizing Virtual Reality with Generative Ai: An in-depth Review.” Journal of Advanced Research in Computing and Applications 30 (1): 19–30. https://doi.org/10.37934/arca.30.1.1930. Hochreiter, S., and J. Schmidhuber. 1997. “Long short- term Memory.” Neural Computation 9 (8): 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. Holger, H., K. Theodora, R. Roger, and T. Kimberly. 2023. “While Still Nascent, Generative AI Has the Potential to Help Fashion Businesses Become More Productive, Get to Market Faster, and Serve Customers Better. The Time to Explore the Technology Is Now.” Accessed November 12, 2024. Huang, L., W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, et al. 2023. “A Survey on Hallucination in Large Lan- guage Models: Principles, Taxonomy, Challenges, and Open Questions.” arXiv preprint arXiv:2311.05232. Ialongo, L.N., C. deValk, E.Marchese, F. Jansen,H. Zmarrou, T. Squartini, and D. Garlaschelli. 2022. “Reconstructing Firm- Level Interactions in the Dutch Input–output Network from Production Constraints.” Scientific Reports 12 (1): 11847. https://doi.org/10.1038/s41598-022-13996-3. Jackson, I., D. Ivanov, A. Dolgui, and J. Namdar. 2024. “Gener- ative Artificial Intelligence in Supply Chain and Operations Management: A Capability-Based Framework for Analysis and Implementation.” International Journal of Production Research 62 (17): 1–26. Karras, T., S. Laine, and T. Aila. 2019. “A Style-Based Gener- ator Architecture for Generative Adversarial Networks.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4401–4410. Long Beach, CA, USA, June 15–20, 2019. Kassner, N., P. Dufter, and H. Schütze. 2021. “Multilingual Lama: Investigating Knowledge in Multilingual Pretrained Language Models.” In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 3250–3258. Kyiv, Ukraine, April 19–23, 2021. Kazemi, S.M., andD. Poole. 2018. “Simple Embedding for Link Prediction in Knowledge Graphs.” Advances in Neural Infor- mation Processing Systems 31. Montreal, Canada, December 3–8, 2018. Kenton, J. D. M.-W. C., and L. K. Toutanova. 2019. “Bert: Pre- Training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of NAACL-HLT, 4171–4186. Minneapolis, Minnesota, USA, June 2–7, 2019. Kingma, D. P., and J. Ba. 2014. “Adam: AMethod for Stochastic Optimization.” arXiv preprint arXiv:1412.6980. https://doi.org/10.1145/3575637.3575655 https://doi.org/10.1177/00221856211054586 https://doi.org/10.1016/S0272-6963(00)00068-1 https://doi.org/10.1093/bioinformatics/btab464 https://doi.org/10.1080/00207543.2023.2294116 https://doi.org/10.1016/j.ijpe.2023.109015 https://doi.org/10.36548/jiip https://doi.org/10.1007/978-1-4419-8462-3_9 https://doi.org/10.37934/arca.30.1.1930 https://doi.org/10.1162/neco.1997.9.8.1735 https://doi.org/10.1038/s41598-022-13996-3 22 G. ZHENG AND A. BRINTRUP Kingma, D. P., and M. Welling. 2019. “An Introduction to Variational Autoencoders.” Foundations and Trends® in Machine Learning 12 (4): 307–392. https://doi.org/10.1561/ 2200000056. Kleinbaum, D. G., and M. Klein. 2002. Logistic Regression: A Self-Learning Text. New York, NY: Springer New York. Kosasih, E. E., and A. Brintrup. 2022. “A Machine Learning Approach for Predicting Hidden Links in Supply Chain with Graph Neural Networks.” International Journal of Produc- tion Research 60 (17): 5380–5393. https://doi.org/10.1080/00 207543.2021.1956697. Kosasih, E. E., F. Margaroli, S. Gelli, A. Aziz, N. Wildgoose, and A. Brintrup. 2022. “Towards Knowledge Graph Reason- ing for Supply Chain Risk Management Using Graph Neural Networks.” International Journal of Production Research 62 (15): 1–17. Küblböck, K. 2013. The EU Raw Materials Initiative: Scope and Critical Assessment. Technical Report, ÖFSE Briefing Paper. Lan, Z., M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut. 2019. “Albert: A Lite Bert for Self-supervised Learning of Language Representations.” arXiv preprint arXiv:1909.11942. Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. “Roberta: A Robustly Optimized Bert Pretraining Approach.” arXiv preprint arXiv:1907.11692. Louie, R., A. Coenen, C. Z. Huang, M. Terry, and C. J. Cai. 2020. “Novice-AI Music Co-creation via AI-Steering Tools for DeepGenerativeModels.” In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13. Hawaii, USA, April 25–30, 2020. Martino, A., M. Iannelli, and C. Truong. 2023. “Knowledge Injection to Counter Large Language Model (LLM) Hallu- cination.” In European Semantic Web Conference, 182–185. Springer. Hersonissos, Greece, May 28th–June 1st, 2023. Meiyappan, P., and M. Bales. 2021. Position Paper: Reducing Amazon’s PackagingWaste UsingMultimodal Deep Learning. Amazon Science. https://www.amazon.science/publications/ position-paper-reducing-amazons-packaging-wasteusing- multimodal-deep-learning Microsoft. 2023. “Empower Your Organization with Copilot.” Accessed November 13, 2024. Mohammed, M. Y., and M. J. Skibniewski. 2023. “The Role of Generative Ai in Managing Industry Projects: Transform- ing Industry 4.0 into Industry 5.0 Driven Economy.” Law and Business 3 (1): 27–41. https://doi.org/10.2478/law-2023- 0006. Mungo, L., F. Lafond, P. Astudillo-Estévez, and J. D. Farmer. 2023. “Reconstructing Production Networks UsingMachine Learning.” Journal of Economic Dynamics and Control 148:104607. https://doi.org/10.1016/j.jedc.2023.104607. Narayanan, D., M. Shoeybi, J. Casper, P. LeGresley, M. Patwary, V. Korthikanti, D. Vainbrand, et al. 2021. “Efficient Large- Scale Language Model Training on GPU Clusters Using Megatron-LM.” In Proceedings of the International Confer- ence for High Performance Computing, Networking, Storage and Analysis, 1–15. St. Louis, Missouri, USA, November 14–19, 2021. Noy, S., and W. Zhang. 2023. “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence.” Science 381 (6654): 187–192. https://doi.org/10.1126/scien ce.adh2586. Ooi, K.-B., G. W.-H. Tan, M. Al-Emran, M. A. Al-Sharafi, A. Capatina, A. Chakraborty, Y. K. Dwivedi, et al. 2023. “The Potential of Generative Artificial Intelligence across Disciplines: Perspectives and Future Directions.” Journal of Computer Information Systems 65 (1): 1–32. Open AI. 2022. “Dall.e2.” Accessed Novemver 11, 2024. OpenAI. 2024. “Chatgpt – ReleaseNotes.” AccessedNovember 11, 2024. Petroni, F., T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller. 2019. “Language Models as Knowledge Bases?” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Inter- national Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguis- tics. Pichler, A., C. Diem, A. Brintrup, F. Lafond, G. Mager- man, G. Buiten, T. Y. Choi, V. M. Carvalho, J. D. Farmer, and S. Thurner. 2023. “Building an Alliance to Map Global Supply Networks.” Science 382 (6668): 270–272. https://doi.org/10.1126/science.adi7521. Pratap, V., A. Sriram, P. Tomasello, A. Hannun, V. Liptchinsky, G. Synnaeve, andR.Collobert. 2020. “MassivelyMultilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters.” arXiv preprint arXiv:2007.03001. Reimers, N., and I. Gurevych.November, 2019. “Sentence-Bert: Sentence Embeddings Using Siamese Bert-Networks.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Rossi, A., D. Barbosa, D. Firmani, A. Matinata, and P. Meri- aldo. 2021. “Knowledge Graph Embedding for Link Pre- diction: A Comparative Analysis.” ACM Transactions on Knowledge Discovery from Data (TKDD) 15 (2): 1–49. https://doi.org/10.1145/3424672. Ryder System. 2024. “6 Ways Generative AI Is Boosting Logis- tics.”’ Accessed November 11, 2024. Safavi, T., and D. Koutra. 2021. “Relational World Knowledge Representation in Contextual Language Models: A Review.” In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 1053–1067. Shin, H.-C., Y. Zhang, E. Bakhturina, R. Puri, M. Patwary, M. Shoeybi, and R. Mani. 2020. “Biomegatron: Larger Biomed- ical Domain Language Model.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Pro- cessing (EMNLP), 4700–4706. Shoeybi, M., M. Patwary, R. Puri, P. LeGresley, J. Casper, and B. Catanzaro. 2019. “Megatron-LM: Training Multi- billion Parameter Language Models Using Model Paral- lelism.” arXiv preprint arXiv:1909.08053. Srivastava, S. K., S. Routray, S. Bag, S. Gupta, and J. Z. Zhang. 2024. “Exploring the Potential of Large Language Models in Supply Chain Management: A Study Using Big Data.” Jour- nal of Global Information Management (JGIM) 32 (1): 1–29. https://doi.org/10.4018/JGIM. Su, Z., X. Zheng, J. Ai, Y. Shen, and X. Zhang. 2020. “Link Prediction in Recommender Systems Based on Vector Sim- ilarity.” Physica A: Statistical Mechanics and Its Applications 560:125154. https://doi.org/10.1016/j.physa.2020.125154. Tan, Y., Z. Zhou, H. Lv, W. Liu, and C. Yang. 2024. “Walklm: A Uniform Language Model Fine-Tuning Framework for Attributed Graph Embedding.” In Advances in Neural Infor- mation Processing Systems 36. https://doi.org/10.1561/2200000056 https://doi.org/10.1080/00207543.2021.1956697 https://www.amazon.science/publications/position-paper-reducing-amazons-packaging-wasteusing-multimodal-deep-learning https://doi.org/10.2478/law-2023-0006 https://doi.org/10.1016/j.jedc.2023.104607 https://doi.org/10.1126/science.adh2586 https://doi.org/10.1126/science.adi7521 https://doi.org/10.1145/3424672 https://doi.org/10.4018/JGIM https://doi.org/10.1016/j.physa.2020.125154 INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 23 Touvron, H., T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, et al. 2023. “Llama: Open and Efficient Foundation Language Models.’ arXiv preprint arXiv:2302.13971. Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. 2017. “Attention Is All You Need.” In Proceedings of the 31st International Confer- ence on Neural Information Processing Systems, 6000–6010. Wang,W., Y. Huang, Y.Wang, and L.Wang. 2014. “Generalized Autoencoder: A Neural Network Framework for Dimen- sionality Reduction.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 490–497. Wang,W., F.Wei, L. Dong, H. Bao, N. Yang, andM. Zhou. 2020. “Minilm: Deep Self-attention Distillation for Task-Agnostic Compression of pre-trained Transformers.” Advances in Neural Information Processing Systems 33:5776–5788. Wichmann, P., A. Brintrup, S. Baker, P. Woodall, and D. McFarlane. 2018. “Towards Automatically Generating Sup- ply Chain Maps from Natural Language Text.” IFAC- PapersOnLine 51 (11): 1726–1731. https://doi.org/10.1016/j. ifacol.2018.08.207. Yang, L., and A. Shami. 2020. “On Hyperparameter Optimiza- tion ofMachine Learning Algorithms: Theory and Practice.” Neurocomputing 415:295–316. https://doi.org/10.1016/j.ne ucom.2020.07.061. Yasunaga, M., J. Leskovec, and P. Liang. 2022. “Linkbert: Pretraining Language Models with Document Links.” In Proceedings of the 60th Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), 8003–8016. Yegnanarayana, B. 2009.Artificial Neural Networks. PHI Learn- ing Pvt. Ltd. Zareie, A., and R. Sakellariou. 2020. “Similarity-Based Link Prediction in Social Networks Using Latent Relation- ships between the Users.” Scientific Reports 10 (1): 20137. https://doi.org/10.1038/s41598-020-76799-4. Zhang, C., S. R. Kuppannagari, R. Kannan, and V. K. Prasanna. 2018. “Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids.” In 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 1–6. IEEE. Zhao, C., X. Sun,M.Wu, and L. Kang. 2024. “Advancing Finan- cial Fraud Detection: Self-attention Generative Adversarial Networks for Precise and Effective Identification.” Finance Research Letters 60:104843. https://doi.org/10.1016/j.frl.20 23.104843. Zheng, G., D. Ivanov, and A. Brintrup. 2024. “AnAdaptive Fed- erated Learning System for Information Sharing in Supply Chains.” International Journal of Production Research 1–23. https://doi.org/10.1080/00207543.2024.2392635. Zheng, G., L. Kong, and A. Brintrup. 2023. “Federated Machine Learning for Privacy Preserving, Collective Supply Chain Risk Prediction.” International Journal of Production Research 61 (23): 8115–8132. https://doi.org/10.1080/00207 543.2022.2164628. Zholus, A., M. Kuznetsov, R. Schutski, R. Shayakhmetov, D. Polykovskiy, S. Chandar, and A. Zhavoronkov. 2024. “Bindgpt: A Scalable Framework for 3D Molecular Design via LanguageModeling and Reinforcement Learning.” arXiv preprint arXiv:2406.03686. https://doi.org/10.1016/j.ifacol.2018.08.207 https://doi.org/10.1016/j.neucom.2020.07.061 https://doi.org/10.1038/s41598-020-76799-4 https://doi.org/10.1016/j.frl.2023.104843 https://doi.org/10.1080/00207543.2024.2392635 https://doi.org/10.1080/00207543.2022.2164628 1. Introduction 2. Related works 2.1. Generative artificial intelligence 2.2. Link prediction in supply networks 2.3. Summary of research gaps 3. Combining pretrained language models and knowledge graphs 3.1. Preliminaries: from triplets to quintuplets 3.2. The pretrained LM-based machine learning framework 3.2.1. Language model selection 3.2.2. Machine learning model selection 4. Case study 4.1. Generating training data 4.2. Experimental settings 4.2.1. Benchmarks 4.2.2. Settings of model training 4.2.3. Performance metrics 4.3. Experimental results and discussions 4.3.1. Benchmark comparison between pretrained LM-enhanced link prediction and general machine learning models 4.3.2. Comparisons between machine learning models 4.3.3. Comparisons between pretrained language models 4.3.4. Summary of findings 5. Conclusions, managerial implications, limitations, discussions, and future works 5.1. Conclusions 5.2. Managerial implications 5.3. Limitations, discussions, and future works Disclosure statement Funding ORCID Data availability statement References << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles false /AutoRotatePages /PageByPage /Binding /Left /CalGrayProfile () /CalRGBProfile (Adobe RGB \0501998\051) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.5 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.1000 /ColorConversionStrategy /sRGB /DoThumbnails true /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 524288 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments false /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo false /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings false /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Remove /UCRandBGInfo /Remove /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 150 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects true /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description << /ENU () >> >> setdistillerparams << /HWResolution [600 600] /PageSize [609.704 794.013] >> setpagedevice