IUPAC Technical Report

Mark Archibald, Ian Bruno, Stuart Chalk, Antony N. Davies, Robert M. Hanson*, Stefan Kuhn,
Robert J. Lancashire and Henry S. Rzepa

FAIRSpec-ready spectroscopic data
collections – advice for researchers, authors,
and data managers (IUPAC Technical Report)
https://doi.org/10.1515/pac-2025-0409
Received January 4, 2025; accepted September 3, 2025

Abstract: In this Technical Report, we introduce the application of FAIR (findable, accessible, interoperable,
and reusable) data management in the form of a “FAIRSpec-ready spectroscopic data collection” – that is, a
collection of instrument data, chemical structure representations, and related digital items that is ready to be
automatically or semi-automatically extracted for metadata that will allow the production of an IUPAC
FAIRSpec Finding Aid. Associating this finding aid with the collection produces an IUPAC FAIRSpec Data
Collection. The challenge we set for researchers is relatively simple: tomaintain their data in a form that allows
critical metadata to be extracted in a discipline-specific way, increasing the probability that the data will be
findable and reusable both during the research process and after publication. We focus on a few specific
suggestions that researchers can use to maximize the “fairness” of their spectroscopic data collection. Most
importantly, following these guidelines ensures that instrument datasets are unambiguously associated with
the chemical structure. The guidelines promote the inclusion of the instrument dataset itself in the collection
and describe ways of organizing the collection such that automated metadata creation is possible. In these
guidelines, we emphasize the importance of systematically organizing data throughout the entire research
process, not just at the time of publication.

Keywords: Data management; FAIR data; FAIR data management; FAIRSpec; finding Aids; spectroscopy.

Article note: This manuscript was prepared in the framework of IUPAC project 2019-031-1-024 of the Committee on Publications and
Cheminformatics Data Standards.

*Corresponding author: Robert M. Hanson, Department of Chemistry, St Olaf College, Northfield, MN, USA, e-mail: hansonr@stolaf.edu.
https://orcid.org/0000-0001-5411-2356
Mark Archibald, Royal Society of Chemistry, Thomas Graham House, Science Park, Milton Road, Cambridge, CB4 0WF, UK,
e-mail: archibaldm@rsc.org. https://orcid.org/0000-0001-8687-7134
Ian Bruno, Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK, e-mail: bruno@ccdc.cam.ac.uk.
https://orcid.org/0000-0003-4901-9936
Stuart Chalk, Department of Chemistry and Biochemistry, University of North Florida, Jacksonville, FL, USA, e-mail: schalk@unf.edu.
https://orcid.org/0000-0002-0703-7776
Antony N. Davies, SERC, Sustainable Environment Research Centre, Faculty of Computing, Engineering and Science, University of South
Wales, Cardiff, UK, e-mail: antonyndavies@gmail.com. https://orcid.org/0000-0002-3119-4202
Stefan Kuhn, Institute of Computer Science, University of Tartu, Narva mnt 18, 51009, Tartu, Estonia, e-mail: stefan.kuhn@ut.ee. https://
orcid.org/0000-0002-5990-4157
Robert J. Lancashire, Department of Chemistry, The University of the West Indies, Kingston 7, Mona Campus, Jamaica,
e-mail: rjlanc@gmail.com. https://orcid.org/0000-0002-6780-3903
Henry S. Rzepa,Department of Chemistry, Imperial College, Molecular Sciences Research Hub, White City Campus, Wood Lane, London,W12
0BZ, England, e-mail: h.rzepa@imperial.ac. https://orcid.org/0000-0002-8635-8390

Pure Appl. Chem. 2025; 97(11): 1479–1510

© 2025 IUPAC & De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
License.

https://doi.org/10.1515/pac-2025-0409
mailto:hansonr@stolaf.edu
https://orcid.org/0000-0001-5411-2356
mailto:archibaldm@rsc.org
https://orcid.org/0000-0001-8687-7134
mailto:bruno@ccdc.cam.ac.uk
https://orcid.org/0000-0003-4901-9936
mailto:schalk@unf.edu
https://orcid.org/0000-0002-0703-7776
mailto:antonyndavies@gmail.com
https://orcid.org/0000-0002-3119-4202
mailto:stefan.kuhn@ut.ee
https://orcid.org/0000-0002-5990-4157
https://orcid.org/0000-0002-5990-4157
mailto:rjlanc@gmail.com
https://orcid.org/0000-0002-6780-3903
mailto:h.rzepa@imperial.ac
https://orcid.org/0000-0002-8635-8390


CONTENTS
1 Introduction ............................................................................................................................................ 1480

1.1 The IUPAC “FAIRSpec” project ..............................................................................................................................1480
1.2 Organization of this report ..................................................................................................................................... 1482
1.3 Intended audience .................................................................................................................................................... 1483
1.4 Management of spectroscopic data ...................................................................................................................... 1483
1.5 FAIR management of spectroscopic data ...........................................................................................................1484
1.6 Not just for publication: the FAIR data workflow ........................................................................................... 1485
1.7 Sample-vs. compound-based collections ............................................................................................................. 1485
1.8 Descriptive and relational metadata ...................................................................................................................1486

2 Guidance for digital chemical structure representations in a FAIRSpec-ready data collection .....1486
2.1 Digital chemical structure representations .......................................................................................................1486
2.2 Generating digital chemical structure representations .................................................................................1488

3 Guidance for instrument dataset representations in a FAIRSpec-ready data collection ................ 1490
4 Guidance for the organization of digital items in a FAIRSpec-ready data collection .......................1492

4.1 The IUPAC FAIRSpec finding aid ........................................................................................................................... 1492
4.2 The FAIRSpec-ready collection ..............................................................................................................................1494

5 Addition of curated metadata to a FAIRSpec-ready collection .......................................................... 1500
5.1 Internal metadata records ......................................................................................................................................1500
5.2 Registered metadata records ................................................................................................................................. 1502

6 Guidance for managers of laboratory instrument management systems and electronic laboratory
notebooks ..................................................................................................................................................1503

7 Summary ...................................................................................................................................................1504
Appendix: Terminology .................................................................................................................................1505
Compound association ...................................................................................................................................1505
Dataset .............................................................................................................................................................1506
Digital item ......................................................................................................................................................1506
Digital object ...................................................................................................................................................1506
Instrument dataset .........................................................................................................................................1506
IUPAC FAIRSpec data collection ...................................................................................................................1506
IUPAC FAIRSpec Finding Aid .........................................................................................................................1507
Persistent identifier .......................................................................................................................................1507
Representation ...............................................................................................................................................1507
List of abbreviations ......................................................................................................................................1507
References .......................................................................................................................................................1508

1 Introduction

1.1 The IUPAC “FAIRSpec” project

The IUPAC ProjectDevelopment of a Standard for FAIR DataManagement of Spectroscopic Data1 was organized in
2019 in order to promote the adoption of FAIR (findable, accessible, interoperable, and reusable) data manage-
ment principles in chemical sciences throughout the data production, publication, and post-publication process.
The overall goals of this ongoing project include:
– the development of a clear set of FAIR Data Management Principles specific to chemistry-related spectro-

scopic data;
– the design of a means of describing a spectroscopic data collection that distills critical metadata associated

with the digital items in the collection, thus providing a standardized, accessible method of exploring the
contents of a data collection without the need to individually access items in the collection (the IUPAC
FAIRSpec Finding Aid2,3);

1480 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


– providing researchers, authors, and data managers with guidelines for the organization of spectra and
associated chemical structure information that allows machine-assisted curation of the data, creating the
necessary link between chemical structure and spectroscopic data that is often key to its analysis and
discussion; and

– the specification of a standardized set of metadata keys and values that will allow a broad range of services
that can efficiently search for and retrieve spectroscopic datasets of interest.

Previously, we have introduced a set of FAIRmanagement principles relating to spectroscopic data in chemistry.4

The five main principles and their associated corollaries are shown in Fig. 1.
In this Technical Report, we focus on the penultimate goal of our project – providing guidelines for the creation

of what we are calling FAIRSpec-ready data collections. As summarized in Fig. 2, we propose best practices in
relation to the five FAIRSpec principles as applied to developing and managing spectroscopic data collections.

Of primary relevance to this report is the recognition that context is important. Spectroscopy data objects
that are part of a collection generally relate to one ormore chemical structures, through diverse relationships, the
recognition of which may only develop over time. Further, data must be organized to enable accessibility, and
associated metadata must be well designed to facilitate interoperability. It is important to value all data formats,
including commonly used proprietary formats as well as recognized standards published by national or inter-
national standardization bodies such as IUPAC.

We have chosen to discuss these guidelines prior to describing specifications for the details of our proposed
IUPAC FAIRSpec Finding Aid (which will be the focus of a second Technical Report currently in development)
because we feel that one of the most important components of the curation necessary to produce an IUPAC

Fig. 1: The five FAIRSpec principles and their corollaries.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1481


FAIRSpec Data Collection is a critical minimal level of curation that can be applied to essentially any working
collection of spectroscopic data, whether or not an IUPAC FAIRSpec Finding Aid is ever generated. Effective data
management starts immediately after spectroscopic data are generated within a laboratory and continues
through publication and beyond. If this minimal curation is done well, over time, many of the goals of FAIR data
management can then be accomplished efficiently or even automatically in later stages of the process, however
that workflow might ultimately be defined. If done poorly, the effort of data preservation may be painfully time
consuming, and meaningful extraction of metadata might even become impossible.

While these guidelines might seem extensive upon first sight, we wish to emphasize that the creation of a
FAIRSpec-ready collection can be simplicity itself. Successful efforts can be as involved as implementing a fully
“data-aware” laboratory management system or as simple as just maintaining a set of file directories on an
instrument, provided appropriate chemical structure representations are added consistently.

1.2 Organization of this report

The guidelines presented here cover four specific areas:
– guidance for the generation of structural representations that accompany those datasets (Section 2),
– guidance for best practices in the preparation of spectroscopic datasets, particularly in regard to their

associated descriptive and relational metadata (Section 3),
– guidelines for the organization of digital items in the collections containing spectroscopic data and their

associated structural representations (Section 4),
– guidance for maximizing the potential for registering metadata with recognized metadata management

agencies (Section 5), and
– guidance for data and laboratory managers for ensuring that the data collected by their researchers are

maintained in a fashion that optimizes the collections’ findability, availability, interoperability, and reus-
ability both within their institution and more generally.

Fig. 2: Best practices in spectroscopic data management based on the five FAIRSpec principles.

1482 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


These guidelines are not intended to cover every possible sort of data or structure representation. In several cases
involving structures, such as organometallic compounds and polymers, we recognize that there is no perfect
solution. As such, these guidelines should be taken as starting points for development of further guidance.

Acronyms and some of the terminology used here are defined more fully in the appendix to this Technical
Report.

1.3 Intended audience

The primary intended audience of this Technical Report includes:
– practicing researchers who create and work with experimental spectroscopic datasets and are interested in

best practices relating to the management of their growing data collections
– principal investigatorswith an interest inmaximizing their data’s potential to be found, used, and referenced

both internally within their institution and by members of their community
– institutional staff responsible for working with researchers to utilize instruments that collect

spectroscopic data
– librarians working with researchers to develop best practices in relation to data management within their

institution
– institutional repository managers with responsibilities that include ensuring the highest level of FAIR data

management within their institutions
– journal editors and publishing house staff tasked with developing guidance for authors and reviewers in the

creation and review of electronic supplementary information (ESI) datasets associated with scientific pub-
lications involving spectroscopic data

– developers of electronic laboratory notebooks (ELNs) and other services associated with the scientific en-
terprise such as integrated laboratory management platforms and data validation services

– funding agencies interested in providing guidance to their grantees in developing practical wide-reaching
FAIR data management plans, particularly in the field of chemistry

– anyone working outside the context of chemistry-related spectroscopy interested in developing similar
guidelines for FAIR data management within their own discipline

1.4 Management of spectroscopic data

Spectroscopic data are key components of many chemistry endeavors both in academia and industry. Spectro-
scopic analysis forms a principal function in the “proof of structure” in most areas of experimental chemistry,
answering questions such as “What have Imade?”, “Howpure is it?”, and “Whydidn’t this reactionwork theway I
expected it to?” Spectroscopic data provide evidence required for publication of scientific results by journal
publishers and is the basis for subsequent experimental replication or extrapolation of results.

Unfortunately, up until very recently, it has been the norm that such data are provided only in packaged
document form, vendor-proprietary formats, or in reduced or processed form as a “spectrum”, perhaps even just
an image of a spectrum, rather than as the primary or raw instrumental output. The result is that published data
are often significantly less useful than they could potentially be.

As primary products of funded research, plans for the maintenance and sharing of spectroscopic data are
increasingly becoming a requirement of funding agencies. For example, the U.S. National Science Foundation
(NSF) guide to proposal preparation specifies:

Proposals must include a document of no more than two pages uploaded under “Data Management Plan”…. This supplementary
document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results…
and may include … the standards to be used for data and metadata format and content.5

and

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1483


Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the
primary data, samples, physical collections, and other supporting materials created or gathered in the course of work under NSF
awards.6

The NSF Directorate for Mathematical and Physical Sciences Division of Chemistry provides more specific
guidance to the chemistry community as well.7

Publishers have also started to at least recommend that authors include their data as ESI for publications,8–10

though only minimal advice for how to organize those data have been provided.
In strongly regulated industries such as pharmaceutical research and manufacturing, there are long-

standing legal requirements to conserve and be able to produce on demand original analytical data such as
spectra for auditors.11, 12 These data may well be required long after the measuring instruments themselves have
been retired.

In both contexts – academic and industrial – this issue is compounded by the international chemistry
community’s lack of an agreed-upon, standardized mechanism for organizing and sharing collections of spec-
troscopic data. Instead, every research group is left to their own devices to save and share data in whatever
organizational scheme they choose to implement, often as a single monolithic document supplementing a pub-
lication, the ESI. In this form, the data cannot be said to be easily discoverable or findable other than by direct
association via the landing page of the journal article reporting the results. The “data” are likely accessible to
humans only via copy/paste operations, if these are even permitted by the format. When raw or minimally
processed data are provided, they tend to be in the form of an idiosyncratically organized archive file. Such
“collections” are not sufficient for optimal broad-context findability or ease of re-use.

To a practicing chemist, it should be obvious that digital entities coming from a laboratory instrument
constitute “primary data” (referred to herein as “datasets”). However, the word data as used here is a broader
term. For our purposes, data includes processed data such as spectra, as well as derived data and analyses. These
might include peak lists or chemical shift splitting, and integration descriptions in nuclear magnetic resonance
(NMR) spectroscopy, as well as 1D and 2D spectral assignments in relation to molecular structure. Chemical
structure representations such as molfiles (MOL) and structure-data files (SDF) also fall in this category.

1.5 FAIR management of spectroscopic data

Relatively recent discussions of data management have emphasized what are referred to as FAIR data man-
agement guidelines.13 It is important to emphasize that what we are referring to as “FAIR” here is not somuch the
data themselves as it is the management of that data, and in particular, the production and organization of the
metadata associated with experimental data.Metadata are digital items that play the role of documentation for
data, resource discovery, and contextualization. Metadata accompanies electronic data in describing the data,
making internal relationships among related digital items in a collection, and relating that collection to
other work.

Thus, we prefer to refer to “FAIR data management” rather than “FAIR data” itself as a way of emphasizing
that we are not just referring to specific file formats (experimental “datasets”) when using these terms, but rather
to how those datasets are described and organized as part of data collections. We focus here on the development
of a rich metadata context associated with experimental datasets.

It is important to note at the outset that these widely discussed FAIR guidelines were designed not only to
facilitate access to such data by humans, but also to allow autonomous discovery and re-use by machines in
contexts such as artificial intelligence and machine learning. “FAIR” has been said to also refer to Fully Artificial-
Intelligence Ready, and the acronym FAIR2 (“FAIR-squared”) has been suggested for Findable, Accessible, Inter-
operable, Reusable, and Fully Artificial-Intelligence Ready.14 Unfortunately, in the chemical spectroscopies, the
current forms of ESI as representations of data are rarely if at all conformant with FAIR guidelines or optimal for
machine processing.

1484 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


An important aspect of FAIR data management relates to how ametadata record is stored and shared. There
are twomainmodels for the use ofmetadata. The onemost associatedwith findable, accessible, interoperable, and
reusable focuses on the public distribution of data and its associatedmetadata. The general FAIR guidelines focus
on a largely discipline-blind model where a digital object is registered with an appropriate registration agency in
exchange for a persistent identifier (PID, normally in the form of a digital object identifier (DOI) from the DOI
Foundation,15 and that record is then aggregated into ametadata store, where it can also be indexed and searched.
The metadata records include full file paths (URLs) or database references to the components or collections as
held in an appropriate repository. This first model is primarily focused on the post-publication finding and
relating of data and their collections on the web.

1.6 Not just for publication: the FAIR data workflow

A more nuanced model for FAIR data management is a private, locally relevant model discussed in detail below.
This second discipline-specific model, which we introduce in these guidelines, is one where the metadata are
stored on a file system or within a local database where they can be summarized by a local IUPAC FAIRSpec
Finding Aid (discussed more specifically in Section 4). This highly structured document describes the accumu-
lating data collection in amachine-readable format. The IUPAC FAIRSpec Finding Aid can then serve as a starting
point for both the mostly discipline-blind PID-based model as well as a much more fine-grained private or public
discipline-specific purposes.

In the everyday activities of a research laboratory, it is common to acquire numerous spectroscopic data
taken by multiple researchers, over multiple timeframes, of samples containing compounds having known or
unknown structure. Many researchers already spend significant time curating their data collections, whether
that be as complex as a dedicated institutional system or as simple as a hand-written laboratory notebook or
spreadsheet. ELNs, when used properly, also serve an important role in the organization of the metadata
associated with spectroscopic data. However, with few exceptions16 such tools (and themetadata they collect) are
largely unstandardized, making reusability a problem.

Thus, well before any publication is in sight, there is a need for more standardized workflows for the
organization of spectroscopic data collections, particularly in relation to their associated samples and the putative
structures of their associated compounds. Every experimental chemist understands the importance of both
organizing and communicating such information. The guidelines we present here suggest simple ways that
researchers can better organize their data and associated metadata to provide added value throughout the
research project lifetime. Indeed, evenwell past the point of project completion and publication, when digital data
collections are stored in and made accessible by repositories, there is a great need for standardization of the
metadata associated with spectroscopic data. These guidelines provide the basis for a workflow to produce data
collections that are ready to be processed – and, indeed, might be regularly processed privately and locally – to
make FAIRSpec-ready collections that are useful throughout the research process.

1.7 Sample-vs. compound-based collections

It is important to understand that the nature of a spectroscopic collection may change over time and be different
in different contexts. Specifically, a spectroscopic data collection in its initial context of a laboratory setting tends
to be sample oriented. A sample obtained from a reaction or other source is analyzed using an instrument. Data
are obtained in the digital form. At this stage, it is not necessarily appropriate to assign any correlation to a
chemical structure. However, it is critical that we have a reference to the sample unambiguously associated with
the spectral data. Methods of accomplishing this task are discussed in Sections 4.2 and 5.1, below.

As time progresses, in association with internal reports and external publications, the chemical structure is
generally associated with spectra. These collections may retain sample-to-spectra associations, but now they add
associations referred to here as “compounds”, as discussed in Sections 4.2. These compounds are described

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1485


digitally as associations made between structure representations and instrument datasets. Section 2 describes in
detail our advice for the creation and introduction of structure representations.

1.8 Descriptive and relational metadata

Metadata is central to this discussion. We distinguish two types of metadata – descriptive and relational.
Descriptive metadata in the context of spectroscopy constitute information such as the type of instrument

used, the temperature, the solvent, and other details of the analytical methods involved in acquiring and pro-
cessing the actual “experimental data.” The description also includes declarations of themedia types that the data
are held in, which will help identify whether it is primary or raw (lossless) data directly captured from an
instrument, or whether it has already been processed (possibly with some loss of information) in some form, as
for example in a conversion of a time domain or induction form into a frequency domain or spectral form.

Relational metadata include information that ties or associates experimental spectroscopic data to their
relevant context or provenance – notebook page and sample references, compound numbers in a publication,
researcher and organizational identifiers, proposed chemical structure details and references to both related
spectra and other analytical techniques. Relational metadata can also be at the collection level, indicating re-
lationships to other collections such as other datasets or to journal publications. Relational metadata canprovide
licensing information together with broader information about sponsoring and research organizations. One of the
most important aspects of relational metadata is that it can be registered with agencies such as DataCite17 specif-
ically for the purposes of improving wide-ranging findability and accessibility. Relational metadata records
themselves at the collection level can then be associated with their own individual persistent identifiers.

Relationalmetadata are dynamic and are likely to increasewith time as the project evolves, with the addition
of new and related spectra, association with other forms of data such as computational models, and as con-
nections to chemical structure aremade or revised. Even after publication, the relationships amongst the items in
a collection are likely to change as related publications appear, and additionalmetadatamay be needed to refer to
later publications or datasets as well. Descriptive metadata, in contrast, are likely to be more static (and, in many
cases, unchangeable for ethical reasons), being produced once and then not changed again.

2 Guidance for digital chemical structure representations in a
FAIRSpec-ready data collection

2.1 Digital chemical structure representations

We start with guidelines for creating chemical structure representation specifically in relation to spectroscopic
data within a FAIRSpec-ready collection.

There are many ways in which a structure can be represented, ranging from an image or diagram, through
electronic formats that capture detailed atomic coordinates and connectivities, to linear representations,
chemical names, and other standard identifiers. In practice, such representations perform many functions.
Authors use images (or “drawings”), for example, to convey aspects of chemical structure and bonding that relate
to their finding. Cheminformaticians use digital representations to reference chemical properties. An important
purpose of a digital structure representation in the context of spectroscopic data collections is to provide the
critical digitalmetadata that allowfinding and discussing the relationship between structure and spectroscopy. In
cases where detailed post-acquisition spectral analysis has been carried out, specific structure representations
are necessary to correlate specific atoms and/or functional groups in a compound’s structure with specific signals
in the spectroscopic data.

Digital structure representations conveying the chemical structures of compounds associated with spec-
troscopic data might include one or more of the file types given in Table 1, where pros and cons of each are

1486 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


mentioned. The most useful structure representations in the current context describe the molecule in terms of
atoms and bonds with 2D or 3D coordinates (e.g., MDL-MOL18, 19 or CDXML20) allowing automated generation of
representations that can be included in metadata such as SMILES (simplified molecular input line entry sys-
tem)21–23 and InChI (International Chemical Identifier)24 when appropriate. Although simple images are poten-
tially valuable representations in a variety of contexts, theymust not be the sole representation for a compound’s
structure, as they cannot generally be reliably converted to any of the other formats.

The key point here is not that one representation is inherently better than another, rather that we value the
presence of multiple representations of a structure within a collection. We encourage researchers to provide as
many structure representations as they are reasonably able to – 2D drawingfile and an image, or a 2D structure as
well as a 3D structure. In terms of a collection being FAIRSpec-ready, we do not need all the possible represen-
tations, only enough to generate additional ones through automated workflows. For example, a single CDXML
representation can be used to generate a SMILES, an InChI, a PNG image, and both 2D- and 3D-molfiles. We do not
(and inmany cases could not) seek to dictate a single correct method of structural representation; rather we have
tried to present here methods most likely to preserve chemical information after machine processing and to
highlight common pitfalls where information would be scrambled or lost.

Table : Common digital chemical structure representations – pros and cons.

Representation type Considerations

MDL-MOL Version


Benefits: high interoperability, can contain D or D coordinates
Limitations: limited ability to convey bonding

MDL-MOL Version


Benefits: all the features of Version , with expanded capabilities to describe special bonding types and multi-
atom connection bonding (as in ferrocene)
Limitations: less widely implemented to date in toolkits and informatics platforms.

CDXML Benefits: well-specified structural format generated and read by popular drawing programs, allows for the
expression of “nicknames” such as Ph (phenyl) and TBS (tert-butyldimethylsilyl), highly versatile in expressing
nuances of bonding, and may provide warnings of inappropriate bonding or charge states
Limitations: generally, less interoperable than MOL (at least at the time of this writing); ambiguity can arise from
using bonding and atomelements for nonmolecular depictions, such as titles, labels, and drawn lines; specification is
no longer formally maintained, but the last published working specification version is available.

CDX Benefits: well-specified binary equivalent of CDXML, easily converted to CDXML with open-source tools
Limitations: binary format lacks the human readability aspects of MOL and CDXML; see note for CDXML in relation to
specification

SMILES Benefits: D (character string) compact format, generally well-specified standard, allows for stereochemical ambi-
guity, allows explicit double bond or aromatic bond descriptions, easily interconvertible with D or D formats within
the range of its applicability
Limitations: there is no universally accepted “canonical” form; different toolkits and toolkit versions differ in in-
terpretations, particularly of what “aromatic” means and where it is applicable. An IUPAC project to formalize
guidelines for reliable use of SMILES is currently in progress.

InChI Benefits: D (character string) compact format, canonical, options can include or not include hydrogen and ste-
reochemical “layers”, easily derivable from D or D formats within the range of its applicability
Limitations: depending upon the presence of layers, cannot always be converted unambiguously to D or D
structure representations or to handle tautomeric isomers; does not currently encode bonding details for metal-
containing compounds or some advanced stereochemistry features. Projects to address priority limitations are
currently in progress., 

Image Benefits: provides a readily interpretable and displayable option for finding aid viewers
Limitations: does not generally allow for reliable error-free conversion to any of the other representations;
generally, not acceptable as the sole structure representation in a collection

Chemical Name Benefits: particularly the IUPAC preferred name for a compound, when appropriate, is the gold standard for
unambiguous description of a chemical entity.
Limitations: prone to errors that are not easily discoverable; requires complex processes for converting to other
forms of chemical identifiers.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1487


Thus, all chemical structure representations involve priorities and trade-offs. When a chemist draws a
chemical structure, the purpose is generally to allow unambiguous communication with other chemists. Addi-
tionally, though, the structures we draw could be used also to communicate unambiguously with machines.
However, the qualities of a drawn structure that make it unambiguous to a human might introduce ambiguity
when interpreted by a machine. The trade-offs and considerations described below are often necessary because
the ecosystem of tools to draw, interchange, and process chemical structures electronically lacks the functionality
(or broadly agreed upon methods) to handle some of the more nuanced aspects of structural representation.
Ultimately, in the context of FAIR data management, the goal is to use a chemical structure to generate the
metadata that enables both humans and machines to communicate efficiently and unambiguously with each
other. Thus, a focus on unambiguousmachine interpretation can lead to slightly different priorities for drawing a
structure than chemists might be used to.

2.2 Generating digital chemical structure representations

What follows is a set of suggestions for best practices in producing the chemical structure representations
associated with spectroscopic datasets.
1. Provide multiple representations of the same structure when feasible. The IUPAC FAIRSpec Principles

embrace a variety and multitude of structure representations. Many of these structure representations are
interconvertible. Thus, if a FAIRSpec-ready data collection includes only a CDXML or MOL representation,
that can generally be sufficient to generate all the other representations for inclusion in an IUPAC FAIRSpec
Data Collection, such as one or more forms of SMILES or InChI. Nonetheless, the presence of multiple
representations for a given structure allows for increased interoperability and improved opportunities for
data and metadata validation.

2. Include only one structure perfile.Drawing files containingmultiple structures (such as reaction schemes
or tables or figures for publication) generally cannot be processed by automated methods. Particularly in
relation to spectroscopic collections, correlating specific structures with specific spectra requires that in-
dividual structures have their own digital representations (i.e., files). In the same vein, generic labels such as
“R” or “X” (“Markush” drawings) should not be used to represent multiple compounds that differ only at
specific locations, because doing so does not allow the sort of direct association with spectroscopic data we
need in this context.

3. Produce structure representations free of annotations. In preparation for extraction of metadata from a
structure, do not annotate structures with labels or text boxes. Numbers providing compound numbers or
adjacent to atoms to identify specific atoms are generally not interpretable by software. Hydrogen bonds and
other “weak” bonds, though perhaps important to a discussion, are not advisable for these structure-data
associations. Notations such as (R), (S), racemic, scalemic, 93 % ee, 3:1 dr, etc. should not be part of the
structure representation itself. If such annotations are important to the discussion, provide a separate
representation that includes them. Chemical names should not be included with structures, as this can
sometimes result in errors in processing, and can lead to unmanageable image widths. In general, chemical
names are not necessary in FAIRSpec-ready collections, as they can be generated from well-made structure
or drawing files.
To the extent that it is useful within the given context, we suggest that the place for structure-specific
annotation is in metadata, not within the structural representation itself. As such, we describe in Section 5.1
several simple ways of adding annotations in a way that allows them to be associated with structural
representations without being hidden within them.
Most importantly, it is advisable to adhere to IUPAC-recommended structure depictions as much as possible.
IUPAC recommendations are available for graphical representation of chemical structures28 and the
depiction of stereochemistry.29 Ongoing IUPAC efforts are expected to expand upon these guidelines spe-
cifically in relation to machine-readable formats.30

1488 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


4. Take care when using abbreviations in a structure. Abbreviated atom labels (e.g., ‘OTHP’, meaning
“tetrahydropyranyloxy”) may cause problems when converting from a drawing program’s native format to
other structure representations, such as MOL, SMILES, and InChI. It is advisable to look for errors in the
drawing program. An abbreviated group flagged as a possible error by the program typically means that the
drawing program cannot interpret the chemical meaning of the abbreviation. If in doubt, expand all ab-
breviations, at least temporarily, just to check. Or, if the program allows, check that the calculated molecular
formula matches the intended structure. For example, a drawing program may interpret “PMB” as phos-
phorus-metal-boron, whereas the author intended it to mean para-methoxybenzyl. Expansion of the label or
checking the molecular formula would quickly identify this as a problem.

5. For mixtures of compounds, consider the spectroscopic context. Here we wish to distinguish between
structure representation, compound, and sample. For our purposes here, a structure representation is a digital
item, a series of bytes, whether that be a SMILES in the form of a string of characters or a MOL file or an
image. We contrast that to a compound, which may or may not have an associated chemical structure or
spectrum. Thus, we speak of “the structure of a compound” or “this compound’s spectrum.” Essentially, the
term compound has an associative nature. It is the connecting link between a spectrum and its associated
structure. Chemical samples, on the other hand – the actual starting point for experimental
spectroscopy – are generally mixtures of compounds, and the “compounds” themselves might even be
mixtures. It is not uncommon, for example, to see in publications the phrase Compound 3c was a 10:1 mixture
of diastereomers.Whether or not this is correct usage of the term “compound” is not for us to say. After much
discussion within our project group, we have come to the following context-based consensus:
a. For a single compound for which the NMR spectrum shows (in the author’s opinion) minor impurities or

residual solvent, only include the structure of the principal component. In the case of a compound
reported with “high enantiomeric excess”, include only the structure of the major enantiomer if, in the
given context, the minor enantiomer is nothing more than an undesired impurity.

b. In contrast, if the context emphasizes the stereochemical nature of themixture (for example, the analysis
is from chromatography that separates and identifies enantiomers, or the NMR spectrum clearly in-
dicates signals from two diastereomers and is used to determine their ratio), multiple structures should
be included in individual files. Additional compound association-level descriptive metadata could pro-
vide information regarding the nature of the mixture.

c. For racemates, include only the structure of one of the enantiomers unless the enantiomers are
distinguishable by the associated spectroscopic data. Representation of racemates is a special case that is
trivially depictable for human consumption but much more difficult to achieve for machine-readability.
Current machine-readable formats lack a reliable, consistent way to store the information that the
structure is racemic.We note that InChI has a flag for racemates, but it is not part of standard InChI. MOL
files may use the “chiral flag” set to zero to indicate a racemic mixture, but this feature has not been
implemented reliably in the past.31

Differentiation between use cases a, b, and c, may well be driven by the environment in which the work was
carried out. For example, in pharmaceutical research and manufacturing, it may well be a regulatory
requirement that the spectroscopic data is measured specifically to identify and quantify extremely low-
concentration compoundswithin amixture such aswhen reporting toxicmetabolite levels in a product. Here
not only must the compound identification be carried through to the final documentation, it is essential that
chiral information, where appropriate, is correctly reported as this can have a direct bearing on the toxicity
of the compound identified.
The underlying principle here is that data and metadata should not be mixed in the same representation
unless the representations themselves demand it as part of a standard. Providing metadata separate from
data makes the metadata itself significantly more findable. The FAIRSpec Finding Aid allows for any amount
of additional annotation to be associated with structures as described in Section 5.1.

6. Do the best you can with compounds that present unsolved challenges for structure description. Few
structure representations properly represent coordination and dative bonds or allow for multicenter at-
tachments (as for many inorganic and organometallic compounds). Even when they do, it is not always

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1489


obvious how to generate the proper representation formachine readability. At the very least, these structure
representations should be accompanied by an image representation, as it is quite likely that neither SMILES
nor InChI can properly describe them. Other cases exist (such as atropisomerism) where generating suitable
machine-readable representations remains an unsolved challenge. Again, the guiding principle here is that
the best representation is one that conveys the intention of the creator, whatever that representation might
be. To the extent that this can bemore than an image, all the better. For example, if a drawing program is used
to produce the image, provide both the drawing program’s native representation and the image.

7. Use standard descriptions for macromolecules, supplemented with images. Macromolecules such as
proteins are always best represented by established formats such as Protein Data Bank (PDB),32 Macromo-
lecular Crystallographic Information File (mmCIF),33 or BinaryCIF,34 (the latter two being more extensible
and more actively maintained). Additionally, complex biomolecules may be represented in a SMILES-like
linear string using the Hierarchical Editing Language for Macromolecules (HELM),35 a machine-readable
linear notation supported by IUPAC and the Pistoia Alliance.36 Nonetheless, images are welcome additional
representations that can be used to supplement and provide meaningful additional interpretation and
annotation of these formats. As for organic and inorganic polymers, and network solids, we give no specific
guidance other than to provide “appropriate and meaningful” digital representations, whatever that might
mean in the context of the collection, even if that is only an image. Additional metadata can be used to be
more descriptive.

3 Guidance for instrument dataset representations in a FAIRSpec-
ready data collection

In this section, we outline ways in which the instrument dataset itself can be optimized for incorporation into an
IUPAC FAIRSpec Data Collection. It is not for these guidelines to elevate one digital representation over another at
the instrument level. On this point, we refer to the original recommendations for data representations as given in
the FAIR Guiding Principles for scientific data management and stewardship as elaborated by GO FAIR.37

Specifically:
F2. (Findability) Data are described with rich metadata
R1.3. (Reusability) Data should meet domain-relevant community standards
“Data” in the GO FAIR context means much more than just instrument “datasets”. It includes chemical

sample, structure, and analysis representations, as well as all the metadata associated with the collection.
Nonetheless, with these goals in mind, specifically in relation to instrument datasets, we suggest:
1. The original instrument dataset is an important representation. There is no question that the potentially

most important digital representation of an instrument dataset is the one that came from the instrument
itself. If it is practical to provide this primary data, it should be provided. For NMR, this implies that the free
induction decay (FID) is made available.

2. Multiple representations are valued. If we consider the likely prospects for data reuse, no single data
representation will always be the most valuable in all contexts (Table 2). Thus, in the case of NMR spec-
troscopy, just providing proprietary FID data is not generally as reusable as providing the FID data, a
transformed frequency-domain spectrum, and a peak listing.

3. Generally, do not combine a molecular structure with its associated spectrum within a single rep-
resentation. While this might seem to be the obvious thing to do, and it can be handled in certain cases
during metadata extraction, we suggest that it is bad practice to create “hard-coded” relationships between
structure and instrument datasets thatmay ormay not be thefinal story. The IUPAC FAIRSpecData Collection
separates structures from spectra to associate them inmore flexible ways, for example, when the need is for
the structure only. Keeping these aspects separate digitally allows for easier later-stage reinterpretation of
the data.

1490 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


4. Package only one dataset per file. While some digital formats allow combining multiple instrument
datasets for the important work of comparative analysis, to be FAIRSpec-ready, just as for structures, there
should only be one item per representation. For example, it is not sufficient to have a single file that contains
all the spectra in a collection, as each collectionmay have its own specific associations (all the spectra for this
compound, all the spectra of this type, etc.), and the ability to repackage data into (initially unknowable)
different collections is an important aspect of the IUPAC FAIRSpec Data Collection. In the case where one file
contains both a spectrum and structure representations, it is particularly critical that there be only one
spectrum with only one structure (and that both are extractable independently).

Table : Digital dataset representation examples.

Representation type Considerations

Original instrument data (for example, the NMR FID and associated
parameter files)

Benefits: highest integrity with no information loss; high potential for
reuse; allows for alternative and/or automated (re)analysis; potential
for generation of all other representations; allows possibility of fraud
detection; allows the most reliable automated metadata extraction.
Limitations: vendor-specific format may require licensing access to a
vendor-specific reader; may not express important aspects of the
data processing used in the analysis; must contain enough of the key
parameters for further processing; may no longer be documented;
shortest expected lifetime.

Original data exported to an alternative standardized format, such as
JCAMP-DX(FID) NMR-Star or nmrML

Benefits: highest possibility of reuse and interoperability; vendor-
agnostic open format; allows for automated production of additional
representations; recognized by regulators as potentially the longest
expected lifetime.
Limitations: Depending on implementations potential for loss of data
resolution or precision; potential for loss of some metadata fields.

Instrument-processed data such as transformed NMR data in the
form of a spectrum or spectra

Benefits: most generally and immediately informative to the practicing
scientist or educator; can be examined in detail and repurposed.
Limitations: does not allow for early-processing adjustments, such as in
NMR spectroscopy phasing, line broadening or use of non Fourier-
transform methods; may require proprietary software to read; infor-
mation loss compared to original data; less scope for fraud detection.

Third party-processed data Benefits: concise; convenient if this is the standard process in each
laboratory; may include meaningful annotation added by the origi-
nator such as integration or peak identification”; may be the only
method of getting FAIR metadata to be associated with the data.
Limitations: formatmay require proprietary software to read; possibly
limited ability to extract key metadata from proprietary formats; may
disallow alternative analysis; may be more subject to fraud or other
sorts of spectral editing (removal of solvent peaks, for example)

Inline string description (for example, in NMR spectroscopy) a string
describing field strength, solvent, chemical shifts, coupling constants,
and integration

Benefits: concise; a common requirement for publication as part of
the experimental details; distills the essential features of the spec-
trum.
Limitations: minimally informative.

Peak listing or peak table Benefits: easilymachine-readable; possibly all that is needed for some
forms of reuse.
Limitations: minimal semantic information; requires additional
context and metadata for interpretation.

Image Benefits: most immediately identifiable and informative to working
chemists, educators, and students; excellent for accompanying more
robust representations.
Limitations: almost certainly reduced resolution; no additional pro-
cessing possible; susceptible to crude data editing; does not generally
allow for conversion to any of the other representations; not
acceptable as the sole structure representation in a collection; often
cannot represent the details used by automated processing software.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1491


4 Guidance for the organization of digital items in a FAIRSpec-ready
data collection

The essential aspect of a FAIRSpec-ready data collection is that it is organized in a systematic manner that allows
(1) machine-based extraction of key metadata and (2) unambiguous association of specific spectra with specific
chemical structures (to the extent that such an association is possible). More specifically, the data and associated
metadata should, with perhaps a small amount of additional curation, allow for the creation of an IUPAC
FAIRSpec Finding Aid. We briefly introduce this metadata document first and then discuss the key elements of a
FAIRSpec-ready collection.

4.1 The IUPAC FAIRSpec finding aid

An IUPAC FAIRSpec Finding Aid is a document that describes in detail the contents of a spectroscopic collection.
The structure of the document is based on the IUPAC FAIRSpec Metadata Object Model,41 which describes a small
set of abstract objects (for example, “samples”, “structures”, “spectra”, “compounds”, and “analyses”). In addition,
the model describes how the various types of digital representations of these abstract objects (CDXML and MOL
files, instrumental data sets, PDF reports, etc.) are related and how they are to be described by structured
metadata.

A key feature of the IUPAC FAIRSpec Finding Aid is that it is extensible.While it is expected to contain certain
metadata in an IUPAC-specified format, the FAIRSpec Metadata Object Model allows for the addition of whatever
additional metadata is needed, depending upon the context. Thus, a key feature of a FAIRSpec-ready collection is
that it provides for additional (“non-extractable”) metadata in a standardized format. We will see how this
additional metadata can be represented in Section 5.

A FAIRSpec-ready data collection provides a means to create such a finding aid and its associated collection
and landing page via automation.Wewant to be able to pass the FAIRSpec-ready data collection to a software tool
(perhaps a public or private web site or a local software application), that can read the collection’s digital items,
extract the key descriptive and relational metadata, and create what we are calling an IUPAC FAIRSpec Finding
Aid. In the process, the extractormay generate additional representations, such as images of structures, predicted
spectra, or peak listings. The extractor would only be limited by its sophistication, but it could only work if the
FAIRSpec-ready collection is properly organized. This section describes how this organization might be achieved.

The outline of an IUPAC FAIRSpec Finding Aid serialized as JavaScript object notation (JSON) is shown in
Fig. 3. The opening view of a web-based landing page that interprets the JSON is shown in Fig. 4,42 which was
created for a recent publication in inorganic chemistry.43 Without going into extensive detail here, the essence of
an IUPAC FAIRSpec Finding Aid is that it consists of the following parts:
– A mandatory header section that

– identifies the document as an IUPAC FAIRSpec Finding Aid
– identifies related work, such as publications or related data collections
– identifies the target repositories and pointers to local or remote data items
– specifies licensing and other access-related aspects of the collection

– An optional section listing individual samples, each with its own set of representations, key metadata, and
identifier

– An optional section listing individual structures, eachwith its own set of representations, keymetadata, and
identifier

– An optional section listing individual experimentally or computationally derived datasets, each with its
own set of representations, key metadata, and identifier. (The specification allows for predicted, simulated,
and experimental spectra to be represented, as long as they are identified as such in their associated
metadata.)

1492 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


– An optional compounds section making one-to-one, one-to-many, or many-to-one associations among items
on two ormore of the above lists. For example, a collection of sample-structure, sample-spectra, or structure-
spectrum relationships.

– An optional section listing individual structure-spectral analyses, each with its own set of specific repre-
sentations, relating specific spectra and their related structures

– Additional custom sections as needed

Fig. 5 and 6 illustrate the sort of interpretation that a relatively simple web-based landing page might make of
structure and spectrum entries, respectively.

Fig. 4: The top of a web page designed to interpret IUPAC FAIRSpec Finding Aids. In this case, the web page reads the IUPAC FAIRSpec
Finding Aid shown in Fig. 3. Clicking the search link allows text, substructure, and property searching of the collection in relation to
compounds, structures, and spectra.

Fig. 3: The top-level view of an IUPAC FAIRSpec Finding Aid as JSON as displayed in a web browser. The document was created by an
automated process that extracted metadata from a single ZIP file associated with a publication and contains compound-based structure-
spectra associations.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1493


In principle, except for the header, any combination of the additional sections is possible. Thus, a FAIRSpec-
ready collection could be as simple as a single structure representation and a single spectrum or as complicated as
the full set of structures, spectra, and analyses associated with a project or publication.

4.2 The FAIRSpec-ready collection

The automated process of adding an IUPAC FAIRSpec Finding Aid to a FAIRSpec-ready collection creates an IUPAC
FAIRSpec Data Collection, which can then be searched and interpreted as shown above. The goal, then, is to create
well-characterized FAIRSpec-ready collections, whether they be sample- or compound-based collections for use
within a research group during the course of a project, or “final” collections associated with publications. The
guidelines presented here for such FAIRSpec-ready collections have the following features:
1. The most important characteristic of a FAIRSpec-ready data collection is that it is organized sys-

tematically and consistently. In principle, any systematic organization that conveys appropriate re-
lationships can be processed to become an IUPAC FAIRSpec Data Collection with an associated IUPAC
FAIRSpec Finding Aid. Nonetheless, the organizational principles described herein specifically illustrate a
small number of suitable FAIRSpec-ready data collection organizations.

2. Organization will depend upon context. The FAIRSpec-ready Data Collection created on the day a dataset
is generated will likely look quite different from one that ultimately is used in creating the IUPAC FAIRSpec
Data Collection associated with a publication. At the beginning of an endeavor, for example, there is
typically just a physical sample. A spectrum is taken. Theremay be the expectation of a certain structure, but
perhaps not. If nothing else, it is hoped that the spectrum will at least support a hypothesis relating to
chemical structure. Itmay be the case that the structure is truly unknown, and it is only after the spectrum is
analyzed that it is “known”. (More precisely, only with the help of spectroscopy can the structure be
hypothesized.) Thus, an initial collection may be just one or more spectra and their associated sample
identifier. One or more (plausible) structures are added a week later. More spectra are taken. More samples
are created and analyzed. Everything is sample-based at first, but then, later, the key organizing principle
shifts to chemical structure – chemical “compounds”. Ultimately, upon publication, the key organizing
principle will be “compound number in the article,” and only selected spectrawill be included. Importantly,
the underlying data have not changed. (We hope!) Only the organization has changed. And, importantly, the
FAIRSpec-ready collection has probably not changed at all, except for some key metadata.

3. Extractor utilities may impose their own conventions. Individual tools developed to extract metadata
from a data collection to create an IUPAC FAIRSpec Data Collection and its associated IUPAC FAIRSpec

Fig. 5: An example of a structure entry based on the presence of a single CDXML file in the FAIRSpec-ready collection. The automated
extractor has added a PNG image, a MOL file, standard and fixed-H InChIs, InChIKey, SMILES, molecular formula, and links to interactive
predicted spectra.

1494 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


Finding Aid may develop their own specific requirements that go beyond what is suggested here. However,
an FAIRSpec-ready data collection must follow the conventions described here. Metadata tools that refer to
themselves as “FAIRSpec-ready extractors”may enforce occurrences of “should” in what follows as “must”
but may not impose additional restrictions that contradict these conventions. For example, an extractor
might allowUnicode characters outside the specified range for identifiers, but if it does, itmust change those
characters to the allowed set for identifiers described for IUPAC FAIRSpec Data Collections.

Fig. 6: A data entry including an x-ray structure determination and an NMR spectrum with several representations. IFD Properties are
extracted from the FAIRSpec-ready collection automatically in the process of creating the IUPAC FAIRSpec Finding Aid. PDF representations
are viewable within the web browser. (Interactive spectral representations might also be available via a web browser, but that feature was
not implemented in this particular case.)

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1495


4. Unique identifiers should be used for samples, structures, and datasets. One characteristic of the IUPAC
FAIRSpec Finding Aid is that it involves “pointing” to digital representations. This is analogous to file names in a
file system – and, for that matter, may be exactly that. For example, a compound might have a name “C3” as its
identifier; an NMR dataset might be referred to as C3_H1_NMR. It is not important that these identifiers be
semantic – that is, that they conveymeaning, such as this example does – and, in fact, it is often desirable that they
not be descriptive. They simply need to be unique. The following guidelines for identifiers should be followed:
a. Characters used throughout the metadata should be represented in the UTF-8 character set. This is the

standard encoding on the web and in many software applications, allowing for the full range of
international language characters.

b. Characters specifically in relation to identifiers should be limited to 7-bit ASCII characters. IUPAC FAIRSpec
Finding Aids utilize string-based identifiers for cross-referencing digital objects. For flexibility of pro-
cessing, identifiers should only utilize simple alphanumeric [A-Za-z0-9] ASCII characters along with a
limited set of punctuation, namely “+-’.,_()[]”. Single spaces are allowed only if not leading or trailing;
multiple sequential spaces characters, tab, and new-line characters are not allowed within identifiers.

5. The organization of files should reflect the types of spectroscopy used. Subdirectories such as “NMR”,
“IR” (infrared), “UVVIS” (ultraviolet–visible spectroscopy), and “HRMS” (high-resolution mass spectrom-
etry) can assist in both human andmachine readability. These specific examples of analytical techniques are
not meant to be exclusive. The IUPAC FAIRSpec Finding Aid in principle can describe any spectroscopic (or
non-spectroscopic) dataset. The key for FAIRSpec-ready collections is that whatever techniques are
described, they are described consistentlywithin the collection.We focus primarily onNMR spectroscopy in
this report simply because we have had more experience with that in our working group to date.

6. Initially, data should be organized by unique sample identifiers. If a collection is being created at a stage
in the research endeavor when the compound’s structure has not been identified, the appropriate associ-
ation is to a specific sample. This might be a unique identifier automatically created by an ELN or some
encoding used by a research group internally. For example, “RMH-III-23.rf-0.65” might be sufficient
immediately after chromatographic isolation, with no structure representations provided.
Thus, we might have a sample-based data collection organized by instrumental method and sample ID:

NMR/
RMH-IV-13b/
RMH-IV-13c/
RMH-IV-13d/

IR/
HSR-II-112/
HSR-II-113/
HSR-II-114/

or, alternatively, the other way around:

RMH-IV-13b/
NMR/
IR/

RMH-IV-13c/
NMR/
IR/

RMH-IV-13d/
NMR/
IR/

1496 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


HSR-II-112/
NMR/
IR/

HSR-II-113/
NMR/
IR/

HSR-II-114/
NMR/
IR/

etc.,

where unique laboratory sample identifiers are used instead of compound identifiers, because at the stage
in the process, that is all that is available. Whatever system is used, it must be systematic, with unique
identifiers (think “file paths”) for each item in the collection.
Alternatively, if an instrument vendor or ELN provides a means of associating a sample ID with a spectrum,
this should be used, as it will provide a well-characterized method of automatically associating a sample ID
with spectral data. The common method of working a sample ID into a title field is not advised, because,
though it might be easily human readable, co-opting a title field to indicate something as important as a
sample ID is fundamentally unsound.

7. At the point of publication, some representative subset of the sample-based FAIRSpec-ready
collection will be used to make a publication-oriented compound-based FAIRSpec-ready collection.
This allows for the natural relationship of “structure of a compound”, and “spectrum of a compound”,with a
direct relationship to typical publications in the field. For example:

1a/
NMR/

1H-NMR/
13C-NMR/

2a/
NMR/

1H-NMR/
13C-NMR/

Where now “1a” and “2a” are references to compounds described in a specific publication. Here we have
unique identifiers of the form 2a/NMR/1H-NMR/xxxx. The file system-like hierarchy provides unique
identifiers for all representations. Note that the file paths shown in this example are themselves semantic.
Thefile path itself is conveying the relationship between a structure and spectrum that will become codified
in the IUPAC FAIRSpec Finding Aid. This is not absolutely necessary, but it is advisable both in terms of
human readability and ease of machine processing. Only selected representative spectra will be included,
often from more than one sample of a given compound.

Organizing by compound or sample identifier could be facilitated by ELNs that allow such modes. We
feel that the benefits of doing so, including the ELN-based creation of the IUPAC FAIRSpec Finding Aid based
on chemical compounds, would be considerable. This principle could also be included in the specifications
for future designs of “IUPAC FAIRSpec-Compliant” ELNs.

8. The organization of files should minimize the duplication of information. Data collections are often
used inmultiple contexts. Ongoing research collections are generally sample-based and created long before
any knowledge of final publication compound numbers. Data published for one manuscript, with a specific

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1497


set of compound numbers, might also be referred to in another manuscript, with different compound
numbers. If maximum flexibility is desired, paths such as “3a/3a-NMR/3a-1HNMR” should be avoided, using
simply “3a/NMR/1HNMR” instead. In thisway, if “3a” is not thefinal compoundnumber in the publication, or
the data are copied to a collection for a different publication, there is only one directory to rename.

9. Associating a FAIRSpec-ready collection with chemical structure is a straightforward process. At
whatever point in a research program the identity of an isolated or reference compound is established, the
association of spectrum with a sample can remain, but an additional association with a chemical structure
becomes appropriate. This association can be simplicity itself. All that is needed is awell-crafted (see Section
2) CDXML or MOL representation. The next several points discuss simple methods to accomplish this
association.

10. Structure representations can be added along the path to a dataset. One of the simplest ways to
associate structures with spectra is to place a structure representation within the directory path leading to
the spectral data. Thus, using the example given above, we might have:
RMH-IV-13b/

structure.cdxml
NMR/
IR/
UVVIS/
HRMS/

RMH-IV-13c/
structure.cdxml
NMR/
IR/
UVVIS/

RMH-IV-13d/
structure.cdxml

… etc.
This is enough to make the key association between the structure and spectrum in simple cases and has the
advantage that changes in structure assignment are trivial to implement. Just replace the file.

11. Structure representations can be integrated into datasets if desired. Some vendors allow one or more
structure files to be added to an instrument dataset. For example:

RMH-IV-13b/
NMR/

NMR-2024.04.15a/
structure.mol

NMR-2024.04.15b/
structure.mol

…etc.
This can work, but it suffers from the issue that there is potential for considerable unnecessary duplication.
Should the structural hypothesis bemodified, thenwe have the error-prone challenge ofmodifying all of the
structural representations within this directory path as well. Nonetheless, this is a standard way to intro-
duce structures into datasets that is recognized also by some vendors within their own software. Thus, a
structure introduced for display in the vendor’s system can be effortlessly introduced also into an IUPAC
FAIRSpec Finding Aid.

12. Structure representations can be placed in a parallel set of directories. Perhaps the cleanest way to
associate structureswith spectra is to keep them separate but to provide associated identifiers. This could be
compound based, as in:

1498 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


data/
RH0013/

NMR/
IR/

RH0014/
NMR/
IR/

… …

structures/
RH0013/

structure.cdxml
RH0014/

structure.cdxml

…

This method has the advantage that there is no duplication. It is particularly useful for compound-based
collections. Unique local lab-based compound identifiers (e.g., RH0013) make the metadata connection
between structures and spectra. Amodification of a structure will be registered for all its associated spectra
automatically. At some later date, a correlation can be made between the lab-based compound identifiers
and the compound numbering used in a specific publication.

13. Compound associations that are mixtures preferably should identify as a mixture using a “+” sign
and include separately identified structure representations, one for each component of themixture,
within a “structures” subdirectory. Thus:

3c+3c’/
structures/

3c.cdxml
3c’.cdxml

NMR/
…

This is enough tomake it clear that themultiple CDXMLfiles are of different structures, as opposed to simply
alternative representations of the same compound.

14. Alternatively, and less preferred, if the context demands, isomeric mixtures may be described by
ambiguous structure representations. In special cases, there might be little or no practical relevance of
the exact stereochemistry of a structure. In that case, no special effort should be expended to detail it. For
example, if the manuscript refers to “Compound 3c” as having a tetrahydopyranyl protecting group (which
has a stereocenter) and no analysis was carried out that determined the stereochemistry of this group, a
single CDXMLfilewith no stereochemistry indicated (just the nickname “THP”) can be provided. However, if
the manuscript refers to “Compound 3c” as a “3:1 mixture of E/Z isomers”, then two separate CDXML files,
one Z and one E must be provided. It is not appropriate to show a single structure with ambiguous notation
or to give structures for both isomers in the same file. We need the extraction process to identify both
isomers independently, without the need to search files for multiple structures. It is not necessary to
describe the extent of the mixture within the structural context. Metadata can be added later to the
compound association itself annotating the specifics of the mixture. Future guidance may allow for a more
systematic way of describing mixtures, for example, with a mixtures InChI (MInChI)44, 45 or using a method
designed specifically for IUPAC FAIRSpec Finding Aids.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1499


5 Addition of curated metadata to a FAIRSpec-ready collection

5.1 Internal metadata records

While structure and instrument data representations in a spectroscopic data collection can be significant sources
for the automated extraction of descriptive metadata, not all descriptive metadata can be gathered this way. In
addition, relationalmetadata records, such as related sample identifiers, are less easily produced or extracted. In
addition, by its very nature, a relational metadata record is not associated with one specific digital item. Thus, its
proper place is not “within” any of the digital items it relates to. For example, we have seen how a file structure
itself can be the basis of relational metadata. (“This structure is associated with this spectrum because they are in
the same directory.”) But, in general, addition of relational metadata will require curation.

Ultimately, key/valuemetadata associatedwith an IUPAC FAIRSpecData Collection (not just a FAIRSpec-ready
one) will be expressed primarily in terms of the IUPAC FAIRData (IFD) Model-specified standard. Thus, for
example, descriptive metadata associated with an NMR experiment in an IUPAC FAIRSpec Finding Aid might
appear as shown in Fig. 7, where the full key name for nmr.expt_solvent, for example, is

IFD.property.dataobject.fairspec.nmr.expt_solvent.
It is not expected that a FAIRSpec-ready collection utilizes such standardized keys. And, in fact, it is preferred that
metadata such as these, that can be extracted automatically from an instrumental dataset via automation, not be
provided explicitly in a FAIRSpec-ready collection at all. For example, the solvent in the originating primary
dataset was given as “MeOD”. The extraction software has matched that to an InChI, a SMILES, and a common
name, improving searchability of the collection.

Fig. 7: Descriptive metadata associated with a Bruker NMR instrument dataset as found in JSON format in an IUPAC FAIRSpec Finding Aid.
Several of the values are generated during metadata extraction, including additional solvent identifiers and the spectrometer nominal
frequency. Note that units for numerical values are given in the specification of the finding aid, not the finding aid itself. The precision of
these numbers does not necessarily represent the experimental precision due to limitations of the JSON format as well as limitations of the
originating vendormetadata. For an unambiguous description of data precision, onemust inspect the data objects themselves, not just the
finding aid.

1500 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


It is important to understand that the IUPAC FAIRSpec Finding Aid allows for both standardized and “ad hoc”
properties (listed in an IUPAC FAIRSpec Finding Aid as “attributes”). Thus, additional metadata that does not fit
easily into the standard can always be added. (Suchmetadata simply would not have any IFD.property prefix.) To
the extent that ad hoc metadata key/value pairs can be mapped to current or future IUPAC FAIRData Standard
pairs, that mapping would be done later, during the automated or semi-automated curation process of metadata
extraction. Examples of attributes from a third-party software representation of a spectrum are shown in Fig. 8.

We provide here suggestions for adding additional ad hoc metadata based on our implementation tests.
1. Consistency is important. In all cases, identifiers and keys should be unique and consistently expressed

(including spelling and capitalization). Values should use a consistent, if not standardized, vocabulary.

Fig. 8: Attributes extractable from a third-party vendor thatmay ormay not have equivalences as IFD.property items. Some of these values
correlate directly with IFD dataset properties, butmany do not. Though not standardized, these additionalmetadatamay be valuable within
certain contexts.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1501


2. Point-specific metadata files can provide additional descriptive or relational metadata not easily
otherwise conveyed.Key: valuemetadata pairs can be listed in a simple textfile accompanying structures or
data within a collection. For instrument datasets, this is already the practice of some spectrometer vendors,
who have adopted a format resembling the IUPAC JCAMP-DX format, storingmetadata specific to that dataset
in the form of “private” ##$KEY=VALUE pairs. A generalized format could just be a collection of single lines of
the form key=value, (popularized in Java properties files) where the keys should be systematically defined
and consistent throughout the collection. For example, If the file metadata.properties (in the form of a Java
properties file) containing only

sample_id=RMH-IV-23c

were added to an instrument dataset, then later automated curation could codify that reference as an
IFD.property.dataobject.originating_sample_id and use that to create an IUPAC FAIRData Sample object with
IFD.property.sample.id “RMH-IV-23c”, thus associating the specified sample with this dataset.
Such a file (perhaps created automatically by an ELN) for a specific structure representation might contain
additional metadata such as:

smiles=CC1=CCC(CC1)C(=C)C
inchi=InChI=1S/C10H16/c1-8(2)10-6-4-9(3)5-7-10/h4,10H,1,5–7H2,2–3H3
chebi_id=CHEBI:15384
mf=C10H16

that could be used to complement automated extraction. In addition, such explicitly added metadata can be
used to validate accompanying structure representations such as MOL or CDXML files, confirming that the
machine reading of a structure has been successful.
Such a file containing

structure_stereochemistry=relative
contained in a structure directory or even in one of the top levels of the collectionwould be enough to convey
the understanding that all chiral structures unless otherwise indicated are to be considered racemic.

3. Awell-organized “primary” spreadsheet can efficiently conveymetadata relationships.Metadata items
are essentially key:value pairs. A convenientway to represent such pairs is the standard spreadsheet practice
where column headers are keys, and each row contains the set of values associated with a given item
(generally identified in the first column). Particularly for a whole collection, a primary internal metadata
record in the form of a spreadsheet can be efficient. Standard open file formats such as CSV (comma-
separated values),46 TSV (tab-separated values),47 ODS (OpenDocument spreadsheet),48 or XLSX (Office Open
XML SpreadsheetML file format),49 are easily extractable via automation for the metadata they contain.

Thus, we might have something like what is shown in Fig. 9, associating publication compound identi-
fiers with lab-local identifiers, database identifiers, and repository PIDs. When (or if) incorporated into an
IUPAC FAIRSpec Finding Aid, these properties would be either mapped to standardized IUPAC FAIRSpec
metadata keys or included as additional properties.

5.2 Registered metadata records

Quite possibly, each collection ultimately might be associated with a primary registered metadata record con-
forming to a declared schema such as the DataCite Schema.50 This metadata record allows the connection to be
made to additional metadata records as appropriate both within the collection and to associated works. These
metadata records should be as chemically rich as possible. An example of how a single DOI reference can be used
to generate a full IUPAC FAIRSpec Finding Aid for a collection is given on the FAIRSpec GitHub pages.51

1502 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


Registering a metadata record for individual instrumental datasets allows for the formation of a unique PID
to be associated with individual parts of the dataset, enabling a higher probability of findability via automated
search engine “bots”. Such registered records should include the unique path or the database reference to the
exact location of the dataset itself - whether located on an institutional, specialist, or generalist repository – thus
enabling potentially widespread accessibility to the dataset itself. This also allows for distributed data storage,
and it also can include selective privacy settings for both pre- and post-publication.

The criteria for selecting an appropriate repository for registering the dataset metadata record should
include some consideration of whether the entry of rich chemical metadata is adequately supported either by the
repository human user interface or by a repository application programming interface (API). APIs are essential
for use by automated systems such as an ELN. The repository should include processes for automatic generation
and then registration of the primary metadata record with an appropriate authority, where the master copy will
be kept and indexed to facilitate findability. Any local repository copy of the master metadata record should
automatically be kept synchronized with the registered version.

Ideally, the repository would offer an option to produce IUPAC FAIRSpec Finding Aids for its various col-
lections, and these would also be registered with an agency as part of the overall collection. If the repository is
associated with a database, it could also produce specialized IUPAC FAIRSpec Finding Aids in response to search
queries as a way of standardizing API calls among various repositories, building them only as needed in response
to queries.

Amore in-depth discussion of metadata registration optimized for spectroscopy will be presented in a future
publication that discusses specific implementations that we have worked on during the development of these
guidelines.

6 Guidance for managers of laboratory instrument management
systems and electronic laboratory notebooks

Facilities implementing laboratory instrument management systems (LIMS) and/or ELNs require specific guid-
ance. When a laboratory user creates a spectroscopic dataset within an electronic tracking environment, they
may not have (or want!) access to the raw data files themselves. In addition, in principle, these systems are
generally amajor part of theworkflowall theway fromacquisition to publication (and possibly beyond, if a public

Fig. 9: A page in a spreadsheet with metadata that can be extracted using automation for additional metadata attributes associated with
compounds in a published IUPAC FAIRSpec Data Collection. The spreadsheet provides both human- and machine-readable content.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1503


institutional data repository is also part of the system). In such cases, it may be that an experimentalist might
hand-curate a collection by retrieving datasets and structures, creating “personal” file-based collections such as
described in Section 4.2. But that is not ideal.

Effectively, a LIMS or ELN presents a unique opportunity, with data-management capabilities potentially
already maintaining a FAIRSpec-ready collection. Spectra are keyed to sample IDs; sample provenance is well
described. Structures are allowed to be associated with spectra by internal database associations. Metadata
intended for local searching of the database is likely already being extracted. All of the necessary components are
there. What is needed is a mechanism for generating an IUPAC FAIRSpec Finding Aid automatically from such a
system.

Managers and developers of such systems are encouraged to read these guidelines, provide guidance for
future IUPAC efforts in this area, and ensure that their systems are “FAIRSpec-ready” – that they maintain
sufficient capabilities to export sample, structure, and spectroscopic metadata along with the spectroscopic data
itself in amanner that is consistent with these guidelines. In addition, these systems need to allow for appropriate
inputs. For example, users need to be trained and encouraged to create conforming structural representations (as
described in Section 2) so that when the time comes, a clear and unambiguous association can be made between
structure and spectrum.

Software development in this area will be important and potentially quite valuable. A LIMS or ELN that
exports an IUPAC FAIRSpec Finding Aid that includes API references to data objects in its database can signifi-
cantly improve the system’s potential for overall findability, accessibility, interoperability, and reusability, both
at the private/local and public/global levels. Thus, an ELN could automatically create local IUPAC FAIRSpec
Finding Aids. These could provide real-time snapshots of the stored sample-based data focused on specific
compounds or reactions, research groupmembers, or any other association that can be queried. To the extent that
these references are later redirected to a public institutional or generalist repository database, publication-ready
compound-based products such as full data packages with accompanying IUPAC FAIRSpec Finding Aids could be
generated automatically. In fact, provided the references are to a public repository, all that would need to be
published alongside amanuscriptwould be the PID of a landing page for an IUPAC FAIRSpec Finding Aid. The data
would never have to be “collected” in the form of a monolithic data file at all.

Finally, LIMS and ELN production of IUPAC FAIRSpec Finding Aids for local use can be the basis for
straightforward exchange of data and metadata among otherwise disparate systems, since an IUPAC FAIRSpec
Finding Aid can express any additional metadata that is needed, not just “IUPAC-defined” metadata.

7 Summary

We have provided a set of guidelines for the FAIR management of spectroscopic data collections specific to the
domain of chemistry. These guidelines cover the generation of machine-readable structure and dataset repre-
sentations, the organization of what we refer to as the FAIRSpec-ready collection, and suggestions for ways to
incorporate additional metadata into the collection. This collection optimally would be able to be curated
automatically by software to create an IUPAC FAIRSpec Finding Aid.

We have emphasized that private local FAIRSpec-ready collections can be developed automatically or semi-
automatically concurrently with laboratory research, not just at the time of publication. The potential benefits of
this “best practice” include the ability on an ongoing basis, with or without use of an ELN or LIMS, structure or
substructure searches for related spectroscopic data, the ability to filter a collection for spectra of a certain sort or
with a given set of property values, andmany other possibilities. In fact, the local benefit of even themostminimal
curation (just associating instrument datasets and analysis with specific samples or chemical structures) can be
significant.

Researchers, authors, and data managers do not have to wait until the overall IUPAC FAIRSpec Metadata
Model is formalized or do they ever have to become experts in its implementation to benefit from these simple
guidelines. The key premise here is that a small amount of forward-thinking organization and consistency can go
a long way to optimizing the findability, interoperability, accessibility, and reusability of spectroscopic data

1504 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections


collections. As a bonus, publishing of IUPAC FAIRSpec Data Collections and their associated IUPAC FAIRSpec
Finding Aids allows for longer-term and more diverse exposure for both researchers and their publications and
allows for the future reuse of data for purposes in creative ways not imagined by their originator.

Research ethics: Not applicable.
Informed consent: Not applicable.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and
approved its submission.
Use of Large Language Models, AI and Machine Learning Tools: None declared.
Conflict of interest: The authors state no conflict of interest.
Research funding: None declared.
Data availability: Not applicable.

Appendix: Terminology

In this appendix, we refer to several terms that have multiple meanings in common usage but specific meanings
within this context:

Compound association

Ametadata object with a unique identifier associating one ormore spectra with one ormore structures within an
IUPAC FAIRSpec Data Collection.
Note 1: Definitions of “compound” abound. The National Cancer Institute defines a compound as “a substance
made from two ormore different elements that have been chemically joined”.52 PubChemdescribes a “Compound
record” as pointing to “at least one Substance record”.53 The IUPAC Gold Book defines a “racemic compound” as a
“crystalline racemate in which the two enantiomers [chirally related “molecular entities”] are present in equal
amounts in a well-defined arrangement within the lattice of a homogeneous crystalline addition compound”.54

Common parlance in published works in the area of organic and inorganic chemistry refer to “Compound XXX,
which was a mixture of diastereomers” and “pure compounds” (implying “a compound” can be “impure”). A
search at an international patent site for “polymer compound” within the “front page” field returns over 1300
results.55 Notice that none of these definitions define specifically what makes one compound different from
another. None answer some of the most basic questions: Can a compound be a mixture of compounds? Is a
mixture of polymers a compound? We opt in these guidelines for a practical, inclusive definition of compound
specifically in the context of compound associations.
Note 2: The unique identifier may be as simple as a sequential number – 1, 2, 3,…. Alternatively, the originator of
the collection might choose to provide a meaningful identifier within the context of the collection. For example,
the compound association ID might refer to a compound number in a related publication (“2a”) or a mixture of
compounds (“2a+2b”) associated with one or more spectra. Within an association that relates to more than one
structure, the structure representations referred to in the association will provide the details of that
relationship – whether these are diastereomers, enantiomers, or completely constitutionally different
compounds. If a mixture, regardless of the ID, the structure representations will provide the details of that
relationship – whether these are diastereomers, enantiomers, or completely constitutionally different
compounds.

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1505


Dataset

A digital item or a collection of digital items that is derived from laboratory analysis (see Instrumental Dataset,
below) or from computation. The two general forms of computed datasets are spectroscopic predictions based on
one ormore proposed chemical structures (such as can be generated at the nmrdb.orgwebsite56) and simulations
based on experimental or predicted parameters, such as NMR frequencies, chemical shifts, and coupling
constants.

Digital item

A collection of bytes that may be local to a digital collection or may be part of a remotely accessed collection.
Note 1: Digital items are commonly implemented as “files” on a filesystem or named items within a compressed
archive (for example, an archive with ZIP, TAR, or TGZ format).

Digital object

A digital item that has associated metadata.
Note 1: In the current context, we make the distinction between the digital items in FAIRSpec-ready data
collections that do not have associated IUPAC FAIRSpec metadata, and digital objects in IUPAC FAIRSpec Data
Collections, which do.

Instrument dataset

A dataset that comprises the raw or minimally transformed data arising from laboratory analysis.
Example 1: A file directory containing NMR FID data or real- and imaginary-valued transformed data, parameter
files, and accompanying metadata.
Example 2: A ZIP archive of such a file directory.

IUPAC FAIRSpec data collection

A collection of digital objects referenced by an IUPAC FAIRSpec Finding Aid and consisting of one or more
representations relating to sample, structure, spectral data, or analysis, and conforms with specifications as
determined by IUPAC.
Note 1: An IUPAC FAIRSpec Data Collection is generally sample-based or compound-based, depending upon its
contexts. Sample-based collections make clear associations between laboratory samples and spectral data on a
one-to-many basis. Compound-based collections associate specific chemical structures with spectral data on a
many-to-many basis.
Note 2: An IUPAC FAIRSpec Data Collection need not be limited to the types of representationmentioned in Note 1.
Any sort of data or metadata may be included. For example, data need not be limited to experimental
spectroscopic data. Results of calculations and simulations can be part of the collection. The expectation, however,
is that such results are ultimately relevant to spectroscopy.

1506 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections

http://nmrdb.org


IUPAC FAIRSpec Finding Aid

A digital item consisting of metadata that summarizes the contents of an IUPAC FAIRSpec Data Collection listing
and making associations among representations relating to sample, structure, spectral data, and analyses, and
conforms with specifications as determined by IUPAC.
Note 1: In general, afinding aid is a document that assists users in locating items in a digital archive. Awidely used
standard developed by the US Library of Congress exists for archive-related digital finding aids and is used
extensively throughout the archival community.57 We extend that concept to locating digital items in a
spectroscopic data collection.
Note 2: An IUPAC FAIRSpec Finding Aid need not be limited to IUPAC-definedmetadata. The format allows for any
amount of additional metadata to be included.
Note 3: An IUPAC FAIRSpec Finding Aid may contain representations as well as metadata. The format allows for
reasonably short representations such as character strings, images, and individual-spectrum PDF documents to
be integrated directly into the finding aid for better findability and reusability.

Persistent identifier

A publicly registered character string providing a long-lasting reference to a digital object and its associated
metadata.
Example 1: 10.1515/pac-2021-2009

Representation

A digital object present within a collection and identifiable using an identifier that is unique within the context of
the collection.
Note 1: Representations may be individual digital items within a collection referenced by the IUPAC FAIRSpec
Finding Aid. Alternatively, they can be character strings (including Base64-encoded58 byte arrays) that are
contained within the finding aid itself.
Example 1: A MOL file may be present within the collection and have the unique file name “3aa/structures/
3aa.mol”
Example 2: A PNG (portable network graphics) image may reside as a file or be included within an IUPAC
FAIRSpec Finding Aid as a Base64-encoded string so that is more easily accessible or because it has been extracted
from a more complex dataset.
Example 3: A SMILES string representation such as “C(O)CCC” need not be in a file of its own; it can be provided in
the IUPAC FAIRSpec Finding Aid as the data value of an IFD.representation.structure.SMILES representation.

List of abbreviations

API Application programming interface
ASCII American standard code for information interchange
CDX ChemDraw exchange format
CDXML ChemDraw XML format
DOI Digital object identifier
ELN Electronic laboratory notebook
ESI Electronic supplementary information
FAIR Findable, accessible, interoperable, and reusable
FID Free induction decay
HELM Hierarchical editing language for macromolecules
HRMS High-resolution mass spectrometry

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1507


IFD IUPAC FAIRData
IR Infrared
JCAMP-DX Joint Committee on Atomic and Molecular Physical Data-Data Exchange format
JSON JavaScript object notation
LIMS Laboratory instrument management system
MDL Molecular Design Limited
MInChI Mixtures InChI
mmCIF Macromolecular crystallographic information file
MOL Molfile
NMR Nuclear magnetic resonance
ODS OpenDocument spreadsheet
PDF Portable document format
PID Persistent identifier
PNG Portable network graphics
SDF Structure-data file
SMILES Simplified molecular input line entry system
URL Uniform resource locator
UTF-8 Unicode transformation format – 8-bit
UVVIS Ultraviolet–visible spectroscopy
XLSX Office Open XML SpreadsheetML
XML Extensible markup language

References

1. Development of a Standard for FAIR Data Management of Spectroscopic Data. Project Details. https://iupac.org/project/2019-031-1-024
(accessed 2025-07-29).

2. IUPAC/IUPAC-FAIRSpec GitHub project. https://github.com/IUPAC/IUPAC-FAIRSpec (accessed 2025-07-29).
3. Several Examples of IUPAC FAIRSpec Finding Aids and Their Associated Landing Pages Can Be Found at IUPAC/IUPAC-FAIRSpec GitHub

Project Site Web Pages. https://iupac.github.io/IUPAC-FAIRSpec (accessed 2025-07-29).
4. Hanson, R. M.; Jeannerat, D.; Archibald, M.; Bruno, I. J.; Chalk, S. J.; Davies, A. N.; Lancashire, R. J.; Lang, J.; Rzepa, H. S. IUPAC Specification

for the FAIR Management of Spectroscopic Data in Chemistry (IUPAC FAIRSpec) – Guiding Principles. Pure Appl. Chem. 2022, 94 (6),
623–636. https://doi.org/10.1515/pac-2021-2009.

5. Chapter II: Proposal Preparation Instructions - Proposal & Award Policies & Procedures Guide (PAPPG) (NSF 23-1) | NSF – National Science
Foundation. 2023. https://new.nsf.gov/policies/pappg/23-1/ch-2-proposal-preparation (accessed 2025-07-29).

6. Dissemination and Sharing of Research Results |NSF – National Science Foundation. https://www.nsf.gov/funding/data-management-plan#:
∼:text=funded%20investigators%20are%20expected%20to%20share (accessed 2025-07-29).

7. CHE Data Management and Sharing Plan Guidance. https://www.nsf.gov/mps/che/data-management-sharing-plans (accessed 2025-07-
29).

8. Data sharing: Guidance for Best Practice and Reproducibility of Experimental Data, Royal Society of Chemistry https://www.rsc.org/
publishing/publish-with-us/publish-a-journal-article/data-sharing (accessed 2025-07-29).

9. Research Data Sharing, Springer Nature https://support.springernature.com/en/support/solutions/folders/6000238326 (accessed 2025-
07-29).

10. ACS Research Data Guidelines https://researcher-resources.acs.org/publish/data_guidelines#organic_chem_data (accessed 2025-07-
29).

11. US Government Code of Federal Regulations 21 CFR 11.10 Electronic Records – Controls for Closed Systems https://www.ecfr.gov/current/title-
21/chapter-I/subchapter-A/part-11/subpart-B/section-11.10 (accessed 2025-07-29).

12. Patel, K. T.; Chotai, N. P. Documentation and Records: Harmonized GMPRequirements. J. Young Pharm. 2011, 3 (2), 138–150. Available as a
National Library of Medicine online article as https://pmc.ncbi.nlm.nih.gov/articles/PMC3122044 (accessed 07 29, 2025).

13. Wilkinson, M. D.; Dumontier, M.; Aalbersberg, Ij. J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L. B.;
Bourne, P. E.; Bouwman, J.; Brookes, A. J.; Clark, T.; Crosas, M.; Dillo, I.; Dumon, O.; Edmunds, S.; Evelo, C. T.; Finkers, R.;
Gonzalez-Beltran, A.; Gray, A. J. G.; Groth, P.; Goble, C.; Grethe, J. S.; Heringa, J.; ’t Hoen, P. A. C.; Hooft, R.; Kuhn, T.; Kok, R.; Kok, J.;
Lusher, S. J.; Martone, M. E.; Mons, A.; Packer, A. L.; Persson, B.; Rocca-Serra, P.; Roos, M.; van Schaik, R.; Sansone, S.-A.; Schultes, E.;
Sengstag, T.; Slater, T.; Strawn, G.; Swertz, M. A.; Thompson, M.; van der Lei, J.; van Mulligen, E.; Velterop, J.; Waagmeester, A.;
Wittenburg, P.; Wolstencroft, K.; Zhao, J.; Mons, B. The FAIR Guiding Principles for Scientific DataManagement and Stewardship. Sci. Data
2016, 3 (1), 160018. https://doi.org/10.1038/sdata.2016.18.

14. The FAIR2 Specification https://www.fair2.ai/specification (accessed 2025-07-29).
15. DOI Foundation, https://www.doi.org (accessed 2025-07-29).

1508 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections

https://iupac.org/project/2019-031-1-024
https://github.com/IUPAC/IUPAC-FAIRSpec
https://iupac.github.io/IUPAC-FAIRSpec
https://doi.org/10.1515/pac-2021-2009
https://new.nsf.gov/policies/pappg/23-1/ch-2-proposal-preparation
https://www.nsf.gov/funding/data-management-plan#:~:text=funded%20investigators%20are%20expected%20to%20share
https://www.nsf.gov/funding/data-management-plan#:~:text=funded%20investigators%20are%20expected%20to%20share
https://www.nsf.gov/mps/che/data-management-sharing-plans
https://www.rsc.org/publishing/publish-with-us/publish-a-journal-article/data-sharing
https://www.rsc.org/publishing/publish-with-us/publish-a-journal-article/data-sharing
https://support.springernature.com/en/support/solutions/folders/6000238326
https://researcher-resources.acs.org/publish/data_guidelines#organic_chem_data
https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11/subpart-B/section-11.10
https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11/subpart-B/section-11.10
https://pmc.ncbi.nlm.nih.gov/articles/PMC3122044
https://doi.org/10.1038/sdata.2016.18
https://www.fair2.ai/specification
https://www.doi.org


16. Tremouilhac, P.; Nguyen, A.; Huang, Y.-C.; Kotov, S.; Lütjohann, D. S.; Hübsch, F.; Jung, N.; Bräse, S. Chemotion ELN: an Open Source
Electronic Lab Notebook for Chemists in Academia. J. Cheminf. 2017, 9 (1), 54. https://doi.org/10.1186/s13321-017-0240-0.

17. DataCite: Connecting Research, Advancing Knowledge. https://www.datacite.org (accessed 2025-07-29).
18. Dalby, A.; Nourse, J. G.; Hounshell, W. D.; Gushurst, A. K. I.; Grier, D. L.; Leland, B. A.; Laufer, J. Description of Several Chemical Structure

File Formats Used by Computer Programs Developed atMolecular Design Limited. J. Chem. Inf. Comput. Sci. 1992, 32 (3), 244–255. https://
doi.org/10.1021/ci00007a012.

19. CTfile Formats. https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf (accessed 2025-07-29).
20. CDX Format Specification. https://iupac.github.io/IUPAC-FAIRSpec/cdx_sdk (accessed 2025-07-29).
21. Weininger, D. SMILES, A Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf.

Model. 1988, 28 (1), 31–36. https://doi.org/10.1021/ci00057a005.
22. Daylight Theory: SMILES. https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html (accessed 2025-07-29).
23. OpenSMILES Specification. http://opensmiles.org/opensmiles.html (accessed 2025-07-29).
24. Heller, S.; McNaught, A.; Stein, S.; Tchekhovskoi, D.; Pletnev, I. InChI - the Worldwide Chemical Structure Identifier Standard. J. Cheminf.

2013, 5 (1), 7. https://doi.org/10.1186/1758-2946-5-7.
25. IUPAC SMILES+ Specification. IUPAC | International Union of Pure and Applied Chemistry. https://iupac.org/project/2019-002-2-024

(accessed 2025-07-29).
26. InChI Requirements for Representation of Organometallic and Coordination Compound Structures. IUPAC | International Union of Pure and

Applied Chemistry. https://iupac.org/project/2009-040-2-800 (accessed 2025-07-29).
27. Enhanced recognition and Encoding of Stereoconfiguration by InChI Tools. IUPAC | International Union of Pure and Applied Chemistry.

https://iupac.org/project/2019-017-2-800 (accessed 2025-07-29).
28. Brecher, J. Graphical Representation Standards for Chemical Structure Diagrams (IUPAC Recommendations 2008). Pure Appl. Chem. 2008,

80 (2), 277–410. https://doi.org/10.1351/pac200880020277.
29. Brecher, J. Graphical Representation of Stereochemical Configuration (IUPAC Recommendations 2006). Pure Appl. Chem. 2006, 78 (10),

1897–1970. https://doi.org/10.1351/pac200678101897.
30. Committee on Publications and Cheminformatics Data Standards. https://iupac.org/body/024 (accessed 2025-07-29).
31. Apodaca, R. L. Stereochemistry and the V2000 Molfile Format, 2021. http://depth-first.com/articles/2021/12/29/stereochemistry-and-the-

v2000-molfile-format (accessed 2025-07-29).
32. Atomic Coordinate Entry Format Version 3.3. https://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html (accessed

2025-07-29).
33. Bourne, P. E.; Berman, H. M.; McMahon, B.; Watenpaugh, K. D.; Westbrook, J. D.; Fitzgerald, P. M. D. Macromolecular Crystallographic

Information File, In Methods in Enzymology; Macromolecular Crystallography Part B; Academic Press, Vol. 277, 1997; pp. 571–590.
34. Molstar/BinaryCIF, 2024. https://github.com/molstar/BinaryCIF (accessed 2025-07-29).
35. Zhang, T.; Li, H.; Xi, H.; Stanton, R. V.; Rotstein, S. H. HELM: A Hierarchical Notation Language for Complex Biomolecule Structure

Representation. J. Chem. Inf. Model. 2012, 52 (10), 2796–2806. https://doi.org/10.1021/ci3001925.
36. IUPAC Subcommittee on HELM. https://iupac.org/body/803 (accessed 2025-07-29).
37. FAIR Principles. GO FAIR. https://www.go-fair.org/fair-principles (accessed 2025-07-29).
38. Grasselli, J. G. JCAMP-DX, a Standard Format for Exchange of Infrared Spectra in Computer Readable Form (Recommendations 1991). Pure

Appl. Chem. 1991, 63 (12), 1781–92. https://doi.org/10.1351/pac199163121781.
39. Ulrich, E. L.; Baskaran, K.; Dashti, H.; Ioannidis, Y. E.; Livny, M.; Romero, P. R.; Maziuk, D.; Wedell, J. R.; Yao, H.; Eghbalnia, H. R.; Hoch, J. C.;

Markley, J. L. NMR-STAR: Comprehensive Ontology for Representing, Archiving and Exchanging Data from Nuclear Magnetic Resonance
Spectroscopic Experiments. J. Biomol. NMR 2019, 73 (1), 5–9. https://doi.org/10.1007/s10858-018-0220-3.

40. nmrML - Home https://nmrml.org (accessed 2025-07-29).
41. IUPAC_FAIRSpec_Specification_draft.pdf in https://github.com/IUPAC/IUPAC-FAIRSpec/tree/main/documents/specifications (accessed

2025-07-29).
42. IUPAC FAIRSpec GitHub Example Icl-13086, https://iupac.github.io/IUPAC-FAIRSpec#icl-10386 (accessed 2025-07-29).
43. Mies, T.; White, A. J. P.; Rzepa, H. S.; Barluzzi, L.; Devgan, M.; Layfield, R. A.; Barrett, A. G. M. Syntheses and Characterization of Diverse

Main Group, Transition, Lanthanide and Actinide Metal Complexes of Ethyl-3-Oxo-2,3-Dihydro-1h-Pyrazole-4-Carboxylate and Related
Bidentate Ligands. Inorg. Chem. 2023, 62, 13253–76. https://doi.org/10.1021/acs.inorgchem.3c01506.

44. InChI Extension forMixture Composition. IUPAC | International Union of Pure and Applied Chemistry. https://iupac.org/project/2015-025-4-
800 (accessed 2025-07-29).

45. Clark, A. M.; McEwen, L. R.; Gedeck, P.; Bunin, B. A. CapturingMixture Composition: an OpenMachine-Readable Format for Representing
Mixed Substances. J. Cheminf. 2019, 11 (1), 33. https://doi.org/10.1186/s13321-019-0357-4.

46. Common Format and MIME Type for Comma-Separated Values (CSV) Files. https://www.ietf.org/rfc/rfc4180.txt (accessed 2025-07-29).
47. Definition of Tab-Separated-Values (tsv). https://www.iana.org/assignments/media-types/text/tab-separated-values (accessed 2025-07-

29).
48. OpenDocument Spreadsheet (ODS). https://esco.ec.europa.eu/en/about-esco/escopedia/escopedia/opendocument-spreadsheet-ods

(accessed 2025-07-29).
49. Structure of a SpreadsheetML Document. 2023. https://learn.microsoft.com/en-us/office/open-xml/spreadsheet/structure-of-a-

spreadsheetml-document (accessed 2025-07-29).

M. Archibald et al.: FAIRSpec-ready spectroscopic data collections 1509

https://doi.org/10.1186/s13321-017-0240-0
https://www.datacite.org
https://doi.org/10.1021/ci00007a012
https://doi.org/10.1021/ci00007a012
https://www.daylight.com/meetings/mug05/Kappler/ctfile.pdf
https://iupac.github.io/IUPAC-FAIRSpec/cdx_sdk
https://doi.org/10.1021/ci00057a005
https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
http://opensmiles.org/opensmiles.html
https://doi.org/10.1186/1758-2946-5-7
https://iupac.org/project/2019-002-2-024
https://iupac.org/project/2009-040-2-800
https://iupac.org/project/2019-017-2-800
https://doi.org/10.1351/pac200880020277
https://doi.org/10.1351/pac200678101897
https://iupac.org/body/024
http://depth-first.com/articles/2021/12/29/stereochemistry-and-the-v2000-molfile-format
http://depth-first.com/articles/2021/12/29/stereochemistry-and-the-v2000-molfile-format
https://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html
https://github.com/molstar/BinaryCIF
https://doi.org/10.1021/ci3001925
https://iupac.org/body/803
https://www.go-fair.org/fair-principles
https://doi.org/10.1351/pac199163121781
https://doi.org/10.1007/s10858-018-0220-3
https://nmrml.org
https://github.com/IUPAC/IUPAC-FAIRSpec/tree/main/documents/specifications
https://iupac.github.io/IUPAC-FAIRSpec#icl-10386
https://doi.org/10.1021/acs.inorgchem.3c01506
https://iupac.org/project/2015-025-4-800
https://iupac.org/project/2015-025-4-800
https://doi.org/10.1186/s13321-019-0357-4
https://www.ietf.org/rfc/rfc4180.txt
https://www.iana.org/assignments/media-types/text/tab-separated-values
https://esco.ec.europa.eu/en/about-esco/escopedia/escopedia/opendocument-spreadsheet-ods
https://learn.microsoft.com/en-us/office/open-xml/spreadsheet/structure-of-a-spreadsheetml-document
https://learn.microsoft.com/en-us/office/open-xml/spreadsheet/structure-of-a-spreadsheetml-document


50. DataCite Metadata Working Group. DataCite Metadata Schema Documentation for the Publication and Citation of Research Data and
Other Research Outputs v4.5. 2024. https://doi.org/10.14454/G8E5-6293. An example of a DataCite metadata record can be found at
https://data.datacite.org/application/vnd.datacite.datacite+json/10.14469/hpc/10386 (accessed 2025-07-29).

51. Demo 2025.5 examples2/icl-10386-crawl-only. https://iupac.github.io/IUPAC-FAIRSpec/#10386co (accessed 2025-07-29).
52. Compound. https://www.cancer.gov/publications/dictionaries/cancer-terms/def/compound (accessed 2025-07-29).
53. PubChem Compounds https://pubchem.ncbi.nlm.nih.gov/docs/compounds (accessed 2025-07-29).
54. IUPAC Gold Book Racemic Compound. https://goldbook.iupac.org/terms/view/R05027 (accessed 2025-07-29).
55. WIPO Patentscope https://patentscope.wipo.int (accessed 2025-07-29).
56. nmrdb.org Tools for NMR Spectroscopists https://www.nmrdb.org (accessed 2025-07-29).
57. EAD: Encoded Archival Description https://www.loc.gov/ead (accessed 2025-07-29).
58. RFC 47648: The Base16, Base32, and Base64 Data Encodings https://datatracker.ietf.org/doc/html/rfc4648 (accessed 2025-07-29).

1510 M. Archibald et al.: FAIRSpec-ready spectroscopic data collections

https://doi.org/10.14454/G8E5-6293
https://data.datacite.org/application/vnd.datacite.datacite+json/10.14469/hpc/10386
https://iupac.github.io/IUPAC-FAIRSpec/#10386co
https://www.cancer.gov/publications/dictionaries/cancer-terms/def/compound
https://pubchem.ncbi.nlm.nih.gov/docs/compounds
https://goldbook.iupac.org/terms/view/R05027
https://patentscope.wipo.int
https://www.nmrdb.org
https://www.loc.gov/ead
https://datatracker.ietf.org/doc/html/rfc4648

	FAIRSpec-ready spectroscopic data collections &#x2013; advice for researchers, authors, and data managers (IUPAC Technical Report)
	1 Introduction
	1.1 The IUPAC &#x201c;FAIRSpec&#x201d; project
	1.2 Organization of this report
	1.3 Intended audience
	1.4 Management of spectroscopic data
	1.5 FAIR management of spectroscopic data
	1.6 Not just for publication: the FAIR data workflow
	1.7 Sample-vs. compound-based collections
	1.8 Descriptive and relational metadata

	2 Guidance for digital chemical structure representations in a FAIRSpec-ready data collection
	2.1 Digital chemical structure representations
	2.2 Generating digital chemical structure representations

	3 Guidance for instrument dataset representations in a FAIRSpec-ready data collection
	4 Guidance for the organization of digital items in a FAIRSpec-ready data collection
	4.1 The IUPAC FAIRSpec finding aid
	4.2 The FAIRSpec-ready collection

	5 Addition of curated metadata to a FAIRSpec-ready collection
	5.1 Internal metadata records
	5.2 Registered metadata records

	6 Guidance for managers of laboratory instrument management systems and electronic laboratory notebooks
	7 Summary
	Appendix: Terminology
	Compound association
	Dataset
	Digital item
	Digital object
	Instrument dataset
	IUPAC FAIRSpec data collection
	IUPAC FAIRSpec Finding Aid
	Persistent identifier
	Representation
	List of abbreviations
	References


<<
  /ASCII85EncodePages false
  /AllowTransparency false
  /AutoPositionEPSFiles true
  /AutoRotatePages /None
  /Binding /Left
  /CalGrayProfile (Dot Gain 20%)
  /CalRGBProfile (sRGB IEC61966-2.1)
  /CalCMYKProfile (Euroscale Coated v2)
  /sRGBProfile (sRGB IEC61966-2.1)
  /CannotEmbedFontPolicy /Warning
  /CompatibilityLevel 1.7
  /CompressObjects /Tags
  /CompressPages true
  /ConvertImagesToIndexed true
  /PassThroughJPEGImages false
  /CreateJobTicket false
  /DefaultRenderingIntent /Default
  /DetectBlends true
  /DetectCurves 0.1000
  /ColorConversionStrategy /sRGB
  /DoThumbnails true
  /EmbedAllFonts true
  /EmbedOpenType false
  /ParseICCProfilesInComments true
  /EmbedJobOptions true
  /DSCReportingLevel 0
  /EmitDSCWarnings false
  /EndPage -1
  /ImageMemory 1048576
  /LockDistillerParams false
  /MaxSubsetPct 35
  /Optimize true
  /OPM 1
  /ParseDSCComments true
  /ParseDSCCommentsForDocInfo true
  /PreserveCopyPage true
  /PreserveDICMYKValues true
  /PreserveEPSInfo true
  /PreserveFlatness false
  /PreserveHalftoneInfo false
  /PreserveOPIComments false
  /PreserveOverprintSettings true
  /StartPage 1
  /SubsetFonts true
  /TransferFunctionInfo /Apply
  /UCRandBGInfo /Remove
  /UsePrologue false
  /ColorSettingsFile ()
  /AlwaysEmbed [ true
  ]
  /NeverEmbed [ true
  ]
  /AntiAliasColorImages false
  /CropColorImages false
  /ColorImageMinResolution 300
  /ColorImageMinResolutionPolicy /OK
  /DownsampleColorImages true
  /ColorImageDownsampleType /Bicubic
  /ColorImageResolution 300
  /ColorImageDepth -1
  /ColorImageMinDownsampleDepth 1
  /ColorImageDownsampleThreshold 1.50000
  /EncodeColorImages true
  /ColorImageFilter /DCTEncode
  /AutoFilterColorImages true
  /ColorImageAutoFilterStrategy /JPEG
  /ColorACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /ColorImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000ColorACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 10
  >>
  /JPEG2000ColorImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasGrayImages false
  /CropGrayImages false
  /GrayImageMinResolution 300
  /GrayImageMinResolutionPolicy /OK
  /DownsampleGrayImages true
  /GrayImageDownsampleType /Bicubic
  /GrayImageResolution 300
  /GrayImageDepth -1
  /GrayImageMinDownsampleDepth 2
  /GrayImageDownsampleThreshold 1.50000
  /EncodeGrayImages true
  /GrayImageFilter /DCTEncode
  /AutoFilterGrayImages true
  /GrayImageAutoFilterStrategy /JPEG
  /GrayACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /GrayImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000GrayACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /JPEG2000GrayImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasMonoImages false
  /CropMonoImages false
  /MonoImageMinResolution 600
  /MonoImageMinResolutionPolicy /OK
  /DownsampleMonoImages true
  /MonoImageDownsampleType /Bicubic
  /MonoImageResolution 1000
  /MonoImageDepth -1
  /MonoImageDownsampleThreshold 1.10000
  /EncodeMonoImages true
  /MonoImageFilter /CCITTFaxEncode
  /MonoImageDict <<
    /K -1
  >>
  /AllowPSXObjects false
  /CheckCompliance [
    /None
  ]
  /PDFX1aCheck false
  /PDFX3Check false
  /PDFXCompliantPDFOnly false
  /PDFXNoTrimBoxError false
  /PDFXTrimBoxToMediaBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXSetBleedBoxToMediaBox true
  /PDFXBleedBoxToTrimBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXOutputIntentProfile (None)
  /PDFXOutputConditionIdentifier ()
  /PDFXOutputCondition ()
  /PDFXRegistryName ()
  /PDFXTrapped /False

  /CreateJDFFile false
  /Description <<
    /DEU <FEFF00280073006500650020006700650072006d0061006e002000620065006c006f00770029000d005500730065002000740068006500730065002000730065007400740069006e0067007300200074006f002000700072006f006400750063006500200063006f006e00740065006e00740020007000720069006e00740069006e0067002000660069006c006500730020006100630063006f007200640069006e006700200074006f002000740068006500200064006100740061002000640065006c0069007600650072007900200072006500710075006900720065006d0065006e007400730020006f00660020004400650020004700720075007900740065007200200028004a006f00750072006e0061006c002000500072006f00640075006300740069006f006e002900200044006100740065003a002000300033002f00300031002f0032003000310035002e0020005400720061006e00730070006100720065006e0063006900650073002000610072006500200072006500640075006300650064002c002000520047004200200069006d0061006700650073002000610072006500200063006f006e00760065007200740065006400200069006e0074006f002000490053004f00200043006f0061007400650064002000760032002e002000410020005000440046002f0058002d0031006100200069007300200063007200650061007400650064002e000d005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f005f000d000d00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e002c00200075006d00200044007200750063006b0076006f0072006c006100670065006e0020006600fc0072002000640065006e00200049006e00680061006c0074002000670065006d00e400df002000640065006e00200044006100740065006e0061006e006c006900650066006500720075006e0067007300620065007300740069006d006d0075006e00670065006e00200076006f006e0020004400450020004700520055005900540045005200200028004a006f00750072006e0061006c002000500072006f00640075006300740069006f006e00290020005300740061006e0064003a002000300031002e00300033002e00320030003100350020007a0075002000650072007a0065007500670065006e002e0020005400720061006e00730070006100720065006e007a0065006e002000770065007200640065006e00200072006500640075007a0069006500720074002c0020005200470042002d00420069006c006400650072002000770065007200640065006e00200069006e002000490053004f00200043006f00610074006500640020007600320020006b006f006e00760065007200740069006500720074002e00200045007300200077006900720064002000650069006e00650020005000440046002f0058002d00310061002000650072007a0065007500670074002e>
    /ENU ()
    /ENN ()
  >>
  /Namespace [
    (Adobe)
    (Common)
    (1.0)
  ]
  /OtherNamespaces [
    <<
      /AsReaderSpreads false
      /CropImagesToFrames true
      /ErrorControl /WarnAndContinue
      /FlattenerIgnoreSpreadOverrides false
      /IncludeGuidesGrids false
      /IncludeNonPrinting false
      /IncludeSlug false
      /Namespace [
        (Adobe)
        (InDesign)
        (4.0)
      ]
      /OmitPlacedBitmaps false
      /OmitPlacedEPS false
      /OmitPlacedPDF false
      /SimulateOverprint /Legacy
    >>
    <<
      /AllowImageBreaks true
      /AllowTableBreaks true
      /ExpandPage false
      /HonorBaseURL true
      /HonorRolloverEffect false
      /IgnoreHTMLPageBreaks false
      /IncludeHeaderFooter false
      /MarginOffset [
        0
        0
        0
        0
      ]
      /MetadataAuthor ()
      /MetadataKeywords ()
      /MetadataSubject ()
      /MetadataTitle ()
      /MetricPageSize [
        0
        0
      ]
      /MetricUnit /inch
      /MobileCompatible 0
      /Namespace [
        (Adobe)
        (GoLive)
        (8.0)
      ]
      /OpenZoomToHTMLFontSize false
      /PageOrientation /Portrait
      /RemoveBackground false
      /ShrinkContent true
      /TreatColorsAs /MainMonitorColors
      /UseEmbeddedProfiles false
      /UseHTMLTitleAsMetadata true
    >>
    <<
      /AddBleedMarks false
      /AddColorBars false
      /AddCropMarks false
      /AddPageInfo false
      /AddRegMarks false
      /BleedOffset [
        0
        0
        0
        0
      ]
      /ConvertColors /ConvertToCMYK
      /DestinationProfileName (ISO Coated v2 \(ECI\))
      /DestinationProfileSelector /UseName
      /Downsample16BitImages true
      /FlattenerPreset <<
        /ClipComplexRegions true
        /ConvertStrokesToOutlines false
        /ConvertTextToOutlines false
        /GradientResolution 300
        /LineArtTextResolution 1200
        /PresetName <FEFF005B0048006F006800650020004100750066006C00F600730075006E0067005D>
        /PresetSelector /HighResolution
        /RasterVectorBalance 1
      >>
      /FormElements true
      /GenerateStructure false
      /IncludeBookmarks false
      /IncludeHyperlinks false
      /IncludeInteractive false
      /IncludeLayers false
      /IncludeProfiles false
      /MarksOffset 8.503940
      /MarksWeight 0.250000
      /MultimediaHandling /UseObjectSettings
      /Namespace [
        (Adobe)
        (CreativeSuite)
        (2.0)
      ]
      /PDFXOutputIntentProfileSelector /UseName
      /PageMarksFile /RomanDefault
      /PreserveEditing true
      /UntaggedCMYKHandling /LeaveUntagged
      /UntaggedRGBHandling /UseDocumentProfile
      /UseDocumentBleed false
    >>
  ]
>> setdistillerparams
<<
  /HWResolution [600 600]
  /PageSize [595.276 841.890]
>> setpagedevice