Behavioral/Cognitive
Auditory Artificial Grammar Learning in Macaque and
Marmoset Monkeys
BenjaminWilson,1,2Heather Slater,1,2 Yukiko Kikuchi,1,2 Alice E. Milne,1,2William D. Marslen-Wilson,3
Kenny Smith,4 and Christopher I. Petkov1,2
1Institute of Neuroscience, and 2Centre for Behaviour and Evolution, Newcastle University, Newcastle upon Tyne, NE2 4HH, United Kingdom, 3Department
of Psychology, University of Cambridge, Cambridge, CB2 3EB, United Kingdom, and 4School of Philosophy, Psychology and Language Sciences, University
of Edinburgh, Edinburgh, EH8 9AD, United Kingdom
Artificial grammars (AG) are designed to emulate aspects of the structure of language, and AG learning (AGL) paradigms can be used to
study the extent of nonhumananimals’ structure-learning capabilities.However, differentAG structures have beenusedwith nonhuman
animals and are difficult to compare across studies and species. We developed a simple quantitative parameter space, which we used to
summarize previousnonhumananimalAGL results. Thiswasused tohighlight anunder-studiedAGwith a forward-branching structure,
designed to model certain aspects of the nondeterministic nature of word transitions in natural language and animal song. We tested
whether twomonkey species could learn aspects of this auditory AG. After habituating the monkeys to the AG, analysis of video record-
ings showed that common marmosets (New World monkeys) differentiated between well formed, correct testing sequences and those
violating theAGstructurebasedprimarily on simple learning strategies. By comparison,Rhesusmacaques (OldWorldmonkeys) showed
evidence for deeper levels of AGL. A novel eye-tracking approach confirmed this result in the macaques and demonstrated evidence for
more complex AGL. This study provides evidence for a previously unknown level of AGL complexity in Old World monkeys that seems
less evident in NewWorldmonkeys, which aremore distant evolutionary relatives to humans. The findings allow for the development of
both marmosets and macaques as neurobiological model systems to study different aspects of AGL at the neuronal level.
Introduction
Language is a uniquely human trait with poorly understood evo-
lutionary origins (Bickerton and Szathmary, 2009; Hurford,
2012). Because of its complexity in meaning (“semantics”) and
structure (“syntax”), natural language cannot be directly investi-
gated in nonhuman animals. However, theoretical work has
identified distinct computations related to language that can be
comparatively studied (Hauser et al., 2002; Bickerton and Szath-
mary, 2009; Hurford, 2012). Initial approaches studied referen-
tial communication in animals, which has inspired work on how
neurons process communication signals (Seyfarth et al., 1980;
Tian et al., 2001). Recently, songbirds have been viewed as prom-
ising neurobiological model systems because, like humans and a
few other animal species, they are vocal learners and can produce
songs with “syntax-like” structure (Berwick et al., 2011). Yet,
vocal production learning appears to have occurred by conver-
gent evolution rather than by common descent, since nonhuman
primates andmost other species havemore limited vocal produc-
tion capabilities (Petkov and Jarvis, 2012). This has raised ques-
tions regarding whether nonhuman primates might be able to
learn structural patterns with sufficient levels of complexity to
provide novel insights on language precursors and their study in
animal models.
Artificial grammars (AG) can be created to emulate certain
aspects of the structure of natural language or simpler “rule-
based” structures that some animalsmight be able to learn. These
can be comparatively studied using AG learning (AGL) para-
digms (Fitch and Hauser, 2004). In such studies human partici-
pants or nonhuman animals have no a priori knowledge about
the structure of the AG. Yet, by being habituated to or trained
with exemplary sequences of sensory stimuli generated by the
AG, the relationship between the elements in the sequence can be
acquired [sometimes also referred to as “statistical learning” (Saf-
fran et al., 1996, 1999)]. Differential responses to novel well
formed (correct) sequences compared with those that violate the
AG structure suggest that some aspect of the AG structure was
learned. Although several nonhuman animal AGL studies have
been conducted, cross-species comparisons between different
nonhuman primates species are needed (Fitch and Hauser, 2004;
Received June 7, 2013; revised Sept. 5, 2013; accepted Sept. 11, 2013.
Author contributions: B.W. and C.I.P. designed research; B.W., H.S., A.E.M., and C.I.P. performed research; Y.K.,
W.D.M.-W. and K.S. contributed unpublished reagents/analytic tools; B.W., H.S., A.E.M., and C.I.P. analyzed data;
B.W. and C.I.P. wrote the paper.
This work was supported by a Project Grant from theWellcome Trust to C.I.P. (WT092606/Z/10/Z). We thankM.
Collison for help with pilot experiments, H. Bassirat for help with the experiments, and T. Griffiths, A. Rees, T.
Smulders, and C. Perrodin for useful discussions or comments on previous versions of this manuscript. We thank A.
Waddle, L. Watson, and L. Reed for animal husbandry, P. Flecknell for veterinary care, and V. Willey for customized
machine work.
The authors declare no competing financial interests.
This article is freely available online through the J Neurosci Author Open Choice option.
Correspondence should be addressed to Dr. Christopher Petkov, Institute of Neuroscience, HenryWellcome
Building, Newcastle University, Framlington Place, Newcastle upon Tyne, NE2 4HH, U.K. E-mail:
chris.petkov@ncl.ac.uk.
DOI:10.1523/JNEUROSCI.2414-13.2013
Copyright © 2013 Wilson et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution and reproduction in
any medium provided that the original work is properly attributed.
The Journal of Neuroscience, November 27, 2013 • 33(48):18825–18835 • 18825
Saffran et al., 2008; Petkov and Jarvis, 2012). We asked whether
New World monkeys (sharing a last common ancestor, LCA,
with humans40 million years ago) would have better, compa-
rable, or worse AGL capabilities than Old World monkeys (LCA
25 million years ago)?
This study first compared different AG structures and animal
AGL results within a quantitative parameter space, which identi-
fied gaps in our understanding. To address these gaps, we studied
New andOldWorldmonkeys (respectively, commonmarmosets
and Rhesus macaques) using a refined AGL approach based on
rating videotaped animal responses. We obtained evidence that
both species notice violations of the AG, but while macaques
show sensitivity to more complex aspects of the AG, the marmo-
sets’ responses are based largely on simpler strategies. We devel-
oped a novel eye-tracking technique to further investigate the
extent of the AGL in individual macaques, the results of which
supported the video-coding results and further ruled out simple
learning strategies in the macaques.
Materials andMethods
This research study abides by the recommendations of the Weatherall
report on “The use of nonhuman primates in research.” The study has
been approved by the U.K. Home Office and abides by the Animal Sci-
entific Procedures Act (1986) of the United Kingdom.
A quantitative parameter space to compare artificial
grammar complexity
It is important to quantify some of the dimensions within which AGs can
vary, so that different AG structures can be compared or meaningfully
varied, rather than being arbitrarily designed or redesigned. The Formal
LanguageHierarchy and itsmore recent variants have suggested categor-
ical distinctions between grammars of different levels of complexity
(Chomsky, 1957; Berwick et al., 2011). However, a number of groups
have emphasized the need for alternative complexity measures to evalu-
ate syntactic complexity (de Vries et al., 2011; Hurford, 2012; Ja¨ger and
Rogers, 2012; Petkov and Wilson, 2012). Even “finite-state grammars”
(FSGs), holding the lowest place in the Formal Language Hierarchy, can
have considerable variability in structural complexity, which we aim to
better understand within a quantitative parameter space.
One important variation in complexity between AGs is in the number
of stimulus classes or elements that contribute to the AG structure. Al-
though human studies have used a variety of AGs (Reber, 1967; Saffran et
al., 2008; Udde´n et al., 2012),many studies with nonhuman animals have
focused on structural relationships between two stimulus classes: i.e., A
andB (Fitch andHauser, 2004; Friederici et al., 2006;Gentner et al., 2006;
Murphy et al., 2008; Hauser and Glynn, 2009). Such AGs require the
participants to learn how several stimuli are subdivided into the two
classes—e.g., based on salient acoustic features such as the gender of the
speaker (Fitch and Hauser, 2004)—before the participants can learn to
recognizewell formed sequences from theAG structure. These structures
are represented by filled circles in the left part of the parameter space in
Figure 1A. Other AG studies do not rely on such binary categorization,
and instead employ multiple elements that contribute to the structure,
which we term “structural elements”; open circles in Figure 1A (Reber,
1967; Saffran et al., 2008; Abe and Watanabe, 2011). Several structural
elements typically contribute to the structure of such AGs (Reber, 1967).
This can be used to generate a wide variety of sequences without requir-
ing participants to perceptually categorize stimuli into two different
classes. Accordingly, the first dimension in Figure 1A is the number of
stimulus classes (in reference to studies using AB-type structures) or
structural elements (in reference to studies which do not require catego-
rization of stimuli) that contribute to the AG structure.
A second key source of variation between AGs is the degree of predict-
ability or determinism of the structure, reflecting the extent to which
each stimulus class or structural element can be predicted by the preced-
ing element(s). The sequence of words or phrases in human language is
generally nondeterministic, making it important to understand how far
nonhuman animals are sensitive to similar properties in the sequences
generated by a given AG. The songs of some songbird species, for exam-
ple, can range from stereotyped and deterministic to much more vari-
able. This can be quantified by calculating their structural linearity
(Honda and Okanoya, 1999), given by the following:
Linearity

Number of stimulus classes or structural elements  1
Number of legal transitions
.
A linearity index of 1.0 describes an entirely predictable, deterministic
AG, where each structural element can be preceded and followed by only
Figure 1. Comparing different AG structures and the current paradigm. A, Mapping of AGs
previously used to test nonhuman animals, including the original AG designed by Reber (1967).
These are plotted as the number of unique stimulus classes (filled circles) or structural elements
(open circles) that contribute to the structure as a function of the linearity of the structure (see
Materials and Methods). The black line subdividing the shaded regions denotes the maximum
possible structural nonlinearity (i.e., random patterns devoid of structure). The checkmarks
highlight regions of the parameter space for which there is evidence that the different animal
species (labeled text in A) can learn that particular level of structural complexity. Crosses or
question marks highlight uncertainty regarding whether the labeled species can learn those
aspects, see text. Figure references: 1: Abe andWatanabe (2011), 2: Fitch and Hauser (2004), 3:
Gentner et al., (2006), 4: Hauser and Glynn (2009), 5: Murphy et al. (2008), 6: Reber (1967), 7:
Saffran et al. (2008), 8: van Heijningen et al. (2009), 9: Stobbe et al. (2012).B, The AG structure
used here contains five unique elements andmultiple forward branching relationships. Correct
sequences (strings of nonsense words) are generated by following any path of arrows from
START to END. Violation sequences do not follow the arrows. The AG was used to create 9
habituation sequences. All experiments began with a habituation phase following by a testing
phase. The testing sequences that follow theAG (“Correct”) or donot follow theAG (“Violation”)
are also shown.
18826 • J. Neurosci., November 27, 2013 • 33(48):18825–18835 Wilson et al. •Monkey Artificial Grammar Learning
one legal transition. The equation above includes transitions between
structural elements and also to and from the start or end of the sequence.
The number of structural elements considered in this equation contains
an additional token so that amanifestly linear AG, e.g., Start3A3B3
End, has 2 structural elements (A and B) and three transitions (3),
where Linearity (2 1)/3 1.0. Thus, the second dimension in Figure
1A is a measure of structural linearity.
Quantifying AG complexity and evaluating animal AGL
capabilities
On this two-dimensional space, we first mapped the AG structures con-
taining just two stimulus classes: (AB)n and AnBn. The (AB)n structures
produce sequences of the form ABAB (where n 2) and the AnBn struc-
tures produce the sequence AABB (Fitch and Hauser, 2004; Gentner et
al., 2006). We also mapped three-element long structures based on the
A/B classes of stimuli, producing sequences such as ABA, AAB, ABB
(Murphy et al., 2008; Hauser and Glynn, 2009). See lower-left area of
Figure 1A. The (AB)n and the A/B structures are relatively linear, with
only one transition that is not entirely predictable based on the prior
stimulus class, e.g. (AB)nmust begin with A, A is then always followed by
B, and B can be followed by either A or “End”. Every species tested
appears able to learn this type of AG structure, either implicitly or explic-
itly: songbirds (Gentner et al., 2006; vanHeijningen et al., 2009; Stobbe et
al., 2012), rodents (Murphy et al., 2008), New and Old world monkeys
(Fitch and Hauser, 2004; Hauser and Glynn, 2009). This suggests that
many species are capable of learning AG structures based on these rela-
tively linear, predictable, structural relationships.
By comparison, AnBn structures (e.g., AABB, where n  2) are more
nonlinear since A may be followed by either A or B (Fig. 1A). After
training, a number of avian species were able to detect violations in both
(AB)n and AnBn structures (up to n  4) (Gentner et al., 2006; van
Heijningen et al., 2009; Stobbe et al., 2012). However, tamarin monkeys
(a NewWorldmonkey species) showed dishabituation responses only to
violations of the (AB)n structure but not to theAnBn structure (wheren
2) (Fitch and Hauser, 2004). It is unclear whether these differences be-
tween monkeys and birds result from the difference between learning by
training or habituation (i.e., explicit vs implicit forms of learning) or
reflect a genuine cross-species difference in AGL capabilities. However,
the results do not provide evidence that tamarin monkeys are able to
learn less linear AG structures of this type.
The other AG structures,mapped in the right half of Figure 1A, consist
of several structural elements, offering considerable variation in the se-
quences that can be generated by each AG. For example, the two-
stimulus class (AB)n structure generates the fixed sequence of the form
ABAB (where n 2), with the bulk of the learning effort used to identify
how the different stimuli fit into the two classes. By comparison, the AG
structure in “Reber-like” AGs can produce a variety of sequences and
sequence lengths, e.g., “TPTXVS” or “VXVPXXVS” (Reber, 1967). Since
several structural elements contribute to the AG, Reber-like structures
are typically less linear and deterministic than those that can be generated
by two-stimulus class AGs (Fig. 1A). While AGs such as those inhabiting
the upper right quadrant in Figure 1A are learned with relative ease by
human participants (Reber, 1967; Friederici et al., 2002; Petersson et al.,
2012), they have not been tested with nonhuman animals and might
prove very difficult for them to learn. For this study of implicit AGL in
nonhumanprimateswe focused on aReber-likeAGdeveloped by Saffran
et al. (2008). In terms of linearity, this AG structure (Fig. 1B) falls be-
tween the structure used by Reber (1967) and the two-stimulus class
structures. This AG can generate sequences of variable length and the
order of the elements varies between sequences. The structure contains
both optional and obligatory elements including a considerable variety of
transitional probabilities between elements (Fig. 1B; Table 1).
Two previous studies have attempted to determine whether nonhu-
man animals can learn AGs with similarly nondeterministic structure. In
the first study, after tamarin monkeys were habituated to sequences gen-
erated by the AG, the only evidence for significant dishabituation re-
sponses to violations of the AG structure was obtained when the animals
were tested with the same “correct” sequences to which they had been
habituated (Saffran et al., 2008). Thus, the dishabituation responses of
these NewWorld monkeys may be based primarily on the novelty of the
violation sequences. In our experimental design we incorporated both
“familiar” and “novel” correct (well formed) testing sequences to deter-
mine whether macaques (OldWorld monkeys) and/or marmosets (New
World monkeys) would distinguish between sequences only on the basis
of familiarity. Second, in a study testingBengalese finches on a relatedAG
structure (Abe and Watanabe, 2011), it has been noted that the testing
sequences used differed significantly in their acoustic properties between
conditions. All correct sequences were acoustically very similar to each
other but the violation sequences differed considerably (Beckers et al.,
2012). Thus, the animals could have responded differently to the test
sequences based solely on acoustical differences. To address this, our
experimental design involved selecting violation sequences that violate
the AG structure at multiple positions in the sequences, and we con-
trolled for acoustic differences between correct and violation sequences
(see Stimuli, below). Last, to better clarify what parts of the sequence the
animals monitor for violations (van Heijningen et al., 2009), this study
incorporated twodifferent types of violation sequences: those that “begin
with A” (like the well formed, correct sequences) and those that “do not
begin with A” (violate the sequence structure from the very first element,
Fig. 1B).
Video-coding experiments
Stimuli. Each of the stimulus sequences shown in Figure 1B was created
by digitally combining recordings of naturally spoken nonsense words
produced by a female speaker based on an AG structure developed by
Saffran et al. (2008) (Fig. 1B). The nonsense words were recordedwith an
Edirol R-09HR (Roland) sound recorder. The amplitude of the recorded
sounds was root-mean-square (RMS) balanced. The nonsense word
stimuli were combined into habituation and testing sequences using cus-
tomized Matlab scripts [100 ms interstimulus intervals (ISI)]. The
sounds were presented to the animals using Cortex software (Salk Insti-
tute) at 75 dB SPL (calibrated with an XL2 sound level meter, NTI
Audio). We confirmed that the power spectrum density of the nonsense
word stimuli was well within the audible range of both macaques and
marmosets [i.e., at least 30 dB above both species’ hearing threshold in
the range of 100–5000 Hz (Pfingst et al., 1975, 1978; Lonsbury-Martin
and Martin, 1981; Bennett et al., 1983) i.e., at least 30 dB above both
species’ hearing threshold in the range of 100–5000 Hz (Seiden,
1958)]. The duration of the naturally spoken nonsense word stimuli
within the sequences varied (Klor  0.64 s; Jux  0.62 s; Cav  0.56 s;
Biff  0.40 s; Dupp  0.39 s). Thus, we confirmed that the duration of
the sequences could not be used as a cue, as follows. The correct and
violation sequence sets were balanced in the number of elements in the
sequences (Fig. 1B) and the mean sequence length (SD) of the se-
quences were comparable: correct sequences, 3.14 (0.42) s; violation se-
quences, 3.25 (0.28) s. Also, we confirmed that there was no significant
difference in sequence sound duration between correct and violation
sequences (independent samples t test, t(6) 0.435, p 0.68), or in the
Table 1. Transitional probabilities between elements and for test sequences
Transition Transitional probability (TP) Test sequences Average TP
Start–A 1.00
A–C 0.56 ACGFC 0.57a
A–D 0.44 ADCFCG 0.59a
D–C 1.00 ACFCG 0.54a
C–F 0.36 ADCGFC 0.62a
C–G 0.43
C–End 0.21
F–C 0.56 AFGCD 0.17b
F–End 0.44 AFCDGC 0.25b
G–F 0.67 FADGC 0.11b
G–End 0.33 DCAFGC 0.17b
The transitional probability (TP) of every legal transition between elements was calculated based on the frequency
of their occurrence within the habituation sequences. Higher TPs represent more common transitions. The average
TP of each test sequence is also shown, highlighting the higher average TPs in the acorrect than in the bviolation
sequences.
Wilson et al. •Monkey Artificial Grammar Learning J. Neurosci., November 27, 2013 • 33(48):18825–18835 • 18827
duration of the individual elements present in correct versus violation
sequences (t(42) 0.609, p 0.55).
Moreover, further steps were taken in designing the sequence sets to
balance for acoustical differences, by either balancing for the presence of
the different elements (A, C, D, F, G), as far as possible, or analytically
confirming that acoustical differences could not explain the reported
results. The A, F, and G elements were balanced so that they occurred
equally often in each of the correct and violation sequences (Fig. 1B).
Half of the violation and correct sequences were also balanced for the
presence of the C and D elements, but it was difficult to achieve this
balance in the other half of the sequences without introducing other
potential confounds. Nonetheless, we found that acoustical differences
cannot explain the results for the following reasons. First, macaque eye-
tracking results by acoustical element (see Fig. 4; Results) showed that a
comparable pattern of stronger responses to elements in violation versus
correct sequences were made in response to all of the elements. There-
fore, the macaques do not simply respond strongly to certain elements,
but their responses vary based on the type of sequence in which they
occurred (correct or violation). Second, an analysis of the average eye
position in response to the C and D elements [ANOVA factors: element
(C or D), condition (correct or violation) and monkey] showed the
expected main effect of condition (p 0.001) and monkey (p 0.008),
but no effect was observed for the element factor (p  0.13) and no
interactions were seen between the elements and condition or monkey
(all p values 0.1). Therefore, the responses cannot be explained by a
preference for any acoustical element but can be explained by the context
in which the element occurs.
Participants: Rhesus macaques. Thirteen male Rhesus macaques
(Macacamulatta) participated in this experiment. Themacaques were in
two separate group-housed colonies. The animals were individually sep-
arated in these colonies for testing, wherever possible.
Participants: Common marmosets. Four common marmosets (Calli-
thrix jacchus) participated in this experiment. The marmosets were in a
single group-housed colony. The animals were individually separated in
the colony for testing, wherever possible.
Habituation phase. During the habituation phase, the animals were
presented with habituation sequences in a randomized order (Fig. 1B).
The sequences were presented from a concealed audio speaker (rate of 9
sequences/min; intersequence interval 4 s). Habituation occurred for
2 h on the afternoon before the experiment, when the animals were quiet
and relaxed, but a few hours before the lights would be turned off for
them to sleep. The following morning the animals were rehabituated to
the sequences presented in a randomized order for 10 min, immediately
before the start of the experiment.
Test phase.Video cameras were set up early in themorning to allow the
animals to become habituated to their presence. During testing, a ran-
domly selected test sequence of the eight (correct or violation) sequences
(Fig. 1B) was individually presented (4 times each, for a total of 32 testing
trials; at an average rate of 1/min; intersequence intervals ranged between
45 and 75 s). Each animal’s orienting responses were video recorded for
offline analysis (JVC and Sony digital video cameras; 720 576 resolu-
tion; 25 frames/s). To obtain sufficient results with the four marmosets
that were available for testing, the animals were tested on four separate
occasions, with at least 1 week separating the testing session. No differ-
enceswere observed betweenmonkeys or testing sessions, suggesting that
the results could not be explained by any learning effect or individual
differences (see Video-coding procedure, below).
Video-coding procedure.We refined the traditional video-coding pro-
cedure tominimize subjectivity in video-coding analysis. First, the audio
track for each video was digitally scrambled so that it was not possible to
identify the sequence condition. The videos from each animal were in-
dependently blind-coded by three raters (coauthors: A.E.M., H.S., and
B.W.). Each rater coded orienting responses based on eye, head, and/or
body movements in the direction of the concealed audio speaker that
presented the stimulus sequences. The strength of the orienting re-
sponses were recorded on a five-point Likert scale, 1  no orienting
response; 2  probably no response; 3  ambiguous response; 4 
probable orienting response; 5definite orienting response. All analyses
were based on trials on which a majority of raters (2 of 3) agreed that an
unambiguous response (strength  4) occurred. We analyzed the pro-
portion of trials onwhich the animals unambiguously responded and the
average duration of the orienting response in these trials (Fig. 2).
To understand the variability between the four marmosets and four
testing sessions, these were included as factors within two repeated-
measures ANOVAs (RM-ANOVAs; because of the limited degrees of
freedom all factors could not be included in the same model). One RM-
ANOVA modeled the between-subject “condition” (2 levels: correct vs
violation sequences) factor and the within-subject factor of “session” (4
levels). The other analysis modeled the between-subject condition factor
and the within-subject factor of “marmoset” (4 levels). These analyses all
revealed a significant main effect of condition, but neither showed any
effect of marmoset, session or interactions with condition (all p values
0.4). These additional results confirm that we observed stable perfor-
mance between animals and across session, suggesting a homogeneous
dataset was available for further analysis with no significant differences
between animals or testing sessions.
Inter-rater reliability: macaques. Three raters coded all of the videos.
Inter-rater reliability was calculated pairwise between the raters. In the
macaque experiment, the raters, on average, had exact agreement on the
strength of the response (on the five point scale) on 75.4% of the trials
and were within one response point from each other on 85.1% of the
trials. Also, Cohen’s Kappa (Landis and Koch, 1977) revealed “substan-
tial” agreement, K  0.67. Only trials on which a majority of the raters
agreed that an unambiguous response had occurred were included in
further analyses. Themacaqueswere rated as unambiguously responding
to 14.7% of all recorded trials by a majority of raters resulting in 16
grammatical and 45 ungrammatical response trials (total of 61) used for
analysis.
Inter-rater reliability: marmosets. Three raters coded all of the videos.
Inter-rater reliability was calculated pairwise between the three raters. In
themarmoset experiment, the raters had exact agreement on the strength
of the response (on the five point scale) on 49.8% of the trials and were
within one response point from each other on 80% of the trials. Cohen’s
Kappa (Landis and Koch, 1977) revealed “fair” to “moderate” agree-
ment, K 0.39. The marmosets were rated as unambiguously respond-
ing on 22.8% of all of the recorded trials by a majority of raters resulting
in 60 grammatical and 57 ungrammatical response trials (total of 117)
used for analysis. These numbers in comparison to those of macaques
(above) indicate that themarmoset data were not statistically underpow-
ered in relation to those that were available for analysis from the ma-
caques. See Results for further details.
Eye-tracking experiment
Participants. Three adult male Rhesus macaques (Macaca mulatta) pre-
viously trained on a fixation task and acclimated to head immobilization.
Stimuli. The stimulus sequences were identical to those used in the
video-coding experiment (Fig. 1B).
Procedures. Animals were seated in a primate chair 60 cm in front of a
computer monitor, displaying a fixation circle, and two audio speakers
(Creative Inspire T10) horizontally positioned at30° visual angle (Fig.
3A). Following 25% of successful fixation trials, a stimulus sequence was
presented from either the left or the right audio speaker, and eye tracking
data were recorded (Fig. 3).
Habituation phase. During each habituation phase with each animal,
one of seven sets of habituation soundswas randomly selected andplayed
to the animal over both audio speakers for 30 min. Each of the habitua-
tion sets consisted of the nine habituation sequences presented in a
randomized order (Fig. 1B); rate of presentation: 9 sequences/min;
intersequence interval  4 s.
Testing phase. Following the habituation phase was a testing run consist-
ingofmultiple trials. Each trial beganwhen theanimal engageda red fixation
spot in the center of the screen to center the eyes. If the animal continuously
fixated for 2 s it was given a juice reward for fixating, and 25%of the success-
ful fixation trialswere followedbya testing trial inwhicha randomly selected
testing sequence (of the8possible, seeFig. 1B)was randomlypresented from
either the right or the left audio speaker. The trials on which a testing se-
quence was presented were separated by on average four trials where no test
sequence was presented and the animal only fixated. Eye-tracking data were
18828 • J. Neurosci., November 27, 2013 • 33(48):18825–18835 Wilson et al. •Monkey Artificial Grammar Learning
collected throughout the fixation and test sequence presentation periods
(220Hz infra-red eye tracker, Arrington Research). Experimental data were
collected in 1–5 separate testing runs per day. Each testing run included at
least eight trials (one presentation of each test sequence in a randomized
order, see Fig. 1B). The animal was given a short break between each testing
run, during which the animal listened to a new randomized set of habitua-
tion sequences for 5 min to rehabituate him to the AG structure. After this,
another testing run began if the animal remainedmotivated to fixate to start
each trial.
Eye-tracking experiment: data analysis. The three macaques partic-
ipated in 25, 25, and 26 testing runs, respectively. Only the first eight
trials of each testing run were used for further analysis since all of the
animals completed these. The eye-tracking data for each trial con-
tained both a 2 s baseline period during which the animal fixated on
the central fixation spot and a subsequent period during which the
test sequence was randomly presented from one of the two audio
speakers. Significant-looking responses to the test sequences were
defined individually for each animal as looks toward the presenting
audio speaker (left or right) exceeding 3 SDs of the variability in the
baseline eye fixation period. The analysis included the time from
stimulus onset up to the point when the animal looked in the opposite
direction for 200 ms. This identified when the animal seemed to
lose interest in the test sequence and looked 3 SDs of baseline
variability toward the opposite, silent audio speaker (Fig. 3B). The
length of the response window for the three monkeys (M) was as
follows: M1 2128 ms, M2 2984 ms, M3 4180 ms. The data were
also analyzed using a fixed 3000 ms window and the pattern of results
was comparable to those with the individually defined analysis win-
dows. Within the response period, we analyzed durations of re-
sponses, defined as the proportion of time in the analysis window that
the animal spent looking toward the presenting audio speaker, be-
yond 3SD of the baseline fixation period (Fig. 3B). We also analyzed
the average eye deflections in the direction of the presenting speaker.
For analysis of the average looking-response to individual elements,
the window was the time during which the element was presented
with an adjustment for how long, on average, it took the animal to
breach the 3 SD criterion to look toward the presenting speaker at the
start of the test sequence.
Eye-tracking experiment: analysis of specific violation sequences. To
assess whether the macaques were sensitive to subtle, additional vio-
lations in later parts of the testing sequences, we analyzed whether the
macaques were sensitive to differences between two sequences (see
Fig. 5, i and ii), which begin identically but then differ in their number
of violations later in the sequences. Mean difference plots between
sequence i and ii were generated for each monkey (see Fig. 5), across
the sequence repetitions (each sequence was repeated, respectively,
25, 25, and 26 times in macaques 1, 2, and 3). Then, 95% confidence
intervals were generated using a bootstrapping procedure as follows.
Within the early part of the sequence, during the presentation of the
first two elements that are identical between the sequences, we created
a data matrix of the eye-traces within this period (time) by the num-
ber of repeats of the two sequences. We then shuffled the sequence
labels 1000 times to generate the null-hypothesis distribution of dif-
ferences to determine the 5 and 95% confidence intervals (CI; see Fig.
5). Deviations of the difference in eye trace below the 5% CI reflect
responses in favor of sequence i with the fewer violations; differences
above the 95% CI would show a preference for sequence ii. Last, for
any significant deflection below the 5% or above the 95% CI, we
calculated the area (representing both the time and magnitude of the
deviation across the CI) that breached this significance threshold.
Figure 2. Video-coding experiment results in Rhesus macaques and common marmosets. (A, C) Mean proportion of trials (SE across animals) on which the Rhesus macaques and common
marmosetsmade unambiguous looking-responses as evaluated by amajority of 3 raters (seeMaterials andMethods). Subpanels indicate responses to correct and violation sequences, main panels
display results to specific subsets of the correct or violation conditions. B, D, Mean response durations (SE across animals) in macaques and marmosets in response to correct and violation
sequences (subpanels) and to the four subcategories of stimuli. Both Bonferroni () and LSD () post hoc tests are reported for all significant contrasts, *p 0.05; **p 0.01; ***p 0.001.
Wilson et al. •Monkey Artificial Grammar Learning J. Neurosci., November 27, 2013 • 33(48):18825–18835 • 18829
Results
Video-coding experiments
After habituating themonkeys to exemplary sequences following
the AG structure, we tested both macaques’ and marmosets’ ori-
enting responses towell formed (correct) sequences that followed
the AG structure, compared with sequences that violated the AG
structure in certain ways. The animals’ responses were video-
taped. To minimize experimental bias in the analysis of video-
taped responses, three raters blind coded all of the videos before
analysis and only trials in which there was majority rater consen-
sus were analyzed (see Materials and Methods).
The 13 macaques showed a significantly higher proportion of
orienting responses to the violation sequences than the correct
sequences (paired samples t test, t(12) 7.898, p 0.001; Fig. 2A,
subpanel). We analyzed the two different types of correct and
violation sequences to clarify whether the observed effect de-
pends on simpler strategies such as familiarity and/or the animals
only noticing violations in the first sequence element (i.e., se-
quences that, unlike the correct sequences, “do not beginwithA,”
Fig. 1B). We used an RM-ANOVA with four levels of the se-
quence condition factor: “familiar,” “novel,” “begin with A,” and
“do not begin with A” (Fig. 2A). Within the main effect for se-
quence condition (F(3,36) 9.146, p 0.001; Fig. 2A), Bonferroni
comparisons showed differences between several key contrasts,
including between “novel” correct sequences and violation se-
quences that “begin with A,” i.e., those that cannot be identified
by either familiarity or an unexpected initial element (Bonferroni
corrected, p  0.03; Fig. 2A). No differences were observed be-
tween “familiar” and “novel” correct sequences (p  1.0) or
violation sequences that “begin with A” or “do not begin with A”
(p 1.0). A similar pattern of effects was observed when analyz-
ing the duration of responses (correct and violation sequences:
t(12) 2.330, p 0.038; RM-ANOVA “condition” factor with 4
levels: F(3,36) 5.276, p 0.004; Fig. 2B). These results together
suggest that not only do the macaques respond to violations of
the AG, but also that their responses cannot be attributed only to
superficial differences between the sequences, such as novelty or
monitoring only the initial parts of the sequences.
Four marmosets were available for study, thus, to obtain suf-
ficient data for analysis they were each tested four times. Each
Figure 3. Eye-trackingmeasurement of preferential looking-responses to different testing sequences.A, Schematic ofmacaque eye-tracking experiment.B, Average eye trace fromonemonkey
(SE across trials). Positive values on the horizontal axis indicate eyemovements toward the audio speaker (left or right) that presented a given test sequence. The dotted line denotes 3 SDs of the
variance in eye position during fixation, which was used for analysis of significant looking-responses (shaded area is the individually defined response period; see Materials andMethods). C, Mean
eye traces (SE) to the correct and violation sequences for the same monkey. D, Group eye-tracking results including individual results by monkey: Top shows mean response duration (%) of
looking-responses to the correct and violation conditions. Bottom shows results for the “familiar” and “novel” correct test sequences and violation sequences that (like the correct sequences) “begin
with A” or those that “do not begin with A.” *p 0.05, ***p 0.001. a.u., Arbitrary units.
18830 • J. Neurosci., November 27, 2013 • 33(48):18825–18835 Wilson et al. •Monkey Artificial Grammar Learning
testing run was separated by at least 1 week and followed an
identical procedure to the macaque experiment, including a ha-
bituation and testing phase. First, to investigate whether there
were differences between monkeys or testing sessions, these were
entered as factors into RM-ANOVA models (see Materials and
Methods, Video coding experiment data analysis). There were no
effects of monkey or session in any of the analyses and these
factors did not interact with the experimental effects (all p val-
ues 0.4), suggesting that the testing sessions were independent
and homogenous (i.e., there were no strong across-session learn-
ing effects). The marmosets did not discriminate between the
different conditions based on the frequency of looking responses
(p 0.5, Fig. 2C) but did based on the duration of their looking
responses, which were significantly longer for the violation than
the correct sequences (correct vs violation sequences: t(12) 
2.142, p  0.043; RM-ANOVA “condition” factor with 4 levels:
F(1,12) 5.895, p 0.032; Fig. 2D). However, unlike themacaque
results, Bonferroni comparisons only showed an effect between
the “familiar” sequences and those that “do not begin with A”
(p  0.003). Even with a less conservative LSD correction for
multiple comparisons the only additional effect observed was
between “novel” and “do not begin with A” (p 0.035, Fig. 2D),
therefore the marmosets’ responses appear to be based primarily
on familiarity or noticing violations at the beginning of the se-
quences. Interestingly, overall, the marmosets responded more
strongly than macaques (compare the four subpanels in Fig. 2A–
D), both in terms of the proportion of responded trials (main
effect of species, F(1,23) 6.611, p 0.017; main effect of condi-
tion, F(1,23) 22.963, p 0.001; significant interaction, F(1,23)
12.869, p 0.002) and the duration of responses (main effect of
species, F(1,23)  22.162, p  0.001; main effect of condition,
F1,23 9.449, p 0.005; no interaction, p 0.656). These obser-
vations and the number of trials available for analysis (see Mate-
rials and Methods) suggest that the marmoset results do not
appear to have been statistically underpowered relative to the
macaque results.
In summary, the video-coding results show that the marmo-
sets respond for longer durations to sequences that violate the AG
structure. However, their results appear to stem primarily from
sensitivity to violation sequences that “do not begin with A” (cre-
ating a simple violation in the first position of the sequence). This
is shown in the duration of marmoset orienting responses by a
significant difference between “familiar” or “novel” sequences
and those violation sequences that “do no begin with A.” Rhesus
macaques, like themarmosets, responded for longer durations to
the violation compared with correct sequences. However, the
macaques showed stronger responses to violation sequences that
“begin with A” compared with “novel” correct sequences,
whereas this effect was not observed in the marmosets. These
observations reveal that themacaques’ results cannot be based on
the familiarity of the sequence or violations in the initial element
position. Eye-tracking experiments were conducted with three of
these macaques to further probe their sensitivity to the AG struc-
ture and to investigate responses at later positions in the se-
quence. Unfortunately, eye tracking with the smaller marmosets
was not technically possible for this study.
Eye-tracking experiment
In three macaques, we asked whether infrared eye-tracking mea-
surements would reveal differential looking-responses between
the correct and violation sequences. The approach is shown in
Figure 3, A and B, and Materials and Methods. The results from
the three Rhesus macaques were analyzed using an RM-ANOVA
with two factors: “monkey” and “sequence condition” (levels:
“correct” and “violation”). The results confirmed those seen in
the video coding experiment: the animals made significantly lon-
ger looking-responses to the violation sequences (significant
main effect of sequence condition: F(2,73)  20.297, p  0.001;
Fig. 3D). Although individual animals differed in their looking
times toward the presenting audio speaker (significant main ef-
fect of monkey, F(2,73) 4.055, p 0.021), there was no interac-
tion between sequence condition and monkey factors (p 1.0).
Moreover, the eye-tracking approach revealed longer look dura-
tions to violation than correct sequences in the individual ma-
caques (t(24)  3.137, p  0.004; t(24)  3.129, p  0.005; and
t(25) 2.023, p 0.05, respectively).
An RM-ANOVAwith four levels of the sequence condition fac-
tor: “familiar,” “novel,” “begin with A,” and “do not begin with A”
(Fig. 3D) showed a significant main effect for sequence condition
(F(3,219)  10.057, p  0.001). Bonferroni comparisons revealed
significant differences between: (1) “novel” and“beginwithA” (p
0.001); (2) “novel” and “do not begin with A” (p  0.014); (3)
“familiar” and “begin with A” (p 0.001); and, (4) “familiar” and
“do not begin with A” (p  0.012); see Figure 3D. There was no
significantdifferencebetweenresponses to familiar versusnovel cor-
rect sequences (p  1.0), nor any effect between responses to the
violation sequences that “beginwithA”versus “donotbeginwithA”
(p 1.0). Furthermore, there was no interaction between the con-
ditionandmonkey factors (p1.0).Theseeffectswereconfirmedat
the individual level (main effect of condition F(3,72)  3.715, p 
0.015; F(3,72) 4.745, p 0.004; F(3,75) 5.08, p 0.003, respec-
tively). These results recapitulate the video-coding results and sug-
gest that the macaques’ abilities to discriminate correct from
violation responsesdonotdependonsequence familiarityoronrote
memorization during the habituation phase. Given that the mon-
keys seem tomonitor the sequences for violations after the first ele-
ment, we asked whether they could also monitor the rest of the
sequence for possible violations. In particular, we assessed if they
responded to violations beyond the second position in the violation
sequences (at which point the branching structure of the AG be-
comes more evident, Fig. 1B).
To better determine the extent of macaque AG learning abil-
ities, we first compared eye movements in response to identical
Figure 4. Eye-tracking of looking-responses to individual elements. Group (and individual)
mean difference plot of responses to “violation”minus “correct” sequences in response to each
of the five stimulus elements (A, C, D, F, and G). Positive numbers reflect stronger looks to
violation sequences than to correct sequences.
Wilson et al. •Monkey Artificial Grammar Learning J. Neurosci., November 27, 2013 • 33(48):18825–18835 • 18831
acoustical elements, within the context of
either correct or violation sequences (Fig.
4). This was done by performing an
ANOVA with the factors of “sequence
condition” (“correct” or “violation”), “el-
ement” (A, C, D, F, or G), and “monkey”
(3 levels). Critically, the main effect of se-
quence condition (F(1,73)  11.978, p 
0.001; Fig. 4) did not interactwith element
(p  0.1), nor was there a main effect of
element itself (p  0.6). Thus, the stron-
ger looks to violation sequences cannot be
explained by the animals’ responses to any
individual element. Furthermore, there
was no correlation between the magni-
tude of looks to different elements and the
position in the correct or violation
sequence where the element occurred
(correct: r  	0.01, p  0.8; violation:
r	0.05, p 0.3), suggesting that the
animals’ responses to the violations
were not based on the preferences for
any specific elements, or due to in-
creased responsiveness at a specific time
throughout the sequences.
We also directly investigated eye move-
ments at particular positions in the se-
quences, focusing the analysis on two
specific violation sequences (Fig. 5). These
sequencesbegin identically,have their initial
violation in the second position, and con-
tain the same elements in positions 3–5.
However, the elements in positions 3–5
have a different order, which generates an
additional violation in one of the sequences
at the third element (see sequence ii, Fig.
5A). We asked whether the animals were
sensitive to this later difference between the
two violation sequences. A bootstrapped
statistical analysis of the animals’ eye-traces
demonstrated that two of the macaques
showed strong significant responses in favor
of sequence ii, containing the additional vi-
olation, which resulted in an area above the
significance threshold at least a factor of 3
greater than any such preference seen either
for sequence i or during sequence positions
1–2 where the two sequences are identical.
No difference could be observed between
the sequences in macaque 3 (Fig. 5). More-
over, the variability in looks in macaques 1
and 2 to these violation sequences (both of
which start with “AF”) do not support the
notion that a special interest in the second
element “F” alone captures the animals’ attention and results in per-
sistent looking responses to all subsequent elements in the violation
sequences (Fig. 6). These results suggest that a significant sensitivity
to a subtle violation later in the sequences can be measured in a
majority of the three animals studied.
Discussion
New World monkeys (marmosets) were able to notice simple
violations of the AG, such as those in the first sequence element.
In contrast, Old World monkeys showed evidence for the capac-
ity to learn more of the nondeterministic components of the AG
throughout the sequences. We consider the results relative to
other studies and the relationships between the studied AG and
language and song structure.
AGL results in relation to previous studies
Many animals are able to recognize a single element from a set,
e.g., recognizing a vocalization from a set of vocalizations
Figure 5. Individual macaque eye-tracking sensitivity to violations at specific sequence positions (difference plots). A, Sche-
matic plot of two of the violation sequences, identifying legal transitions (black arrows) and violations (red arrows). Violation
sequence ii (green) contains onemore violation than sequence i (purple) in the transition between the second and third elements
in the sequence. B–D, Average difference in looking preferences toward sequence ii (positive numbers) or sequence i (negative
numbers), across the repetitions of each sequence; respectively: 25, 25, and 26 for each animal. Vertical black lines denote stimulus
onset (at 0 ms) and the onset of element 3, where the sequences diverge. Dashed lines indicate CI (based on bootstrapped
differences, 1000 permutations, seeMaterials andMethods). Also shown are the areas95%or5% CI (bar plots, right) where
each animal made statistically significant looks in favor of either sequence.
18832 • J. Neurosci., November 27, 2013 • 33(48):18825–18835 Wilson et al. •Monkey Artificial Grammar Learning
(Moore, 2004). An increase in sequence learning complexity, of
which all animals tested appear capable, involves learning the
relationship between two different elements within a sequence,
e.g., recognizing adjacent relationships betweenA and B stimulus
classes in (AB)n structures (Fig 1A). Another facet of sequencing
relationships is sensitivity to the transitional probabilities be-
tween pairs of elements in a sequence (Saffran et al., 1996; Saffran
et al., 1999). Tamarin monkeys were reported to have shown
stronger dishabituation responses to sequences of three syllable
triplets that contained low probability transitions between the
syllables, transitions which the monkeys had rarely encountered
(Hauser et al., 2001). Relatedly, Newport
et al. (2004) suggest that tamarins can
learn the relationship between the first
and last syllable in a triplet sequence with
more variable intervening elements.
Other studies that have tested nonhu-
man animals with more complex AG
structures have produced more variable
results. For instance, evidence for AnBn
structure learning has only been obtained
in birds (Fitch and Hauser, 2004; Gentner
et al., 2006; van Heijningen et al., 2009;
Stobbe et al., 2012). The AnBn structure is
of interest because it was designed to con-
tain nonadjacent associations between
different pairs of stimuli (Hauser et al.,
2002; Bahlmann et al., 2008). However, it
remains unclear whether any nonhuman
animal can naturally produce or learn
nonadjacent hierarchical relationships
such as those found in human language
(Berwick et al., 2011; Hurford, 2012; Ja¨ger
and Rogers, 2012; Petersson andHagoort,
2012). The quantitative parameter space
in this study, within which AG structures
and animal AGL studies were evaluated
(Fig. 1A), sidesteps this controversy and
highlights other interesting aspects of the
complexity of animal AGL that remain
under-studied.
The results reported here show that
both common marmosets and Rhesus
macaques are sensitive to violations of a
forward branching, nondeterministic AG.
In macaques, the results rule out trivial
explanations based on familiarity, acoustic
differences between well formed (correct)
and ill-formed (violation) sequences, and
(at least in 2 of 3 of the macaques tested)
monitoring only the initial part of the se-
quences (Figs. 3C,D, 5, 6). However, the
video-coding results suggest that marmo-
sets’ responses could be interpreted on the
basis of familiarity or on noticing only vi-
olations in the first position in a sequence.
It seems unlikely that the marmoset re-
sults would have differed even if more
marmosets had been available for testing
or if the eye-tracking techniques had been
feasible with them, for the following rea-
sons. The four marmosets were tested
over four testing sessions (each separated
by at least 1 week), yet even with the additional testing sessions
they showed no evidence of adopting a more complex AGL strat-
egy. Furthermore, the marmosets were more responsive than the
macaques, so their results cannot be attributed to reduced statis-
tical power. Also, the results of both the macaque video coding
and eye tracking results were complementary; therefore, it is un-
likely that an eye-tracking experiment would have yielded very
different results to the video coding experiment in marmosets.
For instance, even macaque 3 who, unlike macaques 1 and 2,
showed no significant sensitivity in the eye tracking experiment
to the subtle violation in a later part of one of the violation se-
Figure 6. Individual macaque eye traces in responses to specific violation sequences. Average eye position (degrees visual
angle SE) in response to violation sequence (i) and (ii) for the three macaques. A–C, These sequences are identical for the first
two elements but sequence ii then contains an additional violation before the start of element 3, relative to sequence i (Fig. 5A).
Vertical black lines denote stimulus onset (at 0ms) and the onset of element 3. Stronger responses to violation sequence (ii) can be
seen inmacaques 1 and 2 (A,B) after the onset of the third element (note the areas of separation between the colored areas of the
SEs for the two sequences) but not formacaque 3 (C)whomay only showa slight preference for sequence i during the periodwhen
both sequences are identical (element positions 1–2).
Wilson et al. •Monkey Artificial Grammar Learning J. Neurosci., November 27, 2013 • 33(48):18825–18835 • 18833
quences (Fig. 5), nonetheless noticed violations in sequences that
begin with A, which start identically to the well formed (correct)
sequences (Fig. 3D). This important observation is seen in the
macaque, but not in the marmoset, video-coding results (Fig. 2).
Interestingly, our observations of the marmoset behavior corre-
spond to those from another AGL study in NewWorld monkeys
(cotton-top tamarins), whereby the tamarins only showed signif-
icant familiarization-based learning effects when the same AG
sequences that were used for habituation were also used for test-
ing (Saffran et al., 2008). Notably, in that study, human infants
readily learned the AGL structure (including under less predict-
able conditions) and their responses generalized to novel correct
sequences, the latter of which we only see in the macaque results.
It is of course possible that under different experimental con-
ditions—such as with operant conditioning, by using different
stimuli as elements in theAG [such as tones (Saffran et al., 1999)],
or with more exposure to the sequences (Miles and Meyer,
1956)—marmoset and tamarin monkeys might be shown to be
capable of more comprehensive learning of nondeterministic
AGs or evenmore complex relationships in AG structures. How-
ever, the current results motivate the hypothesis that, under
comparable experimental conditions, speciesmore closely evolu-
tionarily related to humans or to vocal learners such as songbirds,
might have a relative advantage in the complexity of the AG
structures that they are able to learn over more distantly related
species. Such a hypothesis requires further testing with many
more species, which might refine, refute, or support it.
Distinction between vocal production and auditory
learning capacities
Itmight seem surprising thatmacaques show evidence for deeper
AGL than marmosets, given that marmosets are more vocal, and
in our study responded more frequently than the macaques.
However, it is important to distinguish between vocal production
and auditory learning since these capacities seem to be subserved
by different neurobiological pathways and mechanisms (Jarvis,
2004; Petkov and Jarvis, 2012). Regarding vocal production, ev-
idence of combinatorial calling has been reported in some Old
World monkeys [e.g., putty-nosed monkeys, Cercopithecus nicti-
tans (Arnold and Zuberbu¨hler, 2006)], but whether macaques or
marmosets are also capable of this is currently unknown. Rather
than vocal communication, the sequence-structure learning abil-
ities that we tapped into with these implicit AGL tasks may relate
to learning processes. For example, earlier studies have suggested
that macaques were able to learn a discrimination-learning task,
including with a delay, more quickly than marmosets (Miles and
Meyer, 1956; Miles, 1957). Furthermore, it is possible that many
nonhuman primates can learn aspects of AG structures because
they can evaluate patterns in sensory input [or the structure of
social interactions (Bergman et al., 2003); like the movement
patterns of others (Schmitt, 2010)]. However, our understanding
of these abilities would benefit frommore formal structural anal-
ysis and direct cross-species comparisons, such as those pre-
sented here.
Relationship of the current AG structure to natural song and
language structures
Relative to many commonly used AGs, this paradigm departs
from the requirement that stimuli are categorized into only two
stimulus classes. Rather, several elements, both obligatory and
optional, contribute to the structure, as exemplified by the orig-
inal AGL study in humans by Reber (1967). A number of AGs
used to test humans have a forward-branching structure similar
to that used here (Reber, 1967; Friederici et al., 2002;Udde´n et al.,
2008). Branching structures with varying levels of predictability
or linearity can also be observed in the natural song production of
several species. For instance, zebra finches produce a relatively
linear song (Honda andOkanoya, 1999), while the songs of other
birds (Okanoya, 2004) and even some whales (Hurford, 2012)
show more branching transitions, which form phonological
“syntax-like” structures of interest to linguists and other scien-
tists (Bolhuis et al., 2010; Berwick et al., 2011).
Word transitions in sentences of natural languages are char-
acterized by nondeterminism: sentences are not fixed, predeter-
mined sequences, but vary considerably in composition, word
transitions, and length. Well-formed sentences contain obliga-
tory components (e.g., a subject and a finite verb in English de-
claratives), as well as varying numbers of optional categories
(adjectives, adverbs, etc.), the positions of which depend on the
other words in the sentence. Language learners must deal with
unpredictable variation (Kam andNewport, 2009) and appear to
have a general bias to reduce such variation during learning
(Smith and Wonnacott, 2010). Thus, the capacity to evaluate
hierarchical relationships of the types present in human language
may need to be accompanied by processes that allow us to cope
with sequence variability, which are capacities that appear to have
clearer evolutionary origins.
Conclusions
We report evidence of a novel level of AGL complexity in Old
Worldmonkeys (Rhesusmacaques). Themacaque results cannot
easily be attributed to simple strategies such as responding to
acoustic differences, the novelty of sequences, or only recogniz-
ing violations early in the sequences.While the commonmarmo-
sets (NewWorldmonkeys) also showeddishabituation responses
to violations of the AG structure, the results failed to rule out a
reliance on simple strategies. Such behavioral results provide part
of the initial foundation required for neuronal level investiga-
tions of different aspects of syntactic precursors in primate labo-
ratory model systems such as marmoset and macaque monkeys.
References
Abe K, Watanabe D (2011) Songbirds possess the spontaneous ability to
discriminate syntactic rules. Nat Neurosci 14:1067–1074. CrossRef
Medline
Arnold K, Zuberbu¨hler K (2006) Language evolution: semantic combina-
tions in primate calls. Nature 441:303. CrossRef Medline
Bahlmann J, Schubotz RI, Friederici AD (2008) Hierarchical artificial gram-
mar processing engages Broca’s area. Neuroimage 42:525–534. CrossRef
Medline
Beckers GJ, Bolhuis JJ, Okanoya K, Berwick RC (2012) Birdsong neurolin-
guistics: songbird context-free grammar claim is premature. Neuroreport
23:139–145. CrossRef Medline
Bennett CL, Davis RT, Miller JM (1983) Demonstration of presbycusis
across repeated measures in a nonhuman primate species. Behav Neuro-
sci 97:602–607. CrossRef Medline
BergmanTJ, Beehner JC, CheneyDL, Seyfarth RM (2003) Hierarchical clas-
sification by rank and kinship in baboons. Science 302:1234–1236.
CrossRef Medline
Berwick RC, Okanoya K, Beckers GJ, Bolhuis JJ (2011) Songs to syntax: the
linguistics of birdsong. Trends Cogn Sci 15:113–121. CrossRef Medline
Bickerton D, Szathmary E (2009) Biological foundations and origin of syn-
tax. Cambridge, MA: MIT.
Bolhuis JJ, Okanoya K, Scharff C (2010) Twitter evolution: converging
mechanisms in birdsong and human speech. Nat Rev Neurosci 11:747–
759. CrossRef Medline
Chomsky N (1957) Syntactic structures. The Hague: Mouton.
de Vries M, Christiansen MH, Petersson KM (2011) Learning recursion:
multiple nested and crossed dependencies. Biolinguistics 5:010–035.
18834 • J. Neurosci., November 27, 2013 • 33(48):18825–18835 Wilson et al. •Monkey Artificial Grammar Learning
Fitch WT, Hauser MD (2004) Computational constraints on syntactic pro-
cessing in a nonhuman primate. Science 303:377–380. CrossRef Medline
Friederici AD, Steinhauer K, Pfeifer E (2002) Brain signatures of artificial
language processing: evidence challenging the critical period hypothesis.
Proc Natl Acad Sci U S A 99:529–534. CrossRef Medline
Friederici AD, Bahlmann J, Heim S, Schubotz RI, Anwander A (2006) The
brain differentiates human and nonhuman grammars: functional local-
ization and structural connectivity. Proc Natl Acad Sci U S A 103:2458–
2463. CrossRef Medline
Gentner TQ, Fenn KM,Margoliash D, NusbaumHC (2006) Recursive syn-
tactic pattern learning by songbirds. Nature 440:1204–1207. CrossRef
Medline
Hauser MD, Glynn D (2009) Can free-ranging rhesus monkeys (Macaca
mulatta) extract artificially created rules comprised of natural vocaliza-
tions? J Comp Psychol 123:161–167. CrossRef Medline
Hauser MD, Newport EL, Aslin RN (2001) Segmentation of the speech
stream in a nonhuman primate: statistical learning in cotton-top tama-
rins. Cognition 78:B53–64. CrossRef Medline
Hauser MD, Chomsky N, Fitch WT (2002) The faculty of language: what is
it, who has it, and how did it evolve? Science 298:1569–1579. CrossRef
Medline
Honda E, Okanoya K (1999) Acoustical and syntactical comparisons be-
tween songs of the white-backedMunia (Lonchura striata) and its domes-
ticated strain, the Bengalese finch (Lonchura striata var. domestica). Zool
Science 16:319–326. CrossRef
Kam CL, Newport EL (2009) Getting it right by getting it wrong: when
learners change languages. Cogn Psychol 59:30–66. CrossRef Medline
Hurford JR (2012) The origins of grammar: language in the light of evolu-
tion. Oxford: Oxford UP.
Ja¨ger G, Rogers J (2012) Formal language theory: Refining the Chomsky
hierarchy. Philos Trans R Soc Lond B Biol Sci 367:1956–1970. CrossRef
Medline
Jarvis ED (2004) Learned birdsong and the neurobiology of human lan-
guage. Ann N Y Acad Sci 1016:749–777. CrossRef Medline
Landis JR, Koch GG (1977) The measurement of observer agreement for
categorical data. Biometrics 33:159–174. CrossRef Medline
Lonsbury-Martin BL, Martin GK (1981) Effects of moderately intense
sound on auditory sensitivity in rhesus monkeys: behavioral and neural
observations. J Neurophysiol 46:563–586. Medline
Miles RC (1957) Delayed-response learning in the marmoet and the ma-
caque. J Comp Physiol Psychol 50:352–355. CrossRef Medline
Miles RC, Meyer DC (1956) Learning sets in marmosets. J Comp Physiol
Psychol 49:212–222. CrossRef
Moore BR (2004) The evolution of learning. Biol Rev Camb Philos Soc 79:
301–335. CrossRef Medline
Murphy RA, Mondrago´n E, Murphy VA (2008) Rule learning by rats. Sci-
ence 319:1849–1851. CrossRef Medline
Newport EL, Hauser MD, Spaepen G, Aslin RN (2004) Learning at a dis-
tance II. Statistical learning of nonadjacent dependencies in a nonhuman
primate. Cogn Psychol 49:85–117. CrossRef Medline
Okanoya K (2004) The Bengalese finch: a window on the behavioral neuro-
biology of birdsong syntax. Ann N Y Acad Sci 1016:724–735. CrossRef
Medline
PeterssonKM,Hagoort P (2012) The neurobiology of syntax: beyond string
sets. Philos TransR Soc LondBBiol Sci 367:1971–1983. CrossRefMedline
Petersson KM, Folia V, Hagoort P (2012) What artificial grammar learning
reveals about the neurobiology of syntax. Brain Lang 120:83–95. CrossRef
Medline
Petkov CI, Jarvis ED (2012) Birds, primates, and spoken language origins:
behavioral phenotypes and neurobiological substrates. Front Evol Neu-
rosci 4:12. Medline
Petkov CI, Wilson B (2012) On the pursuit of the brain network for proto-
syntactic learning in nonhuman primates: conceptual issues and neuro-
biological hypotheses. Philos Trans R Soc Lond B Biol Sci 367:2077–2088.
CrossRef Medline
Pfingst BE, Hienz R, Miller J (1975) Reaction-time procedure for measure-
ment of hearing. II. Threshold functions. J Acoust Soc Am 57:431–436.
CrossRef Medline
Pfingst BE, Laycock J, Flammino F, Lonsbury-Martin B, Martin G (1978)
Pure-tone thresholds for rhesus-monkey. Hear Res 1:43–47. CrossRef
Medline
Reber AS (1967) Implicit learning of artificial grammars. J Verb Learn Verb
Behav 6:855–863. CrossRef
Saffran JR, Aslin RN, Newport EL (1996) Statistical learning by 8-month-
old infants. Science 274:1926–1928. CrossRef Medline
Saffran JR, Johnson EK, Aslin RN, Newport EL (1999) Statistical learning of
tone sequences by human infants and adults. Cognition 70:27–52.
CrossRef Medline
Saffran J, Hauser M, Seibel R, Kapfhamer J, Tsao F, Cushman F (2008)
Grammatical pattern learning by human infants and cotton-top tamarin
monkeys. Cognition 107:479–500. CrossRef Medline
Schmitt D (2010) Primate locomotor evolution: Biomechanical studies of
primate locomotion and their implications for understanding primate
neuroethology. In: Primate neuroethology (Ghazanfar AA, Platt ML,
eds), pp 10–30. Oxford: Oxford UP.
Seiden HR (1958) Auditory acuity of the marmoset monkey (Hapale jac-
chus). Doctoral dissertation, Princeton University.
Seyfarth RM, Cheney DL, Marler P (1980) Monkey responses to three dif-
ferent alarm calls: evidence of predator classification and semantic com-
munication. Science 210:801–803. CrossRef Medline
Smith K,Wonnacott E (2010) Eliminating unpredictable variation through
iterated learning. Cognition 116:444–449. CrossRef Medline
Stobbe N, Westphal-Fitch G, Aust U, Fitch WT (2012) Visual artificial
grammar learning: comparative research on humans, kea (Nestor notabi-
lis) and pigeons (Columba livia). Philos Trans R Soc Lond B Biol Sci
367:1995–2006. CrossRef Medline
Tian B, Reser D, Durham A, Kustov A, Rauschecker JP (2001) Functional
specialization in rhesus monkey auditory cortex. Science 292:290–293.
CrossRef Medline
Udde´n J, Folia V, Forkstam C, Ingvar M, Fernandez G, Overeem S, van
Elswijk G, Hagoort P, Petersson KM (2008) The inferior frontal cortex
in artificial syntax processing: an rTMS study. Brain Res 1224:69–78.
CrossRef Medline
Udde´n J, IngvarM, Hagoort P, Petersson KM (2012) Implicit acquisition of
grammars with crossed and nested non-adjacent dependencies: investi-
gating the push-down stack model. Cogn Sci 36:1078–1101. CrossRef
Medline
vanHeijningen CA, de Visser J, ZuidemaW, ten Cate C (2009) Simple rules
can explain discrimination of putative recursive syntactic structures by a
songbird species. Proc Natl Acad Sci U S A 106:20538–20543. CrossRef
Medline
Wilson et al. •Monkey Artificial Grammar Learning J. Neurosci., November 27, 2013 • 33(48):18825–18835 • 18835