
View

Online


Export
Citation

FEBRUARY 02 2026

Time-varying partial loudness of noise burst sequences in
stationary noise with a similar level
Josef Schlittenlacher ; Agatha R. Cox; Brian C. J. Moore 

J. Acoust. Soc. Am. 159, 1048–1056 (2026)
https://doi.org/10.1121/10.0042387

Articles You May Be Interested In

Testing and refining a loudness model for time-varying sounds incorporating binaural inhibition

J. Acoust. Soc. Am. (March 2018)

On the loudness of low-frequency sounds with fluctuating amplitudes

J. Acoust. Soc. Am. (August 2019)

Influence of interaural time differences on loudness for low-frequency pure tones at varying signal and
noise levels

Proc. Mtgs. Acoust. (August 2017)

 18 February 2026 10:42:45

https://pubs.aip.org/asa/jasa/article/159/2/1048/3378462/Time-varying-partial-loudness-of-noise-burst
https://pubs.aip.org/asa/jasa/article/159/2/1048/3378462/Time-varying-partial-loudness-of-noise-burst?pdfCoverIconEvent=cite
javascript:;
javascript:;
javascript:;
https://orcid.org/0000-0001-7071-0671
https://crossmark.crossref.org/dialog/?doi=10.1121/10.0042387&domain=pdf&date_stamp=2026-02-02
https://doi.org/10.1121/10.0042387
https://pubs.aip.org/asa/jasa/article/143/3/1504/609577/Testing-and-refining-a-loudness-model-for-time
https://pubs.aip.org/asa/jasa/article/146/2/1142/663720/On-the-loudness-of-low-frequency-sounds-with
https://pubs.aip.org/asa/poma/article/30/1/050004/908562/Influence-of-interaural-time-differences-on
https://servedbyadbutler.com/redirect.spark?MID=188841&plid=3318326&setID=1044502&channelID=0&CID=1578727&banID=524059810&PID=0&textadID=0&tc=1&rnd=5518874329&scheduleID=3474304&adSize=1640x440&data_keys=%7B%22%22%3A%22%22%7D&mt=1771411365700246&spr=1&referrer=http%3A%2F%2Fpubs.aip.org%2Fasa%2Fjasa%2Farticle-pdf%2F159%2F2%2F1048%2F20888316%2F1048_1_10.0042387.pdf&request_uuid=5a61aebc-d170-4e52-81dc-c96de3136206&hc=7bf3ec77fbb188cec997512ed06440b2d4080a24&location=


Time-varying partial loudness of noise burst sequences
in stationary noise with a similar level

Josef Schlittenlacher,1,a) Agatha R. Cox,1 and Brian C. J. Moore2
1Department of Speech, Hearing and Phonetic Sciences, University College London, London WC1N 1PF, United Kingdom
2Cambridge Hearing Group, Department of Psychology, University of Cambridge, Cambridge CB2 3EB, United Kingdom

ABSTRACT:
Loudness increases with increasing duration up to 200ms after sound onset. This temporal integration is well

documented in quiet but less understood in the presence of other sounds and for very short durations. The present

study investigates the temporal integration of partial loudness for bursts of noise in the presence of equally intense

background noise. Level differences required for equal loudness between a reference burst duration of 20ms and

target burst durations of 1, 2, 5, and 10ms were obtained using a 1-up/1-down staircase procedure in the laboratory

and online for burst repetition rates of 5, 10, and 20Hz and for rectangular and Hann shaped bursts. All results

showed that the short duration bursts were perceived as louder than expected from the temporal integration of

energy. The difference was equivalent to a change in level up to 6.7 dB and was larger for higher burst repetition

rates. The difference was higher when using abrupt onsets and offsets for both target and reference compared to

bursts with a Hann window shape. Differences between experiments conducted in the laboratory and online were

small (up to 1.2 dB) but were statistically significant.
VC 2026 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons
Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1121/10.0042387

(Received 20 August 2025; revised 25 December 2025; accepted 13 January 2026; published online 2 February 2026)

[Editor: Pavel Zahorik] Pages: 1048–1056

I. INTRODUCTION

Most noise regulations quantify the impact of a noise

source by specifying a level in decibels that must not be

exceeded when measured in a quiet test environment. This

is an approach that does not always correspond to human

perception. The commonly used A-weighted equivalent

sound level (LAeq) is based on the average sound intensity

over long measurement periods. Some variations of the

metric, like the day-night-average sound level [Sec. 3.19 in

ANSI/ASA (2013)], the community noise equivalent level

[Sec. 3.20 in ANSI/ASA (2013)] or the day-evening-night

average sound level (European Union, 2002) add penalties

based on the time of day but still do not take into account

the distribution of intensity during the measurement period.

The latter, however, is important in many real environ-

ments, where noise sources are often similar in level to the

noisy environments in which they occur, for example in

cities. A stationary sound whose intensity is consistently

below that of the background noise will have a lower

impact on human perception than an overall equally intense

but time-varying sound whose peak intensity is well above

that of the background noise some of the time. This differ-

ence in human loudness perception in the presence of back-

ground sounds is considered by the concept and models of

partial loudness (e.g., Glasberg and Moore, 2005). The pre-

sent study investigates the partial loudness of sequences of

short noise bursts in roughly equally intense background

noise, i.e., with the signal and background at about the

same LAeq—a scenario similar to that of rotorcraft in urban

cities, with a focus on the temporal characteristics of the

noise bursts.

For simultaneous sounds with a similar spectrum, such

as a pure tone and narrowband noise centred at the same fre-

quency, the tone is essentially inaudible for signal-to-noise

ratios (SNRs) smaller than about �3 dB (e.g., Glasberg

et al., 1984; Zwicker, 1954). The exact value varies some-

what between individuals and depends on frequency. Moore

et al. (1997) modelled the masking effect within each audi-

tory filter for the average normal-hearing listener assuming

a “threshold” SNR of �3 dB for center frequencies above

500Hz and progressively higher “threshold” SNRs with

decreasing frequency below 500Hz, the threshold SNR

reaching about 10 dB at 50Hz. Experiments on the loudness

of signals presented in broadband noise showed that loud-

ness grew rapidly with increasing level once the threshold

value was exceeded, and for SNRs exceeding about 15 dB,

the partial loudness of the signal in noise became similar to

that of the signal in quiet (Zwicker, 1963; Stevens and

Guirao, 1967; Houtgast, 1974). Thus, partial masking occurs

over only a small range of SNRs, and a small change in sig-

nal level can lead to a rather large change in loudness (see

also, for example, Schroeder et al., 1979). When a pure tone

or complex tone acts as the background and a noise is the

target, the SNR at the masked threshold is somewhat lower

than when the tone is the target and the noise thea)Email: j.schlittenlacher@ucl.ac.uk

1048 J. Acoust. Soc. Am. 159 (2), February 2026 VC Author(s) 2026.

ARTICLE...................................

 18 February 2026 10:42:45

https://orcid.org/0000-0001-7071-0671
https://creativecommons.org/licenses/by/4.0/
https://doi.org/10.1121/10.0042387
mailto:j.schlittenlacher@ucl.ac.uk
http://crossmark.crossref.org/dialog/?doi=10.1121/10.0042387&domain=pdf&date_stamp=2026-02-02


background, but the growth of partial loudness with increas-

ing level above the masked threshold is about equally steep

(Hellman, 1972; Gockel et al., 2002, 2003).
All of these studies have used stationary sounds, except

for the modulations inherent to narrowband noise. For time-

varying sounds, duration has an important influence on loud-

ness. In quiet, the loudness of a sound grows with increasing

duration up to 150 to 200ms (e.g., Scharf, 1978). This effect

is known as temporal integration. Some studies reported

shorter “critical durations” above which loudness reaches its

final value, for example 65ms for noise bursts (Miller,

1948). For durations longer than the critical duration, loud-

ness does not grow further with increasing duration. Pollack

(1958) reported a critical duration of 100ms for noise bursts

presented one per second, but his data showed considerably

shorter critical durations for sequences with more bursts per

second, for example 10ms for nine bursts per second.

Studies measuring reaction time, which is thought to be gov-

erned by similar neural processes to loudness and decreases

progressively with increasing loudness, have shown a criti-

cal duration of 40ms for pure tones (Schlittenlacher and

Ellermeier, 2015).

For durations shorter than the critical duration, the

increase in loudness with duration has often been approxi-

mated with a simple model such that an increase in level of

3 dB has the same effect as a doubling of duration, as if

energy was integrated over time (e.g., Pollack, 1958;

Zwicker, 1974). This has the convenient effect for simple

metrics such as the LAeq that the metric captures the effects

of temporal integration, although only for durations shorter

than the critical duration and not taking into account effects

such as the dependence of temporal integration on absolute

level (Florentine et al., 1996). In this study, temporal inte-

gration, measured as the level difference required for equal

loudness (LDEL) between 5-ms and 200-ms sounds, was

about 10 dB near threshold but almost 20 dB at moderate

sound pressure levels (SPLs) (about 65 dB).

More complex loudness models incorporate temporal

integration towards the end of the processing chain, i.e., at a

higher level of the auditory pathway. The Cambridge loud-

ness models (e.g., Glasberg and Moore, 2002; Moore et al.,
2016) distinguish between instantaneous loudness, short-

term loudness for judgments of short segments such as a

syllable, and long-term loudness for judgments of longer

segments, such as a sentence. Instantaneous loudness is

thought to be a stage involved in forming the loudness per-

cept, but it may not be directly perceived. Areas in the brain

that track the instantaneous loudness of speech stimuli have

been identified using magnetoencephalography (Thwaites

et al., 2016; Thwaites et al., 2017). Instantaneous loudness
is calculated using time frames (windows) whose durations

are chosen to be as short as possible while achieving the

desired spectral resolution. The window sizes range from

2ms for high frequencies to 64ms for low frequencies. For

signals shorter than this, the windows act like smoothers and

effectively integrate intensity. Short-term loudness is calcu-

lated from instantaneous loudness using a function

resembling an automatic gain control circuit with an attack

time constant of 22ms. The attack time constant determines

the rate of increase in loudness with duration; the temporal

build-up corresponds to that of a first-order system [see

Glasberg and Moore (2002) for details]. Long-term loudness

is calculated from short-term loudness using a similar circuit

but with an attack time constant of 99ms. When a sound is

abruptly turned on, the predicted loudness builds up over a

duration equal to about two to three times the respective

time constant, plus the time associated with previous steps,

since the build-up of long-term loudness is in series with the

build-up of short-term loudness.

Relatively little is known about the temporal integration

of partial loudness. On the one hand, one could argue that if

temporal integration is governed by high-level processes,

partial loudness may build up over time in the same way as

loudness in quiet. On the other hand, one may argue that

since the auditory system is already excited by the back-

ground sound, the loudness of any further sound presented

at a suprathreshold level may build up more quickly than for

that sound presented in quiet. Florentine et al. (1998) com-

pared the loudness of tones with equivalent rectangular

durations of 5ms and 200ms. When the tones were pre-

sented in broadband noise with a level of 60 dB SPL, the

LDEL of the two tones was about 15 to 20 dB for all audible

levels of the tone. This is smaller than the LDEL of about

25 dB for a 60-dB-SPL tone in quiet and similar to the

LDELs found near absolute threshold and for very high lev-

els in quiet. The LDELs of Florentine et al. (1998) quanti-
fied the total amount of temporal integration but did not

give information about the time course of temporal integra-

tion. Specifically, they did not measure temporal integration

over the first few milliseconds following the onset of a

sound, which may be important for impulsive sounds.

Richards (1977) measured temporal integration of partial

loudness for a 1-kHz tone with durations from 10 to 640ms,

using several tone and masker levels. For most conditions,

his results were fitted well by two straight lines, with a more

rapid growth of loudness for durations up to 80ms than for

higher durations. A peculiarity of the results was that tempo-

ral integration continued for durations up to at least 640ms,

even in quiet. For a 60-dB-SPL tone in a 50-dB-SPL noise,

temporal integration was similar to that in quiet and similar

to that expected from intensity integration, i.e., the growth

of loudness with duration corresponded to accumulating the

sound’s energy over time.

The present study was intended to contribute to knowl-

edge about the temporal integration of partial loudness for

very short sounds (noise bursts) using burst durations up to

20ms. Since purely temporal aspects of sound are usually

produced with high precision with any hardware, most of

the experiments were performed online, but one experiment

was repeated in the laboratory to quantify possible differ-

ences between online and laboratory measurements. LDELs

were estimated between bursts with durations of 1, 2, 5, and

10ms and bursts with durations of 20ms, using burst repeti-

tion rates of 5, 10, and 20Hz. Abrupt onsets and offsets

J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al. 1049

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


were used to produce these short durations. A final experi-

ment was conducted to investigate differences between rect-

angularly shaped bursts and Hann-window shaped bursts.

All experiments used an SNR of 0 dB, which is of particular

interest for predicting community acceptance of a new noise

source with a level similar to that of existing noise sources.

If the level of a new noise source was much higher than that

of existing noise sources, it would likely be judged as a

major nuisance; if its level was much lower, and it was

masked by the ambient noise to a large extent, its low loud-

ness would likely be tolerated. The temporal sequence of

noise bursts resembled the repetition of bursts produced by

rotorcraft, and the results may guide the sound design of

helicopters, drones, or electric vertical takeoff and landing

vehicles (“flying taxis”) regarding the shaping of their over-

all temporal envelope.

II. METHOD

Five experiments were conducted in total, four of them

online. Experiments 1a and 1b used the same stimuli and

almost identical methods but were conducted in the labora-

tory and online, respectively. Experiments 1 to 3 differed in

burst frequency, with ten bursts per second for experiment

1, five bursts per second for experiment 2, and 20 bursts per

second for experiment 3. Experiment 4 investigated if

employing rise and fall times rather than abrupt onsets and

offsets of the bursts had an effect on the LDEL. Each

participant took part in only one experiment, so burst repeti-

tion rate and location were studied between-subjects.

Experiment 3 was conducted in 2021, and preliminary

results were presented at Inter-noise (Schlittenlacher and

Moore, 2021). All other experiments were conducted in

2024 and 2025.

A. Participants

Twenty participants completed each experiment, except

for experiment 1a, which was completed by sixteen partici-

pants. The participants for experiment 1a were recruited on

campus. Nine participants for experiment 3 were recruited

via a university website. All other participants were

recruited via the “Prolific” platform (prolific.com).

Participants tested in the laboratory were checked to have

normal hearing (better than 20 dB hearing level) in both

ears. The online participants self-reported having no hearing

loss and no hearing difficulties. Nobody participated in

more than one experiment. All participants were reimbursed

for their time. Sexes and ages of the participants are shown

in Table I.

B. Stimuli

The background sound was taken from a recording of

an urban highway. Due to the high amount of traffic, it was

rather stationary. It was bandpass filtered with cutoff fre-

quencies of 250 and 4000Hz using a sixth-order

Butterworth filter. This was done to ensure a frequency

range that could be reproduced by consumer hardware in

online experiments. Two seconds were cut from the record-

ing and 20-ms raised cosine rise and fall times were applied.

This segment was “frozen,” i.e., the same for all trials and

experiments. Figure 1 shows its average spectrum (left

panel) and waveform and envelope, the latter obtained

through half-wave rectification and 50-Hz low-pass filtering

(right panel). To confirm that the background sound was

TABLE I. Age and sex of the participants for each experiment.

Experiment Female Male Age range (years) Mean age (years)

1a 12 4 19–61 32

1b 10 10 27–64 40

2 12 8 20–60 41

3 11 9 20–52 29

4 7 13 21–73 37

FIG. 1. Characteristics of the background sound. The left panel shows the average spectrum of the frozen background sound as used in the experiments

(blue line) and from its recording before bandpass filtering (black dashed line). The right panel shows the waveform of the background sound and its enve-

lope (blue line), obtained through half-wave rectification and 50-Hz low-pass filtering.

1050 J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al.

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

http://prolific.com
https://doi.org/10.1121/10.0042387


stationary, its short-term loudness level was calculated using

the model of Moore et al. (2018) and analyzed from 200ms

after the onset (to disregard the growth of calculated loud-

ness after the onset of the sound) to the end, using a root

mean square level of 65 dB SPL. The short-term loudness

level ranged between 80.2 phon and 82.4 phon with a mean

of 81.4 phon and a standard deviation of 0.4 phon.

The target sounds were bursts cut from a synthetic noise

that had the same magnitude spectrum as the background

sound but random and thus different phases of the compo-

nents. The noise was synthesized to have a low crest factor

(so-called “low-noise” noise) using the method described by

Moore et al. (2004) and parameters (except for the desired

spectrum) used by Moore et al. (2018). This was done to

minimize the chance of waveform clipping in the online

experiments. Note that the low-noise property depends on

the specific choice of component phases for the entire broad-

band signal. In the auditory system, the broadband signal is

filtered into multiple narrow bands, and this destroys the

“low-noise” property [for details, see Chap. 1 in Moore

(2014)]. The noise bursts had durations of 1, 2, 5, 10, or

20ms and were presented in burst sequences with a total

duration of 1600ms and burst rates of 5, 10, or 20Hz. Odd-

numbered bursts were cut from the synthetic noise. Even

numbered bursts were the same as the preceding bursts

except for a phase shift of 180 degrees. This was done to

avoid any overall offset of the mean amplitude from zero.

The sequence of bursts was presented simultaneously with

the background sound, starting 200ms after the background

sound started. In experiments 1 to 3, the bursts had abrupt

onsets and offsets. In experiment 4, half of the stimuli had

bursts with raised-cosine shaped rise and fall times that

were half of the stimulus duration each, i.e., their waveform

was multiplied with a Hann window that had the same dura-

tion as each burst. The other half of the stimuli had bursts

with abrupt onsets and offsets.

C. Apparatus

Experiment 1a was conducted in the laboratory. Stimuli

were generated and presented via MATLAB (The Mathworks,

Natick, MA), an RME (Haimhausen, Germany)

Hammerface sound card, and Sennheiser (Wedemark,

Germany) HD580 headphones, which have a diffuse-field

response and produce 95 dB SPL for a 1-V 1000-Hz sinusoi-

dal input, as measured in KEMAR dummy head (GRAS

Sound and Vibration, Holte, Denmark). Participants sat in a

double-walled sound-attenuating booth. The SPL of the ref-

erence stimuli was set to 65 dB SPL. Experiments 1b to 4

were conducted online. Participants were asked to wear

headphones and to use a desktop computer or laptop.

Stimuli were presented via a web browser with code written

in JavaScript and the Web Audio API. For setting the refer-

ence level, participants were asked to play a female speech

sound from the LibriSpeech corpus (Panayotov et al., 2015)
at a loudness typical for speech and not to change the system

settings thereafter. The speech sound was a sentence that

lasted six seconds and was repeated until the participant

clicked a button to continue. The root-mean-square (RMS)

level of the reference stimuli was the same as the RMS level

of that speech stimulus.

D. Procedure

All experiments measured the LDEL between a refer-

ence stimulus that had noise bursts with a duration of 20ms

and a target stimulus that had shorter bursts. The sequences

of bursts were presented in the urban background sound

whose level was kept constant at 65 dB SPL (laboratory,

experiment 1a) or the reference level determined by the

speech sound (online, experiments 1b to 4). LDELs were

determined with a two-interval, two-alternative forced-

choice task in which the participants indicated which of the

two helicopter-like sounds, i.e., noise burst sequences, was

louder. Four LDELs were determined for each condition:

Either the level of the 20-ms bursts was varied or the level

of the shorter bursts was varied. The variable sound started

at an RMS level either 10 dB higher or 10 dB lower than the

RMS level of the background sound. The RMS level of the

fixed sequence of bursts was always equal to the level of the

background sound. After each response, the level of the

noise bursts in the variable sound was changed in a 1-up/1-

down procedure (Levitt, 1971) until eight reversals

occurred. The step sizes were 5 dB for the first two reversals,

3 dB for the next two reversals, and 1 dB for the remaining

four reversals. The mean level at the last four reversals was

used to estimate the LDEL.

In experiments 1 to 3, the target sounds had burst dura-

tions of 1, 2, 5, and 10ms. The repetition rate of the bursts

was 10Hz in experiment 1, 5Hz in experiment 2, and 20Hz

in experiment 3. In experiment 4, the target sound always

had a burst duration of 2ms, i.e., sequences of 2-ms bursts

were compared to sequences of 20-ms bursts. The burst rep-

etition rates were 5 and 20Hz, and the bursts either had

abrupt onsets and offsets or their envelopes were multiplied

by a Hann window, thus also resulting in four conditions.

Table II gives an overview of all conditions.

The combination of burst duration, burst repetition rate,

and shape of burst onsets and offsets resulted in four condi-

tions in each experiment. Since four LDELs were measured

for each condition, there were four up-down tracks per con-

dition, resulting in sixteen tracks in each experiment. The

experiments were divided into blocks, which were run in a

pseudo-random order and between which participants were

encouraged to take breaks. Each experiment lasted about

40–45min on average. Experiments 1b, 2, and 3 differed

only in the repetition rate and participants. Experiments 1a

and 1b differed in whether the experiment was done in the

laboratory or online and in one aspect of the procedure: In

experiment 1a, the four tracks for a single condition were

interleaved, i.e., presented in one block with the software

randomly choosing which of the four tracks was presented

on the next trial. In the online experiments, a block con-

sisted of one track. Thus, there were four blocks in

J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al. 1051

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


experiment 1a (the four conditions) and sixteen blocks (four

conditions times four tracks) in all other experiments.

E. Analysis

Unless stated otherwise, LDELs are expressed as the

RMS level of the sequence with the shorter bursts minus the

RMS level of the sequence with the 20-ms bursts; silent

intervals between bursts were included in the RMS calcula-

tion. Thus, a negative LDEL indicates that the sequence

with the shorter pulses needed less overall energy than the

sequence with longer pulses to give equal loudness. The

energy across the whole sequence of 1.6 s rather than the

power of a single burst was used to allow a direct compari-

son with noise evaluation metrics that also average the

energy across the whole duration of measurement. For

example, an LDEL of �8 dB between a sequence with 1-ms

bursts and a sequence with 20-ms bursts would indicate that

the sequence with 1-ms bursts needed 8 dB less energy for

equal loudness but the power of a single 1-ms burst was

5 dB higher than that of a 20-ms burst since the 20-ms bursts

accumulated energy during 20 times as much time

(10log10(20)¼ 13 dB). In the rest of this paper, the term

energy is used to refer to the energy during the whole 1.6 s

of the burst sequence, and the term burst power to refer to

the power that is present during a single burst. To allow an

easier comparison to studies that used burst power as the

measure, all figures contain a second panel with the same

data but with the LDELs expressed as the level difference

between the noise bursts themselves.

All data from a participant were excluded from the

analysis when the four LDELs for any condition differed by

more than 20 dB. This limit was chosen because it was the

difference between starting values of the up-down tracks

and random clicking without listening would result in a dis-

crepancy between tracks of that order of magnitude. This

led to the exclusion of one participant for experiment 1a,

four participants for experiment 1b, four participants for

experiment 2, three participants for experiment 3, and three

participants for experiment 4.

A within-subjects analysis of variance (ANOVA) was

computed for each experiment. This was a one-way

ANOVA with factor burst duration for experiments 1 to 3

and a two-way ANOVA with factors window shape and rep-

etition rate for experiment 4. An additional ANOVA with

additional between-subjects factor location was computed

for the combined data of experiments 1a and 1b to estimate

the effect size of the difference between laboratory and

online data collection.

III. RESULTS

The results of experiments 1a and 1b are shown in

Fig. 2. Average LDELs based on the RMS levels of the

whole sequences (left panel) ranged from �3.4 dB for 1-ms

target bursts to �1.3 dB for 10-ms target bursts for experi-

ment 1a and from �4.6 dB to �0.4 dB for the online data

from Experiment 1b. A two-way ANOVA showed a signifi-

cant effect of duration, F(3,87)¼ 30.1, p< 0.001,

g2p ¼ 0.509, no significant effect of location, F(1,29)¼
0.428, p¼ 0.518, g2p ¼ 0.015, but a significant interaction

between the two factors, F(3,87)¼ 3.51, p¼ 0.019,

g2p ¼ 0.108. Together with the descriptive results, this indi-

cates that the effect of duration on the LDEL based on RMS

level of the sequence was a little greater for the online data

and conversely a little smaller based on the burst levels. In

other words, for the shortest duration, temporal integration

was somewhat closer to that expected from integration of

intensity in the laboratory than in the online experiment.

Separate one-way ANOVAs for each experiment yielded a

significant main effect of duration in both cases, F(3,42)¼
7.08, p< 0.001, g2p ¼ 0.336 for experiment 1a and F(3,45)¼
26.4, p< 0.001, g2p ¼ 0.638 for experiment 1b.

The results for the burst sequences with repetition rates

of 5Hz (experiment 2, red line with circles in Fig. 3) and

20Hz (experiment 3, blue line with squares in Fig. 3)

showed similar patterns. For a burst duration of 1ms, the

LDEL based on the whole sequence was �3.3 dB for the

repetition rate of 5Hz and �6.7 dB for the repetition rate of

20Hz. Overall, the LDELs became more negative (left

panel) or smaller (right panel) with increasing burst repeti-

tion rate. One-way ANOVAs based on the whole sequence

yielded significant main effects of duration: F(3,45)¼ 20.0,

p< 0.001, g2p ¼ 0.571 for experiment 2 and F(3,48)¼ 85.6,

p< 0.001, g2p ¼ 0.843 for experiment 3.

The LDELs for the Hann-windowed bursts conformed

more closely to integration of energy than for the bursts

with abrupt onsets and offsets but were still negative (exper-

iment 4, Fig. 4, left panel). The LDELs were more negative

for the higher burst repetition rate. For the Hann-windowed

bursts, the rise and fall times were shorter for the 2-ms

bursts than for the 20-ms bursts since the window length

equaled the burst duration. Experiment 4 replicated the

results obtained in the previous experiments for the 2-ms

bursts with rectangular envelopes, with LDELs (based on

TABLE II. Overview of experimental conditions.

Experiment Location Burst durations (ms) Burst repetition rate (Hz) Burst onsets and offsets

1a Lab 1, 2, 5, 10 10 Abrupt

1b Online 1, 2, 5, 10 10 Abrupt

2 Online 1, 2, 5, 10 5 Abrupt

3 Online 1, 2, 5, 10 20 Abrupt

4 Online 2 5, 20 Abrupt, Hann window

1052 J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al.

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


the whole sequence) of �1.9 dB for the 5-Hz repetition rate

(experiment 2: �1.4 dB) and �4.4 dB for the 20-Hz repeti-

tion rate (experiment 3: �4.9 dB). A two-way within-sub-

jects ANOVA yielded significant main effects of window

shape, F(1,16)¼ 14.6, p¼ 0.001, g2p ¼ 0.477, and repetition

rate, F(1,16)¼ 18.4, p< 0.001, g2p ¼ 0.535, and a significant

interaction, F(1,16)¼ 6.60, p¼ 0.021, g2p ¼ 0.292.

The results of all experiments were compared with pre-

dictions of a model for partial loudness (Glasberg and

Moore, 2005; but using the time constants of Moore et al.,
2018). Figure 5 shows the participants’ LDELs (abscissa)

plotted against the predicted LDELs (ordinate) based on

three different measures: The maximum of the long-term

partial loudness, which is used to estimate the overall loud-

ness of a sound sequence (Glasberg and Moore, 2002;

Moore et al., 2016); the mean of the short-term partial loud-

ness, which was used by Glasberg and Moore (2005) to esti-

mate partial loudness; and the maximum of the

instantaneous partial loudness as an extreme without any

temporal integration except for the window size of the

Fourier transforms.

Clearly, none of the measures gave accurate predic-

tions. The LDELs predicted using the mean of the partial

short-term loudness and the maximum of the long-term par-

tial loudness differed by less than 1 dB. This is not surpris-

ing because the attack time constants for both short-term

and long-term loudness are longer than the durations of the

bursts used here, so both types of loudness were still build-

ing up when a burst ended. The predictions were close to the

values expected from integration of energy (0 dB in the left

FIG. 2. LDELs of 20-ms reference bursts and target bursts with durations from 1 to 10ms, for a 10-Hz burst repetition rate (experiment 1). Red circles and

blue squares denote means across participants and runs from the laboratory (experiment 1a) and online (experiment 1b), respectively. Error bars denote 61

standard deviation. The left panel shows LDELs based on the RMS level of the whole sequence, while the right panel shows LDELs based on the level of

the noise bursts themselves. The dashed line in a given panel represents the 0-dB (no difference) line of the other panel. For example, for the left panel the

dashed line represents 0-dB difference in Lpeak (or burst level).

FIG. 3. LDELs for 20-ms reference bursts as a function of target burst duration. Red circles and blues squares show mean results for 5-Hz (experiment 2)

and 20-Hz (experiment 3) repetition rates. The black line shows the mean results for a 10-Hz repetition rate (experiment 1b) for comparison. Otherwise, as

Fig. 2.

J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al. 1053

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


panels of Figs. 2 to 4) but slightly positive, implying slightly

slower temporal integration.

For the fastest temporal integration process assessed

here, the maximum value of the instantaneous partial loud-

ness, the predicted LDELs were more negative than the

obtained LDELs and were close to expectations based on

the burst power. For experiment 1, for example, the pre-

dicted LDELs based on the whole sequence were �10.7,

�7.9, �4.3, and �1.7 dB for burst durations of 1, 2, 5, and

10ms, respectively. The predicted values were similar for

the same burst durations in experiments 2 and 3, since the

window sizes of the fast Fourier transform mostly included

only a single burst for all repetition rates.

The results in Fig. 5 suggest that temporal integration

of partial loudness for short, repeated bursts is faster than

predicted by long-term loudness, short-term loudness, or

energy integration. Using the loudness model, a better fit to

the present data might be obtained by assuming an addi-

tional temporal integration process applied to instantaneous

loudness, with an attack time shorter than that used to esti-

mate short-term loudness. However, that is beyond the

scope of the current paper, since any change in model

parameters must not be based on data for impulsive noise

bursts alone.

IV. DISCUSSION

All experiments showed that a simple integration of

energy over time, as it is used by various noise evaluation

metrics, did not capture the LDEL values for short burst

FIG. 4. LDELs for 2-ms target bursts and 20-ms references bursts for each burst shape. Red circles and blues squares show mean results for the 5-Hz and

20-Hz burst rates, respectively. Otherwise, as Fig. 2.

FIG. 5. LDELs obtained in all experiments versus predicted LDELs. Blue symbols show LDELs predicted from the maximum of the long-term partial loud-

ness, red symbols show LDELs predicted from the mean of the short-term partial loudness, and black symbols show LDELs predicted from the maximum

instantaneous partial loudness. Squares show results of experiment 1a (10-Hz burst repetition, laboratory), circles those of experiment 1b (10-Hz burst repe-

tition, online), triangles those of experiment 2 (5-Hz burst repetition), plus signs those of experiment 3 (20-Hz burst repetition), and � characters those of

experiment 4 (burst shapes).

1054 J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al.

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


durations. The deviation of experimental results from pre-

dictions based on energy integration was up to 6.7 dB for

bursts with abrupt onsets and offsets and the fastest repeti-

tion rate (see Fig. 3, left panel). The negative LDELs based

on the level of the whole sequence indicate that the temporal

buildup of loudness was faster than expected from energy

integration. However, it was not instantaneous, shorter burst

durations still being less loud than longer burst durations at

equal burst power and LDELs not being as negative as

would be predicted from instantaneous loudness.

Comparable effects of repetition rate have been found for

loudness in quiet: the loudness of short bursts of noise

increased with repetition rate from 0.3 to 50Hz (Garrett,

1965). The increase was apparent even for low burst rates,

i.e., from 1 to 3Hz. This effect might be modelled by chang-

ing the release time for short-term loudness. However, to

cover the low rates investigated by Garrett, a release time

constant close to that for long-term loudness [750ms in

Moore et al. (2018)] would be needed. The set of available

results thus suggests that the attack time—or build up—of

short-term loudness may be faster for short repeated sounds,

but the release time—or decay of loudness after a sound

event—may be less dependent on the specific sound or even

be similar to that for long-term loudness.

The Cambridge models of loudness (Glasberg and

Moore, 2002; Moore et al., 2016) reflect the concept that the
loudness of time-varying sounds cannot be characterized by

a single value. Rather, there may be several aspects of loud-

ness, such as the momentary loudness of a segment of sound

lasting a few hundred ms such as a syllable (short-term loud-

ness) and the loudness of a longer sound such as a sentence

(long-term loudness). To estimate these different aspects of

loudness, the models incorporate a set of time constants,

representing different temporal-integration processes. It is

possible that a third aspect of loudness is applicable for reg-

ular and predictable but very short bursts of sound, as used

in the present study. Modelling this may require the use of a

very short attack time but with a longer release time to

account for integration across bursts and to account for the

finding that the LDEL for the 1-ms bursts became more neg-

ative with increasing repetition rate.

The present experiments used stimuli with the same

long-term spectrum for the background and target sounds.

This ensured that the LDELs resulted largely from the tem-

poral properties of the sounds. However, it may have been

more difficult for the listeners to separate the target from

background than would be the case for sounds with distinct

spectra. The impact of the background being included in the

loudness assessment was mitigated by using the same back-

ground level in both intervals of the task and by the instruc-

tions to judge the loudness of the “helicopter-like” sounds.

This instruction might have introduced a bias if participants

had a negative attitude towards helicopters. However, this

bias would have occurred for both intervals, with no or lim-

ited effect on the LDEL.

Consider next the effect that the abrupt onsets and off-

sets had on the present results. Typically, such abrupt

changes in level cause a spectral spread and increase loud-

ness. In the present experiments, this effect was mitigated

by using the same ramps for the target and the reference and

by using a broadband noise carrier, although there may have

been some spectral spread outside the passband below

250Hz or above 4000Hz. Experiment 4 showed a statisti-

cally significant effect of burst shape, with smaller LDELs

for the Hann-windowed bursts. However, the overall pattern

was the same as for the abruptly gated bursts. In particular,

the LDELs based on the whole sequence were still negative,

more so for the higher burst rate. The higher LDELs associ-

ated with abrupt onsets are important to consider for highly

impulsive sounds such as rotors, sonic booms, or a jackham-

mer. Impulsive sounds have often been reported to lead to

higher psychoacoustic annoyance (e.g., Torija et al., 2022;
Torija and Nicholls, 2022). This is consistent with the

results of experiment 4, which suggest that impulsive onsets

lead to faster temporal integration and thus higher loudness

for short bursts.

Another possible explanation for the fast temporal inte-

gration found in the present study is that the auditory system

was already excited by the background at the time of burst

presentation. With a background noise at a level similar to

that of the bursts, the total loudness before burst onset was

not much less than that of noise and target together. It is

possible that some of the neural activity caused by the back-

ground was perceptually assigned to the burst, increasing

the loudness of the burst and leading partial loudness to

build up very quickly in the first few milliseconds. The few

decibels of additional temporal integration in the first few

milliseconds is comparable to the difference found by

Florentine et al. (1998) between the temporal integration of

a 60 dB SPL tone in quiet and in the presence of an equally

intense masker.

The largest difference between the LDELs obtained

online and in the laboratory in experiment 2 occurred for the

1-ms bursts, with LDELs of �4.6 dB (online) and �3.4 dB

(laboratory; see Fig. 2). The 1.2-dB difference is somewhat

larger than the 0.5-dB difference obtained between different

conditions online that were replicated in experiment 4.

Apart from the headphones used, the difference may come

from the use of different populations. Participants in the lab-

oratory were primarily young students, which resulted in a

lower average age by eight years. Furthermore, the hearing

of the laboratory participants was verified as being normal

using calibrated equipment. For the online participants, nor-

mal hearing was self-reported on the day of the experiment

and earlier on the recruitment platform. There may have

been a small effect of absolute level, although it is likely

that, on average, online participants used a level close to

65 dB SPL. It is unlikely that the less quiet environment

online had an effect since the background noise level was

high enough to mask many everyday sounds. Even though

the interaction with location was statistically significant, the

effect size of 1.2 dB was rather small and the overall pattern

of the laboratory and online results was similar. Thus, we

think that online experiments are suitable for studying trends

J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al. 1055

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/10.0042387


in loudness perception when sounds are compared to a refer-

ence, differ only in their temporal characteristics, and are

presented in background noise.

V. CONCLUSIONS

(1) Temporal integration of partial loudness for very short

noise bursts was larger than would be expected from

integration of energy. The effect was equivalent to a

level difference of 6.7 dB for 1-ms bursts compared to

20-ms bursts.

(2) Although the partial loudness of short bursts of noise

built up more rapidly than expected from integration of

energy or calculated using short-term partial loudness,

partial loudness still took some time to build up. The

LDEL values were smaller in absolute magnitude than

would be expected from burst power or calculated

instantaneous loudness.

(3) The place of the experiment, in the laboratory or online,

was associated with a statistically significant interaction.

However, the effect amounted to only 1.2 dB. The small

difference means that online studies are suitable for

studying temporal aspects of loudness perception.

ACKNOWLEDGMENTS

This work was supported by the Royal Society, Grant

RG\R2\232164. We thank two reviewers for helpful

comments on an earlier version of this paper.

AUTHOR DECLARATIONS

Conflict of Interest

The authors have no conflicts to disclose.

Ethics Approval

Ethical approval was obtained from the Department of

Speech, Hearing and Phonetic Sciences at University

College London, ID SHaPS-2023-JS-036.

DATA AVAILABILITY

The data that support the findings of this study are

available from the corresponding author upon reasonable

request.

ANSI/ASA. (2013). ANSI/ASA S1.1-2013 Acoustical Terminology
(American Standards Association, New York).

European Union (2002). Directive 2002/49/EC of the European Parliament
and of the Council of 25 June 2002 Relating to the Assessment and
Management of Environmental Noise (The European Parliament and the

Council of the European Union, Brussels, Belgium).

Florentine, M., Buus, S., and Poulsen, T. (1996). “Temporal integration of

loudness as a function of level,” J. Acoust. Soc. Am. 99, 1633–1644.
Florentine, M., Buus, S., and Robinson, M. (1998). “Temporal integration

of loudness under partial masking,” J. Acoust. Soc. Am. 104, 999–1007.
Garrett, M. (1965). “Determination of the loudness of repeated pulses of

noise,” J. Sound Vib. 2, 42–52.
Glasberg, B. R., and Moore, B. C. J. (2002). “A model of loudness applica-

ble to time-varying sounds,” J. Audio Eng. Soc. 50, 331–342.

Glasberg, B. R., and Moore, B. C. J. (2005). “Development and evaluation

of a model for predicting the audibility of time-varying sounds in the pres-

ence of background sounds,” J. Audio Eng. Soc. 53, 906–918.
Glasberg, B. R., Moore, B. C. J., Patterson, R. D., and Nimmo-Smith, I.

(1984). “Dynamic range and asymmetry of the auditory filter,” J. Acoust.

Soc. Am. 76, 419–427.
Gockel, H., Moore, B. C. J., and Patterson, R. D. (2002). “Asymmetry of

masking between complex tones and noise: The role of temporal structure

and peripheral compression,” J. Acoust. Soc. Am. 111, 2759–2770.
Gockel, H., Moore, B. C. J., and Patterson, R. D. (2003). “Asymmetry of

masking between complex tones and noise: Partial loudness,” J. Acoust.

Soc. Am. 114, 349–360.
Hellman, R. P. (1972). “Asymmetry of masking between noise and tone,”

Percept. Psychophys. 11, 241–246.
Houtgast, T. (1974). “Lateral suppression and loudness reduction of a tone

in noise,” Acustica 30, 214–221.
Levitt, H. (1971). “Transformed up-down methods in psychoacoustics,”

J. Acoust. Soc. Am. 49, 467–477.
Miller, G. A. (1948). “The perception of short bursts of noise,” J. Acoust.

Soc. Am. 20, 160–170.
Moore, B. C. (2014). Auditory Processing of Temporal Fine Structure:
Effects of Age and Hearing Loss (World Scientific, Singapore).

Moore, B. C. J., Glasberg, B. R., and Baer, T. (1997). “A model for the pre-

diction of thresholds, loudness, and partial loudness,” J. Audio Eng. Soc.

45, 224–240.
Moore, B. C. J., Glasberg, B. R., and Stone, M. A. (2004). “New version of

the TEN test with calibrations in dB HL,” Ear Hear. 25, 478–487.
Moore, B. C. J., Glasberg, B. R., Varathanathan, A., and Schlittenlacher, J.

(2016). “A loudness model for time-varying sounds incorporating binau-

ral inhibition,” Trends Hear. 20, 1–16.
Moore, B. C. J., Jervis, M., Harries, L., and Schlittenlacher, J. (2018).
“Testing and refining a loudness model for time-varying sounds incorpo-

rating binaural inhibition,” J. Acoust. Soc. Am. 143, 1504–1513.
Panayotov, V., Chen, G., Povey, D., and Khudanpur, S. (2015).
“LibriSpeech: An ASR corpus based on public domain audio books,” in

Proceedings of the IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP) 2015, pp. 5206–5210.

Pollack, I. (1958). “Loudness of periodically interrupted white noise,”

J. Acoust. Soc. Am. 30, 181–185.
Richards, A. M. (1977). “Loudness perception for short-duration tones in

masking noise,” J. Speech Hear. Res. 20, 684–693.
Scharf, B. (1978). “Loudness,” in Hearing, edited by E. Carterette and M.

Friedman (Academic Press, New York), pp. 187–242.

Schlittenlacher, J., and Ellermeier, W. (2015). “Simple reaction time to the

onset of time-varying sounds,” Atten. Percept. Psychophys. 77, 2424–2437.
Schlittenlacher, J., and Moore, B. C. J. (2021). “Temporal integration of

partial loudness of helicopter-like sounds,” in Proceedings of the INTER-
NOISE and NOISE-CON Congress and Conference, pp. 4767–4772.

Schroeder, M. R., Atal, B. S., and Hall, J. L. (1979). “Optimizing digital

speech coders by exploiting masking properties of the human ear,”

J. Acoust. Soc. Am. 66, 1647–1652.
Stevens, S. S., and Guirao, M. (1967). “Loudness functions under inhib-

ition,” Percept. Psychophys. 2, 459–465.
Thwaites, A., Glasberg, B. R., Nimmo-Smith, I., Marslen-Wilson, W. D.,

and Moore, B. C. J. (2016). “Representation of instantaneous and short-

term loudness in the human cortex,” Front. Neurosci. 10, 1–11.
Thwaites, A., Schlittenlacher, J., Nimmo-Smith, I., Marslen-Wilson, W. D.,

and Moore, B. C. J. (2017). “Tonotopic representation of loudness in the

human cortex,” Hear. Res. 344, 244–254.
Torija, A. J., Li, Z., and Chaitanya, P. (2022). “Psychoacoustic modelling

of rotor noise,” J. Acoust. Soc. Am. 151, 1804–1815.
Torija, A. J., and Nicholls, R. K. (2022). “Investigation of metrics for assessing

human response to drone noise,” Int. J. Environ. Res. Public Health 19, 1–19.
Zwicker, E. (1954). “Die Verdeckung von Schmalbandger€auschen durch

Sinust€one” (“Masking of narrowband noises by pure tones”), Acustica 4,
415–420.

Zwicker, E. (1963). “€Uber die Lautheit von ungedrosselten und gedrossel-

ten Schallen” (“On the loudness of unmasked and partially masked

sounds”), Acustica 13, 194–211.
Zwicker, E. (1974). “Die Zeitkonstanten (Grenzdauern) des Geh€ors”
(“Time constants (characteristic durations) of hearing”), Z. H€orger€ate-
Akustik 13, 82–102.

1056 J. Acoust. Soc. Am. 159 (2), February 2026 Schlittenlacher et al.

https://doi.org/10.1121/10.0042387

 18 February 2026 10:42:45

https://doi.org/10.1121/1.415236
https://doi.org/10.1121/1.423314
https://doi.org/10.1016/0022-460X(65)90080-5
https://doi.org/10.1121/1.391584
https://doi.org/10.1121/1.391584
https://doi.org/10.1121/1.1480422
https://doi.org/10.1121/1.1582447
https://doi.org/10.1121/1.1582447
https://doi.org/10.3758/BF03206257
https://doi.org/10.1121/1.1912375
https://doi.org/10.1121/1.1906359
https://doi.org/10.1121/1.1906359
https://doi.org/10.1097/01.aud.0000145992.31135.89
https://doi.org/10.1177/2331216516682698
https://doi.org/10.1121/1.5027246
https://doi.org/10.1121/1.1909531
https://doi.org/10.1044/jshr.2004.684
https://doi.org/10.3758/s13414-015-0940-3
https://doi.org/10.1121/1.383662
https://doi.org/10.3758/BF03208795
https://doi.org/10.3389/fnins.2016.00183
https://doi.org/10.1016/j.heares.2016.11.015
https://doi.org/10.1121/10.0009801
https://doi.org/10.3390/ijerph19063152
https://doi.org/10.1121/10.0042387