Bioinspir. Biomim. 19 (2024) 046015 https://doi.org/10.1088/1748-3190/ad5129

OPEN ACCESS

RECEIVED

22 December 2023

REVISED

1 May 2024

ACCEPTED FOR PUBLICATION

28 May 2024

PUBLISHED

10 June 2024

Original Content from
this work may be used
under the terms of the
Creative Commons
Attribution 4.0 licence.

Any further distribution
of this work must
maintain attribution to
the author(s) and the title
of the work, journal
citation and DOI.

PAPER

Utilising redundancy in musculoskeletal systems for adaptive
stiffness and muscle failure compensation: a model-free inverse
statics approach
Elijah Almanzor1,3, Taku Sugiyama2,3,∗, Arsen Abdulali1, Mitsuhiro Hayashibe2 and Fumiya Iida1
1 Bio-Inspired Robotics Laboratory, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
2 Neuro-Robotics Laboratory, Department of Robotics, Graduate School of Engineering, Tohoku University, Sendai 980-8579, Japan
3 These two authors contributed equally to this work and are designated as co-first authors.
∗ Author to whom any correspondence should be addressed.

E-mail: taku.sugiyama.t5@dc.tohoku.ac.jp, eda26@cam.ac.uk, hayashibe@tohoku.ac.jp, aa2335@cam.ac.uk and fi224@cam.ac.uk

Keywords:musculoskeletal system, redundancy, neural network, inverse statics, adaptive stiffness, resilience

Supplementary material for this article is available online

Abstract
Vertebrates possess a biomechanical structure with redundant muscles, enabling adaptability in
uncertain and complex environments. Harnessing this inspiration, musculoskeletal systems offer
advantages like variable stiffness and resilience to actuator failure and fatigue. Despite their
potential, the complex structure presents modelling challenges that are difficult to explicitly
formulate and control. This difficulty arises from the need for comprehensive knowledge of the
musculoskeletal system, including details such as muscle arrangement, and fully accessible muscle
and joint states. Whilst existing model-free methods do not need explicit formulations, they also
underutilise the benefits of muscle redundancy. Consequently, they necessitate retraining in the
event of muscle failure and require manual tuning of parameters to control joint stiffness limiting
their applications under unknown payloads. Presented here is a model-free local inverse statics
controller for musculoskeletal systems, employing a feedforward neural network trained on motor
babbling data. Experiments with a musculoskeletal leg model showcase the controller’s adaptability
to complex structures, including mono and bi-articulate muscles. The controller can compensate
for changes such as weight variations, muscle failures, and environmental interactions, retaining
reasonable accuracy without the need for any additional retraining.

1. Introduction

Evolution has equipped humans and animals with
the ability to function and adapt in uncertain, com-
plex environments. Intelligent behaviour emerges
from the interplay between an organism’s central
nervous system (CNS), body, and environment [1,
2]. Vertebrates, including humans, feature a bio-
mechanical structure with redundant muscles organ-
ised in agonist-antagonist pairs for motion [3]. These
muscles can be mono-articulate, influencing a single
joint directly, or multi-articulate, affecting multiple
joints [4–6]. Humans are therefore able to perform
a wide variety of complex and precise movements
by coordinating and recruiting certain muscles based

on the specific demands of a particular movement
[7]. Functional redundancy in muscles contributes to
mechanical resilience, compensating for injuries [8,
9], fatigue [10], ageing [11] and maintaining overall
functionality [12].

In healthy individuals, fatigue directly impacts
muscle performance, activations, and resulting
muscle synergies formovement [13, 14]. For instance,
fatigue in the elbow flexors leads to reduced trans-
joint inhibition of related arm muscles, indicating
compensatory use of redundancy [14]. In a sim-
ilar study [15], participants fatigued from repeated
ankle exertions during a hopping task employed two
strategies for compensation. Firstly, they relied more
on proximal bi-articulate muscles to adjust leg joint

© 2024 The Author(s). Published by IOP Publishing Ltd

https://doi.org/10.1088/1748-3190/ad5129
https://crossmark.crossref.org/dialog/?doi=10.1088/1748-3190/ad5129&domain=pdf&date_stamp=2024-6-10
https://creativecommons.org/licenses/by/4.0/
https://creativecommons.org/licenses/by/4.0/
https://orcid.org/0000-0001-8101-4217
https://orcid.org/0000-0001-9235-8779
https://orcid.org/0000-0001-6179-5706
https://orcid.org/0000-0001-9246-7190
mailto:taku.sugiyama.t5@dc.tohoku.ac.jp
mailto:eda26@cam.ac.uk
mailto:hayashibe@tohoku.ac.jp
mailto:aa2335@cam.ac.uk
mailto:fi224@cam.ac.uk
https://doi.org/10.1088/1748-3190/ad5129


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 1. The musculoskeletal leg model utilised in this paper. It is comprised of six muscles: The mono-articulate
agonist-antagonist pairs iliacus (IL) and gluteus maximus (GM), vastus intermedius (VA) and the short biceps (SB), which drive
the hip and knee joints directly, and the bi-articulate muscles, rectus femoris (RF) and the long biceps (LB). On the right is an
example of an arbitrary leg posture.

stiffness, compensating for fatigued knee and ankle
muscles. Secondly, they exhibited earlier activations
of the gastrocnemius muscle. This strategic distri-
bution of effort among muscles enhances endurance
and robustness, enabling sustained forces without
excessive fatigue [16]. Furthermore, various studies
indicate that individuals experiencing chronic pain
increase activity in other muscles to maintain motor
output while minimising use of the painful muscles
[17, 18].

Stroke often leads to weakness, spasticity, and
abnormal movement patterns in limbs opposite the
cerebral lesion, prompting compensatory strategies
[19, 20]. Stroke patients with severe arm impair-
ments exhibit entirely different muscle synergies in
the affected arm compared to the unaffected arm
[21], for adjusting hand posture [22] and for picking
up objects [23]. Comparisons of unconstrained upper
limb reaching between healthy individuals and stroke
patients reveal that stroke patients recruit additional
degrees of freedom, particularly trunkmuscles, indic-
ating exploitation of muscle and kinematic redund-
ancy in the musculoskeletal structure [19]. Subjects
with spinal cord injury also demonstrate signific-
antly different hand muscle synergies [24]. Cats also
exhibit similar compensatory strategies to maintain
limb kinematics following various peripheral nerve
injuries [25].

Musculoskeletal systems mimic the biomechan-
ical structure of organisms, controlling joints through
tendon-actuator unit agonist-antagonist pairs, unlike
traditional robots that directly drive joints with
motors (see figure 1). Similar to real muscles,
actuators in these systems exhibit compliant beha-
viour, utilising components like pneumatic artifi-
cial muscles or Hill-type muscle models [26, 27].
This compliance provides advantages similar to

their biological counterparts. By employing compli-
ant Hill-type models and co-activation patterns of
agonist-antagonist pairs [28], musculoskeletal sys-
tems can attain variable stiffness in their joints or
as a whole. This feature allows for safer interactions
with the environment compared to rigid counterparts
[29]. Moreover, they can generate greater forces than
purely soft robots, leveraging their rigid skeletons.
Furthermore, owing to their high muscle redundan-
cies, failures and breakage of muscles can be com-
pensated by others through different muscle stimula-
tion activations and combinations to achieve the same
motion [30].

The motor equivalence problem posits that
coordination and movement lack unique solutions
[31, 32]. Theoretical infinite solutions exist for the
same task space position across the kinematic joint,
muscle, activation, and neural spaces [33]. Whilst
resolving redundancy in the CNS remains a key
research focus [34], it is widely accepted that the
CNS employs both feedforward and feedback internal
models for motor control [35, 36] with signals des-
cending frommotor cortical areas to the spinal neural
structures [21]. Movement often exhibits stereotyp-
ical behaviour, such as the ‘two-thirds power law’
observed in humans and primates, where angu-
lar velocity logarithmically relates to curvature [37,
38]. Three main internal model frameworks attempt
to explain these regularities: optimal control the-
ory (OCT), passive motion paradigm (PMP), and
motion primitives (MP). OCT suggests that move-
ment regularities arise from CNS path planning,
optimising trajectories based on objective func-
tions like the minimum jerk model and its variants
[39–41], which predict movements using fifth-order
polynomials [42]. PMP [43], building on equilibrium
point hypothesis (EPH), simulates internal shifts in

2


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

musculoskeletal structure equilibrium to achieve
desired postures through joint torque calculations
based on task-related force-fields, akin to imped-
ance control. MP proposes that complex coordinated
motions result from combining stored primitives
with language-like syntax to reduce dimensionality
[44, 45]. For example, the time-varying synergies
model scales, shifts, and sums coordinated muscle
activations to generate new patterns [46, 47], often
extracting muscle synergies using non-negative mat-
rix factorisation [44].

The OCT and PMP frameworks for explaining
human limb manipulation rely heavily on extens-
ive prior knowledge of the entire system, lead-
ing to scalability issues when explaining how the
CNS recruits redundant muscles for synergistic co-
contractions, even after neurotrauma. For instance,
in cases ofmuscle failure like stroke-induced paralysis
[19], manual reformulation of the cost function for
OCT and the admittance matrix for PMP is neces-
sary. On the other hand, while a limited set of motion
primitives can significantly reduce dimensionality
and complexity, their mathematical extraction and
formalisation are required [44, 48, 49]. Consequently,
these methods generally fail to address CNS plas-
ticity, which allows individuals with neuromuscu-
lar disorders due to stroke or injuries [23, 25] or
those experiencing fatigue [15] and chronic pain [17]
to utilise motor redundancy for alternative control
strategies in achieving target reaching motions.

Another notable framework for explaining motor
redundancy is the concept of central pattern gen-
erators (CPG). Rhythmic movement like respira-
tion, walking, and swimming, is attributed to CPGs
in the spinal cord [50–52]. These CPGs generate
muscle activationsmodulated by higher-level centres,
producing synergistic muscle contractions based
on sensory responses from the agent-environment
interactions [53]. Although some evidence indicates
that certain rhythmic humanmovements persist after
neurotrauma [54], a larger impairment is found in
discrete reaching than cyclic movements, implying
distinct areas in the CNS govern rhythmic and non-
rhythmic movement [52].

Themotor redundancy problem of infinite choice
in solutions is also very present in musculoskeletal
systems. The non-linear coupling and interference
arising from muscle arrangements, multi-articulate
relationships between joints, tendons, and agonist-
antagonist muscles, coupled with the force–length
and force–velocity properties of Hill-type muscles
[55, 56], contribute to increased complexity inmuscle
stimulation solutions within an already sophistic-
ated system [57]. Controllers relying on algebraic
or numerical kinematic or dynamic models can
generate appropriate muscle activations for accur-
ate and precise manipulation [58–61]. However,
these approaches assume comprehensive a priori
knowledge of the entire system, posing challenges

when scaling to intricate morphological configur-
ations that involve a larger number of muscles
with diverse properties, geometry, and arrangements.
This limitation constrains their practical applica-
tions. Moreover, sudden changes such as muscle
failure or fatigue require the reformulation of the
model. Consequently, these controllers do not fully
exploit the functional redundancy inherent in mus-
culoskeletal systems.

To obviate the need for explicit models, model-
free methods employing machine learning have been
employed. Some studies have employed supervised
neural networks, which learn the inverse mapping
between task and muscle spaces, facilitating coordin-
ated control of the end-effector with joint-stiffness
control [27, 29, 62–64]. However, these methods
necessitate manual tuning of parameters that govern
co-contraction, restricting their applicability in the
face of unknown payloads or environmental inter-
actions. Additionally, these model-free approaches
require an additional retraining process to adapt
to the system after muscle failure. The only prior
work that undergoes this retraining process is con-
strained to joint space control [65]. While reinforce-
ment learning approaches have proven successful in
highly redundant musculoskeletal systems [66–68],
they often mandate complex reward function design
and laborious retraining after muscle failure.

Several approaches rooted in human motor con-
trol have emerged, with the majority of these incor-
porating machine learning techniques [64, 69–71].
Consequently, they encounter similar drawbacks as
model-free methods. CPG-based control has also
been explored [53], but it is confined to rhythmic
movements. Another noteworthy approach, inspired
by the EPH, has been implemented [72]. Thismethod
enables movement by altering the system’s equilib-
rium but necessitates manual tuning of reflex circuit
gains and muscle reference lengths to achieve desired
end-effector positions, limiting its application in the
task space.

In this study, we address the limitations of prior
methods by implementing a novel application of a
learning-based motor babbling approach to mus-
culoskeletal systems, facilitating control directly in
the task space [73]. This method is employed to
approximate the inverse statics (IS) mapping between
muscle activations and the task space for musculo-
skeletal structures comprising Hill-type muscle mod-
els and bi-articular arrangements. The IS control-
ler and its formulation provides an alternative view
on how such an internal model could be learnt in
biological organisms as well as artificial musculo-
skeletal systems, and offers two key contributions.
Firstly, it demonstrates resilience to muscle failure. In
contrast to prior works [65, 72], our approach does
not necessitate additional retraining when muscles
experience failure, showcasing efficiency in utilising
functional redundancy. Secondly, it autonomously

3


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

adapts joint stiffness without human intervention,
enabling control of the redundant system even in the
presence of unknown payloads. We extend this prop-
erty to demonstrate that the system can handle inter-
actions with the environment not encountered in
the motor-babbling training data without requiring
further retraining. The controller approximates the
local inversion of forward statics by linearising at the
current state using a neural network. Consequently,
it calculates muscle activation signals at each time
step to reach a desired reference, taking as input
only the target and current task space positions and
the current muscle activations, under the assumption
of quasi-static conditions. The learning and deploy-
ment process is straightforward, requiring no prior
information about the model, such as the number
of muscles, their internal states or configuration, the
overall mechanical structure, or the environment. To
validate the controller’s performance, we conducted
simulations using a musculoskeletal leg. Our exper-
imental results indicate that the controller achieves
reasonable accuracy in controlling target-reaching
and trajectory-tracking tasks, even under variations
such as changes in mechanical weight, muscle failure,
and interactions with the environment.

The following section reviews prior work relat-
ing to the control of artificial musculoskeletal sys-
tems (section 2). Section 3.1 explains the mus-
culoskeletal simulation, the implemented Hill-type
muscles (section 3.2), and the formulation of the IS
controller (section 3.3). Section 4 describes the exper-
imental setup and results. Finally, section 5, gives a
discussion, conclusions and scope for future work.

2. Related works

In this section, we review the literature on the various
methods of control for artificial musculoskeletal sys-
tems. Thesemethods fall into the categories ofmodel-
based, model-free, and biologically inspired control
schemes.

2.1. Model-based control approaches
In model-based control [74], several techniques have
been developed, employing kinematic and dynamic
models based on muscle-joint mappings [75–77].
For instance, Zhao et al [58] successfully controlled
a one-degree-of-freedom system driven by mono-
articulate pneumatic muscles using a switch-sliding
controller. Meanwhile, Kawamura et al [59] devised
a numerical joint-space controller for a redundant
muscle humanoid neck. Although their method can
achieve control even when muscles are deactivated,
it is limited to mono-articulate muscles and neces-
sitates accurate joint angle estimation via an exten-
ded Kalman filter. Tahara et al [60] applied the
virtual spring hypothesis to coordinate movement,
achieving end-effector positioning in a six-muscle
musculoskeletal system with bi-articulate muscles.

However, their approach mandates complete know-
ledge of parameters such as muscle insertion points
and fully accessible joint states. Stanev andMoustakas
[61] determined the relationship between kinemat-
ics and dynamics using null-space projection. This
approach, however, also requires fully accessible states
and is difficult to scale to more complex models due
to the time complexity of the numerical algorithm.

Analytical and numerical methods therefore face
scalability challenges in more advanced systems that
incorporate bi-articulate Hill-type muscles suscept-
ible to change such as breakage or failure. These
methods also demand prior knowledge of the sys-
tem, assuming conditions like zero gravity, minimal
friction, fully accessible muscle and joint states, and
minimal interactions, rendering them laborious and
impractical to formulate.

2.2. Model-free control approaches
Approaches that are model-free [65, 78–81] and
hybrid control [26, 29, 63, 82, 83] employ algorithms
like supervised or reinforcement learning.

2.2.1. Supervised learning
Regarding supervised learning approaches, Masuda
et al [62] employed an auto-encoder neural network
to simultaneously control their pneumatically actu-
ated robot and joint stiffness. They used random
motor babbling for learning forward and inverse kin-
ematic (IK) mappings. Similarly, Driess et al [27]
achieved task space control using goal-babbling and
sequential quadratic programming for the inverse,
incorporating neural networks. They adjusted joint
stiffness through perturbations in the Jacobian null
space. Kawaharazuka et al [28] utilised auto-encoders
for state estimation, allowing joint stiffness adjust-
ments and muscle failure detection. Jantsch et al [29]
combined numerical methods and neural networks
hierarchically to control a humanoid robot’s kinemat-
ics and approximate muscle Jacobian. Tahara et al
[63] implemented a hybrid iterative learning control
approach tolerant to sensor noise in muscle and task
space. Zhong et al [64] proposed using convex hull
theory to reduce redundant muscles, simplifying the
system without sacrificing motion accuracy. Other
works applied Gaussian process regression to model
forward dynamics in agonist-antagonist pneumatic-
ally driven joints [26].

The primary advantage of model-free approaches
lies in their ability to eliminate explicit formu-
lations, allowing for scalability to multi-articulate
muscles. Consequently, the majority of these studies
[27, 29, 62–64] enable task space control without
necessitating knowledge of muscle arrangement.
However, while they can achieve joint-stiffness con-
trol by modulating muscle co-contractions, they
rely on human intervention for parameter tuning,
restricting their applicability in scenarios involving

4


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

unknown payloads or interactions with the environ-
ment. Furthermore, akin tomodel-based approaches,
they fail to fully exploit the functional redundancy
offered by multiple muscles. Notably, among the
reviewed literature, the work of Kawahazura et al [65]
is the sole learning approach that addresses this limit-
ation. Nevertheless, their implementation mandates
an online relearning phase for robot control post-
muscle failure. Additionally, their method requires
a known geometric model to initialize their autoen-
coder network, along with internal states such as
muscle length, and is confined to joint control.

2.2.2. Reinforcement learning
In reinforcement learning, Balaghie et al [84] used
neuro-fuzzy controllers updated by critic agents for
each muscle in a simulation musculoskeletal arm.
The controller tolerated up to 20% uncertainties in
arm mass, inertia, and length for task space control.
Another study [30] developed a deep Q network-
based controller for a musculoskeletal arm model,
employing a phased target-learning framework and
human-inspired noise for stable and efficient explor-
ation of the solution space. Complex reinforce-
ment learning algorithms such as Deep Deterministic
Policy Gradient [67], proximal policy optimisation
[68], and Soft Actor-Critics [85] have also shown
promising control results.

However, reinforcement learning faces optim-
isation challenges due to the vast solution space,
requiring complex reward function design, fully
accessible muscle states, high-dimensional inputs,
and specific initial conditions, leading to a computa-
tionally intensive and laborious training process [67].
Unstable early explorations may result in unstable
actions, risking unsafe movements and potential
damage to the system or the environment.

2.3. Biologically-inspired control approaches
Regarding methods inspired by human motor con-
trol, the minimum jerk model and its variations
have been employed for trajectory optimization, gen-
erating human-like motion for rigid robots [86,
87]. However, these methods necessitate a model
of the system which results in similar disadvant-
ages as model-based approaches. The concept of
motion primitives was also applied to musculo-
skeletal systems [88]. Campbell et al [69] used a
Bayesian Interaction Primitives method for open-
loop trajectories without an analytical model. Zhong
et al [64] employed Hebbian learning in recur-
rent neural networks to modulate control signals.
Fu et al [70] extracted motor synergies from non-
optimal motor babbling data, using them for inverse
dynamics in tracking control. Another approach
used radial basis function neural networks to mod-
ulate muscle synergy amplitudes and timescales for
target-reaching [89]. These approaches utilise imit-
ation learning and hence require demonstrations to

bootstrap the learning. Another biologically inspired
approach involves an emotion-modulated learning
rule for recurrent neural networks, dynamically
adjusting hyperparameters based on reward entropy
[90]. This enhanced the accuracy of point-reaching
tasks for their musculoskeletal arm. Likewise, Chen
et al [57] developed a motor cortex-inspired recur-
rent neural network with reward-modulated learn-
ing, exhibiting the ability to learn new tasks without
catastrophic forgetting. As these approaches incor-
porate machine learning, they exhibit the same
drawbacks as the model-free techniques. CPGs have
also been employed for coordinated motion such
as playing instruments [53] and locomotion [91–
93]. CPG-based approaches, while not dependent
on kinematic or dynamic models, are restricted to
rhythmic motions. Drawing inspiration from the
EPH, Marques et al [72] employed Hebbian learn-
ing for reflex behaviour in musculoskeletal legs,
accomplishing stable hopping and point-to-point
reaching tasks via manual tuning of control gains.
Despite its ability to accommodate muscle failure,
this method necessitates re-tuning of reflex circuits.
Moreover, it lacks the inverse mapping between task
andmuscle space,making it unable to directly control
the end-effector.

3. Methods

3.1. Musculoskeletal leg model
To verify the performance and advantages of our
learning-based feedback controller in musculo-
skeletal systems for manipulation tasks, we based the
structure of our simulation experimental platform
on the seminal neuro-muscular studies by Geyer and
Herr [94], and Marques et al [72] which utilised a leg
structure.

We employed MATLAB & Simscape to construct
the musculoskeletal leg, consisting of three rigid seg-
ments: the pelvis, femur, and tibia. The masses of the
femur and tibia are 11 and 5 kg, respectively. The
pelvis mass is set to zero during training data collec-
tion and all experiments as the pelvis is simulated as
a weld joint, except for the squatting experiment in
section 4.3.4, where the pelvis mass varies between
1,3,5 kg. The femur and tibia lengths denoted as Lf
and Lt, are 0.4 and 0.34 m, respectively, and utilise
the centre of the 3D file geometry as the centre of
mass. The hip and knee joints are simulated as revol-
ute joints. Despite the inherent 3D nature of the leg,
the movements are confined to the 2D sagittal plane.

The leg is actuated by a total of six muscles.
The organization, positioning, and tendon linkages
of these muscles draw inspiration from musculo-
skeletal models outlined in [72, 94]. Employing Hill-
type models (refer to section 3.2) and the Simscape
Belts and Cables utilities, we simulate the muscles,
assuming rigid tendons (refer to figure 1). At the

5


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 2. Left: the main experimental setup for the majority of the end-effector control experiments. The pelvis is mounted above
the blue floor such that the end-effector is free to move within the leg’s operating space. Right: the musculoskeletal leg model used
in the squatting experiment in section 4.3.4. The leg’s end-effector is replaced with a flat plate with a spatial contact force against
the blue simulation floor. A flat plate was used to prevent any slippage. The angle has stiffness and damping included to maintain
stability due to the lack of other muscles.

muscle’s actuating end, the cumulative force, com-
prising active, passive, and damping forces, from any
givenHill-typemusclemodel is converted into torque
and directed to a revolute winch. This winch drives
the muscle spool, which, in turn, is linked to simula-
tion cables, representing rigid tendons that connect to
their respective skeleton segments. To ensure mech-
anical stability and prevent simulation crashes, pul-
leys were introduced to avoid collisions between the
cables and rigid bodies.

Two pairs of mono-articulate agonist-antagonist
muscles are used to actuate the hip and knee joints
directly. The iliacus (IL) and gluteus maximus (GM)
are used for the hip joint, and the vastus interme-
dius (VA) and short biceps (SB) are used for the knee
joint. Another agonist-antagonist pair is used for bi-
articulation which has their winches placed on the
pelvis, and the cable ends on the tibia. These are the
rectus femoris (RF) and the long biceps (LB).

For the motor babbling training data collection
and themajority of the experiments, the pelvis is fixed
in place using a weld joint, such that the rest of the
leg is free to move without any interactions with the
environment (see figure 2, left). Both the hip and
knee joints include a damping of 0.3 Nms deg−1 to
remove any mechanical oscillations during actuation
and maintain the quasi-static assumption. A stiffness
of 0.5 Nmdeg−1 is added to the knee joint to set the
equilibriumposition of the tibia to 45◦. This was done
to allow greater motion for the tibia during motor
babbling, particularly when the SB and VA muscles
are exerting similar amounts of force.

For the final experiment involving a squatting
motion, the foot is replaced by a simple rectangular

plate attached to a revolute ankle joint, which has a
spatial contact force model with a floor (see figure 2,
right). The intent of the experiment is not to test
the dynamic capability of the controller such as
upright balancing, but rather to test its behaviour
under unseen quasi-static environmental interac-
tions. Hence, the pelvis mounting is also replaced
with a prismatic joint such that it is only free to
move up and down in the Z direction. The ankle
joint in the model approximates the absence of ankle
muscles by having stiffness and damping values set at
5 Nms deg−1 and 3 Nmdeg−1, respectively, with an
equilibrium point at 17◦, corresponding to the leg’s
initial squatting posture. These values were found to
give sufficient mechanical support to the rest of the
leg during the squatting motion. The contact force
between the feet and the solid ground is modelled
with parameters 1× 109 N m−1 for stiffness, 1 N s
m−1 for damping, and a value of 1 for the coefficient
of static friction. The length, width and thickness of
the plate are 0.3 m, and 0.02 m respectively, with a
mass of 100 kg. These values were found empirically
and were used to prevent the leg from slipping.

3.2. Hill-type muscle model
Themuscle-tendon units (MTU) actuating the leg are
modelled asHill-typemuscles; the standard approach
for approximating biological humanmuscles [55, 56].
Muscle j receives muscle activation signal aj, and out-
puts the totalmuscle force F j

MTU (equation (1)). Each
muscle has a contractile, a spring and a damper ele-
ment, arranged in parallel (see figure 3). The con-
tractile element provides the active contraction force
based on the muscle activation signal aj to produce

6


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

the force FCE. On the other hand, the spring and
damper elements are responsible for the passive forces
of themuscle, and are dependent on themuscle length
and velocity to produce forces Fspring and Fdamper,
respectively. The spring element models the natural
behaviour of real muscles which tries to contract back
to the resting length whenever it experiences exten-
sion from external forces. The damping element res-
ists the rate of extension or contraction [95]

F j
MTU

(̂
l(t) , v̂(t) ,a(t)

)
= FCE

(̂
l(t) , v̂(t) ,a(t)

)
+ Fspring

(̂
l(t)

)
+ Fdamper (v̂(t)) . (1)

Hence, similar to biological muscles, the total out-
put force FMTU of the muscle is not only due to
the muscle activation of the contractile element, but
also the spring and damper elements which gives the
muscles passive dynamics. As a result, each muscle
can also produce passive forces without any muscle
activation. These components greatly affects the com-
plexity and hence the overall behaviour of the mus-
culoskeletal leg unlike certain tendon-driven artificial
systems, which do not explicitly address the spring
and damper elements [28]. Muscles also affect one
another. For example the force from the antagonist
will directly affect the velocity of the agonist.

The force generation of the contractile element is
modelled as follows:

FCE
(̂
l(t) , v̂(t) ,a(t)

)
= FMax · aj (t) · fl

(̂
l(t)

)
· fv (v̂(t)) (2)

where FMax is the maximum isometric force of the
muscle, l̂(t) is the muscle fiber length normalised to
the initial resting length lref, and v̂ is the fiber velo-
city which is normalized to the maximum fiber velo-
city vref. The values used for lref are the initial lengths
when the leg is at its resting posture (see figure 2, left).
The values of vref were found empirically by testing
the maximum velocity of the individual muscles. For
FMax, we used empirical values that correspond to the
joint limits of the leg when the muscle activations aj

are equal to 1. The values for lref, vref, and FMax were
chosen empirically such that they allowed for a wide
range of motions and are more specific to our mus-
culoskeletal system (see table 6, appendix).

The function fl(̂l(t)) is the muscle active force–
length behaviour, which is modelled using a nor-
mal distribution (equation (3)). The function
fv(v̂(t)) is the active force–velocity behaviour of the
muscle, which we modelled using a sigmoid function
(equation (4))

fl
(̂
l(t)

)
=

1√
2πσ

exp

−

(̂
l(t)−µ

)2

2σ2

 (3)

fv (v̂(t)) =
A

1− ebv̂(t)
(4)

where the mean µ and the variance σ2 of the normal
distribution, both set to a value of 1, and the amp-
litude A, and gain b of the sigmoid function set to
values of 1.8 and 5 respectively, such that the shape of
fl(̂l(t)) and fv(v̂(t)) reflects the typical normalised act-
ive force–length and force–velocity characteristics of
the contractile element in priorworks [56, 95, 96]. See
figure 3(b), for the force–length and force–velocities
curves.

The force output of the contractile element is
therefore dependent on the current time-stepsmuscle
activation aj, and the length and velocity of its fibers
(equation (2)). The muscle activation a scales fc and
is constrained within 0⩽ aj ⩽ 1.

The force generated by the passive spring element
(i.e. passive force–length characteristic), is modelled
as a rectified linear function which increases linearly
with K, the (unit-less) spring constant, if the norm-
alised length l̂(t) is greater than 1 (equation (5)). For
K, we used a value of 1 as muscles can only produce
contraction forces, as well as to reflect values used in
prior literature [72, 97]

Fspring
(̂
l(t)

)
= FMax ·

{
K̂l(t)

(
if l̂> 1

)
0 (otherwise) .

(5)

Similarly, we also modelled the passive damp-
ing (i.e. passive force–velocity characteristic) as a lin-
ear function as given by equation (6), which uses
the (unit-less) damping coefficient C= 1 based on
prior works [72, 95], as well as to prevent Hill-type
muscles from exhibiting unrealistic high-frequency
oscillations during the simulation

Fdamper (v̂(t)) = FMax ·Cv̂(t) . (6)

Whilst the simulation model may differ in spe-
cific values to biological structures, the musculo-
skeletal simulation itself is a vital abstracted model
which captures the important characteristics of the
musculoskeletal systems such as the highly non-
linear behaviour, passive compliance and redund-
ancy, arising from the Hill-type the mono and bi-
articulate agonist-antagonist muscle pairs.

3.3. Inverse static mapping between activations
and end-effector positions
In general, musculoskeletal systems featuring Hill-
type muscles exhibit mappings between the muscle
space, the joint space, and the end-effector space [31,
63]. The forward kinematics mapping which relates
the joint angles θ to the end-effector position x is
given by equation (7). The forward statics mapping
which relates the MTU forces FMTU to θ is given by
equation (8). The forward mapping which relates the
muscle activations a to FMTU is given by equation (1)
in section 3.2.

7


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 3. (a) Schematic of the 3-element Hill-type muscle model. (b) Characteristics of the Hill-type muscle model. On the left is
the normalised force output against the normalised length of the muscle under steady-state conditions. The passive force–length
is produced by the spring, and the active force–length is produced by the contractile element. On the right is the normalised force
against the normalised velocity of the muscle. The normalised active force–velocity curve, along with the muscle activation, scales
the contractile force. The passive force–velocity curve is due to the passive damper of the muscle.

x= f1 (θ) (7)

θ = f2 (FMTU) . (8)

The complete forward statics mapping for an
arbitrary musculoskeletal system is therefore given by
equation (9)

x= f(a) . (9)

Hence, the inverse mapping would be to find the
muscle activation solutions a for a given end-effector
pose x. We apply a learning-based inverse control
approach [98–100], which assumes steady-state con-
ditions. Obtaining the inverse of this function is non-
trivial, due to the redundancy of solutions. These
solutions are non-convexwhichmakes direct learning
of inversions unattainable [79]. However, by linear-
ising and taking the derivative at some arbitrary actu-
ator state a0, the forward differential mapping can
then be obtained, as given in equation (10), where J
is the Jacobian that maps actuator velocities ȧ to state
velocities ẋ

ẋ= J
(
a0
)
ȧ (10)

Discretisation of equation (10) using Taylor
expansion up to the first-order, leads to equation (11).
After some rearranging (equation (12)), the general-
ised Jacobian inverse G can be obtained, which maps
the current system and actuator state at time-step i, xi
and ai respectively, and the system state at time-step
i+ 1, xi+1, to the next time-steps actuator state, ai+1,
as given by equation (13)

∆x≈ J
(
a0
)
∆a (11)

J(ai)ai+1 ≈ xi+1 − f(ai)+ J(ai)ai (12)

ai+1 = G(xi+1 − xi)+ ai. (13)

Then through the process of motor babbling
where the system undergoes random actuator
movement using randomised muscle activations a,

the following mapping (equation (14)) can be
approximately learnt by a neural network:

(xi+1,xi,ai)→ (ai+1) (14)

providing spatial locality is ensured, where
|ai+1 − ai|< ϵ. The ϵ is recommended to be within
5% to 15% of the actuator input range [98]. However,
in our case, due to the non-linear behaviour of the
Hill-typemuscle, we increased this constraint to 30%.
When the neural network is implemented as a con-
troller (see figure 4), it should converge the system to
the state xi+1, providing it is physically and geomet-
rically reachable. Hence the xi+1 acts as the reference
state for the controller. Unlike rigid robots, soft and
musculoskeletal systems have compliance with envir-
onmental obstacles, hence, the current state xi is also
dependent on the environment. Due to the general-
isability of neural networks, it is expected to control
the system even without exploring the whole of the
system’s actuator space.

As we are mainly interested in the generation of
activations that result in suitable muscle coordina-
tion for a given desired target, we deliberately kept the
motion constrained in the Saggital plane. Therefore,
with the addition of the knee joint limits, there are
unique joint angle posture solutions for each desired
end-effector target position. This allows for the direct
analysis of the controller’s behaviour regarding find-
ing solutions to the muscle redundancy problem. To
track the overall posture of the leg, we outputted the
X and Z coordinate positions of the foot end-effector,
as represented by the solid sphere in figure 2.

4. Experimental setup and results

4.1. Model-free controller training and
implementation
To obtain the controller, we use a feedforward neural
network, which has three hidden layers, implemented
in PyTorch. Each layer has 1000 nodes. This network

8


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 4. The implementation of the closed-loop model-free controller. The controller takes in the current state, xi = {Xi} which
encodes the foot position, the desired final target position xi+1 = {Xi+1}, and the current muscle activations
ai = {aLBi ,aGMi ,aSBi ,aVAi ,aILi ,a

RF
i }. The controller then outputs the next time-steps muscle activations ai+1, which brings the leg

closer to the target posture provided it is kinematically feasible.

architecture was built incrementally by adding hid-
den layers until the root mean squared error on a
small 20% training data subset of themodel with only
one muscle enabled, the IL in our case, no longer
improved. Except for changing the number of the
input and the output nodes, we used the same net-
work architecture for each leg configuration, e.g. the
full 6-Muscle model uses the same architecture as the
IL-only model. We intentionally retrained a new net-
work for each muscle configuration to test the per-
formance of the control approach as the musculo-
skeletal leg increased in complexity.When implemen-
ted, the controller receives the concatenated vector
of the current muscle activation signals ai, current
states xi, and desired reference states xi+1 at time step i
(see figure 4). The controller then outputs the muscle
activation signals for the next time step ai+1, such that
the desired reference state is realised. The controller is
run at 500 Hz. A low-pass filter with a gain of 0.01 is
also applied to the output of the controller to main-
tain the quasi-static conditions.

For each control network corresponding to the
different muscle configurations, we trained for 500
epochs using the mean squared error loss function,
a 512 batch size and the ADAM optimizer. The ini-
tial learning rate was 0.0001, with a momentum of
0.9. Although we did not search for the optimum
parameters such as the control and motor-babbling
frequency, learning hyper-parameters, as well as the
amount of training data required, we have found that
these settings resulted in reasonable performance for
the controller.

We conducted various experiments to validate
the model-free inverse statics (IS) controller. The

complexity of the musculoskeletal leg was systemat-
ically increased by incrementally adding muscles to
determine their effects on the controller’s perform-
ance. This led to six muscle configuration cases. In
the order of the experiments conducted, the muscle
configuration cases were the iliacus only (IL-only),
the gluteus maximus only (GM-only), the IL with
the GM (IL+GM), the short biceps only (SB-only),
the vastus intermedius only (VA-only), the SB with
the VA (SB+VA), and finally with all the muscles
enabled, including the bi-articulate rectus femoris
(RF) and the long biceps (LB), whichwe refer to as the
six-muscle case.

To obtain the training data, we generated quasi-
static motor babbling data at a frequency of 1 Hz,
where the musculoskeletal leg would move from one
random posture to another. For each posture, we
randomly generated a final muscle activation a per
muscle, with a ramp duration that varies between
2⩽ t⩽ 10 from the previous posture’s activation.
The new randomly generated activations are within
30% of the maximum activation value of 1. The final
muscle activation is then kept at this value with a dur-
ation 5⩽ t⩽ 13, to ensure steady-state behaviours
are included in the training data. A smoothing filter
is applied to the ramp and steady activations to pre-
vent jerky movements. We generated 5000 s of train-
ing data for the single andmuscle-pairs configuration
(IL-only, GM-only, IL andGMpairs, SB-only, and VA
and SB pairs), with 1000 s for validation. Due to the
increased complexity of the leg with both mono and
bi-articulate pairs, we increased the amount of train-
ing data to 10000 s, with 2000 s of validation. The
random foot positions reached bymotor babbling for

9


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 5. The task-space X and Z positions reached by the leg’s end-effector during the motor babbling training data collection.
For each muscle configuration, the range of the workspace is shown, with the initial posture of the corresponding leg model at the
center for reference. Note that for the mono-articulate muscles which control the hip case, we fixed the knee at 45◦. Similarly, we
fixed the hip angle at 0◦, when knee-related mono-articulate muscles were explored.

Table 1. The mean average errors and their standard deviations for the 25 random target states for the mono-articulate muscles which
have direct actuation over the hip and knee joints. The values are given as a percentage with respect to the overall length of the leg
(0.74 m).

Hip Muscles

IL-only GM-only IL+GM

IS Controller PI IS Controller PI IS Controller PI

X Error (%) 0.23± 0.08 0.12± 0.18 0.08± 0.03 0.03± 0.03 0.19± 0.19 0.14± 0.23
Z Error (%) 0.09± 0.09 0.11± 0.20 0.12± 0.01 0.05± 0.04 0.08± 0.08 0.15± 0.43

Knee Muscles

SB-only VA-only SB+VA

X Error (%) 0.27± 0.05 0.00± 0.00 0.19± 0.07 0.00± 0.00 0.11± 0.11 0.00± 0.00
Z Error (%) 0.39± 0.01 0.00± 0.00 0.07± 0.03 0.00± 0.00 0.09± 0.11 0.00± 0.00

the six different muscle configurations are given in
figure 5.

4.2. Hip and knee control experiments
First, we tested the performance of the controller
on the muscles which have direct authority over the
hip and knee joints. For the hip joint control, we
started by evaluating the IL-only and the GM-only
cases, with all the other muscles removed from the
simulation leg. This was done such that we could
compare the behaviour of the model-free controller
against a simple proportional-integral (PI) feedback
controller. Then with increased leg complexity, we
evaluated the controller’s performance when both the
IL and the GM were enabled as a mono-articulate
agonist-antagonist pair (IL+GM), and again com-
pared to a PI controller. The P and I gains used for
each muscle were 1, and 2, respectively. These values
resulted in no overshoot, minimal steady-state errors
and a rise-time of less than 5 s. For each of the three
cases, we generated different 25 random target states

within the leg’s operating space to obtain the average
performance of the controller on position-reaching
tasks. We then conducted similar experiments on the
muscle cases VA, SB, and VA+SB which control the
knee joint. Similar to the hip experiments, we evalu-
ated different sets of 25 random target points against
the simple PI controller. For these experiments, the
initial state of the leg is where the hip joint is set to 0◦,
with the knee angle set to 45◦ (see figure 5).

Table 1 displays the mean absolute errors and
standard deviations between the attained final steady
states and the sets of 25 randomly selected target
states for the cases involving the mono-articulate
muscles. In general, the IS controller achieves relat-
ively low steady-state errors for the various muscle
cases, however, it does under-perform relative to the
PI controller. The point of implementing the PI con-
troller was not for a direct performance comparison,
but to assess the generated muscle activations. Note
that for the PI controller, we use the joint angles as
the reference and feedback states to move the leg to
the 25 target positions.

10


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 6. The resulting muscle activation signals for the IL+GM and VA+SB cases using the learning-based IS and the PI
controllers for arbitrary examples within the target test data. The PI controller only activates the single muscle that moves to the
target, whilst the IS controller utilises both agonist-antagonist muscles to achieve the same posture.

An interesting behaviour can be seen in figure 6,
where for the same target posture, the IS controller
utilises both agonist-antagonist muscles to realise the
target, whereas the PI controller would only activ-
ate the single muscle that actuates the leg within that
region. This shows that the IS controller is capable
of utilisingmuscle redundancy and can automatically
adjust joint stiffness without any human intervention
unlike othermethods [28]. This capability of control-
ler is the result of learning frommotor babbling train-
ing data, which included various co-contraction levels
of agonist-antagonistmuscle pairs.Hence, due to iter-
ative nature of the formulation, the controller is able
to infer appropriate co-contraction levels at each iter-
ation to reach the target position by minimising the
error between xi+1 and xi.

Note that although the musculoskeletal leg only
has a single degrees of freedom for the agonist (such
as GM) and the agonist-antagonist (such as IL+GM)
muscle configuration, a pure agonist model would
not result in the same strategy compared to the
agonist-antagonist configuration. For example, the
addition of the IL muscle (antagonist to the GM),
induces additional and opposing torque to the hip
joint from either the passive forces (from the vis-
coelastic spring and damper elements) or the con-
tractile element triggered by the muscle activation.
Hence, a GM only model would need less muscle
force, and therefore muscle activation, to bring it
closer to the target.

These advantages are further proven in later
experiments with the complete leg, such as muscle
failure (see section 4.3.3) and varying payloads with
and without external interactions (see sections 4.3.2
and 4.3.4). Due to the complexity of the complete
leg with the mono and bi-articulate muscles, we no
longer compare it to the PI controller in later exper-
iments, as that would require more complex models
and feedback controllers [58–61, 65, 75–77].

Table 2.Mean Average Error and their Standard Deviations for
the 25 random target points for the complete Six-Muscle
configuration. The values are given as a percentage with respect to
the overall length of the leg (0.74 m).

X Error (%) Z Error (%)

2.05± 1.88 1.22± 1.27

4.3. Six-muscle leg control experiments
For the complete six-muscle case, we conducted
six different experiments using the same network
without any retraining. The first three experiments
were to determine the general performance of the
controller when coordinating themuscles of themus-
culoskeletal leg for general manipulation without any
external interactions. The hip of the system is moun-
ted and the leg is free to move. The latter three exper-
iments were performed to determine the behaviour
of the controller when changes to the leg occur, such
as varying payloads due to changes in mechanical
structure, muscle failure, and even environmental
interactions.

4.3.1. Randomised target postures, randomised initial
postures and trajectory tracking
Similar to the hip and knee joint experiments, this
first six-muscle experiment was performed to obtain
the average performance of the controller on 25
random target postures within the leg’s workspace
to investigate basic control capabilities when bi-
articulate muscles are included. As seen in table 2,
the mean average errors and standard deviations are
approximately an order of magnitude higher than the
simple mono-articulate cases (see table 1). This is
likely due to the compounding effects of muscle error
during the controller training process, which propag-
ates to the overall structure with all muscles included.

A second experiment was performed to determ-
ine the generalisability of the controller for

11


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Table 3.mean average error and their standard deviations for 3 random target points when the leg is initialised at 10 random postures.
The values are given as a percentage with respect to the overall length of the leg (0.74 m).

Target 1 Target 2 Target 3

X Error (%) 3.72± 3.69 3.26± 3.58 5.04± 3.32
Z Error (%) 2.19± 1.80 1.01± 1.00 3.59± 1.31

Figure 7. An example of X and Z posture reaching target performance and the generated muscle activations for when the leg
starts at arbitrary random initial postures for Target 1. The controller can generate diverse muscle synergies to reach the target,
adapting based on the initial leg position. This demonstrates the controller’s capability to identify appropriate global solutions for
the redundancy problem. Table 3 displays the mean percentage errors corresponding to this experiment..

point-to-point reaching tasks, specifically when the
leg starts from an arbitrary posture. Here, we chose
three random target postures but initialised the leg
at 10 random starting postures. Table 3 shows the
steady-state mean average errors of the reaching tasks
from random starting points. There is a slight reduc-
tion in performance relative to the previous exper-
iment (table 2), where the leg is initialised from
the resting posture. This shows that the errors are
not only dependent on the initial position of the
end-effector but also on the target. Nevertheless, the
errors in the X and Z axes are on average approxim-
ately ⩽ 5% relative to the overall length of the leg.
This shows that the controller is easily scalable to
more complex mechanical structures with coupled
and non-linear interactions between the actuation
muscles without significant reductions in perform-
ance. Furthermore, the standard deviations are relat-
ively low, which indicates that the leg model reached
the same target position from significantly different
starting points with reasonable accuracy.

Figure 7 shows an example of the leg incorpor-
ating different muscle activations, and hence muscle
coordination, for reaching the same target position
from various initial postures.

In the third experiment, we assessed the control-
ler’s performance in tracking a specified trajectory.
An ellipse was employed to evaluate how well the
controller handled paths with both large and small

curvatures. The ellipse has the radii of 8 and 16 cm
in the X and Z axes respectively. The centre of the
ellipse is at X= 8 cm and Z=−65 cm within the
legs workspace. We also discretised the ellipse into
90 target positions. To establish appropriate control
parameters for the controller, we modified the refer-
ence posture xi+1 with either a 1 s or 15 s time delay.
This adjustment aimed to investigate the system’s
response under conditions where the target update
aligns with the frequency of motor babbling train-
ing data at 1 Hz and significantly slower intervals at
0.067 Hz (every 15 s).

The resulting behaviour for the 1 and 15 s time
delays are given in figure 8(a). The mean average res-
ultant error for the entire tracking of the ellipse is
given in table 4. The errors were calculated by taking
the difference between the position reached by the leg
at every 1 and 15 s. These are denoted by the crosses
in the figure. It can be seen that the shorter time delay
results in smoother trajectories with lower errors.
The saw-tooth pattern in the trajectory observed with
the 15 s time delay can likely be attributed to the
interplay of non-linearmuscle behavior, gravitational
effects, and suboptimal convergence characteristics of
the controller.

We also tested the repeatability of the trajectory
tracking. Figure 8(b) shows the resulting trajectory
when the controller is instructed to perform the same
ellipse five times using the 1 s time delay. Variations

12


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 8. Tracking of an arbitrary ellipse with 90 way-points, within the leg’s operating workspace. (a) Tracking performance
when the time delay to the next target xi is varied between 1 and 15 s. A smoother trajectory is achieved with a smaller time delay.
Table 4 displays the corresponding mean percentage errors. (b) Tracking repeatability performance when the leg is cycled through
the ellipse five times. The controller performs the same trajectory between the third and fifth cycles. Table 5 displays the
corresponding mean percentage errors..

Table 4. The effects of the time delay on the mean average
resultant error and standard deviations in the X and Z axes for
entire tracking of the ellipse trajectory given in figure 8(a). The
final position after each time delay is compared with the
individual waypoints of the desired trajectory to obtain the errors.
The values are given as a percentage with respect to the overall
length of the leg (0.74 m).

Time delay

1 s 15 s

Error (%) 1.92± 0.73 2.20± 1.92

in the resulting trajectory are evident only between
the first and third cycles, while the fourth and fifth
cycles replicate the third cycle. This indicates the con-
troller’s sensitivity to changes in the leg’s initial pos-
ition during trajectory tracking. In the absence of
sensor noise in the simulation and the deterministic
nature of feedforward neural networks, the leg con-
verged to a repeated final position by the end of the
second cycle. This final position then serves as the
starting point for the subsequent cycle, resulting in
the repetition of the same trajectory from the third
cycle onward. It is noteworthy, however, that minimal
errors persist and remain consistent with the earlier
experiments.

4.3.2. Performance under payload changes
For this fourth experiment, we varied the weight
of the leg’s skeleton to mimic mechanical changes
that can occur to physical systems. For example, the
replacement of parts orwhen the it is loadedwith aux-
iliary components such as a portable power source.
We varied the mass between ±20% of the original
weight. We again obtained the performance of the
controller for 25 random target postures.

Figure 9 compares the steady-state mean aver-
age errors of the target reaching tasks with different
bone weights. From figure 9(a), it can be seen that
the higher the deviation from the original weight,
the larger the error in both the X and Z axes.
Figure 9(b) shows the trajectory for an arbitrary

example. However, their standard deviations greatly
overlap, suggesting that the changes in structural
payload have a minute effect on the control per-
formance accuracy. Hence, the controller can accom-
modate such changes without retraining and without
including such information in the training data. This
is shown in figure 9(c), where the controller automat-
ically adjusts the required muscle activations for all
muscles, and hence, applies an appropriate stiffness
of the joints, without any further intervention.

4.3.3. Performance under muscle failure
Musculoskeletal systems have the advantage ofmuscle
redundancy. If a muscle undergoes fatigue or failure,
it is possible to achieve the same motion by incor-
porating other muscles. Hence, for this experiment,
we determine the generalisation of the learning-based
controller when muscles are disabled in the model.
We again obtained results for 25 random target pos-
tures. Note that the disabled muscle is still present
within the simulation, meaning it still has passive
forces from the spring and damping elements, how-
ever, it does not contract when given muscle activ-
ation a, similar to partial paralysis induced by stroke
[19]. Stroke patients experience difficulties in regulat-
ing stretch reflex thresholds of muscles. This directly
affects muscle activations and coordination, resulting
in partial actuation [20, 21, 101].

Figure 10(a) compares the steady-state mean
average errors when disabling each of the six muscles
individually. Depending on which muscle was dis-
abled, the error variations differed greatly. For
instance, disabling the GM muscle considerably
increased errors of the X and Z positions. On the
other hand, disabling the VA muscle induced almost
no change in the error for all target states. This sug-
gests that certain muscles are more prominent than
others. Nevertheless, the controller can adapt to fail-
ures of certain muscles without retraining. This can
be seen in the activation graphs in figure 10(b),
wheremuscles that are turned off are compensated by
increasing the activation of the other muscles.

13


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Table 5.mean average resultant Error in the X and Z axes and their standard deviations for the different cycles of the ellipse trajectory
tracking given in figure 8(b). The final position after each time delay is compared with the individual waypoints of the desired trajectory
to obtain the errors. The values are given as a percentage with respect to the overall length of the leg (0.74 m).

Number of cycles

1st Cycle 2nd Cycle 3rd Cycle 4th Cycle 5th Cycle

Error (%) 1.92± 0.73 3.31± 2.22 2.41± 1.39 2.41± 1.39 2.41± 1.39

Figure 9. (a): Mean average error and the standard deviations for the various weight deviations of the femur and the tibia relative
to the original weight. The Standard Deviations overlap greatly suggesting no significant changes in the controller performance.
The values are given as a percentage with respect to the overall length of the leg (0.74 (m). (b) An example trajectory for the
minimum (−20%) and maximum weight (20%) deviations against the original weight (0%). (c) Example muscle activations for
the example given in (b). The IS controller automatically changes the required muscle activation to accommodate lighter and
heavier structures.

The IS controller determines whether a muscle is
deactivated based on the current state. For example,
for the disabled GM muscle (see figure 10(b)), an
arbitrary current input to the network at time t= i
is equivalent to (xi+1,xi,ai) = (xi+1,xi,{aLBi ,aGMi =
0, ..,aRFi }). As the controller has been trained to min-
imise the error between xi+1 (the reference) and
xi (the current end-effector position), the corres-
ponding output muscle activations for the next time
step ai+1, automatically scales the muscle activations
for the non-deactivated muscles. This is the result
of the learnt pseudo-inverse approximation of the
local Jacobian which contains approximated map-
pings between xi,xi+1,ai to ai+1 in various task-space
positions.

We also tested the performance of the system
when up to three muscles were disabled (figure 11). It
is reasonable that the more muscles are disabled, the
higher the error due to more limited movement. For
example, disabling the SB and LB muscles removes
the ability to perform knee flexion.

4.3.4. Performance under external physical interaction
For the final experiment, the controller was evalu-
ated if it can tolerate interactions with the envir-
onment. The leg was controlled to track a squat-
ting motion trajectory, with its foot always in contact
with the ground (see section 3.1 for the experimental
setup). The target reference posture cycles between
−68 cm, where the end-effector is further from the

14


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 10. (a) The mean average error and standard deviations of 25 target states for the cases where each muscle has the
contractile element disabled. The values are given as a percentage with respect to the overall length of the leg (0.74 (m). The
disabling of the GM resulted in the highest error which suggests importance in the overall structure. (b) An arbitrary example of
muscle activations for the disabled GM and VA cases for the same target state. The IS controller was able to compensate for the
disabled muscles by increasing the muscle activation of the others, thereby showing the utilisation of muscle redundancy.

Figure 11. The mean average error and standard deviations of 25 target states for up to three muscles with the contractile element
disabled. The values are given as a percentage with respect to the overall length of the leg (0.74 m). As expected, the more muscles
are disabled, the higher the error. Removal of both the SB and the LB removes the capability to perform knee flexion.

pelvis coordinate axes resulting in a stood-up posture,
and−58 cm which results in the squat posture in the
Z axis. The target X axis is kept at zero. The refer-
ence is updated after every 15 s to give the controller
enough time to stabilise due to quasi-static assump-
tions. This experiment was conducted with different
payloads by varying the weight of the pelvis to 1 kg, 3
kg, and 5 kg.

Figure 12(a) shows the trajectory of the pelvis
against the squatting target reference. No slipping
was observed during all squatting. For all payload
settings, the IS controller achieved stable squatting
without any retraining, even though the training pro-
cess did not address interaction with the environ-
ment. Reasonably accurate tracking performance was
achieved when the payload was at 1 kg. A slight
reduction in tracking performance is seen, as the

payload is increased to 3 and 5 kg. However, an inter-
esting behaviour can be seen in the muscle activation
graphs at the bottom of figure 12(b) as the payload
is increased. These graphs show that the controller
automatically outputs the appropriate muscle activ-
ation signals depending on the payload similar to
our weight deviation and muscle redundancy exper-
iments in sections 4.3.2 and 4.3.3. For example, as
the payload gets heavier, the higher the RF activa-
tion signal the controller outputs at a standing pos-
ition to maintain the posture. Similarly, the control-
ler incorporates higher SB and GM activations for
the heaviest payload at 5 kg, whilst decreasing the VA
and LB activations during the squat posture. Thus,
the controller can automatically adapt the leg’s stiff-
ness required for payloads at least 30% of the overall
weight by coordinating andutilising the redundancies

15


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

accordingly. Future work will examine the limits of
this behaviour.

5. Discussion and conclusions

Musculoskeletal systems featuring agonist-antagonist
pairs of muscles offer the advantages of functional
redundancy, enabling compensation for muscle fail-
ures, and allowing for joint-stiffness adjustment by
varying co-contraction levels of muscle pairs. This
feature is particularly useful for coordinating move-
ment under diverse payloads. However, akin to their
biological counterparts, these systems also encounter
the motor equivalence problem, which posits that
coordination and movement have infinitely many
solutions.

In this study, we introduce the model-free inverse
statics (IS) controller, capable of generating suitable
and appropriate muscle activations depending on the
task requirement. This enables the musculoskeletal
leg to exhibit coordinated behaviour even in the pres-
ence of muscle failure and varying payloads without
the need for retraining, distinguishing it from other
model-free or bio-inspired approaches.

The IS controller demonstrated reasonable accur-
acy, displaying minimal error deviations across all
control experiments. The experimental results have
shown that the controller is easily scalable from
simple mono-articulate cases to more complex struc-
tures that augment the system with bi-articulate
muscles without amending the network architec-
ture. Through the straightforward process of train-
ing on new motor-babbling data on the same archi-
tecture, deployment and control of the system were
achieved without requiring any assumptions about
the arrangement, the muscle model, and the coupling
and interference of the muscles, unlike model-based
methods. The controller effectively generated appro-
priatemuscle activations to reach desired targets, even
when faced with redundancy in solutions within the
actuator and task (end-effector) space. This was val-
idated in the first three experiments involving the
complete six-muscle case. Consistently low errors and
standard deviations were achieved in target-reaching
tasks, irrespective of the initial state of the leg, as well
as in trajectory tracking. While the controller’s per-
formance in mono-articulate cases falls short com-
pared to the PI controller, the errors remain com-
parable to the accuracy and variance observed in
human arm movements in point-to-point reaching
tasks [102].

The controller’s formulation is the learning-based
equivalent of a resolved motion rate controller [79].
Leveraging the generalisability of neural networks
and the iterative nature of the approximated pseudo-
inverse of the local Jacobian, the controller can con-
verge close to desired targets, even in the pres-
ence of mechanical changes in the system [99, 100].
Consequently, the control system has proven its

capacity to compensate for inertial changes in the
musculoskeletal system, exhibiting an insignificant
reduction in performance compared to the baseline
(no weight change), as demonstrated in our weight
deviation experiment in section 4.3.2. An essen-
tial characteristic of musculoskeletal systems is their
ability to control joint stiffness based on the co-
contraction of agonist-antagonist muscles, depend-
ing on the task requirements [28]. As depicted in
figure 9(b), the controller autonomously adjusts the
stiffness of the joints for both lighter and heavier
musculoskeletal structures by modifying the gener-
ated muscle activations accordingly. This adjustment
occurs without the need for further retraining or
human intervention.

Even in the event of actuation failure or fatigue,
motion tasks remain achievable by utilising altern-
ative muscles. Therefore, it is crucial to employ a
controller capable of leveraging this redundancy.
However, previous works on learning-based con-
trollers have not fully addressed this advantage [27,
29, 62–64] or necessitate some form of re-learning
process [65, 72]. For musculoskeletal systems, a
malfunctioning muscle induces significant changes
in mechanical functionality and operation. Patients
with neuromuscular disorders still have their muscles
intact with passive dynamics. We modelled these
partial actuation scenario’s in our muscle disabling
experiment (see section 4.3.3) to determine the res-
ulting muscle synergy for coordination using our
learning-based IS controller.

We demonstrated that our controller can gen-
eralise to cases where individual muscles are deac-
tivated while maintaining performance. As the IS
controller has been trained to minimise the error
between xi+1 and xi, the deactivatedmuscles are com-
pensated by increasing the activation of the other
muscles. Similar strategies are employed by biolo-
gical systems. For example, in the study by Cheung
et al [21], where the activity of the other muscles
are increased to compensate for arm impairment, the
IS controller increased muscle activations for the bi-
articulate muscles LB and RF when either the GM or
the VA muscles were disabled as seen in figure 10(b).
In addition, similar to Bonnard et al’s [15] study
on the effects of fatigue on hopping, the biggest
increase in the activated muscles are seen in the bi-
articulate muscles to compensate for the disabled
mono-articulate muscles. This shows that exploiting
muscle redundancy is important in both biological
and artificial systems for compensatory strategies,
and that it is possible to achieve a simple internal feed-
back model which can compensate for muscle fail-
ure by learning directly on stochastic motor-babbling
movements produced by a musculoskeletal struc-
ture without requiring extensive knowledge about the
system.

The higher mean average error for the deactiv-
ated GM in figure 10(b) is likely attributed to its

16


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Figure 12. Z (height) positions and the corresponding muscle activation signals for the squatting task under varying payloads
(the weight of the pelvis). The shaded and non-shaded areas indicate the bent and stood-up postures, respectively. Although not
seen in the training data, the IS controller can account for the interaction by adjusting the muscle activation signals accordingly.
For example, the magnitude of the muscle activation signals for the RF, SB and GM increased dramatically during the bent
posture to accommodate heavier payloads without any further retraining or intervention.

importance in the embodiment andmovement of the
leg model, at least within our model. Nevertheless,
the controller can still recruit other muscles to reach
the target position by employing a different yet suit-
able muscle synergy. We also examined the control
performance when more than one muscle is dis-
abled. As depicted in figure 11, the positional error
increased with the additional disabled muscles. This
increase is attributed to the mechanical limitation
arising from disabling muscles that facilitate knee
flexion. Nonetheless, it highlights the control’s capab-
ility to tolerate the failure of more than one muscle, a
feature not accommodated by priormodel-free works
[65].

The mapping between the state posture of the leg
and the IS solution is not unique, as it is influenced
by the initial state of the leg, muscle configurations,
and the internal and external forces applied to the
system. Leveraging the advantages of joint stiffness
adaptation and the utilisation of muscle redundancy
for different solution synergies, we demonstrate that
the controller can also withstand interactions with
the environment without requiring prior informa-
tion from motor babbling data or further retraining.
This underscores the benefits of the local inverse stat-
ics strategy compared to model-free global IKs tech-
niques like goal babbling [103], which learn specific
solutions to the inverse problem. As illustrated in
figure 12, the controller can adapt its muscle activ-
ation outputs to perform a squatting motion under
varying payloads. The RF muscle activation, in par-
ticular, exemplifies this adaptability, with increased

muscle activation when standing up to maintain
posture.

In conclusion, previousmodel-basedmethods for
biological and artficial systems necessitate compre-
hensive prior knowledge of the musculoskeletal sys-
tem, relying on several assumptions such as negli-
gible friction, fully accessible muscle states, minimal
external interactions, and a static mechanical struc-
ture. On the other hand, model-free control meth-
ods, although easily scalable to the complex struc-
ture of musculoskeletal systems, are limited in har-
nessing advantages like muscle redundancy for man-
aging muscle failure and adapting to changes in
body weight. In this study, we introduce the use of
the local inverse statics formulation as an alternat-
ive view on how feedback controllers can be learned
in both biological and artificial musculoskeletal sys-
tems, that is practical and straightforward to train
and deploy. The controller maintains reasonable per-
formance even when the system’s body undergoes
substantial changes in self-configuration and physical
interactions, without the need for explicitly formu-
lating specific mappings between muscle, joint, and
end-effector spaces.

5.1. Future work
In future work, improvements to control accuracy
will be pursued through the optimisation of hyper-
parameters, including the number of nodes and hid-
den layers. The observation that a 1 s time delay yields
smoother trajectories compared to a 15 s delay for
trajectory tracking warrants further investigation in

17


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

future work. A more detailed analysis will be conduc-
ted to identify the optimal time-delay value.

Unlike traditional rigid robots, it is not straight-
forward to decouple the kinematics from the dynam-
ics in musculoskeletal systems. For example, IK con-
trol and time-dependent trajectory tracking that is
vital in production lines can be achieved through vari-
ous analytical or learning-based approaches provided
the robot uses sufficiently powerful motors that can
accommodate the mass and inertia of the links, and
with very simple interactions or payloads that is
within the performance of the motors. As a result,
the robotics community often treats IKs and dynam-
ics as separate problems and applies them based on
task requirements.

In musculoskeletal systems, infinite solutions
exist for achieving the same task space position across
the joint space, which governs kinematics, as well as
in the muscle activation and neural spaces, which
govern dynamics and forces [31]. Hence, it is not
possible to separate the desired posture from the
forces required to move to, and maintain that pos-
ture in musculoskeletal systems. Therefore, to con-
trol joints and achieve desired postures, coordina-
tion of agonist-antagonistic pairs of mono or multi-
articulate muscle type actuators (such as Hill-type
or McKibben muscles), each with their own pass-
ive (spring and damper elements) and active (con-
tractile element) dynamics, is necessary, regardless
of the speed or acceleration intensity of the move-
ment required. Hence, for musculoskeletal systems,
the primary distinction between slow (or quasi-static)
and fastermovements lies in the higher inertial forces.
However, due to the quasi-static assumptions of the
feedback controller, our current work is limited to
slower movements, with higher inertial forces cur-
rently out of scope.

We aim to address this limitation through the
addition of a feedforward component. It is widely
accepted that humans use a hybrid of feedback
and feedforward models within the CNS for fast
movements [35, 36]. Feedforward models allow for
fastermovements as they are not constantly reliant on
proprioception, and can predict for a longer horizon,
rather than a single time-step ahead. This extension
can be achieved by incorporating network architec-
tures suitable for time-series data, such as recurrent
neural networks [104] or transformers [105].

The current simulation deliberately constrains
movement to the 2D sagittal plane to simplify the ana-
lysis of the controller’s capabilities with a musculo-
skeletal structure. Due to iterative nature of the for-
mulation, which finds the IK solution to the target
goal given the systems current posture, the approach
is scalable to 3D joint kinematics through the addition
of a Z axis component in the state vector. Prior works
with kinematically hyper-redundant soft continuum
robots using the same learning controller have shown
scalability to 3D kinematics [99, 100]. Hence, future

research will involve implementing and experiment-
ing with the controller in 3D simulations and on a
physical robot as an extension of the present study.
Since the controller only requires task space state and
control inputs, a physical system would only require
a motion tracker for the end-effector, eliminating the
need for additional sensors for joint angles or internal
states of Hill-type muscles. Investigating the impact
of sensor noise on the controller’s performance for a
musculoskeletal system would be an ideal avenue for
exploration.However, it has been shown that the con-
troller can tolerate noise for 3D soft robots in prior
work [99].

Finally, the extension of the work to faster
movement also lays the foundation for extending
the work to locomotion, rather than manipula-
tion. Locomotion and manipulation are generally
very distinct problems with vastly different set of
requirements [106]. Themore dynamic requirements
of biological walking gaits among other factors such
as balancing, are currently out of scope of the paper.
Nevertheless locomotion is also an important aspect
to explore, which will be addressed by extending the
simulation to a bi-pedal structure.

Data availability statement

All data that support the findings of this study are
included within the article (and any supplementary
files).

Acknowledgment

This work was supported by the AgriFoRwArdS
Centre for Doctoral Training programme under
the UKRI grant [EP/S023917/1], the Jersey Farmers
Union, JSPS Grant-in-Aid for JSPS Fellows Grant
Number JP24KJ0338, JST SPRING [JPMJSP2114],
and the GP-Mech Program of Tohoku University. For
the purpose of open access, the author has applied
a Creative Commons Attribution (CC BY) licence to
any Author Accepted Manuscript version arising.

Conflict of interest

The authors declare that the research was conducted
in the absence of any commercial or financial rela-
tionships that could be construed as a potential con-
flict of interest.

Author contributions

Authors E A and T S made a substantial, direct
and intellectual contribution to the work as co-first
authors. They contributed equally to the develop-
ment of the simulation, carrying out the experiments
and writing the manuscript. F I supervised the work.
A A, M H, and F I contributed to the revisions. All
authors approved it for publication.

18


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

Appendix

Table 6. Parameters of muscles utilized in the experiments.

IL GM VA SB RF LB

lref (m) 0.502 612 0.407 100 0.195 512 0.194 165 0.293 156 0.247 841
vref (m s−1) 1.0 1.0 1.0 1.0 1.0 1.0
FMax (N) 220 220 320 320 760 760

ORCID iDs

Elijah Almanzor https://orcid.org/0000-0001-
8101-4217
Taku Sugiyama https://orcid.org/0000-0001-9235-
8779
Mitsuhiro Hayashibe https://orcid.org/0000-
0001-6179-5706
Fumiya Iida https://orcid.org/0000-0001-9246-
7190

References

[1] Latash M L 2010 Motor synergies and the
equilibrium-point hypothesisMotor Control 14 294–322

[2] Pfeifer R, Lungarella M and Iida F 2007 Self-organization,
embodiment and biologically inspired robotics Science
318 1088–93

[3] Marjaninejad A and Valero-Cuevas F 2019 Should
anthropomorphic systems be “redundant”? (Springer Tracts
in Advanced Robotics) vol 124 (Springer) pp 7–34

[4] Bizzi E, Chapple W and Hogan N 1982 Mechanical
properties of muscles: implications for motor control
Trends Neurosci. 5 395–8

[5] Minetti A E 1994 Contraction dynamics in antagonist
muscles J. Theor. Biol. 169 295–304

[6] Qiao H, Chen J and Huang X 2021 A survey of
brain-inspired intelligent robots: integration of vision,
decision, motion control and musculoskeletal systems IEEE
Trans. Cybern. 52 11267–80

[7] Prilutsky B I and Zatsiorsky V M 2002 Optimization-based
models of muscle coordination Exerc. Sport. Sci. Rev.
30 32–38

[8] Koshland G F, Galloway J C and Farley B 2005 Novel muscle
patterns for reaching after cervical spinal cord injury: a case
for motor redundancy Exp. Brain Res. 164 133–47

[9] Benton A M, Amiri P, Henson D P, Sivapuratharasu B,
Mcgregor A H and Bull A M J 2023 Characterization of
muscle recruitment during gait of bilateral transfemoral
and through-knee persons with limb loss Front. Bioeng.
Biotechnol. 11 1128528

[10] Singh T and Latash M L 2011 Effects of muscle fatigue on
multi-muscle synergies Exp. Brain Res. 214 335–50

[11] Savelberg H H, Verdijk L B, Willems P J and Meijer K 2007
The robustness of age-related gait adaptations: can running
counterbalance the consequences of ageing? Gait Posture
25 259–66

[12] Côté J N, Mathieu P A, Levin M F and Feldman A G 2002
Movement reorganization to compensate for fatigue during
sawing Exp. Brain Res. 146 394–8

[13] Psek J A and Cafarelli E 1993 Behavior of coactive muscles
during fatigue J. Appl. Physiol. 74 170–5

[14] Aymard C, Katz R, Lafitte C, Le Bozec S and Pénicaud A
1995 Changes in reciprocal and transjoint inhibition
induced by muscle fatigue in man Exp. Brain Res.
106 418–24

[15] Bonnard M, Sirin A V, Oddsson L and Thorstensson A
1994 Different strategies to compensate for the effects of

fatigue revealed by neuromuscular adaptation processes in
humans Neurosci. Lett. 166 101–5

[16] Avrillon S, Guilhem G, Barthelemy A and Hug F 2018
Coordination of hamstrings is individual specific and is
related to motor performance J. Appl. Physiol. 125 1069–79

[17] Lomond K V and Côté J N 2010 Movement timing and
reach to reach variability during a repetitive reaching task
in persons with chronic neck/shoulder pain and healthy
subjects Exp. Brain Res. 206 271–82

[18] Lund J P, Donga R, Widmer C G and Stohler C S 1991 The
pain-adaptation model: a discussion of the relationship
between chronic musculoskeletal pain and motor activity
Can. J. Physiol. Pharmacol. 69 683–94

[19] Roby-Brami A, Feydy A, Combeaud M, Biryukova E V,
Bussel B and Levin M F 2003 Motor compensation and
recovery for reaching in stroke patients Acta Neurol. Scand.
107 369–81

[20] Jones T A 2017 Motor compensation and its effects on
neural reorganization after stroke Nat. Rev. Neurosci.
18 267–80

[21] Cheung V C K, Turolla A, Agostini M, Silvoni S, Bennis C,
Kasi P, Paganoni S, Bonato P and Bizzi E 2012 Muscle
synergy patterns as physiological markers of motor cortical
damage Proc. Natl Acad. Sci. 109 14652–6

[22] Raghavan P, Santello M, Gordon A M and Krakauer J W
2010 Compensatory motor control after stroke: an
alternative joint strategy for object-dependent shaping of
hand posture J. Neurophysiol. 103 3034–43

[23] Roby-Brami A, Fuchs S, Mokhtari M and Bussel B 1997
Reaching and grasping strategies in hemiparetic patients
Motor Control 1 72–91

[24] Zariffa J, Steeves J and Pai D K 2012 Changes in hand
muscle synergies in subjects with spinal cord injury:
characterization and functional implications J. Spinal Cord
Med. 35 310–8

[25] Chang Y-H, Auyang A G, Scholz J P and Nichols T R 2009
Whole limb kinematics are preferentially conserved over
individual joint kinematics after peripheral nerve injury J.
Exp. Biol. 212 3511–21

[26] Büchler D, Calandra R, Schölkopf B and Peters J 2018
Control of musculoskeletal systems using learned dynamics
models IEEE Robot. Autom. Lett. 3 3161–8

[27] Driess D, Zimmermann H, Wolfen S, Suissa D, Haeufle D,
Hennes D, Toussaint M and Schmitt S 2018 Learning to
control redundant musculoskeletal systems with neural
networks and sqp: exploiting muscle properties 2018 IEEE
Int. Conf. on Robotics and Automation (ICRA) (Brisbane,
QLD, Australia, 21–25 May 2018) pp 6461–8

[28] Kawaharazuka K, Tsuzuki K, Onitsuka M, Asano Y,
Okada K, Kawasaki K and Inaba M 2020 Musculoskeletal
autoencoder: a unified online acquisition method of
intersensory networks for state estimation, control and
simulation of musculoskeletal humanoids IEEE Robot.
Autom. Lett. 5 2411–8

[29] Jäntsch M, Schmaler C, Wittmeier S, Dalamagkidis K and
Knoll A 2011 A scalable joint-space controller for
musculoskeletal robots with spherical joints 2011 IEEE Int.
Conf. on Robotics and Biomimetics (Karon Beach, Thailand,
7–11 December 2011) pp 2211–6

[30] Zhou J, Chen J, Deng H and Qiao H 2019 From rough to
precise: human-inspired phased target learning framework

19

https://orcid.org/0000-0001-8101-4217
https://orcid.org/0000-0001-8101-4217
https://orcid.org/0000-0001-8101-4217
https://orcid.org/0000-0001-9235-8779
https://orcid.org/0000-0001-9235-8779
https://orcid.org/0000-0001-9235-8779
https://orcid.org/0000-0001-6179-5706
https://orcid.org/0000-0001-6179-5706
https://orcid.org/0000-0001-6179-5706
https://orcid.org/0000-0001-9246-7190
https://orcid.org/0000-0001-9246-7190
https://orcid.org/0000-0001-9246-7190
https://doi.org/10.1123/mcj.14.3.294
https://doi.org/10.1123/mcj.14.3.294
https://doi.org/10.1126/science.1145803
https://doi.org/10.1126/science.1145803
https://doi.org/10.1007/978-3-319-93870-7_2
https://doi.org/10.1016/0166-2236(82)90221-1
https://doi.org/10.1016/0166-2236(82)90221-1
https://doi.org/10.1006/jtbi.1994.1150
https://doi.org/10.1006/jtbi.1994.1150
https://doi.org/10.1109/TCYB.2021.3071312
https://doi.org/10.1109/TCYB.2021.3071312
https://doi.org/10.1097/00003677-200201000-00007
https://doi.org/10.1097/00003677-200201000-00007
https://doi.org/10.1007/s00221-005-2218-9
https://doi.org/10.1007/s00221-005-2218-9
https://doi.org/10.3389/fbioe.2023.1128528
https://doi.org/10.3389/fbioe.2023.1128528
https://doi.org/10.1007/s00221-011-2831-8
https://doi.org/10.1007/s00221-011-2831-8
https://doi.org/10.1016/j.gaitpost.2006.04.006
https://doi.org/10.1016/j.gaitpost.2006.04.006
https://doi.org/10.1007/s00221-002-1186-6
https://doi.org/10.1007/s00221-002-1186-6
https://doi.org/10.1152/jappl.1993.74.1.170
https://doi.org/10.1152/jappl.1993.74.1.170
https://doi.org/10.1007/BF00231064
https://doi.org/10.1007/BF00231064
https://doi.org/10.1016/0304-3940(94)90850-8
https://doi.org/10.1016/0304-3940(94)90850-8
https://doi.org/10.1152/japplphysiol.00133.2018
https://doi.org/10.1152/japplphysiol.00133.2018
https://doi.org/10.1007/s00221-010-2405-1
https://doi.org/10.1007/s00221-010-2405-1
https://doi.org/10.1139/y91-102
https://doi.org/10.1139/y91-102
https://doi.org/10.1034/j.1600-0404.2003.00021.x
https://doi.org/10.1034/j.1600-0404.2003.00021.x
https://doi.org/10.1038/nrn.2017.26
https://doi.org/10.1038/nrn.2017.26
https://doi.org/10.1073/pnas.1212056109
https://doi.org/10.1073/pnas.1212056109
https://doi.org/10.1152/jn.00936.2009
https://doi.org/10.1152/jn.00936.2009
https://doi.org/10.1123/mcj.1.1.72
https://doi.org/10.1123/mcj.1.1.72
https://doi.org/10.1179/2045772312Y.0000000037
https://doi.org/10.1179/2045772312Y.0000000037
https://doi.org/10.1242/jeb.033886
https://doi.org/10.1242/jeb.033886
https://doi.org/10.1109/LRA.2018.2849601
https://doi.org/10.1109/LRA.2018.2849601
https://doi.org/10.1109/ICRA.2018.8463160
https://doi.org/10.1109/LRA.2020.2972841
https://doi.org/10.1109/LRA.2020.2972841
https://doi.org/10.1109/ROBIO.2011.6181620


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

for redundant musculoskeletal systems Front. Neurorobot.
13 61

[31] Carpenter M B 1968 The co-ordination and regulation of
movements J. Neuropathol. Exp. Neurol. 27 348–348

[32] Lashley K S 1951 The problem of serial order in behavior
vol 26 (Wiley) pp 112–46

[33] Hirashima M and Oya T 2016 How does the brain solve
muscle redundancy? Filling the gap between optimization
and muscle synergy hypotheses Neurosci. Res. 104 80–87

[34] Flash T, Meirovitch Y and Barliya A 2013 Models of human
movement: trajectory planning and inverse kinematics
studies Robot. Auton. Syst. 61 330–9

[35] Kawato M 1999 Internal models for motor control and
trajectory planning Curr. Opin. Neurobiol. 9 718–27

[36] Desmurget M and Grafton S 2000 Forward modeling
allows feedback control for fast reaching movements Trends
Cogn. Sci. 4 423–31

[37] Lacquaniti F, Terzuolo C and Viviani P 1983 The law
relating the kinematic and figural aspects of drawing
movements Acta Psychol. 54 115–30

[38] Karklinsky M and Flash T 2015 Timing of continuous
motor imagery: the two-thirds power law originates in
trajectory planning J. Neurophysiol. 113 2490–9

[39] Todorov E and Jordan M I 1998 Smoothness maximization
along a predefined path accurately predicts the speed
profiles of complex arm movements J. Neurophysiol.
80 696–714

[40] Uno Y, Kawato M and Suzuki R 1989 Formation and
control of optimal trajectory in human multijoint arm
movement Biol. Cybern. 61 89–101

[41] Harris C M and Wolpert D M 1998 Signal-dependent noise
determines motor planning Nature 394 780–4

[42] Flash T and Hogan N 1985 The coordination of arm
movements: an experimentally confirmed mathematical
model J. Neurosci. 5 1688–703

[43] Mohan V and Morasso P 2011 Passive motion paradigm:
an alternative to optimal control Front. Neurorobot. 5 4

[44] Flash T and Hochner B 2005 Motor primitives in
vertebrates and invertebrates Curr. Opin. Neurobiol.
15 660–6

[45] Sanzari M, Ntouskos V and Pirri F 2019 Discovery and
recognition of motion primitives in human activities PLoS
One 14 1–39

[46] d’Avella A, Saltiel P and Bizzi E 2003 Combinations of
muscle synergies in the construction of a natural motor
behavior Nat. Neurosci. 6 300–8

[47] Alnajjar F, Itkonen M, Berenz V, Tournier M, Nagai C and
Shimoda S 2015 Sensory synergy as environmental input
integration Front. Neurosci. 8 436

[48] Esmaeili S, Karami H, Baniasad M, Shojaeefard M and
Farahmand F 2022 The association between motor
modules and movement primitives of gait: a muscle and
kinematic synergy study J. Biomech. 134 110997

[49] Rohrer B, Fasoli S, Krebs H I, Hughes R, Volpe B,
Frontera W R, Stein J and Hogan N 2002 Movement
smoothness changes during stroke recovery J. Neurosci.
22 8297–304

[50] Taga G, Yamaguchi Y and Shimizu H 1991 Self-organized
control of bipedal locomotion by neural oscillators in
unpredictable environment Biol. Cybern. 65 147–59

[51] Mussa-Ivaldi F and Solla S 2004 Neural primitives for
motion control IEEE J. Ocean. Eng. 29 640–50

[52] Klarner T and Zehr E P 2018 Sherlock holmes and the
curious case of the human locomotor central pattern
generator J. Neurophysiol. 120 53–77

[53] Wang H, Nonaka T, Abdulali A and Iida F 2023
Coordinating upper limbs for octave playing on the piano
via neuro-musculoskeletal modeling Bioinspir. Biomim.
18 066009

[54] Leconte P, Orban de Xivry J-J, Stoquart G, Lejeune T and
Ronsse R 2016 Rhythmic arm movements are less affected
than discrete ones after a stroke Exp. Brain Res.
234 1403–17

[55] Hill A V 1938 The heat of shortening and the dynamic
constants of muscle Proc. R. Soc. B 126 136–95

[56] Heinen F, Lund M E, Rasmussen J and de Zee M 2016
Muscle-tendon unit scaling methods of hill-type
musculoskeletal models: an overview Proc. Inst. Mech. Eng.
H 230 976–84

[57] Chen J and Qiao H 2022 Motor-cortex-like recurrent
neural network and multitask learning for the control of
musculoskeletal systems IEEE Trans. Cogn. Dev. Syst.
14 424–36

[58] Zhao L, Li Q, Liu B and Cheng H 2019 Trajectory tracking
control of a one degree of freedom manipulator based on a
switched sliding mode controller with a novel extended
state observer framework IEEE Trans. Syst. Man Cybern.
Syst. 49 1110–8

[59] Kawamura M, Ookubo S, Asano Y, Kozuki T, Okada K and
Inaba M 2016 A joint-space controller based on redundant
muscle tension for multiple dof joints in musculoskeletal
humanoids 2016 IEEE-RAS 16th Int. Conf. on Humanoid
Robots (Humanoids) (Cancun, Mexico, 15–17 November
2016) pp 814–9

[60] Tahara K, Arimoto S, Sekimoto M and Luo Z-W 2009 On
control of reaching movements for musculo-skeletal
redundant arm model Appl. Bionics Biomech. 6 11–26

[61] Stanev D and Moustakas K 2018 Simulation of constrained
musculoskeletal systems in task space IEEE Trans. Biomed.
Eng. 65 307–18

[62] Masuda H, Hitzmann A, Hosoda K and Ikemoto S 2019
Common dimensional autoencoder for learning redundant
muscle-posture mappings of complex musculoskeletal
robots 2019 IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems (IROS) (Macau, China, 3–8 November 2019)
pp 2545–50

[63] Tahara K, Kuboyama Y and Kurazume R 2012 Iterative
learning control for a musculoskeletal arm: utilizing
multiple space variables to improve the robustness 2012
IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
(Vilamoura-Algarve, Portugal, 7–12 October 2012)
pp 4620–5

[64] Zhong S, Chen J, Niu X, Fu H and Qiao H 2020 Reducing
redundancy of musculoskeletal robot with convex hull
vertexes selection IEEE Trans. Cogn. Dev. Syst.
12 601–17

[65] Kawaharazuka K, Kawamura M, Makino S, Asano Y,
Okada K and Inaba M 2017 Antagonist inhibition control
in redundant tendon-driven structures based on human
reciprocal innervation for wide range limb motion of
musculoskeletal humanoids IEEE Robot. Autom. Lett.
2 2119–26

[66] Schumacher P, Geijtenbeek T, Caggiano V, Kumar V,
Schmitt S, Martius G and Häufle D 2023 Natural and
robust walking using reinforcement learning without
demonstrations in high-dimensional musculoskeletal
models (arXiv:2309.02976)

[67] Kidziński Łukasz, Mohanty S P, Ong C, Hicks J L,
Carroll S F, Levine S, Salathé M and Delp S L 2018 Learning
to run challenge: synthesizing physiologically accurate
motion using deep reinforcement learning (arXiv:1804.
00198)

[68] Liu R, Wang J, Chen Y, Liu Y, Wang Y and Gu J 2023
Proximal policy optimization with time-varying muscle
synergy for the control of an upper limb musculoskeletal
system IEEE Trans. Autom. Sci. Eng. 21 1–12

[69] Campbell J, Hitzmann A, Stepputtis S, Ikemoto S,
Hosoda K and Amor H B 2019 Learning interactive
behaviors for musculoskeletal robots using bayesian
interaction primitives (arXiv:1908.05552)

[70] Fu K C D, Libera F D and Ishiguro H 2015 Extracting
motor synergies from random movements for
low-dimensional task-space control of musculoskeletal
robots Bioinspir. Biomim. 10 056016

[71] Kalidindi H T, Thuruthel T G, Laschi C and Falotico E 2019
Cerebellum-inspired approach for adaptive kinematic

20

https://doi.org/10.3389/fnbot.2019.00061
https://doi.org/10.3389/fnbot.2019.00061
https://doi.org/10.1097/00005072-196804000-00011
https://doi.org/10.1097/00005072-196804000-00011
https://doi.org/10.1016/j.humov.2007.04.001
https://doi.org/10.1016/j.neures.2015.12.008
https://doi.org/10.1016/j.neures.2015.12.008
https://doi.org/10.1016/j.robot.2012.09.020
https://doi.org/10.1016/j.robot.2012.09.020
https://doi.org/10.1016/S0959-4388(99)00028-8
https://doi.org/10.1016/S0959-4388(99)00028-8
https://doi.org/10.1016/S1364-6613(00)01537-0
https://doi.org/10.1016/S1364-6613(00)01537-0
https://doi.org/10.1016/0001-6918(83)90027-6
https://doi.org/10.1016/0001-6918(83)90027-6
https://doi.org/10.1152/jn.00421.2014
https://doi.org/10.1152/jn.00421.2014
https://doi.org/10.1152/jn.1998.80.2.696
https://doi.org/10.1152/jn.1998.80.2.696
https://doi.org/10.1007/BF00204593
https://doi.org/10.1007/BF00204593
https://doi.org/10.1038/29528
https://doi.org/10.1038/29528
https://doi.org/10.1523/JNEUROSCI.05-07-01688.1985
https://doi.org/10.1523/JNEUROSCI.05-07-01688.1985
https://doi.org/10.3389/fnbot.2011.00004
https://doi.org/10.3389/fnbot.2011.00004
https://doi.org/10.1016/j.conb.2005.10.011
https://doi.org/10.1016/j.conb.2005.10.011
https://doi.org/10.1371/journal.pone.0214499
https://doi.org/10.1371/journal.pone.0214499
https://doi.org/10.1038/nn1010
https://doi.org/10.1038/nn1010
https://doi.org/10.3389/fnins.2014.00436
https://doi.org/10.3389/fnins.2014.00436
https://doi.org/10.1016/j.jbiomech.2022.110997
https://doi.org/10.1016/j.jbiomech.2022.110997
https://doi.org/10.1523/JNEUROSCI.22-18-08297.2002
https://doi.org/10.1523/JNEUROSCI.22-18-08297.2002
https://doi.org/10.1007/BF00198086
https://doi.org/10.1007/BF00198086
https://doi.org/10.1109/JOE.2004.833102
https://doi.org/10.1109/JOE.2004.833102
https://doi.org/10.1152/jn.00554.2017
https://doi.org/10.1152/jn.00554.2017
https://doi.org/10.1088/1748-3190/acfa51
https://doi.org/10.1088/1748-3190/acfa51
https://doi.org/10.1007/s00221-015-4543-y
https://doi.org/10.1007/s00221-015-4543-y
https://doi.org/10.1098/rspb.1938.0050
https://doi.org/10.1098/rspb.1938.0050
https://doi.org/10.1177/0954411916659894
https://doi.org/10.1177/0954411916659894
https://doi.org/10.1109/TCDS.2020.3045574
https://doi.org/10.1109/TCDS.2020.3045574
https://doi.org/10.1109/TSMC.2017.2719057
https://doi.org/10.1109/TSMC.2017.2719057
https://doi.org/10.1109/HUMANOIDS.2016.7803367
https://doi.org/10.1155/2009/498363
https://doi.org/10.1155/2009/498363
https://doi.org/10.1109/TBME.2017.2764630
https://doi.org/10.1109/TBME.2017.2764630
https://doi.org/10.1109/IROS40897.2019.8968605
https://doi.org/10.1109/IROS.2012.6385628
https://doi.org/10.1109/TCDS.2019.2953642
https://doi.org/10.1109/TCDS.2019.2953642
https://doi.org/10.1109/LRA.2017.2720854
https://doi.org/10.1109/LRA.2017.2720854
https://doi.org/10.48550/arXiv.2309.02976
https://arxiv.org/abs/1804.00198
https://arxiv.org/abs/1804.00198
https://doi.org/10.1109/TASE.2023.3254583
https://doi.org/10.1109/TASE.2023.3254583
https://arxiv.org/abs/1908.05552
https://doi.org/10.1088/1748-3190/10/5/056016
https://doi.org/10.1088/1748-3190/10/5/056016


Bioinspir. Biomim. 19 (2024) 046015 E Almanzor et al

control of soft robots 2019 2nd IEEE Int. Conf. on Soft
Robotics (RoboSoft) pp 684–9

[72] Marques H G, Bharadwaj A and Iida F 2014 From
spontaneous motor activity to coordinated behaviour: a
developmental model PLoS Comput. Biol. 10 e1003653

[73] Caligiore D, Ferrauto T, Parisi D, Accornero N, Capozza M
and Baldassarre G 2008 Using motor babbling and hebb
rules for modeling the development of reaching with
obstacles and grasping Int. Conf. on Cognitive Systems 2008
(Karlsruhe, Germany, 2–4 April 2008) (available at: https://
api.semanticscholar.org/CorpusID:12480235)

[74] Armanini C, Boyer F, Mathew A T, Duriez C and Renda F
2023 Soft robots modeling: a structured overview IEEE
Trans. Robot. 39 1728–48

[75] Kino H, Kikuchi S, Yahiro T and Tahara K 2009 Basic study
of biarticular muscle’s effect on muscular internal force
control based on physiological hypotheses 2009 IEEE Int.
Conf. on Robotics and Automation (Kobe, Japan, 12–17 May
2009) pp 4195–200

[76] Jacobsen S, Ko H, Iversen E and Davis C 1989 Antagonistic
control of a tendon driven manipulator Proc., 1989 Int.
Conf. on Robotics and Automation (Scottsdale, AZ, USA,
14–19 May 1989) vol 3 pp 1334–9

[77] Potkonjak V, Svetozarevic B, Jovanovic K and Holland O
2010 Control of compliant anthropomimetic robot joint
AIP Conf. Proc. 1281 1271–4

[78] Salaün C, Padois V and Sigaud O 2009 Control of
redundant robots using learned models: an operational
space control approach 2009 IEEE/RSJ Int. Conf. on
Intelligent Robots and Systems (St. Louis, MO, USA, 10–15
October 2009) pp 878–85

[79] D’Souza A, Vijayakumar S and Schaal S 2001 Learning
inverse kinematics Proc. 2001 IEEE/RSJ Int. Conf. on
Intelligent Robots and Systems. Expanding the Societal Role
of Robotics in the the Next Millennium (Cat.
No.01CH37180) (Maui, HI, USA, 29 October–3 November
2001) vol 1 pp 298–303

[80] Sugiyama T, Kutsuzawa K, Owaki D and Hayashibe M 2021
Individual deformability compensation of soft hydraulic
actuators through iterative learning-based neural network
Bioinspir. Biomim. 16 056016

[81] Sugiyama T, Kutsuzawa K, Owaki D and Hayashibe M 2024
Latent representation-based learning controller for
pneumatic and hydraulic dual actuation of pressure-driven
soft actuators Soft Robot. 11 105–17

[82] Ookubo S, Asano Y, Kozuki T, Shirai T, Okada K and
Inaba M 2015 Learning nonlinear muscle-joint state
mapping toward geometric model-free tendon driven
musculoskeletal robots 2015 IEEE-RAS 15th Int. Conf. on
Humanoid Robots (Humanoids) (Seoul, Korea (South), 3–5
November 2015) pp 765–70

[83] Dong H, Figueroa N and El Saddik A 2013 Muscle force
control of a kinematically redundant bionic arm with
real-time parameter update 2013 IEEE Int. Conf. on
Systems, Man and Cybernetics (Manchester, UK, 13–16
October 2013) pp 1640–7

[84] Hadi Balaghi M E, Vatankhah R, Broushaki M and Alasty A
2016 Adaptive optimal multi-critic based neuro-fuzzy
control of mimo human musculoskeletal arm model
Neurocomputing 173 1529–37

[85] Akimov D 2020 Distributed soft actor-critic with
multivariate reward representation and knowledge
distillation (arXiv:1911.13056)

[86] Piazzi A and Visioli A 2000 Global minimum-jerk
trajectory planning of robot manipulators IEEE Trans. Ind.
Electron. 47 140–9

[87] Lin H-I 2014 A fast and unified method to find a
minimum-jerk robot joint trajectory using particle swarm
optimization J. Intell. Robot. Syst. 75 379–92

[88] Sharif Razavian R, Ghannadi B and McPhee J 2019 A
synergy-based motor control framework for the fast
feedback control of musculoskeletal systems J. Biomech.
Eng. 141 031009

[89] Chen J and Qiao H 2021 Muscle-synergies-based
neuromuscular control for motion learning and
generalization of a musculoskeletal system IEEE Trans. Syst.
Man Cybern. Syst. 51 3993–4006

[90] Huang X, WuW, Qiao H and Ji Y 2018 Brain-inspired
motion learning in recurrent neural network with emotion
modulation IEEE Trans. Cogn. Dev. Syst. 10 1153–64

[91] Ijspeert A J, Crespi A, Ryczko D and Cabelguen J-M 2007
From swimming to walking with a salamander robot
driven by a spinal cord model Science 315 1416–20

[92] Wang T, Guo W, Li M, Zha F and Sun L 2012 CPG control
for biped hopping robot in unpredictable environment J.
Bionic Eng. 9 29–38

[93] Yu H, Gao H, Ding L, Li M, Deng Z and Liu G 2016 Gait
generation with smooth transition using CPG-based
locomotion control for hexapod walking robot IEEE Trans.
Ind. Electron. 63 5488–500

[94] Geyer H and Herr H 2010 A muscle-reflex model that
encodes principles of legged mechanics produces human
walking dynamics and muscle activities IEEE Trans. Neural.
Syst. Rehabil. Eng. 18 263–73

[95] Haeufle D, Günther M, Bayer A and Schmitt S 2014
Hill-type muscle model with serial damping and eccentric
force-velocity relation J. Biomech. 47 1531–6

[96] Meyer A J, Patten C and Fregly B J 2017 Lower extremity
EMG-driven modeling of walking with automated
adjustment of musculoskeletal geometry PLoS One
12 1–24

[97] Burdet E, Franklin D W and Milner T E 2013 Human
Robotics: Neuromechanics and Motor Control (The MIT
Press) (https://doi.org/10.7551/mitpress/9007.001.0001)

[98] George Thuruthel T, Falotico E, Manti M, Pratesi A,
Cianchetti M and Laschi C 2017 Learning closed loop
kinematic controllers for continuum manipulators in
unstructured environments Soft Robot. 4 285–96

[99] Almanzor E, Ye F, Shi J, Thuruthel T G, Wurdemann H A
and Iida F 2023 Static shape control of soft continuum
robots using deep visual inverse kinematic models IEEE
Trans. Robot. 39 2973–88

[100] Thuruthel T G, Falotico E, Cianchetti M, Renda F and
Laschi C 2016 Learning global inverse statics solution for a
redundant soft robot Proc. 13th Int. Conf. on Informatics in
Control, Automation and Robotics (Lisbon, Portugal, 29–31
July 2026) vol 2 pp 303–10 (available at: https://api.
semanticscholar.org/CorpusID:39454151)

[101] Levin M F 2016 Principles of Motor Recovery After
Neurological Injury Based on a Motor Control Theory
(Springer) pp 121–40

[102] Gribble P L, Mullin L I, Cothros N and Mattar A 2003 Role
of cocontraction in arm movement accuracy J.
Neurophysiol. 89 2396–405

[103] Rolf M, Steil J and Gienger M 2010 Goal babbling permits
direct learning of inverse kinematics IEEE Trans. Auton.
Mental Dev. 2 216–29

[104] Yu Y, Si X, Hu C and Zhang J 2019 A review of recurrent
neural networks: LSTM cells and network architectures
Neural Comput. 31 1235–70

[105] Trinh M, Behery M, Emara M, Lakemeyer G, Storms S and
Brecher C 2022 Dynamics modeling of industrial robots
using transformer networks 2022 6th IEEE Int. Conf. on
Robotic Computing (IRC) (Italy, 5–7 December 2022)
pp 164–71

[106] Reher J and Ames A D 2021 Dynamic walking: toward agile
and efficient bipedal robots Ann. Rev. Control Robot. Auton.
Syst. 4 535–72

21

https://doi.org/10.1371/journal.pcbi.1003653
https://doi.org/10.1371/journal.pcbi.1003653
https://api.semanticscholar.org/CorpusID:12480235
https://api.semanticscholar.org/CorpusID:12480235
https://doi.org/10.1109/TRO.2022.3231360
https://doi.org/10.1109/TRO.2022.3231360
https://doi.org/10.1109/ROBOT.2009.5152196
https://doi.org/10.1109/ROBOT.1989.100165
https://doi.org/10.1063/1.3497932
https://doi.org/10.1063/1.3497932
https://doi.org/10.1109/IROS.2009.5354438
https://doi.org/10.1109/IROS.2001.973374
https://doi.org/10.1088/1748-3190/ac1b6f
https://doi.org/10.1088/1748-3190/ac1b6f
https://doi.org/10.1089/soro.2022.0224
https://doi.org/10.1089/soro.2022.0224
https://doi.org/10.1109/HUMANOIDS.2015.7363456
https://doi.org/10.1109/SMC.2013.283
https://doi.org/10.1016/j.neucom.2015.09.026
https://doi.org/10.1016/j.neucom.2015.09.026
https://arxiv.org/abs/1911.13056
https://doi.org/10.1109/41.824136
https://doi.org/10.1109/41.824136
https://doi.org/10.1007/s10846-013-9982-8
https://doi.org/10.1007/s10846-013-9982-8
https://doi.org/10.1115/1.4042185
https://doi.org/10.1115/1.4042185
https://doi.org/10.1109/TSMC.2020.2966818
https://doi.org/10.1109/TSMC.2020.2966818
https://doi.org/10.1109/TCDS.2018.2843563
https://doi.org/10.1109/TCDS.2018.2843563
https://doi.org/10.1126/science.1138353
https://doi.org/10.1126/science.1138353
https://doi.org/10.1016/S1672-6529(11)60094-2
https://doi.org/10.1016/S1672-6529(11)60094-2
https://doi.org/10.1109/TIE.2016.2569489
https://doi.org/10.1109/TIE.2016.2569489
https://doi.org/10.1109/TNSRE.2010.2047592
https://doi.org/10.1109/TNSRE.2010.2047592
https://doi.org/10.1016/j.jbiomech.2014.02.009
https://doi.org/10.1016/j.jbiomech.2014.02.009
https://doi.org/10.1371/journal.pone.0179698
https://doi.org/10.1371/journal.pone.0179698
https://doi.org/10.7551/mitpress/9007.001.0001
https://doi.org/10.1089/soro.2016.0051
https://doi.org/10.1089/soro.2016.0051
https://doi.org/10.1109/TRO.2023.3275375
https://doi.org/10.1109/TRO.2023.3275375
https://api.semanticscholar.org/CorpusID:39454151
https://api.semanticscholar.org/CorpusID:39454151
https://doi.org/10.1007/978-3-319-47313-0_7
https://doi.org/10.1152/jn.01020.2002
https://doi.org/10.1152/jn.01020.2002
https://doi.org/10.1109/TAMD.2010.2062511
https://doi.org/10.1109/TAMD.2010.2062511
https://doi.org/10.1162/neco_a_01199
https://doi.org/10.1162/neco_a_01199
https://doi.org/10.1109/IRC55401.2022.00035
https://doi.org/10.1146/annurev-control-071020-045021
https://doi.org/10.1146/annurev-control-071020-045021

	Utilising redundancy in musculoskeletal systems for adaptive stiffness and muscle failure compensation: a model-free inverse statics approach
	1. Introduction
	2. Related works
	2.1. Model-based control approaches
	2.2. Model-free control approaches
	2.2.1. Supervised learning
	2.2.2. Reinforcement learning

	2.3. Biologically-inspired control approaches

	3. Methods
	3.1. Musculoskeletal leg model
	3.2. Hill-type muscle model
	3.3. Inverse static mapping between activations and end-effector positions

	4. Experimental setup and results
	4.1. Model-free controller training and implementation
	4.2. Hip and knee control experiments
	4.3. Six-muscle leg control experiments
	4.3.1. Randomised target postures, randomised initial postures and trajectory tracking
	4.3.2. Performance under payload changes
	4.3.3. Performance under muscle failure
	4.3.4. Performance under external physical interaction


	5. Discussion and conclusions
	5.1. Future work

	Appendix 
	References