Stability and motor adaptation in human arm movements - Research

Oct 5, 2004 - schemes, using a monotonic antisymmetric update law, can- ... sense that when the hand is slightly perturbed during the movement it tends to ...
537KB taille 6 téléchargements 330 vues
Biol Cybern (2005) DOI 10.1007/s00422-005-0025-9

O R I G I N A L PA P E R

E. Burdet · K. P. Tee · I. Mareels · T. E. Milner C. M. Chew · D. W. Franklin · R. Osu · M. Kawato

Stability and motor adaptation in human arm movements

Received: 5 October 2004 / Accepted: 9 September 2005 © Springer-Verlag 2005

Abstract In control, stability captures the reproducibility of motions and the robustness to environmental and internal perturbations. This paper examines how stability can be evaluated in human movements, and possible mechanisms by which humans ensure stability. First, a measure of stability is introduced, which is simple to apply to human movements and corresponds to Lyapunov exponents. Its application to real data shows that it is able to distinguish effectively between stable and unstable dynamics. A computational model is then used to investigate stability in human arm movements, which takes into account motor output variability and computes the force to perform a task according to an inverse dynamics model. Simulation results suggest that even a large time delay does not affect movement stability as long as the reflex feedback is small relative to muscle elasticity. Simulations are also used to demonstrate that existing learning schemes, using a monotonic antisymmetric update law, cannot compensate for unstable dynamics. An impedance compensation algorithm is introduced to learn unstable dynamics, which produces similar adaptation responses to those found in experiments. E. Burdet · K. P. Tee · C. M. Chew Department of Mechanical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore E. Burdet (B) Department of Bioengineering, Imperial College London, London, UK E-mail: [email protected] http://www.bg.ic.ac.uk/staff/burdet/ I. Mareels Department of Electrical and Electronic Engineering, The University of Melbourne, Melbourne, Australia T. E. Milner · D. W. Franklin School of Kinesiology, Simon Fraser University, Burnaby, British Columbia, Canada D. W. Franklin · R. Osu · M. Kawato ATR Computational Neuroscience Laboratories, 2-2-2 Hikaridai “Keihanna Science city” Kyoto 619-0288, Japan

Keywords Nonautonomous dynamic system · Reflex feedback · Motor learning · Iterative learning · Nonlinear adaptive control · Impedance control

1 Introduction 1.1 Motivation and goal Stability is critical for successful human movements. In an unstable system, slight variation in the initial conditions, noise in the control signal or external perturbations can lead to unpredictable, irreproducible and inconsistent execution of motor tasks. In contrast, in a stable system, the same or similar motor commands will lead to similar movements despite small disturbances. This means that the movement outcome can be predicted and the movement can be planned. It also becomes possible to form a library of motor commands corresponding to various tasks. Stability is particularly important for human motion because of the large variability in consecutive performances of the same action, i.e. motor output variability, and because we perform most actions in interaction with the environment which may add variability. Stability has been rigorously defined in dynamical systems theory. Lyapunov stability means that after a small perturbation a response or movement trajectory will remain close to the undisturbed response/trajectory; asymptotic stability means that it will in addition converge to the undisturbed response/trajectory; exponential stability means that it will converge at an exponential rate (Vidyasagar 1993). However these mathematical definitions cannot be applied directly to the control of human motions. First, they require infinite time, whereas human motor control, as with any physical system, produces movements of finite duration. Even when adapted to finite time movements, the above stability concepts have limited applicability to the control of human motion, for reasons which will be explained below. This paper will examine how stability can be quantitatively established in an experimental evaluation of human movements, and will characterize possible mechanisms by

E. Burdet et al.

which humans ensure stability. Towards the end of the movement, motion stability depends on corrective movements. This paper focuses on stability provided by muscle properties and stretch reflexes and, therefore, will analyze stability over the entire movement rather than only at the end.

1.2 Stable and unstable tasks Interaction-free arm movements are generally stable, in the sense that when the hand is slightly perturbed during the movement it tends to return to the undisturbed trajectory (Milner 1993; Won and Hogan 1995; Gomi and Kawato 1997), corresponding to asymptotic or exponential stability. This ‘stability’stems mainly from muscle elasticity and the stretch reflex, which produce a restoring force towards the undisturbed trajectory. However, the stabilization provided by reflexes is limited by a time delay of at least 60 ms, which means that in some cases reflexes can create instability (Jacks et al. 1988). Moreover, to manipulate objects or use tools we have to interact with the environment and compensate for forces arising from it. Ultimately, it is the interaction between our limbs and the environment that determines whether or not a movement will be stable (Colgate and Hogan 1988). While tasks such as opening a door involve a stable interaction with the environment, many common tasks, in particular, tasks involving tools, are intrinsically unstable (Rancourt and Hogan 2001). Drilling, carving and keeping a screwdriver in the slot of a screw are just a few examples of unstable tasks. Unstable tasks are more difficult to control than stable tasks, as neuromotor noise (Slifkin and Newell 1999; Osu et al. 2004) or material irregularities can cause the tool to slip unexpectedly to one side or the other.

1.3 Common methods to infer stability are impractical In principle, stability could be inferred similarly to the mathematical definition, from observing movement trajectories and response to perturbations. However, in human movements the trajectory is different from trial to trial so an undisturbed trajectory cannot be observed directly; in turn it is not possible to infer stability by comparing pairs of disturbed versus undisturbed trajectories. Furthermore, human motor control is a non-autonomous dynamic system, in which the command could vary from trial to trial, so we do not even know which dynamical system should be used. In motor task execution, stability depends on the endpoint impedance that results from the spring-like property of muscles and (stretch) reflexes. Endpoint stiffness can be evaluated at static positions by measuring the restoring force to perturbations of the hand position (Mussa-Ivaldi et al. 1985). We have recently extended this method to estimate impedance during movement (Gomi and Kawato 1997; Burdet et al. 2000), although it requires at least 40 repetitions of the same movement to measure stiffness at one point during movement.

1.4 Motor learning mechanisms Humans constantly adapt their movements to changes of internal and external conditions. Learning in novel environments, where environment interaction with the arm is stable, has been investigated extensively (Shadmehr and Mussa-Ivaldi 1994; Lackner and Dizio 1994; Shadmehr and Holcomb 1997; Krakauer et al. 1999; Kawato 1999). The experimental data show that subjects learn a feedforward compensation force necessary to overcome external dynamics. We have recently examined learning in a divergent force field that produced an unstable interaction with the arm (Burdet et al. 2001), and shown that humans improve task performance and overcome instability by increasing the mechanical impedance of the arm selectively in the unstable direction. Altogether, these results suggest that humans form appropriate internal models to compensate for force and instability arising from the interaction with the environment. Several algorithms have been proposed to model motor learning (Albus 1971; Kawato et al. 1987; Katayama and Kawato 1993; Bhushan and Shadmehr 1999; Sanner and Kosha 1999). These learning schemes have been shown to work in stable tasks. However, no unstable task has been investigated so far. 1.5 Contributions and outline The first contribution of this paper is to clarify the notion of stability in human motions. The notion of stability is illustrated using a computer model of arm movement control which computes the force to perform a task according to an inverse dynamics model, and comparing simulations with movements recorded during experiments (Burdet et al. 2001; Osu et al. 2003). This model enables us to study stability despite motor output variability and adaptation, which normally mask the typical criteria needed to demonstrate Lyapunov stability in real human movements. It can also be used to examine how stability depends on the time delay of reflexes. The second contribution is to provide a quantitative measure that can be computed from measured trajectories. The problem caused by motion variability is overcome by considering an ensemble of trajectories, and characterizing a probability distribution over the ensemble similar to Franklin et al. (2003b). This provides a stability measure roughly equivalent to the Lyapunov exponent. The third contribution is to investigate the learning mechanisms for unstable interactions; in particular, to examine whether previous algorithms represent a plausible strategy in unstable situations. Our results suggest that impedance compensation is essential to perform stable movements in unstable interactions and we introduce a learning algorithm producing an adaptation response similar to that observed in human movements. The remainder of the paper is organized as follows. Section 2 analyzes the stability of movements performed in interaction with the environment. The computational model of the control of arm movements and its implementation are

Stability and motor adaptation in human arm movements

presented in Sects. 2.1 and 2.2, respectively. This model is used in Sect. 2.3 to define the meaning of stability in human arm movements. In Sect. 2.4, a measure is introduced and used to infer stability using recorded data. The influence of time delay on stability is examined in Sect. 2.5. Thereafter, Sect. 3 examines related adaptation mechanisms. Sect. 3.1 demonstrates that existing learning schemes cannot compensate for unstable dynamics, and Sect. 3.2 describes necessary impedance compensation mechanisms.

not change when the external conditions of the task are varied from the conditions prevailing under the free movement trials. The CNS is a learning system which continuously adapts to novel dynamics resulting from the interaction with the environment. When the environment is not changing, the CNS adapts the control and produces movements with similar trajectories in consecutive trials. Let q∗ (t) be the trajectory in learned dynamics, corresponding to the applied torque τ m : f(q∗ (t), q˙ ∗ (t), q¨ ∗ (t)) = τ m (t) = τ IDM (t) +τ (t), 0 ≤ t ≤ T ,

2 Motion stability 2.1 Computational model of arm movements The (joint space) model introduced in this section will be used to illustrate various stability concepts and measures, and to elucidate the role of learning, feedforward and feedback control in arm movements. In the following description scalars s are italic, vectors v are bold and matrices M are bold capitals. Let τ m (t) represent the k-dimensional vector of torques produced by muscles on the k joints of a limb, and q(t) the resulting joint angle trajectories. We assume that muscle torque/force is produced according to an inverse dynamics model (IDM) of the task (Shadmehr and Mussa-Ivaldi 1994). This corresponds to the experience gained in several movements and constitutes a plan of action τ IDM , which consists of the forces/torques to move the arm as well as learned forces/torques required to overcome environmental dynamics for the particular task. When one repeats an action several times, the trajectory is never exactly the same: movements with the same action plan will have some variation (Slifkin and Newell 1999; Osu et al. 2004). We assume that the muscle force is subject to motor output variability, resulting in τ m (t) = τ IDM (t) + τ (t),

0 ≤ t ≤ T,

(1)

where τ (t) is a random variable. The torques are modeled as continuous functions of time, and T is the time horizon over which the task is completed. τ IDM (t) and τ (t) could be generated using some model. However, to compare simulation results with available data recorded in experiments performed by human subjects, we prefer to identify τ IDM (t) and τ (t) from these data. τ IDM is identified as the empirical mean of N observations of muscle (N ) torque {τ (1) m , . . . , τ m } required to execute the task under free movement conditions, i.e., τ IDM =

N 1  (i) τ . N i=1 m

(2)

Noise waveforms τ (t) are randomly selected from the set (N ) {τ (1) , . . . , τ (N ) } = {τ (1) m − τ IDM , . . . , τ m − τ IDM } to generate a movement in various environment dynamics. We make the simplifying assumption that torque variability will

(3)

˙ ¨ where f(q(t), q(t), q(t)) represents the torque necessary to move the limb and depends on the joint position q, velocity ˙ ¨ q(t) and acceleration q(t). The responses q∗ (t), q˙ ∗ (t), and ∗ q¨ (t) are random variables. ˙ ¨ q(t)) Changes in the environment dynamics τ E(q(t), q(t), modify the trajectory from q∗ (t) to q(t) and in turn cause restoring forces r (Milner and Cloutier 1993; Won and Hogan 1995): ˙ q) ¨ = τ m = τ IDM + τ + r(q, q, ˙ q∗ , q˙ ∗ ). (4) τ E + f(q, q, The restoring force r is produced by muscle elasticity re as well as reflex forces rr r = re + rr .

(5)

For simplicity, we assume that the torque produced by both reflex forces and muscle elasticity can be modeled as linear functions r = r(e, e˙ ) of the trajectory deviation e = q − q∗ and its derivative e˙ = q˙ − q˙ ∗ . Muscle elasticity is modeled as re = K (e + κd e˙ ),

(6)

where K is the intrinsic joint stiffness matrix, which increases with torque, i.e., with muscle activation (Tee et al. 2004). Reflexes are modeled as rr (t) = G [e(t − φ) + gd e˙ (t − φ)] .

(7)

where G is the reflex gain matrix and φ the time delay. Equations (1–7) can be used to simulate movements of the (possibly redundant) limbs under various scenarios for interaction forces and motor variability. This model for arm movements interacting with the environment, sketched in Fig. 1a extends the model of Shadmehr and Mussa-Ivaldi (1994) in three respects: – it considers motor noise inherent to motion generation. – it incorporates a more realistic impedance model, that includes both a dependence on torque (due to muscle activation) and the time delay inherent in reflexes. – the inverse dynamics model depends on the planned rather than on the executed trajectory. This seems to be more compatible with the nature of a feedforward motor command and the significant time delay in the sensory pathways. Comparison of the simulated trajectories (Fig. 2) with real trajectories observed under equivalent conditions (Osu et al. 2003), as well as comparison of simulated and measured endpoint impedance (Tee et al. 2004), show that this simple

E. Burdet et al.

Fig. 1 Simulation of human movements to investigate motor control and learning. a The scheme of neural control and feedback error learning in novel dynamics. b Movement task involves reaching of hand towards target point while an external force is exerted on the hand through a robotic interface

model predicts the responses of the human arm interacting with novel environmental dynamics well. 2.2 Model’s implementation The simulations were compared with available data on human arm movements and adaptation to stable and unstable dynamics from Burdet et al. (2001), Franklin et al. (2003a, b) and Osu et al. (2003). The task considered is to move the arm ahead of the body from (0, 31) cm to (0, 56) cm, as indicated in Fig. 1b, in approximately 600 ms. The planned torques and motor output variability were identified from N = 50 trials in free conditions. The horizontal arm movements at shoulder height use a two-link mechanical structure with parameters defined in Table 1. For the horizontal motions of interest, gravity can be neglected, and the task dynamics are modeled by ˙ q) ¨ := H(q)q¨ + C(q, q) ˙ q˙ + τ PFM , f(q, q, where ⎡ 2 2 J1 + J2 + M1 lm1 + M2 (l12 + lm2 ⎢ +2 l1 lm2 cos q2 ) H(q) = ⎢ ⎣ 2 J2 + M2 (lm2 + l1 lm2 cos q2 )

(8)

⎤ 2 J2 + M2 (lm2 + l1 lm2 ⎥ cos q2 )⎥ ⎦ 2 J2 + M2 lm2 (9)

is the inertia matrix and



M2 l1 lm2 q˙2 (2 q˙1 + q˙2 ) sin(q2 ) ˙ q˙ = C(q, q) M2 l1 lm2 q˙12 sin(q2 )

(10)

is the term corresponding to Coriolis and centrifugal forces. q1 and q2 denote the shoulder joint angle and elbow joint angle, respectively. The experiments reported in Burdet et al. (2001), Franklin et al. (2003a, b), and Osu et al. (2003) used the PFM robotic interface to produce force fields on the hand during movement. The PFM is not completely ‘transparent’ to the user, thus the corresponding dynamics τ PFM have to be taken into account in the simulation. τ PFM was identified from test trajectories with a large dynamic variation (Slotine 1991) and modeled as τP F M = J(q)T (ME x¨ + Dd x˙ + tanh(200 Ds x˙ )) where ME =



1.516 0 0 1.404

(11)



Ns2 /m,  10.247 0 Dd = Ns/m, 0 7.592  0.102 0 Ns/m, Ds = 0 0.356

x¨ and x˙ represent Cartesian acceleration and velocity, respectively. The Jacobian matrix transforming endpoint force into joint torque (De Wit et al. 1996) is given by

Stability and motor adaptation in human arm movements

a 0.55

y [m]

0.5

0.45

VF

NF

DF

0.4

0.35

0.31 -0.05

x 0 [m]

0.05

-0.05

0 [m]

0.05

-0.03

0 [m]

0.03

b 0.55 force in x-direction [N]

Perturbation

y [m]

0.5

0.45

0.4

2 0 -2 0

100

VF

200

300

DF

time[s]

0.35

0.31

x -0.05

0 [m]

0.05

-0.05

0 [m]

0.05

Fig. 2 Motion stability. a Simulated hand trajectories in null field (NF) and in two force fields, the velocity dependent field (VF) and divergent field (DF), without learning. b The VF interaction is asymptotically stable, as after a perturbation the trajectory remains close to the undisturbed trajectory and eventually converges to it. On the other hand, the interaction with the DF is unstable, as shown by the diverging trajectory after a small perturbation. The perturbation is a 3 N force pulse in the positive or negative x-direction

J(q) =

∂xi ∂qj



 −l1 sin q1 − l2 sin(q1 + q2 ) −l2 sin(q1 + q2 ) . = l1 cos q1 + l2 cos(q1 + q2 ) l2 cos(q1 + q2 ) (12)

For the torque resulting from muscle elasticity, a ratio 1 s of joint damping to stiffness was used, correspondκd = 12 ing to a larger dependence on position error (Mirbagheri et al. 2000). K was used as the mean stiffness of five subjects measured in Gomi and Osu (1998):

E. Burdet et al.

Table 1 Parameters of the two link structure of Fig. 1b used in the simulation

Upper arm Forearm

K(|τ m |) =

Mass (kg)

Length (m)

Center of mass from proximal joint (m)

Mass moment of inertia (kg m2 )

1.93 1.52

0.31 0.34

0.165 0.19

0.0141 0.0188

 10.8 + 3.18 |τ1 | 2.83 + 2.15 |τ2 | Nm/rad 2.51 + 2.34 |τ2 | 8.67 + 6.18 |τ2 | (13)

where τ1 and τ2 denote shoulder and elbow torques, respectively. For the reflexes, φ = 60 ms was used as default for the time delay of the reflex response and gd = 2s was used as ratio of joint damping to stiffness, corresponding to a larger dependence on velocity error (Mirbagheri et al. 2000). G = 1 K Nm/rad produces reflex gains increasing with muscle 50 activation (Sinkjaer et al. 1988) such that the reflex contribution is approximately 25% of the total restoring force (Carter et al. 1990). The simulations in this paper used simple Euler integration with 1 ms time step. 2.3 Stability in simulated human movements This section explains what stability means for human movements, and illustrates the stability concepts using simulated arm movements. To this purpose, we examine the effect of a small perturbation (δ) on motion. To distinguish this small “random” perturbation from reproducible dynamics τ E we rewrite Eq. 4 as ˙ q) ¨ = τ IDM + τ + r(q, q, ˙ q∗ , q˙ ∗ ). δ + τ E + f(q, q,

(14)

The aim is to arrive at a definition and measure of stability that can be used in an experimental context, hence the stability notion must be computable from the measurements, using a relatively small number of trials. In free motion, i.e., without external force, the trajectory corresponding to τ IDM is approximately a straight line trajectory from the start to the target with a bell-shaped velocity profile. Non-negligible variations arise in repeated trials, but the ensemble of trajectories occupy a narrow cone-shape around the straight line connecting the start point with the target (see Fig. 2a). The variation in the start point is neglected corresponding to the experimental conditions. The variability from the ‘planned’ straight line trajectory does not imply that the free movement trajectory is unstable. Figure 2b clearly illustrates that small perturbations along the trajectory do not affect the entire trajectory, but are limited in time, suggesting that the planned task is stable and that the trajectory acts as an attractor. The interaction with external dynamics shows different patterns. For example, when a position dependent divergent force field (DF) defined by  450 0 T x (15) τ E = J FDF , FDF = − 0 0

is exerting a force FDF on the hand during movement, we observe that the hand is pushed away from the straight line trajectory. The external dynamics amplify the motor torque variability and the trajectories occupy a far larger neighborhood around the straight line connecting start and end point (see Fig. 2a). The interaction with the external force field DF (Eq. 15) leads to instability, as is confirmed by the divergence after small perturbations applied during movement (Fig. 2b). Not every external force interaction destabilizes. Consider for example a velocity dependent external force (VF) defined by  13 −18 x˙ . (16) τ E = JT FV F , FV F = − 18 13 Under the influence of this force field, the trajectories systematically deviate left from the planned straight line trajectory. Nevertheless, the task is successfully performed in that the end point is reached. Moreover as is illustrated in Fig. 2b, small (time localized) perturbations lead to small deviations overall and the perturbed trajectories also satisfactorily reach the end point. We conclude that the VF leads to a trajectory that is stable although it differs from the interaction free ‘planned’ trajectory. 2.4 Movements deviation as a measure of stability for real movements Inferring the stability of a particular response from observations of the response to perturbations requires the comparison of the unperturbed trajectory with the perturbed response. In experiments however, it is not possible to know the undisturbed trajectory exactly, due to the variability in repeated movements. Alternatively, one can infer stability from observing the deviation of the set of consecutive trajectories in repeated trials. The size of the set of deviations is affected by the amount of motor variability and the magnitude of unpredictable disturbances along the motion as well as by the stability properties of the overall system. The deviation will grow with the duration of the observation when the system is unstable, and remain bounded if the system is stable. 2.4.1 Simplified stability model We use a simple conceptual model to illustrate the ideas. Let y be a co-ordinate along a particular motion path, say 0 ≤ y ≤ 1 (start to finish), which we call the reference path. Let x be a coordinate locally orthogonal to this reference path. A new motion trajectory different from the reference

Stability and motor adaptation in human arm movements

path due to some perturbation, starting at the same point and finishing near the end point will have a representation x(y). A simple model (think of it as a linearization along the reference path of the dynamics forcing deviation away from the reference path) may be fitted as d x(y) = −λ x(y) + ε; dy

x(0) = 0;

0 ≤ y ≤ 1.

(17)

µs =

N 1  (i) e , N i=1 a

where ea(i) =

1 L



Y

¯ dy. |x (i) − x|

(21)

(22)

0

The deviation from neutral is measured with respect to the position along the y-axis, i.e. along the line from the starting point to the target, and not with respect to time. The deviation (18) measured in the x-direction is x (i) (y), which corresponds to ¯ i-th trial. x(y) is the mean over N trials and plays the role of The parameter λ, called the Lyapunov exponent, captures the reference trajectory. Y is the y-displacement. The absothe stability. A positive Lyapunov exponent indicates stabillute error ea corresponds to the area between the path for the ity, whereas a negative one indicates unstable dynamic interactual movement and the mean trajectory normalized by the actions. The term ε captures the variability along the path, path length L. which, in order to simplify matters is assumed to be constant As is clear from the above discussion, such a measure along the path. The reference path corresponds to zero peris not equivalent to Lyapunov (or asymptotic or exponential) turbation. The mean deviation of the motion represented by stability, but it encapsulates stability well. Furthermore, some Eq. 17 is then given by instability may be masked by the variability between trials,   1 i.e., will not appear in the deviation. Deviation alone is not −λy y=1 e ε an indication of stability as the size obviously depends on y+ x(y) dy = ea ≡ λ λ y=0 0 the interaction as well. However, deviation may be a more

−λ plausible criterion for physiological stability than formal cri1 ε ελ e ε − ≈ − + O(ελ2 ). (19) teria for stability: while there is no evidence that the central 1+ = λ λ 2 6 λ nervous system (CNS) is concerned with formal stability, the The mean deviation grows with the size of the perturba- magnitude of trajectory deviation may well be considered tion term. Stable interaction dynamics (positive Lyapunov by the CNS for planning motion (Burdet and Milner 1998; exponent) decrease the effect of perturbations, and unstable Harris and Wolpert 1998). Low deviation is critical for sucdynamics (negative Lyapunov exponent) increase its effect. cessful actions, as it means that movements corresponding Comparing the mean deviation along the path for two to an action will always be similar, such that this action can different interactions (different Lyapunov exponents, but same be planned and the small variations can be corrected during perturbation term) it follows that the difference is approxi- movement. mately proportional to the difference of the Lyapunov expoTo illustrate how this measure can be used to infer stanents: bility in real movements, we apply it to motions measured in interaction with the force fields of Eqs. 15 and 16 (Burdet

ε (2) λ − λ(1) + O(ελ2 ). (20) et al. 2001; Osu et al. 2003). The mean stability measure ea(1) − ea(2) ≈ 6 is computed in NF, VF and DF for each of five subjects. Y A similar (mathematically more rigorous) treatment con- is the y-displacement when 550 ms has elapsed from start sidering ε as a stochastic variable would allow specifying of movement. Considering the variance between the mean how many trials would be theoretically required to decide on stability measures of the different subjects, a series of onestability/instability with a given probability. way ANOVA tests was used to determine whether the stability measure in VF or DF is significantly different from that in NF (Fig. 3a). As expected, the difference with the VF is 2.4.2 Empirical stability not significant (P > 0.9) while that in the DF is significant (P < 0.03). This shows that the deviation measure of Eq. 21 The above discussion shows that stability is qualitatively re- can be used to infer motion stability. Fig. 3b shows that our flected in the fact that trajectories corresponding to differ- simulated subject produces movements yielding similar staent trials are similar despite motor output variability. On the bility measures in NF, VF and DF to that of subject “DWF”, other hand, instability amplifies motor output variability and whose parameters were used for the simulation. results in trajectories that diverge increasingly.1 In line with The difference between the measure in the first 20 movethe above discussion and to more quantitatively capture sta- ments in the VF (Fig. 5a) and in NF is large ( 1 of the meability we propose using the mean absolute error of a family of sure in NF) and tends to statistical significance3 (P = 0.06). N movements relative to the mean path as a stability measure: Is the interaction with the VF unstable in initial trials? The 1 In some exceptional (i.e., low probability) cases, deviation caused reason for this apparent contradiction is that the CNS rapidly by instability may be masked by the deviation due to motor output adapts to the VF. In fact, there is evidence suggesting that variability. the inverse dynamics model, which compensates for the VF, with the solution

ε 1 − e−λy . x(y) = λ

E. Burdet et al.

before learning after learning

before learning, simulated subject after learning, simulated subject before learning, real subject after learning, real subject

p>0.48 p0.37 p>0.9

-3

b

x 10

8

8

7

7 Stability Measure [m]

Stabiility Measure [m]

a

6 5 4 3

-3

conventional feedback error learning (FEL)

6 5 4 3

2

2

1

1

feedback error and impedance learning

0

0 NF

VF

NF

DF

c

VF

DF

x 10 -3

Stabiility Measure [m]

2

1.5

1

0.5

0

40 60 80 100 120 140 160 180 200 220 delay [ms]

Fig. 3 Movement deviation as a measure of stability (defined in Sect. 2.3). a Stability measure computed for real movements in NF, VF and DF, before and after learning. We take the mean deviation for each of five subjects to form the sample distribution and do a series of one-way ANOVA relative to NF. Red dash represents the median, the box contains the 1st to 3rd quartiles, and the limits represent the maximum and minimum. The difference between DF and NF is significant (P < 0.03), showing instability in DF. After learning the difference is no longer significant (P > 0.48), corresponding to acquired stability. The difference between VF and NF is not significant (P > 0.37), as both interactions are stable. b Good match of stability measure between the simulated subject and the real subject (DWF) from which parameters were derived. c The stability is not very sensitive to long time delay in the reflexes

is formed during the initial few trials, and probably changes substantially with every new trial (Osu et al. 2003; Milner and Franklin 2005). As a consequence the trajectory changes from trial to trial. To infer the interaction with novel environmental dynamics, it is necessary to perform movements without permitting learning to occur, for example by introducing a few random VF trials in a series of NF trials, as for the “before effects” in Osu et al. (2003).

2.5 Influence of reflex delay on motion stability One of the main factors affecting the stability of a closed loop dynamic system is the time delay of feedback. However the situation may be different for the control of the human arm, which possesses an “zero-delay feedback loop” constituted by muscle elasticity, in addition to neural feedback (McIntyre and Bizzi 1993).

Stability and motor adaptation in human arm movements

Fig. 4 Simulations show that feedback error learning cannot compensate for unstable dynamics. a Hand trajectories in a stable interaction (VF) converge to straight movements after learning, but in an unstable interaction (DF) continue to diverge even after 90 trials. b The inverse dynamics model (IDM) acquired in DF (black) is small and negligible compared to that in VF (grey). The mean IDM torque pattern over the final 20 trials is displayed. c While in VF (grey) the IDM is smoothly acquired, in DF (black) the mean torque oscillates about zero with a large variance, i.e., no internal model is learned. d Illustration of why the inverse dynamic model oscillates around 0 when feedback error learning is used in unstable dynamics (Sect. 3.1)

To examine how reflex feedback delay affects motion stability in our model, we varied the reflex delay and examined its effect on the deviation to 3 N perturbations applied between 200 and 400 ms after onset of movements performed without motor output variability. We measured the mean deviation to perturbations applied in eight directions {0◦ , 45◦ , 90◦ ,

135◦ , 180◦ , 225◦ , 270◦ , 315◦ }. From Fig. 3c it is noted that trajectory deviation is practically insensitive to even large reflex delays. The increase in mean deviation, when delay is doubled, is small compared to inherent motor output variability. This can be explained by the fact that muscle elasticity plays the primary role in providing stability in our model, as

E. Burdet et al.

10

1

-3

x 10 0

4

1

3

-5

6

VF

3 8

-10 -15

5 2

7

4

6

8

10 12

18

trials

x 10

DF

VF

5 0

DF

-5 2

4

6

8

10 12

error [m] 2

2 4

signed error

2

18

-3

error [m] 2

b

a

-10

trials Fig. 5 Experimental data of initial movements in VF and DF a The analysis of signed error shows that the trajectories converge monotonically to the straight trajectory in VF, but oscillate left and right about the straight trajectory in DF b The individual trials are displayed as dotted lines. Their mean is displayed as a solid line, and the dashed lines show the mean plus or minus one standard deviation. In DF, the sign convention for each subject was chosen so that the first trajectory deviated in the negative (i.e., left) direction

it makes up about 75% of total resistance to perturbations. Effectively, stability decreases with increasing contribution from neural feedback.

3 Mechanisms to learn stable and unstable dynamics 3.1 A monotonic antisymmetric learning law cannot compensate for unstable dynamics Taking the mean of the terms in Eq. 4 over movements (with ˙ q) ¨ = τ IDM ): τ = 0 and f(q, q, ˙ q∗ , q˙ ∗ ), τ E = r(q, q,

(23)

we see that the mean restoring force r corresponds to the dynamics of the task not yet incorporated into the IDM. Therefore, the IDM may be updated from r. Similarly, most algorithms from neurophysiological models (Albus 1971; Kawato et al. 1987; Katayama and Kawato 1993; Bhushan and Shadmehr 1999; Sanner and Kosha 1999) as well as nonlinear adaptive control and iterative learning applied to robotics [see Slotine (1991) and Burdet et al. (1998) for reviews] perform a minimization of the (square) feedback and have a monotonic learning law: a positive error produces a positive change of motor command and a negative error a negative change, which we, therefore, refer to ‘monotonic antisymmetric’. To examine how such algorithms function in stable and unstable interactions, we performed simulations for the VF and DF using the following learning law:

learning does not provide any improvement in performance in the DF. After 100 trials, the movements are still unstable and most do not reach the target. Because of the unpredictability of the unstable interaction with the DF, the dynamics experienced on one trial is not indicative of the dynamics of the next movement. As Figs. 4(b,c) show, the part of the IDM corresponding to the external force will converge to the external dynamics for the VF and to zero for the DF, i.e., to the mean dynamics. Figure 4d explains why a monotonic antisymmetric learning law cannot compensate for unstable dynamics. A small deviation in (for example) the positive direction is amplified by the instability. On the next trial, torque will be increased in the opposite direction due to learning in the previous trial, leading to a movement in the negative direction. For the same reason, the movement on the subsequent trial will again be in the opposite direction, i.e., in the positive direction. This oscillatory behavior, similar to that observed in experiments (Fig. 5), results in no IDM (Fig. 4b).2 Initial movements measured in VF and DF of Fig. 5 and Osu et al. (2003), similar to the simulation of Fig. 4, suggest that the neural adaptive control mechanism involves an antisymmetric learning law. In a stable interaction (VF), the trajectories converge quickly and monotonically towards the straight line trajectory, while in an unstable interaction (DF) the trajectory oscillates to the left and right of the straight line trajectory. 3.2 Impedance compensation

(24)

We have seen that existing learning schemes from human motor control cannot compensate for unstable dynamics. How-

where i is the trial number and 0 < α < 2 the learning factor (Burdet et al. 1998). We see in Fig. 4a that such learning compensates well for the VF. After a few trials the trajectories become straight and similar to NF movements. In contrast,

2 The stability of the above learning mechanism has been studied in detail in the literature on adaptive control and iterative control, see for example Anderson et al. (1986). It can be concluded that the learning will fail in the case of unstable dynamics, but may also fail in other situations, including some stable dynamics.

τ (i+1) IDM

=

τ (i) IDM

+

α (re(i)

+

rr(i) )

Stability and motor adaptation in human arm movements

a 0.55

0.5

before learning

after learning

0.45

0.4

DF

0.35 0.31 -0.05

0

0.05

-0.05

0

trials 1-6

0.05

trials 91-96

c

-3

x 10

Stabiility Measure [m]

b

s =0.5 s =1 s =1.5 s =2

3.5 3 2.5 2 1.5 1 0.5

50 N/m

0

0.75

1

1.25

1.5

1.75

2

noise scaling factor

Fig. 6 Hand trajectories and stiffness in DF predicted by impedance matching. a After learning, movements become straight and similar to movements in NF. b stiffness increases along the direction of instability and increases with noise level. c the resulting deviation only increases slightly

ever, humans can learn unstable dynamics to succeed in unstable tasks such as carving. Measurement of endpoint stiffness revealed that humans learn to perform stable movements in the DF by controlling the impedance magnitude and geometry (Burdet et al. 2001). This section presents an algorithm to realize such impedance compensation. We extend the torque dependent stiffness matrix KIDM (|τ m |) of Eq. 13 to K = KIDM (|τ m |) + KS

(25)

such that KIDM (|τ m |) is the stiffness that arises due to the torque generated by the inverse dynamics model and KS is learned to compensate for destabilization originating in the environment. After every trial i, the stiffness K to counteract destabilizing environmental forces is identified using F(i) − F = K(i) e(i) ,

(26)

where e(i) is the tracking error in i-th trial and F is the mean endpoint force over the trials. The relation (i) K(i+1) = (1 − λ) K(i) S S + λ K

(27)

with λ ≤ 0.02 realizes a smooth update of KS . The force signal can be measured directly or can be identified from the addition of the IDM and the impedance terms (Fig. 1). The tracking error e may be obtained from kinesthetic information provided by muscles spindles and the force by Golgi tendon organs. We see in Fig. 6a that this algorithm is able to compensate for destabilization by the DF. The resulting stiffness is elongated in the direction of instability, corresponding to the experimental results (Burdet et al. 2001). In fact, stiffness will always become elongated along the direction of instability, similar to the joint space model of Tee et al. (2004). An interesting property of our model is that the magnitude

E. Burdet et al.

of learnt stiffness scales with the magnitude of motor noise (Fig. 6b), and the variability increases only slightly (Fig. 6c). To simulate this, the magnitude of the noise waveform was scaled with a parameter s > 0: τ  = sτ .

(28)

4 Discussion Stability is a critical factor for accurate and consistent performance, as it indicates the reproducibility of motions and the robustness to environmental as well as internal perturbations. However, it is difficult to conceptualize stability in human motion, in particular due to the variability in trials corresponding to the same planned action. Whether there is a trajectory which the neuro-mechanical control is tracking, is controversial (Gomi and Kawato 1997; Hinder and Milner 2003); if such desired trajectory exists it is difficult to localize accurately (Won and Hogan 1995; Hodgson and Hogan 2000; Ostry and Feldman 2003). Therefore, inferring stability by perturbing the movement and making comparisons with predictions of undisturbed movements is not a viable strategy. This study did not require any putative nominal trajectory to assess stability, but quantified stability directly from the mean deviation in consecutive movements. Application of this stability measure on real data showed that it can capture the main features of stability in a non-invasive way for a variety of interactive conditions. Movement deviation is a simple criterion to assess performance in a few trials that can be used, for example, in rehabilitation. Although our stability measure is not universal, i.e., particular unstable dynamics may have low deviation and trials with motor adaptation have to be avoided as instability may be wrongly attributed to modified motor commands, it can generally distinguish stable from unstable interactions, and can also be used to infer learning (Osu et al. 2003). To examine stability of arm movements interacting with dynamics, we implemented a joint space model. This model computes the joint torque produced by muscles according to an inverse dynamics model of the planned movement, and considers motor output variability and torque dependent impedance. Our simulations showed that motion stability, in the sense of reproducibility, is not much affected by reflex time delays below 300 ms, i.e., well within the physiological range. The probable reason is that both muscle elasticity and reflexes contribute to resistive forces to perturbations, and the reflex contribution is generally relatively small (Mirbagheri et al. 2000). Movement reproducibility may be particularly important to the CNS, which may use it to learn novel dynamics. In fact, existing learning schemes require stability and are based on it (Slotine 1991; Muramatsu and Watanabe 2003). However, in contrast to the ‘controller’ of human movements, which successfully compensates for unstable interactions (Burdet et al. 2001), existing learning schemes based on a monotonic antisymmetric update law fail as they can only learn the mean dynamics but are unable to attenuate very large variations

between trials. The human ‘controller’ may use impedance compensation in a similar way to what we have proposed in Sect. 3.2, to ensure successful movements without increasing trajectory deviation, despite the presence of (large) motor noise. This model was developed assuming that the musculoskeletal system has simple joints and uses a joint-based approach. It does not consider complex muscle mechanics nor muscle geometry. Therefore, we expect that it may not be able to reproduce the adaptation to all environments equally well. In particular, the coupling of coactivation (i.e., stiffness) and reciprocal activation (i.e., force) is probably more complex than modeled here (Perreault et al. 2002). We are currently developing a biologically more realistic model in muscle space rather than in joint space, which combines control of force and impedance (Burdet et al. 2004). Acknowledgements We thank S. Keerthi, C. J. Ong, J. Peters, V. Loo BT and an anonymous reviewer for their suggestions. “This work was supported by the National University of Singapore, the National Institute of Singapore, the National Institute of Information and Communications Technology of Japan, the Natural Sciences and Engineering Research Council of Canada and the Human Frontiers Science Project.”

References Albus J (1971) A theory of cerebellar function. Math Biosci 10:25–61 Anderson BDO, Bitmead RR, Johnson CR, Kokotovic PV, Kosut R, Mareels IMY, Praly L, Riedle BD (1986) Stability of adaptive systems: averaging and passivity analysis. MIT, Boston, MA Bhushan, N, Shadmehr, R (1999) Computational nature of human adaptive control during learning of reaching movements in force fields. Biol Cybern 81:39–60 Burdet E, Codourey A, Rey L (1998) Experimental evaluation of nonlinear adaptive controllers. IEEE Control Syst Magazine 18(2):39–47 Burdet E, Milner TE (1998) Quantization of human motions and learning of accurate movements. Biol Cybern 78:307–318 Burdet E, Osu R, Franklin DW, Yoshioka T, Milner TE, Kawato M (2000) A method for measuring endpoint stiffness during multi-joint arm movements. J Biomech 33:1705–1709 Burdet E, Osu R, Franklin DW, Milner TE, Kawato M (2001) The central nervous system skillfully stabilizes unstable dynamics by learning optimal impedance. Nature 414:446–449 Burdet E, Franklin DW, Osu R, Tee KP, Kawato M, Milner TE (2004). How are internal models of unstable tasks formed? In: Proceedings of IEEE international conference on engineering in medicine and biology society, IEEE/EMB, San Francisco, USA Carter RR, Crago PE, Keith MW (1990) Stiffness regulation by reflex action in the normal human hand. J Neurophysiol 64:105–118 Colgate JE, Hogan N (1988) Robust control of dynamically interacting systems. Int J Control 48:65–88 De Wit CC, Siciliano B, Bastin G (1996). Theory of robot control. Springer, Berlin Heidelberg New York Franklin DW, Burdet E, Osu R, Kawato M, Milner TE (2003A) Functional significance of stiffness in adaptation of multijoint arm movements to stable and unstable dynamics. Exp Brain Res 151:145–157 Franklin DW, Osu R, Burdet E, Kawato M, Milner TE (2003B) Adaptation to stable and unstable dynamics achieved by combined impedance control and inverse dynamics model. J Neurophysiol 90:3270– 3282 Gomi H, Kawato M (1997). Human arm stiffness and equilibrium point trajectory during multi-joint movement. Biol Cybern 76:163–171

Stability and motor adaptation in human arm movements

Gomi H, Osu R (1998). Task-dependent viscoelasticity of human multijoint arm and its spatial characteristics for interaction with environments. J Neurosci 18:8965–8978 Harris CM , Wolpert DM (1998) Signal-dependent noise determines motor planning. Nature 394:780–784 Hinder MR, Milner TE (2003) The case for an internal dynamics model versus equilibrium point control in human movement. J Physiol 549:953–963 Hodgson, AJ, Hogan N (2000) Model-independent definition of attractor behaviour applicable to interactive tasks. IEEE Trans Syst Man Cybern C. Appl Rev 30:105–118 Jacks A, Prochazka A, Trend PS (1988) Instability in human forearm movements studied with feed-back-controlled electrical stimulation of muscles. J Physiol 402:443–461 Katayama, M, Kawato, M (1993) Virtual trajectory and stiffness ellipse during multijoint arm movement predicted by neural inverse models. Biol Cybern 69:353-362 Kawato M, Furukawa K, Suzuki R (1987) A hierarchical neural-network model for control and learning of voluntary movement. Biol Cybern 57:169–185 Kawato M (1999) Internal models for motor control and trajecotry planning. Curr Opin Neurobiol 9:718–727 Krakauer JW, Ghilardi MF, Ghez C (1999) Independent learning of internal models for kinematic and dynamic control of reaching. Nat Neurosci 2:1026–1031 Lackner JR, Dizio P (1994) Rapid adaptation to Coriolis force perturbations of arm trajectory. J Neurophysiol 72:299–313 McIntyre J, Bizzi E (1993) Servo Hypotheses for the Biological Control of Movement. J Mot Behav 25:193–202 Milner TE (1993) Dependence of elbow viscoelastic behaviour on speed and loading in voluntary movements.Exp Brain Res 93:177–180 Milner TE, Cloutier C (1993) Compensation for mechanically unstable loading in voluntary wrist movement. Exp Brain Res 94:522–532 Milner TE and Franklin DW (2005) Impedance control and internal model use during the initial stage of adaptation to novel dynamics. J Physiol 567:651–664 Mirbagheri MM, Barbeau M, Kearney RE (2000) Intrinsic and reflex contributions to human ankle stiffness: variation with activation level and position. Exp Brain Res 135:423–436 Muramatsu E, Watanabe K (2003) Feedback error learning control of time delay systems. In: Proceedings of annual conference of the society of instrument and control engineers, Fukui, pp 312–317

Mussa-Ivaldi FA, Hogan N, Bizzi E (1985) Neural, mechanical, and geometric factors subserving arm posture in humans. J Neurosci 5: 2732–2743 Ostry DJ, Feldman AG (2003) A critical evaluation of the force control hypothesis in motor control. Exp Brain Res 221:275–288 Osu R, Burdet E, Franklin DW, Milner TE, Kawato M (2003) Different mechanisms involved in adaptation to stable and unstable dynamics. J Neurophysiol 90:3255–3269 Osu R, Kamimura N, Iwasaki H, Nakano E, Harris CM, Wada Y, Kawato M (2004): Optimal impedance control for task achievement in the presence of signal-dependent noise. J Neurophysiol 92:1199– 1215 Perreault EJ, Kirsch RF, Crago PE (2002) Voluntary control of static endpoint stiffness during force regulation tasks. J Neurophysiol 87:2808–2816 Rancourt D, Hogan N (2001) Dynamics of pushing. J Mot Behav 33:351–362 Sanner, RM and Kosha, M (1999) A mathematical model of the adaptive control of human arm motions. Biol Cybern 80:369–382 Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of motor tasks. J Neurosci 14:3208–3224 Shadmehr R, Holcomb HH (1997) Neural correlates of motor memory consolidation. Science 277:821–825 Sinkjaer T, Toft E, Andreassen S, Hornemann BC (1988) Muscle stiffness in human ankle dorsiflexors: intrinsic and reflex components. J Neurophysiol 60:1110–1121 Slifkin AB, Newell KM (1999) Noise, information transmission, and force variability. J Exp Psychol Hum Percept Perform 25:837– 851 Slotine JJE (1991)Applied nonlinear control. Prentice-Hall, Englewood Cliffs, NJ, USA Tee KP, Burdet E, Chew CM, Milner TE (2004) A model of force and impedance in human arm movements. Biol Cybern 90:368–375 Won J, Hogan N 1995 Stability properties of human reaching movements. Exp Brain Res 107:125–136 Vidyasagar M (1993) Nonlinear systems analysis. Prentice Hall, Englewood Cliffs, NJ, USA