^{1}

^{1}

^{2}

^{3}

^{1}

^{1}

^{4}

^{5}

^{6}

^{*}

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

Edited by: Matteo Zago, Polytechnic of Milan, Italy

Reviewed by: Nicola Francesco Lopomo, University of Brescia, Italy; Carlos D. Maciel, University of São Paulo, Brazil

This article was submitted to Biomechanics, a section of the journal Frontiers in Bioengineering and Biotechnology

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Modern statistics and higher computational power have opened novel possibilities to complex data analysis. While gait has been the utmost described motion in quantitative human motion analysis, descriptions of more challenging movements like the squat or lunge are currently lacking in the literature. The hip and knee joints are exposed to high forces and cause high morbidity and costs. Pre-surgical kinetic data acquisition on a patient-specific anatomy is also scarce in the literature. Studying the normal inter-patient kinetic variability may lead to other comparable studies to initiate more personalized therapies within the orthopedics.

Trials are performed by 50 healthy young males who were not overweight and approximately of the same age and activity level. Spatial marker trajectories and ground reaction force registrations are imported into the Anybody Modeling System based on subject-specific geometry and the state-of-the-art TLEM 2.0 dataset. Hip and knee joint reaction forces were obtained by a simulation with an inverse dynamics approach. With these forces, a statistical model that accounts for inter-subject variability was created. For this, we applied a principal component analysis in order to enable variance decomposition. This way, noise can be rejected and we still contemplate all waveform data, instead of using deduced spatiotemporal parameters like peak flexion or stride length as done in many gait analyses. In addition, this current paper is, to the authors’ knowledge, the first to investigate the generalization of a kinetic model data toward the population.

Average knee reaction forces range up to 7.16 times body weight for the forwarded leg during lunge. Conversely, during squat, the load is evenly distributed. For both motions, a reliable and compact statistical model was created. In the lunge model, the first 12 modes accounts for 95.26% of inter-individual population variance. For the maximal-depth squat, this was 95.69% for the first 14 modes. Model accuracies will increase when including more principal components.

Our model design was proved to be compact, accurate, and reliable. For models aimed at populations covering descriptive studies, the sample size must be at least 50.

In biomechanics, the safety and efficiency of novel surgical techniques as well as the development of biocompatible products ultimately rely on its capability of being tested on humans through clinical trials. The complete development chain of a new surgical technique or implant and their introduction into clinic practice is both time-consuming and economically demanding. Next to it, it is known that patient-specific surgery planning or implant design can improve the long-time outcome of an implant (

Lower limb kinetics can be estimated based on musculoskeletal models and ground force plates using inverse dynamics (

Hence, in order to create the foundations for the development and optimization of the design or the durability of orthopedic implants, it is mandatory to generate appropriate loading conditions that represent inter-patient variability across the population (

Recent developments in medical imaging significantly increased the accuracy of the three-dimension computational anatomical representation, enhancing the anatomical differences within a determined population (

While the gait cycle has been the most researched activity in the current literature, it is not particularly demanding for the lower limb joints. For the purpose of implant wear testing, implant fixation, and joint stability analysis, there are other more challenging activities commonly performed in daily living that might be of particular interest (

In sum, the purpose of this study is to build statistical models of deep squatting and forward lunging for applications in pre-clinical testing of orthopedic implants and surgery in an asymptomatic adult population and ultimately to analyze and validate the inter-individual variations in lower limb kinetics.

Fifty-three asymptomatic volunteers participated in the study. In order to eliminate sex and race differences and reduce the potential influence of age and body mass index (BMI), only healthy Caucasian men who were not overweight and aged between 17 and 25 years are included. The admission requirement is practicing sports for at least 2 h a week. The subjects were asked to perform five times a smooth maximal-depth squat and a right forward lunge step with a predetermined frequency and fluency after a short training. In addition, the volunteers underwent full lower limbs MRI. An ethics committee (Ghent University Hospital, Belgium) approved these investigations (EC2014/0286). The characteristics of our study population are listed in

Demographic and anthropometric characteristics of the study population.

Demographic descriptor | Mean (95% CI*) | Normal values |

Height (cm) | 181.79 (180.08–183.51) | Not applicable |

Weight (kg) | 71.75 (69.63–73.88) | Not applicable |

Body mass index (kg/m^{2}) |
21.70 (21.16–22.23) | 18.5–25 ( |

Sport activity (hours a week) | 3.40 (2.76–4.03) | Not applicable |

Center-edge angle (°) | 28.41 (27.19–29.63) | 25–39° ( |

Alpha angle (°) | 64.61 (62.38–66.84) | <55° ( |

Centrum-collum-diaphyseal angle or neck-shaft angle (°) | 129.24 (127.99–130.49) | 125–135° ( |

Femoral anteversion angle (°) | 9.40 (7.30–11.49) | <15° ( |

In both examinations, 28 reflective markers are stuck on the skin on palpable anatomical landmarks. The application of skin markers to investigate kinetics is obvious but rather inaccurate. By contrast, using more accurate measurements with implants would raise ethical concerns.

Our motion capture acquisition strategy was based on a similar study by Deluzio et al. (

Motion capture musculoskeletal models were personalized with subject-specific bone geometry that was incorporated in a simulation model from the Twente Lower Extremity Model (TLEM 2.0) dataset (

Overview of data input for the motion capture musculoskeletal simulation model.

The output data from musculoskeletal models are numerous, multivariate, and multidimensional (

The beginning and end frames of all motion lab recordings are not useful due to irrelevant transients. Analogously, the peak evolution will vary from the center of the recorded data. Hence, data alignment and trimming are essential prior to incorporating the subjects’ motion recordings into a statistical model. These operations are executed using standard implementations in MATLAB (MathWorks, Natick, MA, United States).

The frame recorded with the peak knee flexion angle is defined as 50% progress of the motion. Trimming is based on knee flexion. For the lunge, the best is to consider only the closed chain part. As such, recordings where the right foot is not on the right force plate are left aside. Several arbitrary ways to execute an open-chain motion could be an important source of noise η. Noise is defined as artifacts when processing the input data to the output data (

Interpolation is performed to ensure that the measurements are running synchronized in real time. All trimmed measurements are subdivided into 0–50–100 proceedings, corresponding to the onset, the middle, and the finish of motion, respectively. Each set of kinetic data is arranged in a feature vector and concatenated into a training matrix. The training data matrix

An observation expresses several dynamic parameters on a certain progress of the aligned lunge or squat motion from 0 to 100%. For each participant _{i} for the

The input matrices _{axis} consist all of 101 observations

_{o} for each observation

After normalization by row-wise standard deviation in [4] and [5] as well as mean centering in [6], a residual matrix

PCA is a powerful dimensionality reduction technique developed by Karl Pearson. It is not a method to investigate the center size of the data but the common variability. PCA is mathematically defined as an orthogonal linear transformation. PCA transforms the data; as such, most of the variance of the data will come to lie in the first components. This allows us to create statistical models. Altogether PCA is a reliable tool in capturing the salient features of waveform data (

Using this for a statistical model, it enables to generate population data from a small set of clinical data. The kinetic model should represent waveform data as a linear combination of vectors, representing the primary modes of variation in experimental data (

In Eq. [7], ^{T}^{T}^{T} x R^{T}^{T}xR_{ok}. Here,

The PC scores from a single waveform quantify the contribution of each feature. The variance of the scores for the ^{T} x R_{k} represents the variance of the k

The cumulative variance of each mode

The PC weight matrix

A set of patient data can be approximately reconstructed by using

As mentioned before,

Validation is defined as the process of ensuring that the dimensionality-reduced PCA model accurately represents real-world kinetics. Probably the most important problem arising with this process is the choice of the optimal number of the principal components to be retained. PCA projects the input data from a high dimensional space into a subspace of lower dimension, which can then further be divided into two separate subspaces: the

Further, four quantitative model parameters are investigated. “Goodness” measures are chosen according to the statistical shape modeling study of

The first validation test that analyzes relevant information is retained by the model or otherwise states how well the original data can be reconstructed from the model given the number of principal components retained. Here, the root-mean-square error (RMSE) is computed in Eq. [12] as the average absolute difference between the original training data and the reconstructed data for models with 80, 90, 95, and 98% variance of the original data.

The model will be compact enough if it can describe the variance in kinetic measurements with a minimal number of modes. Eq. [9] is used to describe the compactness with the cumulative variance for a certain number of modes.

The model generalization quantifies the ability of models to represent new instances. The generalization ability is evaluated by performing a series of leave-one-out tests on the training data. The question here is: how many training samples are necessary to approach the population precisely? The generalization ability is therefore a means for

The generalization evolution gives the RMSE between the excluded subject data and the best-matched 95% variance model _{g} test value in Eq. [13], the higher the precision of the median generalization value. Here the number of models created for each number of training samples amounts to _{g} = 10,000.

A population model is able to generate new data. The model specificity measures the soundness of new instances randomly generated by the developed model

We assume that the PCs of the model are normally distributed (

The RMSE is defined as the error between the virtually subject data and the most similar sample in the training dataset. The specificity value can be interpreted as the median approximation error of _{s} generated subjects. The higher the _{s} test value, the higher the precision of the specificity. Here the number of models created is set to _{s} = 1,000,000.

The average hip and knee peak flexion angles are, respectively 95° and 104° for the lunge and 107° and 112° for the squat motion, respectively. The average peak hip joint reaction force (HJRF) amounts to 3.08 times body weight (BW) for the maximum-depth squat and 4.76 BW for lunging. The means for peak knee joint reaction force (KJRF) are still higher: 4.52 BW for squat motion and 7.16 BW for the lunge. The trimmed original waveform data from HJRF and KJRF of our musculoskeletal model are represented by gray curves in

A statistical model of kinetic output data from the AMS was made for deep squatting and another one for lunging.

Scree plot with the cumulative variance of the modes (or principal components) in the lunge (orange) and squat (purple) kinetic model.

Relation between the kinetic waveform simulation output and the squat progress for each individual sample in gray. Mean values of the measurements in green ±2 standard deviations of the first mode in red and blue. The first mode accounts for 33.80% of the inter-subject population variance. Note the different

Mean values of joint reaction forces during deep squatting in green ±2 standard deviations of the second mode in red and blue. The second mode accounts for 14.05% of the inter-subject population variance.

Mean values of joint reaction forces during deep squatting in green ±2 standard deviations of the third mode in red and blue. The third mode accounts for 11.88% of the inter-subject population variance.

Mean values of joint reaction forces during lunging in green ±2 standard deviations of the first mode in red and blue. The first mode accounts for 40.87% of the inter-subject population variance.

Mean values of joint reaction forces during lunging in green ±2 standard deviations of the second mode in red and blue. The second mode accounts for 15.07% of the inter-subject population variance.

Mean values of joint reaction forces during lunging in green ±2 standard deviations of the third mode in red and blue. The third mode accounts for 10.46% of the inter-subject population variance.

RMSE for the original squat training data versus reconstructed squat data with an increasing number of principal components on the

Validation analyses of the squat and lunge statistical kinetic model. We consider squat and lunge models that capture 80, 90, 95, and 98% of inter-individual population variance.

Validation summary | Squat model | Lunge model | ||||||

% of inter-variability in the population | 80% | 90% | 95% | 98% | 80% | 90% | 95% | 98% |

Model accuracy RMSE (median ±IQR**) (BW) | 0.0149 ±0.0122 | 0.0107 ±0.0087 | 0.0075 ±0.0064 | 0.0054 ±0.0048 | 0.0248 ±0.0210 | 0.0162 ±0.0167 | 0.0132 ±0.0126 | 0.0082 ±0.0083 |

Dimensionality* | 6 | 10 | 14 | 19 | 5 | 9 | 13 | 17 |

Model specificity RMSE (median ±IQR**) (BW) | 0.1582±0.0943 | 0.1581±0.0948 | 0.1583±0.0946 | 0.1584±0.0943 | 0.1291±0.0831 | 0.1310±0.0815 | 0.1314±0.0809 | 0.1320±0.0803 |

Choosing the optimal amount of principal components for the squat kinetic datasets.

PC | Eigenvalue | Percentage | Cumulative | Rank of | Equality |

of variance | variance | roots | of roots | ||

1 | 204.85 | 33.80 | 33.80 | 0.001 | 0.001 |

2 | 85.11 | 14.04 | 47.85 | 0.001 | 0.001 |

3 | 71.98 | 11.88 | 59.73 | 0.001 | 0.001 |

4 | 58.21 | 9.61 | 69.33 | 0.001 | 0.001 |

5 | 46.81 | 7.73 | 77.06 | 0.001 | 0.001 |

6 | 31.26 | 5.16 | 82.22 | 0.001 | 0.001 |

7 | 18.13 | 2.99 | 85.21 | 0.001* | 0.001 |

8 | 15.18 | 2.50 | 87.71 | 1 | 0.001 |

9 | 13.82 | 2.28 | 89.99 | 1 | 0.001 |

10 | 9.55 | 1.58 | 91.57 | 1 | 0.001 |

11 | 8.38 | 1.38 | 92.95 | 1 | 0.001 |

12 | 6.37 | 1.05 | 94.00 | 1 | 0.001 |

13 | 5.57 | 0.92 | 94.92 | 1 | 0.001 |

14 | 4.65 | 0.77 | 95.69 | 1 | 0.005* |

15 | 3.97 | 0.66 | 96.35 | 1 | 0.078 |

Regarding

Accuracy evolution of kinetic lunge data with log–log scaling (boxplot with root-mean-square error of the reconstructed data with 95% variance versus the original training data) for different levels of prior knowledge expressed as amounts of training data in a kinetic model. The green horizontal line indicates the in-sample target accuracy.

The validation analysis confirmed that our models have a high degree of compactness and accuracy. Many types of noise are in the higher components. The PCA technique has adequately allowed rejection of the error variance from the model. The meaningful variance is obviously divided over the first 12 or 14 components. This multidimensionality describes the silent features in the data and, eventually, they could be linked to the varying characteristics of the study population. A common source of meaningless variance originates from data alignment. It is impossible to avoid this because we do not want to introduce supplementary noise in the data by aligning them more stringently. Since all subjects have a BMI lower than 25, skin shift errors during movements are limited (

According to the lunge, the model only describes the closed-chain part of motion for two reasons. First, femoroacetabular impingement and joint reaction forces are more pronounced at higher flexion (

The dominant mode is supposed to describe the overall variance (

The second mode of the squat model indicates that a high HJRF component in the transverse plane results in high KJRF components in the frontal and sagittal plane in order to counterbalance the downward force at the hip. The third mode correlates the depth of squatting with the joint reaction force components in the frontal plane. For the lunge, the association of the frontal joint reaction force components is mainly summarized in the second mode. Finally, according to our interpretation, the third mode of the lunge model may take alignment errors into account.

The RMSE for model accuracy are far below 0.05 BW, as opposed to similar studies. The specificity was almost equal for models with 80, 90, 95, and 98% of variance. It questions the relevance of taking the model specificity into account in this setting. According to the generalization evolution, we could conclude that, minimally, 50 samples are enough to provide reliable models at 0.1 BW precision for both squat and lunge motion. Nevertheless, we recommend exceeding this threshold number because the in-sample accuracy is still lower, especially for the squat. Note that gender, age group, BMI group, and race differences are not included here. Therefore, it is very likely that, in more heterogeneous populations like the elderly, 50 samples will be too low to ensure reliable models.

Unfortunately, electromyography data are not collected during this study. This could give information about muscle activation and muscle strength. Motor unit action potentials could be registered non-invasively by using surface electromyography. It has been stated several times that the muscle activation patterns depend on several aspects like training level and osteoarthritis (

By applying correlation matrix PCA to obtain uncorrelated maximum-variance linear combinations and given that there is only kinetic data with limited scaling differences, some more PCs are required to account for the same amount of covariance compared to classical covariance matrix PCA (

The most important limitation of the present work, however, relates to the population under investigation, namely, young male, Belgian adolescents and the unknown extent of which findings can be extrapolated to other populations. Nevertheless, in general terms, we expect our results to be representative by extension for a Western European population.

We created two models that describe kinetics from both hip and knee joint, contrary to the limited number of studies available with PCA analyses of waveform data considering the knee only (

The datasets generated for this study are available on request to the corresponding author.

The studies involving human participants were reviewed and approved by the Commission for Medical Ethics, UZ Gent. The patients/participants provided their written informed consent to participate in this study.

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

PG is an employee of AnyBody Technology. No financial benefits have been received or will be received from any commercial party related directly or indirectly to the subject of this article. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The authors would like to thank Ashwin Schouten for his contribution to the musculoskeletal modeling in the AMS.