The Madden–Julian oscillation (MJO) is a major tropical weather system and one of the largest sources of predictability for subseasonal-to-seasonal weather forecasts. Skillful prediction of the MJO has been a highly active area of research due to its large socio-economic impacts. Silini et al., herein S21, developed a machine learning model to predict the MJO, which they claimed to have an MJO prediction skill of 26–27 days over all seasons and 45 days for December–February (DJF) winter. If true, this would make the skill of their model competitive with that of the state-of-the-art dynamical MJO prediction systems at 20–35 days. However, here we show that the MJO prediction was calculated incorrectly in S21, which spuriously increased the performance of their model. Correctly computed skill of their model was substantially lower than that reported in S21; the skill for all seasons drops to 11–12 days and the skill for forecasts initialized during DJF drops to 15 days. Our findings clarify that the S21 machine learning model is not competitive with state-of-the-art numerical weather prediction models in predicting the MJO.

A correction to this article is available online at

corrected publication 2024

Prediction of the Madden–Julian oscillation (MJO) has been an active area of research since the 1990s^{1,2} because of the recognized importance of the MJO as a source of predictability for subseasonal-to-seasonal weather forecasts. There have been efforts to improve the forecasts of the MJO and standardize its assessments^{3}, including those of the WGNE MJO Task Force^{4,5}. These coordinating bodies and studies have developed frameworks and metrics for fair comparison of model performances regarding MJO simulation across different models. Recent studies using these comparison frameworks have documented impressive improvement in MJO prediction^{1,3}, largely due to improved dynamical models optimized for MJO prediction^{5}. The best dynamical models can skillfully predict the MJO beyond 30 days^{1}, with most showing 20–25 days of prediction skill^{1}.

Silini et al. (2021; hereafter S21)^{6} introduced a set of machine learning models to predict the Real-time Multivariate MJO index (RMM)^{7}, which is an index commonly used to identify the MJO. The RMM represents the state of an MJO with RMM1 and RMM2, that correspond to the leading pair of empirical orthogonal functions of the outgoing longwave radiation, and zonal winds at 850 and 200 hPa. S21 trained different types of machine learning models to forecast the RMM indices using past values as inputs and outputs. The details are described in S21.

A key metric for assessing the prediction skill of MJO forecasts that has widely been used in the weather and climate community is the bivariate correlation coefficient (COR)^{1,4}. COR is calculated via Eq. (

Here _{1,2} are the observed values of RMM1 and RMM2 at time _{1,2} are the corresponding model predicted values of RMM1 and RMM2, respectively, at a lead time

While S21 states that Eq. (

Note also that here, _{0} denotes the initial date of the forecast. In Eq. (

Personal communication with the authors of S21 confirmed that this error was unintentionally made during the calculation of COR. We emphasize that Eq. (

To quantify the impact of this error on the results in S21, we compare the prediction skill of the S21 model evaluated by the correct COR via Eq. (

The COR values calculated correctly (solid lines) per Eq. (

In Fig. ^{6}). However, the correct skill computed via Eq. (

The discussion in S21 regarding the relative merits of their model is invalid because skill was computed incorrectly. It turned out that the machine learning framework for MJO prediction in S21 is not competitive with the state-of-the-art dynamical models. Their models show skill comparable to classical linear models of the MJO^{1,8}, at approximately 1–2 weeks, which use simple linear regression techniques to forecast the RMM. This implies that current state-of-the-art dynamical models are still the optimal method for MJO prediction.

We also note that there have been other machine learning models that demonstrated performances exceeding that of S21^{9–11}. This suggests that machine learning framework of S21, which employs only RMM as the input of the machine learning, may be less advantageous than existing machine learning approaches that use additional input variables. A more detailed comparison would be needed to fully evaluate this claim. Evaluation of S21 model using the correct equation highlights that their models do not outperform other existing machine learning approaches for MJO prediction.

We do not want to discourage or dissuade continued research on prediction of the MJO using machine learning. However, carefully adhering to the best practices established by the MJO community and correctly employing standardized metrics to evaluate model performances are necessary for fair comparison across different models. This is especially true given the important implications that MJO prediction has on subseasonal-to-seasonal forecasting in both scientific and applied domains.

Preceding studies have suggested that inherent MJO predictability is up to 6–7 weeks^{12,13}. The initial claim of S21 that MJO prediction skill exceeds 60 days in certain phases or seasons seemed especially difficult to reconcile with this predictability estimates. The correction of the equation to evaluate their model indeed reduced the model performance down below the reported predictability estimates. However, both dynamical and machine learning approaches for MJO prediction should continue to endeavor to meet these predictability limits. Future research in this area will continue to illuminate whether any modeling framework might reach this goal.

All members of the WGNE MJO Task Force are appreciated for discussion of these results and formulating this response. T.S. was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI grant 21K13991. Z.K.M. was supported by National Science Foundation under Award No. 2020305. This work was also supported, in part, by NOAA grant NA19OAR4590151, NSF Grant AGS-1841754, NOAA Grant NA22OAR4590168, and NOAA Grant NA22OAR4310608. D.K. was supported by New Faculty Startup Fund from Seoul National University. H.K. was supported by the National Research Foundation of Korea (NRF) grant (NRF-RS-2023-00278113).

Z.K.M., E.A.B., and E.D.M. initiated the work through discussion regarding S21’s initial findings. Z.K.M. conducted the analysis, drafted the initial document. Z.K.M. and T.S. coordinated editing. All authors contributed to discussion of results and drafting the final document.

RMM data used for this study is available from the Australian Government Bureau of Meteorology:

Keras TensorFlow trained feed-forward neural network of S21 can be found at repository indicated in S21^{14}. Codes to reproduce the results of S21 and the correct computation of COR is available at

The authors declare no competing interests.