entropy Article Towards a Measure for Characterizing the Informational Content of Audio Signals and the Relation between Complexity and Auditory Encoding Daniel Guerrero 1,*, Pedro Rivera 2 , Gerardo Febres 3 and Carlos Gershenson 2,4,5 ���������� ������� Citation: Guerrero, D.; Rivera P.; Febres, G.; Gershenson, C. Towards a Measure for Characterizing the Informational Content of Audio Signals and the Relation between Complexity and Auditory Encoding. Entropy 2021, 23, 1613. https://doi.org/10.3390/e23121613 Academic Editor: Amos Maritan Received: 7 October 2021 Accepted: 25 November 2021 Published: 30 November 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico 2 Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico; pedro.rivera@c3.unam.mx (P.R.); cgg@unam.mx (C.G.) 3 Departamento de Procesos y Sistemas, Universidad Simón Bolívar, Sartenejas, Baruta 1080, Miranda, Venezuela; gerardofebres@usb.ve 4 Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico 5 Lakeside Labs GmbH, Lakeside Park B04, 9020 Klagenfurt am Wörthersee, Austria * Correspondence: dguerrerog77@gmail.com Abstract: The accurate description of a complex process should take into account not only the inter- acting elements involved but also the scale of the description. Therefore, there can not be a single measure for describing the associated complexity of a process nor a single metric applicable in all scenarios. This article introduces a framework based on multiscale entropy to characterize the com- plexity associated with the most identifiable characteristic of songs: the melody. We are particularly interested in measuring the complexity of popular songs and identifying levels of complexity that statistically explain the listeners’ preferences. We analyze the relationship between complexity and popularity using a database of popular songs and their relative position in a preferences ranking. There is a tendency toward a positive association between complexity and acceptance (success) of a song that is, however, not significant after adjusting for multiple testing. Keywords: multiscale complexity; entropy; information content; auditory encoding; music 1. Introduction Despite sound’s intrinsic complexity, the human brain can decode and process it to extract valuable information from its environment. The brain can estimate distances, roughly identify the materials producing a specific sound, and even estimate the number of objects producing the sound [1,2]. The brain also assesses the different sounds it perceives and orders them according to our preferences. When hearing a sound, it is easy to classify it as pleasant or unpleasant. Even when the precise elements and processes involved in this decision are not clear, we are perfectly conscious of the final result of this evaluation. In particular, when our brain listens to music, it performs a classification process, and this classification is made based on the intrinsic properties associated with music. We can say that these intrinsic properties conform to music’s informational content. Some authors [3] support the idea that sound preferences are dominated by a trade-off between the simple and the complicated (the expected and the unexpected elements, regular or random). When a song is too simple, it does not generate the necessary stimuli to maintain the listener’s attention. On the other hand, if the song is too complicated, in the sense that it does not offer recognizable patterns and too dense information is required to describe it, just as noise is, it is not attractive either. This suggests the existence of an intermediate, “optimal” balance between these two extremes. There have been several proposals to measure and characterize the informational content of a musical segment. These approaches range Entropy 2021, 23, 1613. https://doi.org/10.3390/e23121613 https://www.mdpi.com/journal/entropy https://www.mdpi.com/journal/entropy https://www.mdpi.com https://orcid.org/0000-0003-3507-1821 https://orcid.org/0000-0003-0193-3067 https://doi.org/10.3390/e23121613 https://doi.org/10.3390/e23121613 https://creativecommons.org/ https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/ https://doi.org/10.3390/e23121613 https://www.mdpi.com/journal/entropy http://www.mdpi.com/1099-4300/23/12/1613?type=check_update&version=2 Entropy 2021, 23, 1613 2 of 22 from analyzing the motifs of the network associated with the transition between notes in a song [4], to the analysis of the underlying language in digital format [5]. Nevertheless, there is no clear definition to capture the complexity and informational content of a song. The present article explores the relationship between complexity and preferences using music as our object of study. To achieve this, we define a metric to characterize the complexity of a musical segment. Then we explore to what extent the complexity of a song affects the degree of acceptance. We evaluate the multiscale entropy as a candidate to char- acterize the complexity associated with the melody of a musical segment. Specifically, we study the correlation between multiscale entropy and the listener’s preferences considering pitch intervals (in the musical sense) at different periods of a song. The paper is structured as follows. In Section 2, we survey the most relevant literature on the relationship between music and complexity and describe the different approxima- tions to the problem. Section 3 describes the complexity metrics we use, the data, and also the processing transformations involved. Section 4 presents the most important findings derived from our analysis. Finally, Section 5 provides a summary of the contributions and limitations of our work. We end with some proposals for future work. 2. Background and Related Work In 2015, Febres et al. [6] computed the informational entropy of languages applying Shanon’s proposed information metrics: entropy [7]. This assessment of language’s in- formation used words as the symbols making up languages. Later, Febres and Jaffe [5] applied similar ideas to determine the information content of the songs. Since there were no words, in this case, the authors analyzed the information content of music by using the language associated with the Music Instrument Digital Interface (MIDI) format. This language contains all the necessary instructions to generate and reproduce the specified song. With this language, it was possible to estimate the informational entropy, among other useful metrics, and characterize the associated information content of the songs. This characterization makes it possible to identify the musical genre and analyze changes in music’s complexity over time. In the work of Perez-Verdejo et al. [8], they analyze music consumption patterns in Mexico using streaming statistics and audio features from the music streaming platform Spotify. The authors investigate how music features correlate with the streaming metric and compare the regional (Mexican) patterns with global (worldwide) counterparts. The authors identify the features that clearly distinguish or characterize the most popular songs in Mexico. In 2014, Gamaliel et al. [9] introduced the concept of instrumentational complexity and showed that there exists a relationship between instrumentational complexity and album sales. They found a negative association between complexity and sales. The conclusion is that the simpler albums (measured by their metric) tend to be associated with higher sales: simplicity sells. This measure of instrumentational complexity is based only on the number and the uniqueness of the instruments used in the song. From an information-theoretical point of view, this metric is not genuinely associated with the informational content of a musical segment. In our opinion, a measure for musical complexity must consider the intrinsic elements of the music. In their work, Parmer et al. [10] analyze popular songs and classify them by their associated complexity. By transforming each song into a sequence of tokens, they generate a language. Then, the authors use a conditional version of Shannon’s entropy [7] to measure the complexity of a song expressed as a sequence of tokens. They found an inverted- U-shaped relationship between popularity and entropy. With this characterization, they identify the musical genre of the songs based on their entropy profile. In Overath et al. [11] they show that brain activity in the Planum Temporale (a brain region typically associated with audio processing) when measured via functional Magnetic Resonance Imaging (fMRI) is positively associated with the complexity of the incoming auditory stimulus. The authors generate a series of pitch sequences with a pre-specified Entropy 2021, 23, 1613 3 of 22 entropy and analyze the exhibited level of activity in the brain’s response. They show that when the entropy of the audio signal is high, so is the activity in the Planum Temporale, so there is a positive relation between signal complexity and brain activity. In the present study, we follow the work of Carpentier et al. [12]. In this article, the authors explore the relationship between the complexity of the environment (input) and the complexity in the associated brain response (processing/decoding). A group of participants is exposed to a series of auditory stimuli while asked to perform a perceptual or emotional task. The activity in the brain response for each task is measured via fMRI. The aim is to evaluate whether the association between the stimulus’s complexity and the response’s complexity (complexity matching) explains the listener preferences. The authors found higher complexity matching during perceptual music listening tasks compared to emotional music listening tasks. This analysis is, to some extent, related to Ashby’s law of required variety [13,14] in the sense that, in order to process a complex signal, the brain must be able to use an at least equally complex decoding process. To characterize these complexities, both of the input and the brain activity, the authors use multiscale entropy. 3. Materials and Methods It is generally understood that a complex phenomenon lives in an intermediate point between chaos and regularity [15]. However, none of these perfectly describes a complex process. Intuitively, complexity is associated with structural richness and the meaning of the underlying process. As an example of how the complexity of the process is related to its regularity patterns, we analyze three signals: sinusoidal, pink noise (1/ f noise) and white noise. Each of these processes has different structural properties and consequently different levels of complexity (Figure 1). Figure 1. Three signals with different structural properties: sinusoidal, pink noise (1/ f noise) and white noise. Entropy 2021, 23, 1613 4 of 22 We use multiscale entropy (MSE) as our measure of complexity. Before applying MSE to music analysis, we investigate some of its properties on the three signals described above. MSE is a measure designed for the analysis of time series and one of its most important features is that it allows for evaluation across many different process scales. As described in the works of Siegenfeld et al. [16], Allen et al. [17], Bar-Yam [18] and Febres [19], the complexity depends on the scale at which the observer interprets the system. For a process to be complex, the interdependence between its elements must hold over the different scales of observation, not only at the extreme detailed system’s description. MSE allows this inter-scale analysis. In addition to its mathematical properties, the other important motivation for selecting MSE as our complexity metric is that it has been applied to describe and characterize cognitive processes in experimental settings [20–23]. Based on the current literature and data availability, MSE is promising for exploring the relationship between complexity and preferences when applied to audio or musical analysis in particular. MSE is itself based on sample entropy (SE), which is a measure of the degree of compressibility of a signal [24,25]. The more compressible a signal is (fewer bits needed to represent it), the less its measure of SE. The intuitive definition of SE is clearly related to Siegenfeld et al.’s [16] definition of complexity and the notion of Kolmogorov Complexity [26]. SE is defined in the following terms for a series S consisting of N elements: SE = −log ( Sr(m + 1) Sr(m) ) , (1) where Sr(m + 1) is the number of pairs of subsequences of size m + 1 with distance less than r, and Sr(m) is the number of subsequences of size m with distance less than r. The distance parameter r is set to 20% (following [24,27] ) of the standard deviation of the full series S, and we use Euclidian distance. SE algorithmically computes the conditional probability that, given a sequence of length N, any pair of subsequences with m similar consecutive points will also be similar in the m + 1 point. SE is therefore a measure of self-similarity. The more self-similar the series is, the more redundancy it contains and the less its SE value. Note that, by construction, Sr(m + 1) will always be smaller than or equal to Sr(m) (as adding a restriction can only reduce the number of coincidences), and therefore, SE will be greater or equal than zero (zero when the series is absolutely redundant). However, SE does not fully capture the concept of complexity. It does not take into account the different scales involved in the process and it assigns high values (high complexity) to random processes. A white noise, while being not compressible, will obtain a high value of SE. For this reason, MSE should be introduced. To calculate MSE from SE, it is necessary to apply a reduction process where the elements of the original series S = {s1, ..., sn} are aggregated to create a unique element of the reduced series Yτ = {yτ 1 , ..., yτ n}. yτ i = 1 τ i+τ ∑ i si , (2) where τ ∈ {1, 2, 3, ...} represents the scale of aggregation (number of aggregated elements). SEτis calculated for each new series Yτ while varying the parameter τ. These multiple SEτ calculations, when taken altogether, represent the MSE metric. If we now calculate MSE and SE for two known signals, white noise and pink noise, we observe that these measures attain different results for the same pair of signals. Table 1 shows values of SE for a sinusoid, white noise, and pink noise. Even though pink noise has a richer structural complexity [15,28,29], white noise shows a higher SE value complexity. Entropy 2021, 23, 1613 5 of 22 Table 1. SE for a sinusoid signal and pink and white noises. Signal Sample Entropy (SE) Sinusoid 0.4675 Pink noise 1.7735 White noise 2.1752 If we calculate MSE for the signals mentioned above, we obtain not a scalar but a profile (the complexity profile) that represents the associated complexity. This profile spans through each of the considered scales (20 in our case), as shown in Figure 2. Figure 2. MSE for a sinusioidal signal, pink noise, and white noise. Now it becomes clear that the estimated complexity for white noise is not consistent among all aggregation scales [17]. When the scale of aggregation augments, the white noise process reveals a simple structure that is not easily observed in the original scale. Instead, pink noise maintains an almost constant complexity among all scales and therefore is more complex than white noise. Based on its underlying properties, MSE offers a good approximation to the intrinsic complexity of a time series. We propose to use MSE in our analysis of the relation between complexity and preferences based on the following: 1. This metric is used to measure the complexity associated with brain processes. In particular, it is used to analyze the temporal activation patterns in specific brain regions. [20–23] 2. It allows the analysis of time series, such as music, over different observation scales. Based on these considerations, MSE can provide useful insights in analyzing the rela- tion between the informational content of a musical stimulus and the cognitive processes involved in the determination of musical preferences. Entropy 2021, 23, 1613 6 of 22 3.1. Data 3.1.1. Music The data sample used is part of the Million Song Dataset (MSD) [30], consisting of one million annotated songs (http://millionsongdataset.com/ (accessed on 30 September 2021)). Some of the included tags are year of release, genre, album, artist and a set of technical features per song. It is worth mentioning that the songs are already processed, there is no audio for the songs in the database, and only the extracted features are available. Table 2 shows some of the technical features included in the database. Table 2. MSD database technical components. Component Description Key Estimation of the key the song is in Loudness General loudness of the track Segment_pitches Chroma features for each segment Segments_timbre MFCC-like features for each segment Segments_loudness_max Max loudness during each segment Following the work of Overath et al. [11], we use the pitch component as the fun- damental element of our analysis. It is also important to remark that the pitch is one of the perceptual components of music. Therefore, there is no strict relation between the physical properties of sound and our perception of pitch [2,31] (although it is related to the frequency component). The brain determines our perception of this component, which is why it is considered a relevant element for our analysis of musical preferences. For each song, we use the component denominated segment_pitches. This component is a matrix of shape (chroma_feature, time_segments) that determines the relative presence of each pitch class in the corresponding time interval. This matrix is called the chromatogram of the song and represents the basic melody of the song. 3.1.2. Music Preferences To analyze the listener’s preferences we use the year-end HOT 100 charts Billboards https: //www.billboard.com/charts/year-end/2020/hot-100-songs (accessed on 30 September 2021). These are a compendium of the most popular songs for each year in the United States and this ranking serves as a proxy for musical preferences. The basic idea is that these top songs have specific characteristics that make them different from the other songs and separate them into two sets: high popularity and low popularity. 3.2. Data Processing MSE is meant to be used for time series, but our data are in matrix form. For that reason, we need to apply specific data processing steps to transform the matrix data into time series. The data processing steps are: 1. For each time segment, the most representative pitches are identified. 2. The original values of the matrix are mapped into the integer interval x ∈ [1, 12] ⊆ Z. 3. Finally, the pitch dimension is collapsed to end up with a flattened matrix, i.e., a vector representing a time series of length time_segments. The intuition behind these transformations is that in each time segment, we seek to preserve only the most representative pitch for that time segment. In this way, the matrix representing the structure of the song reduces to its most representative perceptual element in each time segment. Figure 3 illustrates this process. http://millionsongdataset.com/ https://www.billboard.com/charts/year-end/2020/hot-100-songs https://www.billboard.com/charts/year-end/2020/hot-100-songs Entropy 2021, 23, 1613 7 of 22 Figure 3. Transformation from a chromatogram matrix to a time series. We obtain a time series for each song with these transformations, and now it is possible to calculate its corresponding MSE. In order to create a proxy for the listener’s preferences, we use songs from the Bill- boards Hot 100 list. This list includes the 100 most-listened songs for each year in the United States and a ranking for the song’s popularity. Since the MSD database includes songs in the range of years from 1931 to 2011, we could analyze the differences in complex- ity between the songs included in the Hot 100 list and those not included for each year. Because MSE generates a complexity profile associated to each song, it is possible then to compare the complexity profiles of both groups and to determine if there is a significant difference between “successful” and “unsuccessful” songs. 4. Results We use a sample of songs between the years 2000 and 2010. The 100 top songs are identified for each year, and its MSE is computed. MSE is also calculated for the songs not included in the Hot 100. Therefore, for each year, it is possible to separate the songs into two groups, successful and unsuccessful songs, and compare the complexity in each group. The analysis of each time series includes scales from 1 to 20. The average is used to aggregate the complexity values per series over the appropriate scale. Figure 4 summarizes the findings of our analysis, and Appendix A includes the complete results and figures for other years. Entropy 2021, 23, 1613 8 of 22 Figure 4. MSE for the year 2000. We observe that the mean complexity is higher for songs belonging to the top group at most scales, suggesting that the songs with a better position in the ranking have slightly greater complexity than the others (at least for the songs under consideration). The mean complexity profile of the top songs is higher for each of the considered scales. However, there are many overlapping regions at the intra-group variance of the complexity profiles in the corresponding group distributions, as shown by Figure 5. Figure 5. Complexity profile variance. Due to the overlapping regions in the complexity profile distributions, it is necessary to evaluate the statistical significance of the differences we have previously identified between the complexity profiles of the top and non-top groups of songs, respectively. We use Welch’s test to evaluate the difference between two independent populations [32] and check for normality using the Shapiro–Wilk test [33]. The Welch’s test is a variant of Student’s t-test with the property of being more robust when the hypothesis of equal variance does not hold and when the sample sizes between the two populations are different. In our case, one of our groups has only 100 observations, the top group, for each year. In addition to the standard statistical test, it is important to note that we are facing a multiple hypothesis testing scenario (as we are simultaneously testing 20 scales). Then it becomes necessary to make a correction to take this into account. We use the Bonferroni correction [34] to Entropy 2021, 23, 1613 9 of 22 adjust the significance results obtained with Welch’s test. In Figure 6, we present the results derived from the Welch test (before the Bonferroni correction). Figure 6. Statistically significant scales after the Welch test (level 0.05). After the Welch test, eight out of the twenty scales in the complexity profile resulted significant at level 0.05. It is important to note that the significant scales are distributed along with the profile’s range. Nevertheless, after applying the Bonferroni correction, the significance level dropped to 0.0025 (adjusted for 20 scales), at which none of the scales resulted as significant. Although not all scales in the complexity profile were statistically significant, the ones that were indeed significant are distributed along with the profile’s range: Table 3 and Appendix A.2 present detailed results. Table 3. Difference and statistical significance (year 2000) Scale Difference p-value Welch (α = 0.05) Bonferroni (α = 0.0025) 6 0.1369 0.014 Yes No 7 0.1651 0.004 Yes No 8 0.1532 0.024 Yes No 11 0.1213 0.017 Yes No 12 0.1148 0.044 Yes No 13 0.1460 0.011 Yes No 16 0.1333 0.022 Yes No 19 0.1611 0.017 Yes No Although the Bonferroni correction rendered all scales non-statistically significant, this is somehow an expected result given that many factors are contributing to the success or popularity of a song. Many of these factors are not even related to the musical properties of the songs but to external factors such as advertising expenses and social trends. Never- theless, the analysis shows that the complexity profile of the top songs tends to be above that of the non-top songs for almost all the years of the studied period—a surprising fact considering the simplicity of our approach and the musical elements we are considering. In addition to the measured difference, the shape of the complexity profile provides an overview of some of the important characteristics of a system and its complexity scale relationship [18,35]. Nevertheless, to further investigate and compare the differences between the two groups of songs, we evaluate the relation between the total area under the complexity profile and the rank it obtained in the Billboard chart. We calculated the area under the complexity profile for all the songs in the two considered groups (top and non-top songs) to analyze this relation. We plotted these areas against the corresponding Entropy 2021, 23, 1613 10 of 22 ranks (the logarithm of rank) for each song, Figure 7. As there is no rank information for the non-top songs, we assigned ranks for all these songs via a Monte Carlo simulation in which the overall shape of the distribution was invariant as the areas for each song kept fixed. Figure 7. Area under the complexity profile for top songs (blue) and non-top songs (red) and its relation to log(rank) for the year 2000. For comparison purposes, white noise, pink noise, and the sinusoidal wave are included at an arbitrarily set rank. Figure 7 shows that the density of top songs tends to lay in the high side of the area spectrum, and the average area of a top song is always greater than the average area of a non-top song for all the considered years. Interestingly, Figure 7 also suggests that area under the complexity profile of the most preferred songs tend to be in a specific range of the spectrum (not so low and not so high). Songs in the extremes of the spectrum are not widespread, thus indicating that there exists a preferred level of complexity (this same pattern was observed in all sampled years). Although we are not pursuing a predictive model for successful songs, Figure 7 lets us predict that if the calculated area for a given song is extremely low or extremely high, the corresponding song will certainly not be a well-ranked one. This finding is somewhat related to [10], where the authors find a U-inverted rela- tionship between complexity and preferences. Here, we found evidence that the area under the complexity profile of top songs is hardly located in the low or high extremes of the spectrum. However, as we do not have the exact rank positions for non-top songs, it becomes impossible to confirm the U-inverted shape. Nevertheless, our findings do not contradict the results described in [10]. We have also included in this figure the areas for the three signals (sinusoid, white noise, and pink noise) described in Section 3 as a reference to compare the difference between the complexity of a song and the complexity of the different signals. 5. Discussion The meaning and quantification of complexity are under permanent discussion. Loosely speaking, one view suggests that the complexity of an object includes the ef- fort needed to build an object’s description. Following this intuition, methods to estimate this description’s effort may include counting the number of object’s parts, assessing the relationship among these parts, or any applicable extensive counting procedure. To avoid the effects of prejudices in these counting processes, the notion of complexity, as intimately related to the information account in the object’s description, has been accepted [18,36,37]. Complexity is, therefore, a property of the object. Nevertheless, complexity brings the influence of the language used for the description and, more relevant for the scope of this Entropy 2021, 23, 1613 11 of 22 work, the scale at which the object is observed. Thus, complexity shares objective and subjective aspects. To consider the variations of complexity when the object is seen at different scales, the complexity profile [35] has been proposed. The complexity profile offers an overview of the object’s complexity interpreted at a range of scales. Here, we have proposed a framework for analyzing the complexity associated with a song and relating this complexity to the listener’s preferences. Our findings suggest an association between complexity and preferences in the sense that preferred (well ranked) songs tend to have high complexity, at least for the considered songs and analyzed years. Furthermore, our results add some evidence suggesting the existence of an optimal level of complexity associated with our preferences. In Figure 7, where we added the calculated areas for pink noise, white noise, and the sinusoid, it is worth noting that the area for the pink noise is close to that of the preferred songs, and this can be an explanation of why pink noise is sometimes used with relaxations purposes. Its complexity is higher than white noise but without the necessary elements to distract or catch our minds. We find this insight interesting as it opens the door for the study of relaxing sounds using techniques similar to the one we have described. Furthermore, when computing the average area for each group, we observe that the mean area is higher for the top songs than the non-top areas. This comparison holds for every year in our sample and was evaluated using the Wilcoxon test for independent samples [38], as shown in Figure 8 (detailed analysis in Appendix A.6). Figure 8. Average area vs. log(rank) for the two groups of songs in each year. Although the framework presented here has some limitations and is far from describ- ing a clear relationship (a predictive model) between complexity and preferences, it allows for a descriptive characterization of popular songs in terms of their multiscale complexity. Importantly, it provides a way to identify songs that will not be well ranked as they have extreme (low or high) complexity. We used multiscale entropy to measure and characterize the complexity of a song’s melody when properly processed using standard music information retrieval (MIR) tools because this metric captures some of the critical aspects of a complex process in which we are particularly interested. Although MSE does not provide a complete description of the complexity of a process, nor is it the only alternative for measuring complexity, it does provide an interesting and innovative way to investigate the relationship between complexity and preferences when analyzing audio or music. We introduced this work intending to contribute to developing new methods to understand how the brain perceives and processes complex objects. Since audio represents many dimensions: time series, frequency, rhythm, number, and type of involved instruments, we decided to use audio Entropy 2021, 23, 1613 12 of 22 signals (music) as our object of analysis. Due to this broad range of possibilities, there is no clear and unique definition of the informational content associated with a song nor a precise measure of its complexity. We hope that this work contributes to better frameworks and methodologies to analyze and understand complex processes such as music. 5.1. Limitations We found a certain degree of association between multiscale complexity and popular- ity suggesting that the complexity of popular songs tends to be located in the high side of the range. Although the results presented in this article are not entirely conclusive in the sense of providing a clear relation between complexity and preferences, this can be associated with the following: • The associated factors involved for a song to become popular are more than we can afford to consider in a study such as this. • Many of the involved factors are not directly associated with the complexity of the song, for example, social trends, cultural biases, spending on advertising, and sample design biases, etc. These exogenous factors make it difficult to compute an unbiased estimation of the relationship between music complexity and its corresponding public preferences. We believe, however, that there exists a level of music complexity where most people will find this music as pleasant. This "optimal" level of music complexity can be estimated with the methods presented. 5.2. Future Work The study can be extended to make a complexity metric that accounts for more musical features. Here, we limited the analysis to pitch sequences to construct a time series and only to the most relevant pitch element. As the database includes the complete chromatogram for each song, it is possible to select different combinations of pitch elements according to their relevance. This generalization could consist of: 1. Consider a complexity profile for each level of relevance. 2. Construct a weighted average considering the distinct pitch classes involved in each time segment and calculate the complexity profile of this weighted series. In addition to the pitch elements, the database includes the timbre and loudness elements. An identical treatment to the one described for pitch might be helpful to generate the corresponding complexity profiles. Different combinations of musical elements will allow for a richer approximation of music. One practical and interesting application of the framework we have presented is to use the complexity profiles to improve music recommender systems in streaming platforms. It is even possible to use the complexity profile to generate new music by following specific complexity patterns associated with customers’ preferences. An analysis of the complexity profiles between genres would be illuminating. It would be interesting to find out if there is a relevant difference between two songs that belong to the same genre, but one is popular (top), and the other is not (non-top). Furthermore, to investigate if each musical genre has a characteristic complexity profile. In addition to these experiments, the complexity profile could be used as a feature in predictive models, for example, trying to predict the genre of a song given its complexity profile. More elaborate processing and treatment are necessary to carry out this analysis. In future work, it would also be interesting to compare different complexity metrics to determine the degree of similarity between MSE and other metrics for the same analysis. Furthermore, it would be important to evaluate how robust our results are with respect to parameter changes in the pre-processing steps, the musical elements considered or in the sampling design. Finally, it is important to remark that music also has therapeutic properties. Our analysis found that pink noise has a complexity close to the preferred songs, making this a Entropy 2021, 23, 1613 13 of 22 possible guide for creating music with properties in between the spectrum of pink noise and popular music that could have better results in musical therapies. Some rehabilitation therapies use musical stimuli to treat memory and speech-related problems [39–41]. A complexity analysis relating sensory stimuli and the corresponding patient’s response can help identify and select the stimulus for the appropriate treatment. Author Contributions: Conceptualization, D.G., P.R., G.F. and C.G.; methodology, D.G.; software, D.G.; validation, D.G., P.R., G.F. and C.G.; data curation, D.G.; writing—original draft preparation, D.G.; writing—review and editing, D.G., P.R., G.F. and C.G. All authors have read and agreed to the published version of the manuscript. Funding: This work was partially supported by UNAM’s PAPIIT IN107919 and IV100120 grants. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: Data from MSD can be found at https://aws.amazon.com/datasets/ million-song-dataset/ (accessed on 30 September 2021). The year-end HOT 100 Billboards are available at https://www.billboard.com/charts/year-end/2020/hot-100-songs (accessed on 30 September 2021). Acknowledgments: We wish to thank two anonymous reviewers whose comments helped us con- siderably improve this paper. Conflicts of Interest: The authors declare no conflict of interest. Abbreviations The following abbreviations are used in this manuscript: MSE multiscale entropy SE sample entropy fMRI functional magnetic resonance imaging MSD million song dataset MIR music information retrieval https://aws.amazon.com/datasets/million-song-dataset/ https://aws.amazon.com/datasets/million-song-dataset/ https://www.billboard.com/charts/year-end/2020/hot-100-songs Entropy 2021, 23, 1613 14 of 22 Appendix A. Appendix A.1. Complexity Profiles for the Years 2001–2010 Figure A1. Complexity profiles (2001-2010). Entropy 2021, 23, 1613 15 of 22 Appendix A.2. Scale Distributions (2000) Figure A2. Statistically significant distributions (year 2000). Appendix A.3. Statistically Significant Differences in Scale (2001–2010). There were no statistically significant scales for the years 2005, 2006 and 2008. This can be by assessed observing that the respective profiles are almost completely overlapping. For the rest of the years, the statistical significance is presented in the following tables: Entropy 2021, 23, 1613 16 of 22 Table A1. Statistical significance 2001. Scale Calculated Difference p-value 4 0.2066 0.0102 5 0.1437 0.0341 7 0.1771 0.0014 8 0.2158 0.0015 11 0.1481 0.0049 13 0.2281 0.0002 15 0.1544 0.0217 17 0.2249 0.0006 Table A2. Statistical significance 2002. Scale Calculated Difference p-value 5 0.1480 0.0269 6 0.1458 0.0061 7 0.1236 0.0307 8 0.1513 0.0182 10 0.1415 0.0305 15 0.1418 0.0428 Table A3. Statistical significance 2003. Scale Calculated Difference p-value 3 0.1979 0.0032 4 0.1978 0.0074 18 0.1434 0.0447 Table A4. Statistical significance 2004. Scale Calculated Difference p-value 1 0.1071 0.0009 2 0.1192 0.0091 3 0.1353 0.0282 6 0.1163 0.0177 12 0.1319 0.0391 15 0.1416 0.0132 16 0.1366 0.0066 17 0.1037 0.0367 18 0.1131 0.0386 20 0.1176 0.0260 Entropy 2021, 23, 1613 17 of 22 Table A5. Statistical significance 2007. Scala Calculated Difference p-value 1 0.0465 0.0414 7 0.0955 0.0449 8 0.0936 0.0437 9 0.1154 0.0174 10 0.1039 0.0354 13 0.1243 0.0043 18 0.1019 0.0328 19 0.1197 0.0107 Table A6. Statistical significance 2009. Scale Calculated Difference p-value 1 -0.0598 0.0050 2 -0.1173 0.0002 3 -0.1628 0.0002 16 0.0910 0.0465 Table A7. Statistical significance 2010. Scale Calculated Difference p-value 7 0.1798 0.0145 8 0.1531 0.0173 16 0.2233 0.0094 Appendix A.4. Shapiro–Wilk Test for Normality in Scale Distributions (2000–2010) The Shapiro–Wilk test was used to evaluate the normality assumption in the scale distributions used in Welch’s test. When the sample was too large (as in the case of all non-top groups of songs, ∼10,000 samples), the test rendered non-significant results, but for large samples, the normality assumptions are not strongly required as they are for small samples. The p-values presented in the following tables correspond to the small samples (top-songs). Entropy 2021, 23, 1613 18 of 22 Table A8. Shapiro–Wilk test. p-values per scale and year (2000-2005). Scale 2000 2001 2002 2003 2004 2005 1 0.9329 0.0936 0.0326 0.1685 0.2854 0.0633 2 0.0365 0.1406 0.2010 0.1104 0.1980 0.0006 3 0.8561 0.6597 0.8340 0.0013 0.5701 0.7112 4 0.0699 0.1221 0.0936 0.0879 0.6430 0.4746 5 0.5008 0.3178 0.5555 0.3835 0.1539 0.2213 6 0.0006 0.0065 0.0163 0.0011 0.2359 0.0341 7 0.0020 0.0856 0.4961 0.0031 0.1052 0.2203 8 0.0021 0.4071 0.2640 0.0097 0.5182 0.6734 9 0.1027 0.1666 0.8300 0.0160 0.1497 0.3567 10 0.0455 0.9184 0.4900 0.7766 0.0203 0.8399 11 0.0153 0.0454 0.4747 0.1271 0.9733 0.6351 12 0.0131 0.1530 0.8234 0.0076 0.0030 0.9613 13 0.1261 0.6955 0.2467 0.6120 0.7281 0.1045 14 0.8757 0.1543 0.6581 0.1633 0.0269 0.5477 15 0.3005 0.1063 0.9445 0.6983 0.2720 0.5705 16 0.8187 0.1645 0.0264 0.3736 0.0928 0.7060 17 0.0322 0.0705 0.4356 0.2320 0.0775 0.6569 18 0.3397 0.8494 0.9125 0.3709 0.1332 0.6508 19 0.7292 0.7102 0.0528 0.6223 0.4121 0.6581 20 0.6156 0.1075 0.3672 0.4147 0.0047 0.4603 Table A9. Shapiro–Wilk test. p-values per scale and year (2000-2005). Scale 2006 2007 2008 2009 2010 1 0.4237 0.0457 0.5723 0.4247 0.6414 2 0.1894 0.7549 0.1976 0.5587 0.2651 3 0.1428 0.1868 0.3652 0.2921 0.0346 4 0.4861 0.5645 0.0102 0.5316 0.6629 5 0.1318 0.7352 0.0313 0.1480 0.0420 6 0.1656 0.0006 0.0016 0.0006 0.0044 7 0.1119 0.0037 0.0021 0.0040 0.3237 8 0.5568 0.0335 0.0025 0.1747 0.1008 9 0.0157 0.3449 0.0187 0.2692 0.0422 10 0.1804 0.6223 0.0324 0.4781 0.3087 11 0.3535 0.0303 0.0002 0.6112 0.0283 12 0.8280 0.3112 0.0132 0.2647 0.9715 13 0.8021 0.5907 0.0091 0.3962 0.3695 14 0.1404 0.0152 0.0005 0.1899 0.7427 15 0.7298 0.5955 0.0001 0.6707 0.1390 16 0.5348 0.2038 0.0025 0.4316 0.1962 17 0.0560 0.1144 0.0012 0.0534 0.0428 18 0.0923 0.5496 0.1424 0.2645 0.1992 19 0.2156 0.1237 0.0576 0.0001 0.7480 20 0.7259 0.5339 0.0857 0.0367 0.2382 Entropy 2021, 23, 1613 19 of 22 Appendix A.5. Reduction in Significant Scales after Bonferroni Correction Figure A3. Statistical level needed to achieve significant scales. Appendix A.6. Statistically Significant Differences for Area under the Complexity Profile (2000–2010). Table A10. Significance test for area distribution between top and non-top songs in each year (Wilcoxon test, α = 0.05). Year Difference p-value Significant 2000 2.939693 0.000003 Yes 2001 2.780737 0.000030 Yes 2002 2.569101 0.000033 Yes 2003 1.967215 0.000193 Yes 2004 2.362849 0.000177 Yes 2005 1.589775 0.000940 Yes 2006 0.751036 0.143811 No 2007 2.353117 0.000004 Yes 2008 1.518561 0.000003 Yes 2009 0.892549 0.032301 Yes 2010 2.091714 0.006654 Yes Entropy 2021, 23, 1613 20 of 22 Figure A4. Average area vs. log(rank) for the two groups of songs in each year. References 1. Presti, D. Foundational Concepts in Neuroscience: A Brain-Mind Odyssey (Norton Series on Interpersonal Neurobiology). In Foundational Concepts in Neuroscience; W. W. Norton & Company: NewYork, NY, USA, 2016. 2. Schnupp, J.; Nelken, I.; King, A. Auditory neuroscience: Making sense of sound. In Auditory Neuroscience; MIT Press: Cambridge, MA, USA, 2012. 3. Arnold, S. Theory of Harmony; University of California Press: Berkeley, CA, USA, 2010. Entropy 2021, 23, 1613 21 of 22 4. Padilla, P.; Knights, F.; Ruiz, A.T.; Tidhar, D. Identification and Evolution of Musical Style I: Hierarchical Transition Networks and Their Modular Structure. In Proceedings of the 6th International Conference on Mathematics and Computation in Music, Mexico City, Mexico, 26–29 June 2017; Agustín-Aquino O., Lluis-Puebla E., Montiel M., Eds.; Springer: Berlin, Germany, 2017. 5. Febres, G.; Jaffe, K. Music viewed by its Entropy Content: A novel window for comparative analysis. PLoS ONE 2017, 12, e0185757. [CrossRef] [PubMed] 6. Febres, G.; Jaffé, K.; Gershenson, C. Complexity measurement of natural and artificial languages. Complexity 2015, 20, 25–48. [CrossRef] 7. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [CrossRef] 8. JPérez-Verdejo, M.; Piña-García, C.A.; Ojeda, M.M.; Rivera-Lara, A.; Méndez-Morales, L. The rhythm of Mexico: an exploratory data analysis of Spotify’s top 50. J. Comput. Soc. Sience 2021, 4, 147–161. [CrossRef] 9. Gamaliel, P.; Peter, K.; Stefan, T. Instrumentational Complexity of Music and Why Simplicity Sells. PLoS ONE 2014, 9, e115255. 10. Parmer, T.; Ahn, Y.Y. Evolution of Informational Complexity of Contemporary Western Music. arXiv 2019, arXiv:1907.04292. Available online: https://arxiv.org/abs/1907.04292 (accessed on 30 September 2021). 11. Overath, T.; Cusack, R.; Kumar, S.; von Kriegstein, K.; Warren, J.D.; Grube, M.; Carlyon, R.P.; Griffiths, T.D. An Information Theoretic Characterisation of Auditory Encoding. PLoS Biol. 2007, 5, e288. [CrossRef] [PubMed] 12. Carpentier, S.M.; McCulloch, A.R.; Brown, T.M.; Faber, S.E.M.; Ritter, P.; Wang, Z.; Salimpoor, V.; Shen, K.; McIntosh, A.R. Complexity Matching: Brain Signals Mirror Environment Information Patterns during Music Listening and Reward. J. Cogn. Neurosci. 2020, 32, 734–745. [CrossRef] 13. Ashby, W.R. Requisite Variety and Its Implications for The Control of Complex Systems. Cybernetica 1958, 7, 405–417. 14. Gershenson, C. Requisite Variety, Autopoiesis, and Self-organization. Kybernetes 2015, 44, 866–873. [CrossRef] 15. Grassberger, P. Toward a Quantitative Theory of Self-generated Complexity. Int. J. Theor. Phys. 1986, 25, 907–938. [CrossRef] 16. Siegenfeld, A.F.; Bar-Yam, Y. An Introduction to Complex Systems Science and Its Applications. Complexity 2020, 2020. [CrossRef] 17. Allen, B.; Stacey, B.C.; Bar-Yam, Y. Multiscale Information Theory and The Marginal Utility of Information. Entropy 2017, 19, 273. [CrossRef] 18. Bar-Yam, Y. Multiscale Complexity/Entropy. Adv. Complex Syst. 2004, 7, 47–63. [CrossRef] 19. Febres, G. A Proposal about the Meaning of Scale, Scope and Resolution in the Context of the Interpretation Process. Axioms 2018, 7, 11. [CrossRef] 20. Costa, M.; Goldberger A.L.; Peng C.-K. Multiscale Entropy Analysis of Physiologic Time Series. Phys. Rev. Lett. 2002, 89, 068102. [CrossRef] [PubMed] 21. Costa, M.; Goldberger, A.L.; Peng, C.-K. Multiscale Entropy Analysis of Biological Signals. Phys. Rev. E 2005, 71, 021906. [CrossRef] 22. Alexandre, A.; Simon, B.; Ana, C.; Owen, C. Atypical EEG Complexity in Autism Spectrum Conditions: A Multiscale Entropy Analysis. Clin. Neurophysiol. 2011, 122, 2375–2383. 23. Courtiol, J.; Perdikis, D.; Petkoski, S.; Müller, V.; Huys, R.; Sleimen-Malkoun, R. The multiscale entropy: Guidelines for use and interpretation in brain signal analysis. J. Neurosci. Methods 2016, 273, 175–190. [CrossRef] 24. Richman, J.S.; Moorman, J.R. Physiological Time-series Analysis Using Approximate Entropy and Sample Entropy Am. J. -Physiol.-Heart Circ. Physiol. 2000, 278, H2039–H2049. [CrossRef] 25. Thomas, C.; Joy, T. Elements of Information Theory; John Wiley and Sons: Hoboken, NJ, USA, 2006. 26. Li, M.; Vitányi, P. An Introduction to Kolmogorov Complexity and Its Applications; Springer: Berlin, Germany, 2019. 27. Delgado-Bonal, A.; Marshak, A. Approximate Entropy and Sample Entropy: A Comprehensive Tutorial. Entropy 2019, 21, 541. [CrossRef] [PubMed] 28. Mandelbrot, B. Multifractals an 1/f Noise: Wild Self-affinity in Physics; Springer: Berlin, Germany, 1999. 29. Per, B.; Chao, T.; Kurt, W. Self-organized criticality: An explanation of the 1/f noise. Phys. Rev. Lett. 1987, 59, 381. 30. Bertin-Mahieux, T.; Ellis, D.; Whitman, B.; Lamere, P. The Million Song Dataset. In Proceedings of the 12th International Society for Music Information Retrieval Conference, Miami, FL, USA, 24–28 October 2011. [CrossRef] 31. Plack, C.J.; Oxenham, A.J.; Fay, R.R.; Popper, A.N. Pitch: Neural Coding and Perception; Springer: Berlin, Germany, 2005. 32. Welch, B. The Generalization Of ‘Student’S’ Problem When Several Different Population Varlances Are Involved. Biometrika 1947, 34, 28–35. [CrossRef] [PubMed] 33. Shapiro, S.S.; Wilk, M.B. An Analysis of Variance Test for Normality. Biometrika 1947, 52, 3–4. 34. Bland, J.M.; Altman, D.G. Multiple significance tests: the Bonferroni method. Br. Med. J. 1995, 310, 6973. [CrossRef] [PubMed] 35. Bar-Yam, Y. From Big Data to Important Information. Complexity 2016, 21, 73–98. [CrossRef] 36. Rosas F.; Mediano P.; Ugarte M.; Jensen H. An Information-Theoretic Approach to Self-Organization: Emergence of Complex Interdependencies in Coupled Dynamical Systems. Entropy 2018, 20, 793. [CrossRef] 37. Abdallah, S.A.; Plumbley, M.D. A Measure of Statistical Complexity based on Predictive Information with Application to Finite Spins Systems . Phys. Lett. 2012, 376, 275–281. [CrossRef] 38. Mann, H.B.; Whitney, D.R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann. Math. Stat. 1947, 18, 50–60. [CrossRef] 39. Lam, H.L.; Li, W.T.V.; Laher, I.; Wong, R.Y. Effects of Music Therapy on Patients with Dementia—A Systematic Review. Geriatrics 2020, 5, 62. [CrossRef] http://doi.org/10.1371/journal.pone.0185757 http://www.ncbi.nlm.nih.gov/pubmed/29040288 http://dx.doi.org/10.1002/cplx.21529 http://dx.doi.org/10.1002/j.1538-7305.1948.tb01338.x http://dx.doi.org/10.1007/s42001-020-00070-z http://dx.doi.org/10.1371/journal.pbio.0050288 http://www.ncbi.nlm.nih.gov/pubmed/17958472 http://dx.doi.org/10.1162/jocn_a_01508 http://dx.doi.org/10.1108/K-01-2015-0001 http://dx.doi.org/10.1007/BF00668821 http://dx.doi.org/10.1155/2020/6105872 http://dx.doi.org/10.3390/e19060273 http://dx.doi.org/10.1142/S0219525904000068 http://dx.doi.org/10.3390/axioms7010011 http://dx.doi.org/10.1103/PhysRevLett.89.068102 http://www.ncbi.nlm.nih.gov/pubmed/12190613 http://dx.doi.org/10.1103/PhysRevE.71.021906 http://dx.doi.org/10.1016/j.jneumeth.2016.09.004 http://dx.doi.org/10.1152/ajpheart.2000.278.6.H2039 http://dx.doi.org/10.3390/e21060541 http://www.ncbi.nlm.nih.gov/pubmed/33267255 http://dx.doi.org/10.7916/D8NZ8J07 http://dx.doi.org/10.1093/biomet/34.1-2.28 http://www.ncbi.nlm.nih.gov/pubmed/20287819 http://dx.doi.org/10.1136/bmj.310.6973.170 http://www.ncbi.nlm.nih.gov/pubmed/7833759 http://dx.doi.org/10.1002/cplx.21785 http://dx.doi.org/10.3390/e20100793 http://dx.doi.org/10.1016/j.physleta.2011.10.066 http://dx.doi.org/10.1214/aoms/1177730491 http://dx.doi.org/10.3390/geriatrics5040062 Entropy 2021, 23, 1613 22 of 22 40. Leggieri, M.; Thaut, M.H.; Fornazzari, L.; Schweizer, T.A.; Barfett, J.; Munoz, D.G.; Fischer, C.E. Music Intervention Approaches for Alzheimer’s Disease: A Review of the Literature. Front. Neurosci. 2019, 13. [CrossRef] [PubMed] 41. Moreno-Morales, C.; Calero, R.; Moreno-Morales, P.; Pintado, C. Music Therapy in the Treatment of Dementia: A Systematic Review and Meta-Analysis. Front. Med. 2020, 7. [CrossRef] [PubMed] http://dx.doi.org/10.3389/fnins.2019.00132 http://www.ncbi.nlm.nih.gov/pubmed/30930728 http://dx.doi.org/10.3389/fmed.2020.00160 http://www.ncbi.nlm.nih.gov/pubmed/32509790 Introduction Background and Related Work Materials and Methods Data Music Music Preferences Data Processing Results Discussion Limitations Future Work Complexity Profiles for the Years 2001–2010 Scale Distributions (2000) Statistically Significant Differences in Scale (2001–2010). Shapiro–Wilk Test for Normality in Scale Distributions (2000–2010) Reduction in Significant Scales after Bonferroni Correction Statistically Significant Differences for Area under the Complexity Profile (2000–2010). References