This study explores the evolving landscape of popular music using JMP Pro 17 to analyze trends in lyrical sentiment, complexity, and musical structure from Billboard Hot 100 songs spanning 2000-2022. Our objective was to identify which factors drive chart longevity and popularity, using a combination of audio features, lyrical analysis, and advanced statistical modeling.
We began by integrating and cleaning three large data sets sourced from Kaggle: Billboard chart rankings, audio features from Spotify, and full song lyrics. Using JMP’s data preparation, visualization, and modeling tools, we conducted sentiment analysis, engineered features for lyrical complexity, and examined genre-level differences using ANOVA and correlation analysis techniques.
The core of our analysis involved building two predictive models using bootstrap forests in JMP: one to estimate how long a song stays on the chart, and one to predict whether it will reach the Top 10. These models revealed that features like lyrical complexity, acousticness, valence, tempo, and speechiness are strong predictors of commercial success. Our chart longevity model achieved an R² of 0.814, while our Top 10 classification model reached an AUC of 0.982 with only a 5.1% misclassification rate – highlighting the practical power of JMP in real-world entertainment analytics.
By leveraging JMP interactively throughout – from sentiment scoring to predictive modeling – this project demonstrates how text analysis and music metadata can be combined to predict and forecast success in a creative industry. Our findings offer useful insights for artists, producers, and labels aiming to align content with evolving listener preferences.
Presenters
Schedule
11:30 AM-12:15 PM
Location: Sabine
Skill level
- Beginner
- Intermediate
- Advanced