This study explores the evolving landscape of popular music using JMP Pro 17 to analyze trends in lyrical sentiment, complexity, and musical structure from Billboard Hot 100 songs spanning 2000-2022. Our objective was to identify which factors drive chart longevity and popularity, using a combination of audio features, lyrical analysis, and advanced statistical modeling.

We began by integrating and cleaning three large data sets sourced from Kaggle: Billboard chart rankings, audio features from Spotify, and full song lyrics. Using JMP’s data preparation, visualization, and modeling tools, we conducted sentiment analysis, engineered features for lyrical complexity, and examined genre-level differences using ANOVA and correlation analysis techniques.

The core of our analysis involved building two predictive models using bootstrap forests in JMP: one to estimate how long a song stays on the chart, and one to predict whether it will reach the Top 10. These models revealed that features like lyrical complexity, acousticness, valence, tempo, and speechiness are strong predictors of commercial success. Our chart longevity model achieved an R² of 0.814, while our Top 10 classification model reached an AUC of 0.982 with only a 5.1% misclassification rate  highlighting the practical power of JMP in real-world entertainment analytics.

By leveraging JMP interactively throughout – from sentiment scoring to predictive modeling  this project demonstrates how text analysis and music metadata can be combined to predict and forecast success in a creative industry. Our findings offer useful insights for artists, producers, and labels aiming to align content with evolving listener preferences.

 

Presented At Discovery Summit 2025

Presenters

Schedule

Wednesday, Oct 22
11:30 AM-12:15 PM

Location: Sabine

Skill level

Intermediate
  • Beginner
  • Intermediate
  • Advanced

Files

Published on ‎07-09-2025 08:59 AM by Community Manager Community Manager | Updated on ‎09-02-2025 01:17 PM

This study explores the evolving landscape of popular music using JMP Pro 17 to analyze trends in lyrical sentiment, complexity, and musical structure from Billboard Hot 100 songs spanning 2000-2022. Our objective was to identify which factors drive chart longevity and popularity, using a combination of audio features, lyrical analysis, and advanced statistical modeling.

We began by integrating and cleaning three large data sets sourced from Kaggle: Billboard chart rankings, audio features from Spotify, and full song lyrics. Using JMP’s data preparation, visualization, and modeling tools, we conducted sentiment analysis, engineered features for lyrical complexity, and examined genre-level differences using ANOVA and correlation analysis techniques.

The core of our analysis involved building two predictive models using bootstrap forests in JMP: one to estimate how long a song stays on the chart, and one to predict whether it will reach the Top 10. These models revealed that features like lyrical complexity, acousticness, valence, tempo, and speechiness are strong predictors of commercial success. Our chart longevity model achieved an R² of 0.814, while our Top 10 classification model reached an AUC of 0.982 with only a 5.1% misclassification rate  highlighting the practical power of JMP in real-world entertainment analytics.

By leveraging JMP interactively throughout – from sentiment scoring to predictive modeling  this project demonstrates how text analysis and music metadata can be combined to predict and forecast success in a creative industry. Our findings offer useful insights for artists, producers, and labels aiming to align content with evolving listener preferences.

 



0 Kudos