Data Mining iTunes

I’ll admit that I dwell pretty firmly in the past when it comes to music. I don’t really listen to the radio and none of my friends are really into music so I don’t have a whole lot of ways to get introduced to new stuff. Ever since I ripped all my CD’s into iTunes, I’ve spent a good deal of the time organizing, cataloging, and rating the songs in my library and not so much time expanding it.

I have my nifty little widget over to the right there that displays what I’m listening to at any given moment, and I recently added the little star icons that display my rating of each song. You can also click the link underneath the song info to see my entire iTunes library. To make this exciting service available to you, my loyal readers, I have a nighly task set up on my PC that dumps the song information into a database on my web server.

It occurred to me that since my iTunes library was in a database, I could run some database queries to analyze my musical tastes and trends.

First off, just what does consitute a 5-star song in my book? It has to be a classic; something I’d listen to over and over and never tire of. It turns out, I’m pretty miserly in giving out the maximum honor. Of 2,270 songs, only 46 (or 2%) have been so honored:

NewArtist Name Album Year
BauhausShe's In PartiesBurning From The Inside1983
BauhausBela Lugosi's Dead1979-83 (Disc 1)1986
BauhausThe Passion Of Lovers1979-83 (Disc 1)1986
Beautiful South, TheRotterdam (Or Anywhere)Blue Is The Colour1997
BeckLost CauseSea Change2002
Clash, TheLondon CallingLondon Calling1979
Cure, TheA ForestSeventeen Seconds1980
Cure, TheJust Like HeavenKiss Me, Kiss Me, Kiss Me1987
David BowieSpace OdditySpace Oddity1969
David BowieThe Man Who Sold The WorldThe Man Who Sold The World1972
David BowieYoung AmericansSound + Vision II1975
David BowieHeroesHeroes1977
David BowieAshes to AshesSound+Vision III1989
David SylvianOrpheusSecrets of the Beehive1987
Dead Can DanceUllysesA Passage In Time1991
Dead Can DanceSeveranceA Passage In Time1991
Elvis Costello & The AttractionsNew AmsterdamBest Of1980
Elvis Costello & The AttractionsBeyond BeliefBest Of1982
Fiery Furnaces, TheStraight StreetBlueberry Boat2004
Joy DivisionLove Will Tear Us ApartSubstance 1977-19801988
Kate BushCloudbustingHounds of Love1985
Kate BushWuthering HeightsThe Whole Story1986
Massive AttackProtectionProtection1994
Neko CaseLady PilotBlacklisted2002
New OrderBlue MondaySubstance [Disc 1]1987
Nick Cave and The Bad SeedsFrom Her To EternityFrom Her To Eternity1984
Nick Cave and The Bad SeedsSaint HuckFrom Her To Eternity1984
Nick Cave and The Bad SeedsThe CarnyYour Funeral... My Trial1986
Nick Cave and The Bad SeedsThe Mercy SeatTender Prey1992
PortisheadGlory BoxDummy1994
Public Image LimitedSeattle (DRM)The Greatest Hits So Far1998
Public Image LimitedSeattleThe Greatest Hits So Far1998
Rolling Stones -- Keith Richards & Mick Jagger, TheSympathy for the DevilBeggars Banquet2005
Roxy MusicAvalonAvalon1982
Siouxsie and The BansheesCities in DustTwice upon a Time (The Singles)0
Siouxsie and The BansheesKiss Them for Me (DRM)Superstition1991
Siouxsie and The BansheesKiss Them for MeSuperstition1991
Siouxsie and The BansheesCities in DustTwice upon a Time - The Singles1992
Smiths, TheHow Soon Is Now?Meat is Murder1984
Smiths, TheHand In GloveThe Smiths1984
Smiths, ThePanicLouder Than Bombs1987
Tear Garden, TheOpheliaTired Eyes Slowly Burning1987
Tear Garden, TheRomulus And VenusThe Last Man To Fly1993
This Mortal CoilSong To The SirenIt'll End In Tears1984
Tom WaitsInvitation to the BluesSmall Change1976
Tom WaitsTimeRain Dogs1985
Tom WaitsGun Street GirlRain Dogs1985
Tom WaitsThe Briar And The RoseThe Black Rider1993
Tom WaitsHold OnMule Variations1999
White Stripes, TheSeven Nation ArmyElephant2003

Now, I’ve only rated 1,023 songs (45%), but I’m pretty sure everything that I would assign 5 stars to has been tagged.

Next, I wondered about that disappearing method of grouping songs together called the “album.” You see, kids, back in the old days when we bought music we had to buy a dozen or so songs all together, whether we liked them or not. Nowadays, with your MP3 file-swapping and hula hoops and all that nonsense, you can just focus on individual songs.

My Top 20 Albums list is based on the average rating of the songs on an album, excluding any non-rated songs. I further restricted it to only albums that had 4 or more songs to eliminate all the one-, two-, and three-offs I got from mixes or downloaded from somewhere.

AlbumRating RatedTracks Album Artist
74.0010The DreamingKate Bush
73.336Secrets of the BeehiveDavid Sylvian
73.008For Your PleasureRoxy Music
71.119BlacklistedNeko Case
70.5010FloodlandThe Sisters Of Mercy
70.004Stompin' At The SavoyBenny Goodman
70.0014The SmithsThe Smiths
69.0911Franks Wild YearsTom Waits
68.8991979-83 (Disc 1)Bauhaus
68.577TemptationHolly Cole
68.577It'll End In TearsThis Mortal Coil
68.005AvalonRoxy Music
68.005Welcome To The Beautiful SouthThe Beautiful South
67.3719Rain DogsTom Waits
67.009Meat is MurderThe Smiths
66.9212Stories From The City, Stories From The SeaPJ Harvey
66.679Your Funeral... My TrialNick Cave and The Bad Seeds
66.679Blueberry BoatThe Fiery Furnaces
66.3219Best OfElvis Costello & The Attractions
66.1315The Black RiderTom Waits

Here’s the SQL I used:
SELECT SUM( Rating ) / COUNT( Name ) AS AlbumRating, COUNT( Name ) AS RatedTracks, Album, Artist FROM phptunest_tunes WHERE Rating > 0 GROUP BY Album HAVING RatedTracks > 3 ORDER BY AlbumRating DESC LIMIT 0,20
The results were surprising, but then I realized they are skewed by the fact that I tend to only rate my favorite tracks. For example, I only rated 4 tracks on “Secrets of the Beehive” but they were all high. I would probably rate the remaining tracks lower, which would move it down the list. I don’t know enough about statistics to correct for that.

Then, I wondered how varied my tastes were. I knew I had a lot of tunes from my favorite artists, but I wondered what percentage of the whole those made up. Here are the results of that, limited to artists who make up 1% or more of my entire library.

Artist NumberOfTracks PercentOfWhole
Tom Waits32111.97
David Bowie2298.54
Nick Cave and The Bad Seeds1917.12
The Cure1495.56
Kate Bush953.54
Bauhaus813.02
Dead Can Dance772.87
The Smiths682.54
PJ Harvey622.31
The Beautiful South461.72
David Sylvian371.38
The Tear Garden361.34
This Mortal Coil351.30
Belle & Sebastian331.23
Einst311.16
Siouxsie and The Banshees311.16
Pet Shop Boys301.12
The Fiery Furnaces301.12
Peter Murphy281.04
No Doubt271.01
The Pogues271.01

And, here’s the SQL:
SELECT Artist, COUNT(Artist) AS NumberOfTracks, COUNT(Artist) / (SELECT COUNT(*) FROM `phptunest_tunes`) * 100.0 AS PercentOfWhole FROM `phptunest_tunes` GROUP BY Artist HAVING PercentOfWhole > .9 ORDER BY PercentOfWhole DESC
The top three — Tom Waits, David Bowie, and Nick Cave and The Bad Seeds — make up 30% of my library.

Finally, I wondered just how stuck in the past I was. I ran a simple query to count the number of tunes I own from each year and then I plotted the results on a line graph in Excel.

It’s not as bad as I thought. 2004 and 2002 have the 2nd and 4th most tunes in my library. 1987, however, was apparently the peak of appealing musical output as far as I am concerned.

Comments

--AHRM--
--COUGH!!!--
--bigdork--
--COUGH!!! COUGH!!!--

Actually, several years ago I sent out a questionnaire to all of my friends asking them "What kind of music would you listen to..." and then added a bunch of categories underneath (e.g. "...on a bright Sunday morning in Spring aftern months of rain," or "...when you're prefunking before a night on the town." This was before the big 'meme' craze that has since become the scourge of otherwise lovely blogs everywhere. I got a lot of great answers... Perhaps you should start a questionnaire...

I think musical interest, particularly in popular music, peaks for everyone during the late teens and early twenties. For old farts like myself, that would be from about 1965 to 1976. The onset of disco and punk created a disaffection for "the latest sounds" and to this day, nothing sounds as good to me as those great albums from that heady era.

Certainly, your focus on what were your good old days confirms my theory.