## Til I Die: Seeking new music

I’ve been following the tweets from an account called Albums You Must Hear @Albums2Hear. Each tweet is an album recommended by the account owner. I’m a sucker for lists of Albums That I Must Hear Before I Die since I’m always interested in new (or not so new) music recommendations.

I wanted to assemble a list of the albums that I don’t have from this account and I was able to do so using R.

Using rtweet, it was possible to pull a list of all the albums and reorganise them so that I had a csv containing the albums with the artist and year. I could then use this to compare with a list of albums from my iTunes library. A snippet of the retrieved records is shown here (full list is here).

The code for retrieval is here. The output is csv can be used to compare with a list of your own records.


library(rtweet)
library(httpuv)
library(stringr)
all_tweets <- get_timeline("Albums2Hear", n = 1500)
albums <- all_tweets$text albums <- gsub("#albumsyoumusthear ","",albums) tempdf <- as.data.frame(str_split_fixed(albumV, " - ", 3)) colnames(tempdf) <- c("Artist","Album","YearURL") tempdf2 <- as.data.frame(str_split_fixed(tempdf$YearURL, " ", 2))
colnames(tempdf2) <- c("Year","URL")
df <- data.frame(y$Artist,y$Album,z$Year) colnames(df) <- c("Artist","Album","Year") write.csv(df,file = "albums2hear.csv")  Thanks to whoever runs the account – they ask for support here. The post title comes from ‘Til I Die by The Beach Boys from Sunflower/Surf’s Up. ## Bateman Writes: 1994 BBC 6Music recently went back in time to 1994. This made me wonder what albums released that year were my favourites. As previously described on this blog, I have this information readily available. So I quickly crunched the numbers. I focused on full-length albums and, using play density (sum of all plays divided by number of album tracks) as a metric, I plotted out the Top 20. There you have it. Scorn’s epic Evanescence has the highest play density of any album released in 1994 in my iTunes library. By some distance. If you haven’t heard it, this is an amazing record that broke new ground and spawned numerous musical genres. I think that record, One Last Laugh In A Place of Dying… and Ro Sham Bo would all be high on my all-time favourite list. A good year for music then as far as I’m concerned. Other observations: I was amazed that Definitely Maybe was up there, since I am not a big fan of Oasis. Likewise for Dummy by Portishead. Note that Oxford’s Angels and Superdeformed[…] are bootleg records. Bubbling under: this was the top 20, but there were some great records bubbling under in the 20s and 30s. Here are the best 5. • Heatmiser – Cop and Speeder • Circle – Meronia • Credit to the Nation – Take Dis • Kyuss – Welcome to Sky Valley • Drive Like Jehu – Yank Crime I heard tracks from some of these bands on 6Music, but many were missing. Maybe there is something for you to investigate. Part of a series obsessively looking at music in an obsessive manner. ## Come To California I’ve returned from the American Society for Cell Biology 2016 meeting in San Francisco. Despite being a cell biologist and people from my lab attending this meeting numerous times, this was my first ASCB meeting. The conference was amazing, so much excellent science and so many opportunities to meet up with people. For the areas that I work in: mitosis, cytoskeleton and membrane traffic, the meeting was pretty much made for me. Often there were two or more sessions I could have attended, but couldn’t. I’ll try to summarise some of my highlights. One of the best talks I saw was from Dick McIntosh, who is a legend of cell biology and is still making outstanding contributions. He showed some new tomography data of growing microtubules in a number of systems which suggest that microtubules have curved protofilaments as they grow. This is in agreement with structural data and some models of MT growth, but not with many other schematic diagrams. The “bottom-up cell biology” subgroup was one of the first I attended. Organised by Dan Fletcher and Matt Good, the theme was reconstitution of biological systems in vitro. The mix of speakers was great, with Thomas Surrey and Marileen Dogterom giving great talks on microtubule systems, and Jim Hurley and Patricia Bassereau representing membrane curvature reconstitution. Physical principles and quantitative approaches were a strong theme here and throughout the meeting, which reflects where cell biology is at right now. I took part in a subgroup on preprints organised by Prachee Avasthi and Jessica Polka. I will try to write a separate post about this soon. This was a fun session that was also a chance to meet up with many people I had only met virtually. There was a lot of excitement about preprints at the meeting and it seemed like many attendees were aware of preprinting. I guess this is not too surprising since the ASCB have been involved with the Accelerating Science and Publishing in Biology (ASAPbio) group since the start. Of the super huge talks I saw in the big room, the Cellular Communities session really stood out. Bonnie Bassler and Jurgen Knoblich gave fantastic talks on bacterial quorum sensing and “minibrains” respectively. The Porter Lecture, given by Eva Nogales on microtubule structure was another highlight. The poster sessions (which I heard were sprawling and indigestible) were actually my favourite part of the meeting. I saw mostly new work here and had the chance to talk to quite a few presenters. My lab took three posters of different projects at various stages of publication (Laura’s work preprinted/in revision project presented by me, Nick’s work soon to submit and Gabrielle’s work soon to write up) and so we were all happy to get some useful feedback on our work. We’ve had follow up emails and requests for collaboration which made the long trip worthwhile. We also had a mini lab reunion with Dan Booth one of my former students who was presenting his work on using 3D Correlative Light Electron Microscopy to examine chromosome structure. For those that follow me on Twitter, you may know that I like to make playlists from my iTunes library when I visit another city. This was my first time back on the west coast since 2001. Here are ten tracks selected from my San Francisco, CA playlist: 10. California Über Alles – Dead Kennedys from Fresh Fruit For Rotting Vegetables 9. San Franciscan Nights – The Animals from Winds of Change 8. Who Needs the Peace Corps? – The Mothers of Invention from We’re Only In It For The Money 7. San Francisco – Brian Wilson and Van Dyke Parks from Orange Crate Art 6. Going to California – Led Zeppelin from IV 5. Fake Tales of San Francisco – Arctic Monkeys from Whatever People Say I Am, That’s What I’m Not 4. California Hills – Ty Segall from Emotional Mugger 3. The Portland Cement Factory at Monolith California – Cul de Sac from ECIM (OK Monolith is nearer to LA than SF but it’s a great instrumental track). 2. Come to California – Matthew Sweet from Blue Sky on Mars 1. Russian Hill – Jellyfish from Spilt Milk Before the meeting, I went on a long walk around SF with the guys from the lab and we accidentally found ourselves on Russian Hill. For some reason I have a higher than average number of bootlegs recorded in SF. Television (Old Waldorf 1978), Elliott Smith (Bottom of the Hill, 1998), Jellyfish (Warfield Theater 1993), My Bloody Valentine, Jimi Hendrix etc. etc. The post title comes from #2 in my playlist ## Bateman Writes: Eye of the Tiger I don’t often write about music at quantixed but I recently caught Survivor’s “Eye of The Tiger” on the radio and thought it deserved a quick post. Surely everyone knows this song: a kind of catchall motivational tune. It is loved by people in gyms with beach-unready bodies and by presidential hopefuls without permission to use it. Written specifically for Rocky III after Sylvester Stallone was refused permission by Queen to use “Another One Bites The Dust”, it has that 1980s middle-of-the-road hard-rock-but-not-heavy-metal feel to it. The kind of track that must be filed under “guilty pleasure”. Possibly you love this song. Maybe you get ready to meet your opponents whilst listening to it? If this is you, please don’t read on. I find it difficult listening to this track because of the timing of the intro. Not sure what I mean? Here is a waveform of one channel for the intro. Two of the opening phrases are shown underlined. A phrase in this case is: dun, dun-dun-dun, dun-dun-dun, dun-dun-durrrr. Can you see the problem with the second of those two phrases? Still don’t see it? In the second phrase the second of the dun-dun-duns comes in late. I’ve overlaid the waveform again to compare phrase 1 with phrase 2. The difference is one-eighth (quaver) and it drives me nuts. I think it’s intentional because, well the whole band play the same thing. I don’t think it’s a tape splice error, because the track sounds live and surely someone must have noticed. Finally, they play these phrases again in the outro and that point the timing is correct. No, it’s intentional. Why? From this page Jim Peterik of Survivor says: I started doing that now-famous dead string guitar riff and started slashing those chords to the punches we saw on the screen, and the whole song took shape in the next three days. So my best guess is that the notes were written to match the on-screen action! The video on YouTube is only at 220 million views (at the time of writing). Give it a listen, if my description of dun-dun-dun’s was not illustrative enough for you. Notes: • The waveform is taken from the Eye of The Tiger album version of the song. I read that the version in the movie is actually the demo version. • I loaded it into Igor using SoundLoadWave. I made an average of the stereo channels using MatrixOp and then downsampled the wave from 44.1 kHz so it was easier to move around. A very occasional series on music. The name Bateman Writes, refers to the obsessive writings of the character Patrick Bateman in Bret Easton Ellis’s novel American Psycho. This serial killer had a penchant for middle of the road rock act Huey Lewis & The News. ## My Blank Pages IV: Every Song Ever Every Song Ever: Twenty Ways to Listen in an Age of Musical Plenty Ben Ratliff (Farrar, Straus and Giroux) A non-science book review for today’s post. This is a great read on “how to listen to music”. There have been hundreds of books published along these lines, the innovation here however is that we now live in an age of musical plenty. Every song ever recorded is available at our fingertips to listen to when, where and how we want. This means that the author can draw on Thelonious Monk, Sunn O))), Shostakovitch and Mariah Carey. And you can seek it out and find out whatever it is that they have in common. I got hooked in Chapter 2 (discussing slowness in music). I was reading and thinking: he should mention Sleep’s Dopesmoker, but what are the chances? I turn the page and there it was. Then I knew that we were literally on the same page and that I would enjoy whatever it was he had to say. Isn’t confirmation bias a wonderful thing (outside of science). A lot of writing about music is terrible, but I love it when it is done well. As it is here. I especially like reading “under the bonnet” analysis of songs. Ian MacDonald’s Revolution In The Head (or Twilight of the Gods by Wilfred Mellers as an extreme example) springs to mind. This close analysis means you can go back and find new treasures in old songs. And this is the essence of the book. I must admit that I have thought about trying to write similar analyses of songs on quantixed. Aside from the fact that I don’t have time, I was worried it might make me seem like Patrick Bateman discussing the merits of Huey Lewis & The News in American Psycho. It’s something that’s difficult to do well and Ratliff’s analyses here are light touch and spot-on. The short section on blast beats which mentioned D.R.I. made me smile too. Although there’s a factual error here. Ratliff talks about how singer-drummer-brother combo Kurt and Eric Brecht lock in on Draft Me when they played CBGB’s in 1984. Drummer Eric had left the band at that point to be replaced by Felix Griffin, and it is him, not Eric, duelling with vocalist Kurt. Both on LP Dealing With It and the gig at CBGB’s which was later released as an LP and video. Again it’s a band that I have soft spot for and it was great to see them picked out. There were a couple of quotes that I found amusing, being a CD collector and something of a completist. Here’s one: A friend described to me the experience of acquiring a complete CD collection of Mozart, after having had a piece-by-piece relationship with his music for most of his life. It was 175 CDs, or something like that. “I realized,” he said, “that now that I had it all, I never needed to listen to it again. Along the same lines, I thought this quote was pretty chilling. We can pretty much wave bye-bye to the completist-music-collector impulse: it had a limited run in the human brain, probably 1930 to 2010. (It still exists in a fitful way, but it doesn’t have a consensual frame: there is no style for it.) It is not only a way of buying, owning, and arranging music-related objects and experiences in one’s life, but also a distinct way of listening. As somebody who is not a fan of streaming and still values physically owning music I know I am out-of-step with the rest of the world. However I think this quote is at odds with what the whole book is trying to achieve. The guy listening to music on his phone speaker on the bus, described in the intro can’t hear and appreciate much of what is described in the book. To hear that squeak of John Bonham’s kick drum pedal on Since I’ve Been Loving You from Led Zeppelin III, you need to be listening in the old-fashioned way, rather than in the noisy and busy way most music is consumed nowadays. It’s a great read. You can get it here. My Blank Pages is a track by Velvet Crush. This is an occasional series of book reviews. ## A Day In The Life III This year #paperOTD (or paper of the day for any readers not on Twitter) did not go well for me. I’ve been busy with lots of things and I’m now reviewing more grants than last year because I am doing more committee work. This means I am finding less time to read one paper per day. Nonetheless I will round up the stats for this year. I only managed to read a paper on 59.2% of available days… The top ten journals that published the papers that I read: • 1 Nat Commun • 2 J Cell Biol • 3 Nature • 4= Cell • 4= eLife • 4= Traffic • 7 Science • 8= Dev Cell • 8= Mol Biol Cell • 8= Nat Cell Biol Nature Communications has published some really nice cell biology this year and I’m not surprised it’s number one. Also, I read more papers in Cell this year compared to last. The papers I read are mainly recent. Around 83% of the papers were published in 2015. Again, a significant fraction (42%) of the papers have statistical errors. Funnily enough there were no preprints in my top ten. I realised that I tend to read these when approving them as an affiliate (thoroughly enough for #paperOTD if they interest me) but I don’t mark them in the database. I think my favourite paper was this one on methods to move organelles around cells using light, see also this paper for a related method. I think I’ll try again next year to read one paper per day. I’m a bit worried that if I don’t attempt this, I simply won’t read any papers in detail. I also resolved to read one book per month in 2015. I managed this in 2014, but fell short in 2015 just like with #paperOTD. The best book from a limited selection was Matthew Cobb’s Life’s Greatest Secret. A tale of the early days of molecular biology, as it happened. I was a bit sceptical that Matthew could bring anything new to this area of scientific history. Having read Eighth Day of Creation, and then some pale imitations, I thought that this had pretty much been covered completely. This book however takes a fresh perspective and it’s worth reading. Matthew has a nice writing style, animating the dusty old main characters with a insightful detail as he goes along. Check it out. This blog is going well, with readership growing all the time. I have written about this progress previously (here and here). The most popular posts are those on publishing: preprints, impact factors and publication lag times, rather than my science, but that’s OK. There is more to come on lag times in the New Year, stay tuned. I am a fan of year-end lists as you may be able to tell. My album of the year is Battles – La Di Da Di which came out on Warp in September. An honourable mention goes to Air Formation – Were We Ever Here EP which I bought on iTunes since the 250 copies had long gone by the time I discovered it on AC30. Since I don’t watch TV or go to the cinema, I don’t have a pick of the year for that. When it comes to pro-cycling, of course I have an opinion. My favourite stage race was Critérium du Dauphiné Libere which was won by Chris Froome in a close contest with Tejay van Garderen. The best one-day race was a tough pick between E3 Harelbeke won by Geraint Thomas and Omloop Het Nieuwsblad won by Ian Stannard. Although E3 was a hard man’s race in tough conditions, I have to go for Stannard outfoxing three(!) Etixx Quick Step riders to take the win in Nieuwsblad. I’m a bit annoyed that those three picks all involve Team Sky and British riders…. I won’t bore everyone with my own cycling (and running) exploits in 2015. Just to say, that I’ve been more active this year in any year since 2009. I shouldn’t need to tell you where the post title comes from. If you haven’t heard Sgt. Pepper’s Lonely Hearts Club Band by The Beatles, you need to rectify this urgently. The greatest album recorded on 4-track equipment, no question. 🙂 ## Your Favorite Thing: Algorithmically Perfect Playlist I’ve previously written about analysing my iTunes library and about generating Smart Playlists in iTunes. This post takes things a bit further by generating a “perfect playlist” outside of iTunes… it is exclusively for nerds. How can you put together a perfect playlist? What are your favourite songs? How can you tell what they are? Well, we could look at how many times you’ve played each song in your iTunes library (assuming this is mainly how you consume your music)… but this can be misleading. Songs that have been in there since the start (in my case, a decade ago) have had longer to accumulate plays than songs that were added last year. This problem was covered nicely in a post by Mr Science Show. He suggests that your all-time greatest playlist can be modelled using $$\frac{dp}{dt}=\frac{A}{Bt+N_0} + Ce^{-Dt}$$ Where $$N_0$$ is the number of tracks in the library at $$t_0$$, time zero. A and B are constants and the collection growing linearly over time. The second component is an additional correction for the fact that songs added more recently are likely to have garnered more plays, and as they age, they relax back into the general soup of the library. I used something similar to make my perfect playlist. Calculating something like this is well beyond the scope of iTunes and so we need to do something more heavy duty. The steps below show how this can be achieved. Of course, I used IgorPro to do almost all of this. I tried to read in the iTunes Music Library.xml directly in Igor using the udStFiLrXML package, but couldn’t get it to work. So there’s a bit of ruby followed by an all-Igor workflow. You can scroll to the bottom to find out a) whether this was worth it and b) for other stuff I discovered along the way. All the code to do this is available here. I’ll try to put quantixed code on github from now on. Once the data is in Igor, the strategy is to calculate the expected number of plays a track should have received if iTunes was simply set to random. We can then compare this number to the actual number of plays. The ratio of these numbers helps us to discriminate our favourite tracks. To work out the expected plays, we calculate the number of tracks in the library over time and the inverse of this gives us the probability that a given track, at that moment in the lifetime of the library, will be played. We know the total number of plays and the lifetime of the library, so if we assume that play rate is constant over time (fair assumption), this means we can calculate the expected number of plays for each track. As noted above, there is a slight snag with this methodology, because tracks added in the last few months will have a very low number of expected plays, yet are likely to have been played quite a lot. To compensate for this I used the modelling method suggested by Mr Science Show, but only for recent songs. Hopefully that all makes sense, so now for a step-by-step guide. Step 1: Extract data from iTunes xml file to tsv After trying and failing to write my own script to parse the xml file, I stumbled across this on the web. #!/usr/bin/ruby require 'rubygems' require 'nokogiri' list = [] doc = Nokogiri::XML(File.open(ARGV[0], 'r')) doc.xpath('/plist/dict/dict/dict').each do |node| hash = {} last_key = nil node.children.each do |child| next if child.blank? if child.name == 'key' last_key = child.text else hash[last_key] = child.text end end list << hash end p list  This script was saved as parsenoko.rb and could be executed from the command line find . -name "*.xml" -exec ruby parsenoko.rb {} > playlist.csv \;  after cd to appropriate directory containing the script and a copy of the xml file. Step 2: A little bit of cleaning The file starts with [ and ends with ]. Each dictionary item (dict) has been printed enclosed by {}. It’s easiest to remove these before importing to IgorPro. For my library the maximum number of keys is 38. I added a line with (ColumnA<tab>ColumnB<tab>…<tab>ColumnAL), to make sure all keys were imported correctly. Step 3: Import into IgorPro Import the tsv. This is preferable to csv because many tracks have commas in the track title, album title or artist name. Everything comes in as text and we will sort everything out in the next step. LoadWave /N=Column/O/K=2/J/V={"\t","$",0,0}


Step 4: Get Igor to sort the key values into waves corresponding to each key

This is a major type of cleaning. What we’ll do is read the key and its value. The two are separated by => and so this is used to parse and resort the values. This will convert the numeric values to numeric waves.

This is done by executing

iTunes()

Step 5: Convert timestamps to date values

iTunes stores dates in a UTC timestamp with this format 2014-10-02T20:24:10Z. It does this for Date Added, Date Modified, Last Played etc. To do anything meaningful with these, we need to convert them to date values. IgorPro uses the time in seconds from Midnight on 1st Jan 1904 as a date system. This requires double precision FP64 waves. We can parse the string containing this time stamp and convert it using

DateRead()

Step 6: Discover your favourite tracks!

We do all of this by running

Predictor()

The way this works is described above. Note that you can run whatever algorithm you like at this point to generate a list of tracks.

Step 7: Make a playlist to feed back to iTunes

The format for playlists is the M3U file. This has a simple layout which can easily be printed to a Notebook in Igor and then saved as a file for importing back into iTunes.

To do this we run

WritePlaylist(listlen)

Where the Variable listlen is the length of the playlist. In this example, listlen=50 would give the Top 50 favourite tracks.

So what did I find out?

My top 50 songs determined by this method were quite different to the Smart Playlist in iTunes of the Most Played tracks. The tracks at the top of the Most Played list in iTunes have disappeared in the new list and these are the ones that have been in the library for a long time and I suppose I don’t listen to that much any more. The new algorithmically designed playlist has a bunch of fresher tracks that were added in the last few years and I have listened to quite a lot. Looking through I can see music that I should explore in more detail. In short, it’s a superior playlist and one that will always change and should not go stale.

Other useful stuff

There are quite a few parsing tools on the web that vary in their utility and usefulness. Some that deserve a mention are:

• The xml file should be readable as a plist by cocoa which is native to OSX
• Visualisation of what proportion of an iTunes library is by a given artist – bdunagan’s blog
• itunes-parser on github by phiggins
• Really nice XSLT to move the xml file to html – moveable-type
• Comprehensive but difficult to follow method in ruby.

The post title comes from “Your Favorite Thing” by Sugar from their LP “File Under: Easy Listening”

## Science songs

I thought I’d compile a list of songs related to biomedical science. These were all found in my iTunes library. I’ve missed off multiple entries for the same kind of thing, as indicated.

Neuroscience

• Grand Mal -Elliott Smith from XO Sessions
• She’s Lost Control – Joy Division from Unknown Pleasures (Epilepsy)
• Aneuryism – Nirvana from Hormoaning EP
• Serotonin – Mansun from Six
• Serotonin Smile – Ooberman from Shorley Wall EP
• Brain Damage – Pink Floyd from Dark Side of The Moon
• Paranoid Schizophrenic – The Bats from How Pop Can You Get?
• Headacher – Bear Quartet from Penny Century
• Headache – Frank Black from Teenager of the Year
• Manic Depression – Jimi Hendrix Experience and lots of other songs about depression
• Paranoid – Black Sabbath from Paranoid (thanks to Joaquin for the suggestion!)

Medical

• Cancer (interlude) – Mansun from Six
• Hepatic Tissue Fermentation – Carcass or pretty much any song in this genre of Death Metal
• Whiplash – Metallica from Kill ‘Em All
• Another Invented Disease – Manic Street Preachers from Generation Terrorists
• Broken Nose – Family from Bandstand
• Bones – Radiohead from The Bends
• Ana’s Song – Silverchair from Neon Ballroom (Anorexia Nervosa)
• 4st 7lb – Manic Street Preachers from The Holy Bible (Anorexia Nervosa)
• November Spawned A Monster – Morrissey from Bona Drag (disability)
• Castles Made of Sand – Jimi Hendrix Experience from Axis: Bold As Love (disability)
• Cardiac Arrest – Madness from 7
• Blue Veins – The Raconteurs from Broken Boy Soldiers
• Vein Melter – Herbie Hancock from Headhunters
• Scoliosis – Pond from Rock Collection (curvature of the spine)
• Taste the Blood – Mazzy Star… lots of songs with blood in the title.

Pharmaceutical

• Biotech is Godzilla – Sepultura from Chaos A.D.
• Luminol – Ryan Adams from Rock N Roll
• Feel Good Hit Of The Summer – Queens of The Stone Age from Rated R (prescription drugs of abuse)
• Stars That Play with Laughing Sam’s Dice – Jimi Hendrix Experience (and hundreds of other songs about recreational drugs)
• Tramazi Parti – Black Grape from It’s Great When You’re Straight…
• Z is for Zofirax – Wingtip Sloat from If Only For The Hatchery
• Goldfish and Paracetamol – Catatonia from International Velvet
• L Dopa – Big Black from Songs About Fucking

Genetics and molecular biology

• Genetic Reconstruction – Death from Spiritual Healing
• Genetic – Sonic Youth from 100%
• Hair and DNA – Hot Snakes from Audit in Progress
• DNA – Circle from Meronia
• Biological – Air from Talkie Walkie
• Gene by Gene – Blur from Think Tank
• My Selfish Gene – Catatonia from International Velvet
• Sheer Heart Attack – Queen (“it was the DNA that made me this way”)
• Mutantes – Os Mutantes
• The Missing Link – Napalm Death from Mentally Murdered E.P.
• Son of Mr. Green Genes – Frank Zappa from Hot Rats

Cell Biology

• Sweet Oddysee Of A Cancer Cell T’ Th’ Center Of Yer Heart – Mercury Rev from Yerself Is Steam
• Dead Embryonic Cells – Sepultura from Arise
• Cells – They Might Be Giants from Here Comes Science (songs for kids about science)
• White Blood Cells LP by The White Stripes
• Anything by The Membranes
• Soma – Smashing Pumpkins from Siamese Dream
• Golgi Apparatus – Phish from Junta
• Cell-scape LP by Melt Banana

Album covers with science images

Godflesh – Selfless. Scanning EM image of some cells growing on a microchip?

Circle – Meronia. Photograph of an ampuole?

Do you know any other science songs or album covers? Leave a comment!

## My Favorite Things

I realised recently that I’ve maintained a consistent iTunes library for ~10 years. For most of that time I’ve been listening exclusively to iTunes, rather than to music in other formats. So the library is a useful source of information about my tastes in music. It should be possible to look at who are my favourite artists, what bands need more investigation, or just to generate some interesting statistics based on my favourite music.

Play count is the central statistic here as it tells me how often I’ve listened to a certain track. It’s the equivalent of a +1/upvote/fave/like or maybe even a citation. Play count increases by one if you listen to a track all the way to the end. So if a track starts and you don’t want to hear it and you skip on to the next song, there’s no +1. There’s a caveat here in that the time a track has been in the library, influences the play count to a certain extent – but that’s for another post*. The second indicator for liking a track or artist is the fact that it’s in the library. This may sound obvious, but what I mean is that artists with lots of tracks in the library are more likely to be favourite artists compared to a band with just one or two tracks in there. A caveat here is that some artists do not have long careers for a variety of reasons, which can limit the number of tracks actually available to load into the library. Check the methods at the foot of the post if you want to do the same.

What’s the most popular year? Firstly, I looked at the most popular year in the library. This question was the focus of an earlier post that found that 1971 was the best year in music. The play distribution per year can be plotted together with a summary of how many tracks and how many plays in total from each year are in the library. There’s a bias towards 90s music, which probably reflects my age, but could also be caused by my habit of collecting CD singles which peaked as a format in this decade. The average number of plays is actually pretty constant for all years (median of ~4), the mean is perhaps slightly higher for late-2000s music.

Favourite styles of music: I also looked at Genre. Which styles of music are my favourite? I plotted the total number of tracks versus the total number of plays for each Genre in the library. Size of the marker reflects the median number of plays per track for that genre. Most Genres obey a rule where total plays is a function of total tracks, but there are exceptions. Crossover, Hip-hop/Rap and Power-pop are highlighted as those with an above average number of plays. I’m not lacking in Power-pop with a few thousand tracks, but I should probably get my hands on more Crossover or Hip-Hop/Rap.

Using citation statistics to find my favourite artists: Next, I looked at who my favourite artists are. It could be argued that I should know who my favourite artists are! But tastes can change over a 10 year period and I was interested in an unbiased view of my favourite artists rather than who I think they are. A plot of Total Tracks vs Mean plays per track is reasonably informative. The artists with the highest plays per track are those with only one track in the library, e.g. Harvey Danger with Flagpole Sitta. So this statistic is pretty unreliable. Equally, I’ve got lots of tracks by Manic Street Preachers but evidently I don’t play them that often. I realised that the problem of identifying favourite artists based on these two pieces of information (plays and number of tracks) is pretty similar to assessing scientists using citation metrics (citations and number of papers). Hirsch proposed the h-index to meld these two bits of information into a single metric, the h-index. It’s easily computed and I already had an Igor procedure to calculate it en masse, so I ran it on the library information.

Before doing this, I consolidated multiple versions of the same track into one. I knew that I had several versions of the same track, especially as I have multiple versions of some albums (e.g. Pet Sounds = 3 copies = mono + stereo + a capella), the top offending track was “Baby’s Coming Back” by Jellyfish, 11 copies! Anyway, these were consolidated before running the h-index calculation.

The top artist was Elliott Smith with an h-index of 32. This means he has 32 tracks that have been listened to at least 32 times each. I was amazed that Muse had the second highest h-index (I don’t consider myself a huge fan of their music) until I remembered a period where their albums were on an iPod Nano used during exercise. Amusingly (and narcissistically) my own music – the artist names are redacted – scored quite highly with two out of three bands in the top 100, which are shown here. These artists with high h-indeces are the most consistently played in the library and probably constitute my favourite artists, but is the ranking correct?

The procedure also calculates the g-index for every artist. The g-index is similar to the h-index but takes into account very highly played tracks (very highly cited papers) over the h threshold. For example, The Smiths h=26. This could be 26 tracks that have been listened to exactly 26 times or they could have been listened to 90 times each. The h-index cannot reveal this, but the g-index gets to this by assessing average plays for the ranked tracks. The Smiths g=35. To find the artists that are most-played-of-the-consistently-most-played, I subtracted h from g and plotted the Top 50. This ranked list I think most closely represents my favourite artists, according to my listening habits over the last ten years.

Track length: Finally, I looked at the track length. I have a range of track lengths in the library, from “You Suffer” by Napalm Death (iTunes has this at 4 s, but Wikipedia says it is 1.36 s), through to epic tracks like “Blue Room” by The Orb. Most tracks are in the 3-4 min range. Plays per track indicates that this track length is optimal with most of the highly played tracks being within this window. The super-long tracks are rarely listened to, probably because of their length. Short tracks also have higher than average plays, probably because they are less likely to be skipped, due to their length.

These were the first things that sprang to mind for iTunes analysis. As I said at the top, there’s lots of information in the library to dig through, but I think this is enough for one post. And not a pie-chart in sight!

Methods: the library is in xml format and can be read/parsed this way. More easily, you can just select the whole library and copy-paste it into TextEdit and then load this into a data analysis package. In this case, IgorPro (as always). Make sure that the interesting fields are shown in the full library view (Music>Songs). To do everything in this post you need artist, track, album, genre, length, year and play count. At the time of writing, I had 21326 tracks in the library. For the “H-index” analysis, I consolidated multiple versions of the same track, giving 18684 tracks. This is possible by concatenating artist and the first ten characters of the track title (separated by a unique character) and adding the play counts for these concatenated versions. The artist could then be deconvolved (using the unique character) and used for the H-calculation. It’s not very elegant, but seemed to work well. The H-index and G-index calculations were automated (previously sort-of-described here), as was most of the plot generation. The inspiration for the colour coding is from the 2013 Feltron Report.

* there’s an interesting post here about modelling the ideal playlist. I worked through the ideas in that post but found that it doesn’t scale well to large libraries, especially if they’ve been going for a long time, i.e. mine.

The post title is taken from John Coltrane’s cover version of My Favorite Things from the album of the same name. Excuse the US English spelling.

## Tips from the Blog I

What is the best music to listen to while writing a manuscript or grant proposal? OK, I know that some people prefer silence and certainly most people hate radio chatter while trying to concentrate. However, if you like listening to music, setting an iPod on shuffle is no good since a track by Napalm Death can jump from the speakers and affect your concentration. Here is a strategy for a randomised music stream of the right mood and with no repetition, using iTunes.

For this you need:
A reasonably large and varied iTunes library that is properly tagged*.

1. Setup the first smart playlist to select all songs in your library that you like to listen to while writing. I do this by selecting genres that I find conducive to writing.
Conditions are:
-Match any of the following rules
-Genre contains jazz
-add as many genres as you like, e.g. shoegaze, space rock, dream pop etc.
-Don’t limit and do check live updating
I call this list Writing

2. Setup a second smart playlist that makes a randomised novel list from the first playlist
Conditions are:
-Match all of the following rules
-Playlist is Writing   //or whatever you called the 1st playlist
-Last played is not in the last 14 days    //this means once the track is played it disappears, i.e. refreshes constantly
-Limit to 50 items selected by random
-Check Live updating
I call this list Writing List

That’s it! Now play from Writing List while you write. The same strategy works for other moods, e.g. for making figures I like to listen to different music and so I have another pair for that.

After a while, the tracks that you’ve skipped (for whatever reason) clog up the playlist. Just select all and delete from the smart playlist, this refreshes the list and you can go again with a fresh set.

* If your library has only a few tracks, or has plenty of tracks but they are all of a similar genre, this tip is not for you.