Multiplex: comparing Plex libraries

This website is all about niche tech tips and this post is no exception.

I run a Plex media server. This allows me to stream my music collection when I am out and about. Plex pass owners get the nifty plexamp app for listening to music, which I really like. The databasing for the movie and TV show side of Plex works great, but the music side has its peculiarities.

The problem

I have a large music library. It is meticulously curated with full metadata in iTunes/Music style. Plex can handle this but there are a couple of foibles that don’t work so well and I noticed that my music library had developed a few glitches. The major one being that about 70% of my compilation albums had all been given the artist name of Liz Phair with Material Issue! They covered the Tra La La theme from the Banana Splits show and due to some conflict or other, that artist name propagated to almost all compilations.

The problem is that compilations can be checked as compilations in iTunes/Music yet have their Album Artist metadata blank, and be understood to be a compilation. Plex doesn’t understand this and treats each song as a one track album for each artist on the compilation. For some reason this led to the unusual naming issue above.

There were some other problems with Album Artist names that I won’t bore you with.

The fix

So the first thing was to flush the “Album Artist” name for all Compilations with Various Artists. Which can be done easily in iTunes/Music. However,

  • A rescan of the Music library after this change did not fix the database.
  • Doing the Plex Dance did not work either. This is where you move the offending media out of the folder that Plex watches, scan, clean bundles, and then move back and rescan.
  • Drastic action was need.

This meant I needed to make a new Music Library.

Given the emotional attachment I have to my library and play history and so on, I couldn’t bring myself to delete the music library and remake it. Instead I chose to make a new library, verify it was an improvement and then either delete and replace the old library, or fix the problems in the old library.

How I did it

I first made a new music library (I called it musica for a temporary name) with all the same settings as the music library. Creating this library took a long time – I have a lot of music!!

Next the problem was how to check the libraries. There have been various plug-ins and scripts developed over the years (plex2csv, ExportTools) but the current approach (seemingly endorsed by Plex) is this: WebTools-NG. Install was simple on my mac and it was quick to log in to my server.

WebTools-NG can export each Plex library as a delimited text file. I took no chances and just asked it to dump out everything for both libraries. Each export took around 4 hours. In retrospect I could have selected only the basic information and it would have been faster.

Into R

I read the files into R. The files use pipe as a separator. The data frames are huge and have lots of irrelevant columns (69 in total), so I reduced them to the information that matters like this.

music <- read.delim("Data/music.csv",sep = "|")
musica <- read.delim("Data/musica.csv",sep = "|")

reduce_library <- function(input) {
  output <- subset(input, select = c("Album.Title",
                           "Album.Year",
                           "Title",
                           "Audio.Track.Album.No",
                           "Audio.Track.No",
                           "Audio.Track.Artist",
                           "Duration",
                           "Part.File.Combined",
                           "Part.Size.as.Bytes"))
  return(output)
}

music <- reduce_library(music)
musica <- reduce_library(musica)

I could see that I had 125389 rows in music and 125376 in musica. So what was missing?

An anti-join will reveal the what is in music that is missing from musica. We can use Part.Files.Combined which is a full file path to each music file, to do these comparisons.

library(dplyr)
only <- music %>%
  anti_join(musica,by = "Part.File.Combined")

This revealed that one album (a 2-disc set) had not scanned properly. But otherwise all other files were present in both libraries.

Next, the question was what other Artist Names were borked in music.

# merge music and musica to compare
all_df <- merge(x = music, y = musica, by = "Part.File.Combined", all = TRUE)
# find Artist conflicts
check <- all_df[with(all_df, Audio.Track.Artist.x != Audio.Track.Artist.y),]
# just look at artists' albums where there is a conflict rather than per track
check <- subset(check, select = c("Audio.Track.Artist.x","Audio.Track.Artist.y"))
check <- checks[!duplicated(check),]

There were 4727 tracks with Artist conflicts and these were from 155 albums.

The problems were not just limited to compilations. Several other artists had become misnamed in the original library. For many reasons, I decided to delete the original library and then rename musica to music. The only issue was then to reconnect all the apps and services to the new library.

Weirdly, the play history seems to have carried over to the new library. Which was a nice surprise since I had reconciled myself to losing this info. One problem with the new library is that the waveform that can be seen for each track when it plays in plexamp is missing until it can be “scanned for loudness”. This is a scheduled task and eventually they will be replaced. You can check the logs to verify that there is progress on this process.

Conclusion

If you find problems with your music library. Just go ahead and replace it. There were very few errors in the remade library and it solved the headaches I had with the original library.

The post title comes from “Multiplex” by Oliver from his album “Standing Stone“. One of those legendary unreleased/lost albums that got reissued in the 1990s. Recorded on reel-to-reel tape in a Welsh farmhouse, this album is well worth seeking out.

Leave a Comment

Your email address will not be published.