Fun With Numbers: Nico Stream Rankings Predicting Summer 2013 Disk Over/Unders

Quick recap: I’m taking a look at various no-cost indicators of popularity for anime and their related goods. First, I’m checking how well they correspond to disk sales by checking whether different applications of that statistic beat the null “every v1 will sell less than 4000 disks” accuracy criterion for a given season (66% for Fall, 59% for Summer). Later I’ll check how well these indicators corresponded to boosts in manga/LN source popularity (for works that were originally LN/manga), to contrast their predictive abilities at both high-cost and low-cost levels of interest.

Nico stream rankings are often discussed in the context of early-season indicators of disk sales, and my analysis of the Fall 2013 shows for which those numbers were available did show a bunch of different versions of the data proving more accurate than the null hypothesis. Adding another season should help control for any peculiarities with the Fall season.

The accuracy rates for the top 5/10/15 models using only the 25 series Summer series with Nico stream data available, split into 3 averaged samples corresponding to the beginning, middle, and end (labeled 9-12, but occasionally 9-13 or 9-10) of the series. The individual rank data was compiled here, and the sorted version can be found here. The bare-bones summary of the results is below (green means an accuracy greater than that of the null hypothesis).

Note: the chart format groups by period, rather than by metric, as the Fall chart did.

Summer2013-nico-compThat’s a lot of green, corresponding to a fairly useful indicator. While there are notably some top-tier performances that failed to lead to disk sales (Watamote was top 5 in half the metrics, top 10 in all of them), the general results point to an indicator that can be fairly discussed early on as a broad-basis indicator of disk-buyer popularity.

Peculiarly, the max rating percentage becomes a less accurate indicator for the later periods. This appears to be due to the rise of specific shows (Genshiken Nidaime, Rozen Maiden’s reboot) that lost viewers while maintaining the number of viewers giving the max scores; most likely, many of the people who had middling opinions of them checked out after the first month, leading to somewhat inflated scores.

Fun With Numbers: Nico Stream Rankings Predicting Fall 2013 Disk Over/Unders

Niconico stream rankings are one of the most frequently talked-about stats when it comes to early-season indicators of how a show might be performing. It’s been somewhat frustrating to me that they seem to get brought up at the beginning of every season, but how accurate their predictions were has never really been revisited. By comparing their basic effectiveness with the eventual disk sales results, hopefully we can get a little more context for how those numbers should be interpreted.

Nico ranks come with a couple of numbers attached, so I’m looking a couple of them to see which ones worked best. I use 3 metrics: the number of viewers at the end of the episode stream, the percentage of viewers who gave the episode a 1 (the max score), and the product of those first two (i.e. the number of users who gave the episode a max score). All 3 of these metrics were averaged over episodes 1-4 (October), episodes 5-8 (November), and episodes 9-10,11, 12, 13 (December). I then determined how accurate guesses based solely on these method would be if the top 5, 10, and 15 items on the list were assumed to have their first volume sell over 4k disks.

It should be noted that only 25 of the 38 Fall shows I’m using v1 data for have available Nico rank data, 9 of which had v1s sell over 4000 disks. In the interest of not polluting the sample, I’m just checking the accuracy of the rankings within the sample, excluding the other 13. This does leave the sample with a different over/under ratio than the full seasonal sample (36% instead of 34%).

Anyway, here’s how the various methods performed on the Fall data (detailed data here, breakdown doc here). Numbers for the ones that beat the .66 accuracy afforded by the best null hypothesis are shown in green:

nicoran-compTotal number of viewers arguably performed the best, besting the null hypothesis outright in October and December. Max rating % did very well when only taking the top 5, but fairly poorly otherwise (the top 15 models barely beat out a coin flip model). Max rating number was fairly spotty, except as a broad-strokes indicator in October and a top-tier indicator in December.

This is data taken from a fairly small sample, but these numbers do suggest that the top-tier of the rankings and the total # of stream viewers are at least better than random chance at picking which shows will succeed, something that couldn’t be said for most Torne DVR lists. How impressive that is depends on how well other indicators do and how well it holds up to an extension of the sample size.