Fun With Numbers: Berserk’s Third Movie Offers an Upper Bound on US Amazon Sales Projections

I’m officially done tracking US release data for the month of April, and the full data is here, if you care to check. I’m not posting a full summary; a lot of things got released, and I’m at the point where each month of data isn’t going to revolutionize my results. But there was one neat null result I got out of this month. The BD for Berserk’s third movie (Golden Age: Advent) put up the best release week of any release I’ve tracked to date, spending the majority of the days between April 15 and April 20 in triple digit rankings. Despite this performance, and a healthy amount of preorder ranking, it failed to crack the top 20 on the US BD charts (preorders are counted in first-week totals).

I’ve been collecting older BD charts via the Numbers for two months now (they’re about 2-3 months delayed from the present time), and those (from January and February in particular), suggest that a release needs to move between 10,000 and 20,000 BD units to crack the number 20 slot in any given week. Therefore, any amazon-rank fit I use to try and convert the numbers into approximate sales figures should be able to take that Berserk data from March 25 through April 20 and come out with a sales figure below at least 20,000, possibly below 10,000 (I’ll have more exact figures in a few months when the Numbers catches up to that week). That’s not a super-tight constraint, but it’ll help me clean out some of the more egregious overestimates in trying to fit the data (this may or may not disqualify the fit I used on the March data, which projects the movie at just a hair under 16,000 copies).

Fun With Numbers: May 2014 US Amazon Data (Initial Numbers)

This is just an infodump post for the May series I’ll be tracking, compiled via amazon’s upcoming anime releases list. Not much beyond the initial numbers here. The April summary post will be up in a week or so (though it won’t have updated charts – I want to just keep collecting data for the next few months before I try identifying trends again).

Continue reading

Fun With Numbers: Google Traffic Predicting the Fall 2013 Disk Over/Unders

Continuing on my series of disk-sales predictor checks, I’m checking whether google search traffic for Fall 2013 series’ Japanese and Romanized titles did a decent job with their eventual v1 disk sales with a simple over/under accuracy check for the top 5, 10, and 15 series each month. Results are shown below (accuracies better than the null hypothesis in green):

google-comp

Google traffic seems to be be a fairly strong indicator of disk sales, especially in December, where the top-10 check achieved the highest accuracy of any metric to date. That’s a very positive sign for the metric; I was actually expecting it to be more indicative of casual manga/LN-sales interest. We’ll see how it ultimately does there when I get the sales bump data together in a couple weeks.

Fun With Numbers: Normalized Google Traffic for Fall 2013 Anime

A big part of my goal is to take a look at existing numbers and see what can be gleaned from them, but that doesn’t mean I can’t take a stab at collecting new metrics. This post is just a summary of one rarely-used metric I’m curious in gauging the efficacy of, normalized google traffic, for anime aired in the Fall 2013 season.

Why check google traffic? It’s not complicated; most people with access to anime have access to the internet. And most people with access to the internet end up using search engines a lot. So you can (theoretically) get a good first-order approximation of how much relative interest an anime has generated by checking its search traffic volume against some other predetermined total (the volume of the term “Fall 2013 Anime” for the month of October is used here). By using volume for both the original Japanese and Romanized titles, it’s possible to parse the traffic in Japan versus the rest of the world. Show data is found on this doc, and listed after the break for the 38 Fall 2013 shows whose v1 data I’ve been using in predictive tests.

Continue reading

Fun With Numbers: Yuruyuri and the Oricon Threshold Iceberg

Towards the end of 2011, the chief editor of comic Yuri Hime, Naitarou Nakamura, revealed that Yuruyuri, a then-7-volume manga series which was adapted into an anime series in summer of that year, had sold over 1 million copies. That in itself isn’t a particularly rare feat; 49 different manga sold that many copies in 2011 alone, as did 15 individual volumes.

What makes Yuruyuri’s case particularly instructive, though, is that it had never appeared on the Oricon charts until June of 2012, 6 months after hitting said million copy mark. How is it possible that a series can hit that impressive a milestone without charting once? The short answer is that the Oricon charts are a very incomplete list. In the past 5 years’ worth of manga charts, we’ve never had a threshold that was below a five-digit number of copies. That means that, in theory, it’s entirely possible (if not extremely likely) for a series to sell 10,000 copies per week without its fans hearing a word about it. If this hypothtetical volume did that for 52 weeks, its total sales of 520,000 volumes would be 25,000 copies shy of the last series on the 2011 top 50 individual volumes list (One Piece’s first volume).

The above example is a bit extreme, but Yuruyuri’s performance isn’t that far off. There is a fairly strong limiting case we can look at to get an idea for how exactly Yuruyuri made it to the magic million (which required an average of ~143,000 copies sold per volume);  Assume the sales were entirely fueled by the anime’s popularity boost. The series had 7 volumes out for the period between the anime airing (on July 4th). Between July 4th and December 18th, there were a total of 24 weeks of Oricon sales charts. 1,000,000 copies/7 volumes/24 weeks=5950 copies/volume/week. The lowest threshold over that time period was 18,406 copies/week for one week in mid-October. Even if we assume that all of those sales were packed into the 12 weeks in which the anime was airing, that’s only up to about 11,900 copies/volume/week, still short of the most generous available threshold over that time period. In a less stringent case, if the manga was already half of the way to a million copies and the anime provided a more moderate boost (which would still have been doubling the series’ sales in a quarter of its previous 2 years in print), it would have been even easier for the series to remain entirely under the radar en route to the million-copy mark.

Yuruyuri had a successful anime, averaging about 8348 disks per volume, and thus didn’t need the manga success the way a lesser series might have. But it does serve as one of the more powerful counters to the idea that the success of a anime in advertising a manga necessitates an appearance on the Oricon charts. It also illustrates the fact that, when actually see big boosts in sales, those might be significantly bigger than just what we observe. The most successful manga advertisements, the crazy-chart Blue Exorcists, are easy to quantify. However, many series, even those that end up as clear-cut successes from an insider’s point of view, are not.* One thing that should always be kept in mind, especially when looking at manga for adults like Aoi Hana that packs a per-volume price tag (~1030 yen) twice that of newer One Piece volumes (~430 yen), is that a series doesn’t have to be making the Oricon charts at all to make its publisher happy.**

*I am guilty of oversimplifying these cases myself at times, so I can’t really blame other people for doing so. To wit, the gain-probabilities I name in this article are for minimum gains, not exact gains.

**Yuruyuri, by the by, runs about 930 yen/volume.

Fun With Numbers: Myanimelist Stats Predicting the Fall 2013 Disk Over/Unders

Edit: Link to source document fixed.

Like those of any other free-access site, or any one set of numbers, statistics from myanimelist should be utilized with caution. But that doesn’t mean they shouldn’t or can’t be used. In this article, I’m looking the current rank and popularity of 38 shows from the Fall 2013 on myanimelist to determine how they stack up against other methods of doing the same. As before, note that any accuracy below the best null hypothesis of .66 is fairly insignificant.

By the by, the myanimelist stats used in this piece were taken in late April, 2014. Odds are they’ve changed a bit since the outset of the season, though shifts within the season tend not to be particularly cataclysmic. Most can probably be viewed as in a similar boat to December Torne/eps 9-12 Nico ranks, in that they’re made when the audience has more information about the show (i.e. has seen more episodes). I used both rank/score and popularity/total number of users to order the shows, and then checked how many v1 sales using the top 5/10/15 shows on those lists correctly pegs as over or under 4000 disks. Data can be found on this doc, and the results are below (indicators greater than the null hypothesis in green):

mal-compTurns out, both indicators do about the same, both beating out the null hypothesis for all 3 list sizes. Out of the metrics I’ve examined so far, that’s the best overall performance. Of course, keep in mind that this is a fairly small sample, and there are other ways (notably source material sales) that these indicators may be telling. It’s still fairly early in the project to say much, beyond the fact that mal stats are at least a little more indicative than random chance.

Fun With Numbers: Nico Stream Rankings Predicting Fall 2013 Disk Over/Unders

Niconico stream rankings are one of the most frequently talked-about stats when it comes to early-season indicators of how a show might be performing. It’s been somewhat frustrating to me that they seem to get brought up at the beginning of every season, but how accurate their predictions were has never really been revisited. By comparing their basic effectiveness with the eventual disk sales results, hopefully we can get a little more context for how those numbers should be interpreted.

Nico ranks come with a couple of numbers attached, so I’m looking a couple of them to see which ones worked best. I use 3 metrics: the number of viewers at the end of the episode stream, the percentage of viewers who gave the episode a 1 (the max score), and the product of those first two (i.e. the number of users who gave the episode a max score). All 3 of these metrics were averaged over episodes 1-4 (October), episodes 5-8 (November), and episodes 9-10,11, 12, 13 (December). I then determined how accurate guesses based solely on these method would be if the top 5, 10, and 15 items on the list were assumed to have their first volume sell over 4k disks.

It should be noted that only 25 of the 38 Fall shows I’m using v1 data for have available Nico rank data, 9 of which had v1s sell over 4000 disks. In the interest of not polluting the sample, I’m just checking the accuracy of the rankings within the sample, excluding the other 13. This does leave the sample with a different over/under ratio than the full seasonal sample (36% instead of 34%).

Anyway, here’s how the various methods performed on the Fall data (detailed data here, breakdown doc here). Numbers for the ones that beat the .66 accuracy afforded by the best null hypothesis are shown in green:

nicoran-compTotal number of viewers arguably performed the best, besting the null hypothesis outright in October and December. Max rating % did very well when only taking the top 5, but fairly poorly otherwise (the top 15 models barely beat out a coin flip model). Max rating number was fairly spotty, except as a broad-strokes indicator in October and a top-tier indicator in December.

This is data taken from a fairly small sample, but these numbers do suggest that the top-tier of the rankings and the total # of stream viewers are at least better than random chance at picking which shows will succeed, something that couldn’t be said for most Torne DVR lists. How impressive that is depends on how well other indicators do and how well it holds up to an extension of the sample size.

Fun With Numbers: Torne DVR Rankings Predicting Fall 2013 Disk Over/Unders

Note: The original version of this article incorrectly stated that there were 39 shows in the Autumn list, not 38. This has been corrected.

This is the first proper article in what will hopefully be a few individual examinations of the power of various indicators to predict disk sales in a rough manner. This segment deals with Torne rankings, a list ranking shows based on how often they were recorded on PS3 DVR. The original data can be found here, and is collected on this doc. I’m comparing these ranks with the rank of v1 disk sales to see how predictive the former were of the latter, based on an over/under 4000 accuracy model.

See this post for details, but basically, one can get 66% accuracy just by predicting that every v1 will sell under 4000 disks (25 of 38 shows in the season were under). Hopefully, good models will do better than that. In order to test how well each metric performs, I take the top 5, 10, and 15 seasonal shows from the metric and guess they will be over 4000 while everything else will flop, then see how many of those “guesses” end up being right. Clerical note: I’m excluding shows from different seasons (eg Monogatari, HxH) that end up in the Torne rankings, and my top 15 shows are the top 15 to rank from Fall 2013 only.

Since Torne rankings are available monthly, I also compared the accuracy of different months; one might expect later months to be closer to the final sales total, as people have had more time and information (read: episodes of the show) with which to make an informed decision. Assuming the metrics are indicative at all, of course.

Anyway, here’s how accurate the top 5/10/15 guess models were for October/November/December Torne Rankings (numbers outperforming the null hypothesis in green):

torne-correctedThe top 15 model doesn’t seem to be particularly precise in general, but accuracy improved each month, and the December rankings were actually notably accurate. We’ll see how these indicators stack up against others (and how well they compare with source boosts) later.

 

Fun With Numbers: Null Hypotheses Predicting Fall 2013 Anime Sales

Note: The original version of this article incorrectly stated that there were 39 shows in the Autumn list, not 38. This has been corrected.

There’s a lot of gray area that can surround the performance of shows early on in a season, before hard stalker numbers roll in and make prior predictions fairly irrelevant with their impressive model. I’m interested in what some of these early and less-rigorous numbers can mean, both for disk sales and lower-cost indicators, and I’m slowly putting together a list of them for Fall 2013 to test against how the first volumes of disks actually sold, along with how big a sales boosts related print media got.* This post deals with very little of those, just outlines three dummy models against which all other indicators will be judged when I get around to it.

The first criterion for a bit of data having any real meaning is its ability to do better than guessing that is either random or very dumb, outperforming what is traditionally known as the null hypothesis. Let’s say you knew roughly what the anime market looked like in 2013, and were trying to build a model to try and guess how the 38 shows to on that list linked above would do. Let’s say you’re just interested in whether a show will sell more or less than 4000 copies, and you judge the success of your model by how well it’s able to peg which shows go over and which shows go under. For reference, 13 of the 38 had their first volume sell over 4000, so the “actual” over odds are about ~34%.**

Continue reading

Fun With Numbers: The West-Side All Stars

Something I stumbled onto a while a go that made me curious was a seemingly non-trivial connection between myanimelist popularity in the sales boosts of both novels and manga attached to a given anime, which led me to some speculation as to how different Western and Japanese fanbases’ preferences really are.

This time, I’m taking a look at a similar question; how many shows with high levels of Western popularity truly bomb in Japan? To answer this, I took the TV shows in the top 200 most popular on myanimelist, and excluded the ones attached to any series that averaged over 4000 in disk sales, or had a novel or manga chart in its first two release weeks at 20,000 copies or more. What remains is, theoretically, a list of the series which failed to catch on in Japan despite catching on in the West.* Data via myanimelist, someanithing, and the Japanese BD/DVD sales wiki. Note that I count the box releases as part of the disk average for pre-millenial series.

Continue reading