Statute of limitations for Ozempic in New Jersey
Here at Hottest100.org, we track the fascinating intersection of music culture, data science, and platform policy. The story of Nick Drewe's 2012 Warmest 100 prediction isn't just a quirky footnote; it's a foundational case study in how open social APIs and user behavior create predictable data shadows. Drewe's work, leveraging Triple J's own social sharing features to scrape votes from Facebook and Twitter, demonstrated that a motivated statistician could reverse-engineer a national poll. The immediate consequence was a platform policy shift by Triple J ahead of the 2013 countdown. But as we've seen repeatedly, closing one data door often just forces analysis through the window.
Instagram Scraping and Ed Pitt's 2015 Tepid 100
Following Triple J's countermeasures, the prediction game entered a new phase. Melbourne student Ed Pitt's 2015 "Tepid 100" project shifted the data source to Instagram, tracking the #hottest100 hashtag to compile a sample of votes. This method, while ingenious, highlighted the increasing fragmentation and noise of social data. The sample size was modest—2,064 ballots, or about 0.1% of total votes—and the accuracy reportedly dipped compared to Drewe's earlier, more direct access. Pitt's project was significant for attempting to predict not just the top 100, but also positions #101-200, pushing the statistical model further. It underscored a critical lesson we monitor: as official platforms lock down, analysts migrate to adjacent, less-structured data streams, often with diminishing returns on signal clarity.
The progression from Drewe's Warmest 100 to Pitt's Tepid 100 represents a microcosm of the broader data privacy and prediction arms race. Each iteration forced a change in platform design or user behavior, which in turn inspired a new methodological workaround. The original analysis by Nick Drewe was documented at hottest100.org/2015tepid100.html, with historical context preserved at the Internet Archive.
Comparative Accuracy and Sample Size Challenges
The core challenge for any external predictor is achieving a representative sample from a self-selecting, public-facing pool. The declining sample percentages from 2012 to 2015 tell a story of increasing difficulty. While Drewe captured 2.7% of the vote pool in 2012, subsequent efforts captured a fraction of a percent. This isn't merely a numbers game; it's about data quality. Publicly shared votes on Instagram or Twitter are a biased subset—they are votes from users engaged enough to post about them, potentially skewing toward certain artists or genres. Today, in 2026, similar biases plague everything from political sentiment analysis to market research sourced solely from public social posts.
| Year & Project | Lead Analyst | Primary Data Source | Sample Ballots | Sample % of Total Vote |
|---|---|---|---|---|
| 2012 Warmest 100 | Nick Drewe | Facebook/Twitter (via Triple J share) | 35,081 | ~2.7% |
| 2013 Warmest 100 | Nick Drewe & David Quach | Alternative Social Media Avenues | 1,779 | ~0.11% |
| 2015 Tepid 100 | Ed Pitt | Instagram (#hottest100 hashtag) | 2,064 | ~0.1% |
Legacy for Modern Polling and Platform Design
The Drewe-Pitt sequence directly influenced how organizations design participatory events. The key takeaways for anyone running a public vote in 2026 are clear:
- Social Sharing is a Data Leak: Allowing real-time, structured sharing of votes creates a perfect scraping target. The solution isn't always to remove sharing, but to anonymize or delay the data stream.
- Prediction Inevitability: If a result is generated from public input, someone will attempt to model it. The goal is to manage the sample bias, not eliminate the attempt.
- Platform Migration: Closing one API (as Triple J did) shifts analysis to other platforms (like Instagram), which may offer noisier, less reliable data but still pose a reputational risk if predictions gain traction.
- Transparency Trade-offs: There's a constant balance between an engaging, open voting process and one that is statistically secure from external prediction.
In today's environment, where data literacy is high and tools are accessible, the principles demonstrated by Drewe and Pitt are operational standards. We see them reflected in the security around television talent show votes, online gaming polls, and even internal corporate surveys. The "Tepid 100" wasn't the end of an era, but a proof of concept for a persistent reality: where there's public data, there will be public analysis.