Facebook uses hashtags for weakly supervised learning - knocks 2% off error rate for ImageNet


More proof of the power of huge data. 1 billion images, using hashtag prediction as a weakly supervised task and then finetuning on ImageNet.

They got 85.4% accuracy, >2% better than the last state of the art.


This just goes to show how important having labeled data is, even if the labels aren’t 100% accurate or specific enough. Which is why the big service providers have such an advantage currently - certainly there’s something to be said about having access to almost unlimited processing power, but I think that’s not nearly as important a differentiator as having access to mountains of data, and more importantly, mountains of user-labeled data. So I guess the answer to that is to either develop competitive unsupervised approaches, or crowdsource the data labeling task. :slight_smile: