More Twitter weirdness
Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. After a little while, the dupe disappears, but if you delete the dupe manually, the original is gone too.
I presume what’s going on is that tweets are cached, the tweets are eventually batched to disk, and they don’t always get deleted from cache until some time after they’re persisted. If you happen to check the page of your recent tweets inbetween — boom, you get two hits. But what I don’t understand is why the two versions have different timestamps.
Presumably, this could be explained at a MySQL User Conference session next month, one of whose topics will be Intelligent caching strategies using a hybrid MemCache / MySQL approach. I’m so glad they don’t use stupid strategies to do this …
Of course, caching weirdness is just one of many reasons Twitter needs to be rearchitected.
Edit: Here’s an interesting write-up of Twitter’s scaling strategies as of April, 2007. Twitter wrote its own queueing technology called Starling, and later open-sourced it. Hat tip to @Tricon.
Comments
3 Responses to “More Twitter weirdness”
Leave a Reply
I can only imagine the extent to which Twitter applies creativity to scaling issues. I missed it last year but planning to attend this year. Not because I believe in Rails but because there should be some valuable lessons to be learned from their experience.
Hi Curt! Believe it or not, Twitter’s architecture is being discussed by Blaine Cook, one of it’s architects, at the MySQL conference:
http://en.oreilly.com/mysql2008/public/schedule/detail/631
Cheers,
Jay
Jay,
You just repeated a link I had in the original post — and the main subject of the post was duplication of information.
😀
Best,
CAM