The guys over at Instagram wrote a nice article about their technology stack. They are using a lot of the Amazon AWS services, from S3 to Ec2 and ELB. Instagram is written in Django. PostgreSQL is powering their databases together with Redis for their lists and sessions. When they started sharing their PostgreSQL cluster they switched from PostSQL Geo to Apache SOLR for their geo API. Gearman is working with 200 workers to get data asynchronously through the system and into their streaming data clients.
We thought it would be fun to give a sense of all the systems that power Instagram, at a high-level; you can look forward to more in-depth descriptions of some of these systems in the future. This is how our system has evolved in the just-over-1-year that we’ve been live, and while there are parts we’re always re-working, this is a glimpse of how a startup with a small engineering team can scale to our 14 million+ users in a little over a year.
via Instagram Engineering • What Powers Instagram: Hundreds of Instances, Dozens of Technologies.
No surprises, but it looks like a solid stack. I’ve been using most of the tools myself in earlier projects. I was a bit surprised about the size of their instances. I would’ve used more but smaller instances, but apparently the IO is not the bottleneck. Food for thought..
“Companies can see pieces [of software] as far more core to their business,” said Recordon. “But our ability to serve PHP faster is not core to our business. But cheaper/faster development tools for other companies is a real competitive advantage.” Hence, logic dictates that because there’s no business loss if Facebook’s infrastructure is open-sourced, then open-sourced it should be.
/via Facebook opens up about open-source software
Very nice quote by David Recordon, creator of OAuth amongst other things and currently Facebook’s open source guru. The rest of the interview is worth reading as well, but this quote is something to think about. Which parts of your infrastructure can you open-source?
There is an old parable about the concept of commitment when it comes to breakfast. The story goes that when looking at a plate of the traditional fare of ham and eggs, it’s obvious that the chicken is an interested party, but the pig is truly committed.
I heard this parable before and it keeps intriguing me. Am I a chicken or a pig?
When I tell this story to entrepreneurs, my point is usually to contrast the approach VCs have to start-ups as compared to entrepreneurs. The VC is an interested party, but at the end of the day, if their start-ups live or die, they typically still have their job, their office and their portfolio of other investments. The entrepreneur, on the other hand, is the pig – truly committed to the outcome, with no fallback.
But lately I’ve been thinking about the parable of the pig and the chicken in the context of the characteristics that make a great entrepreneur – and the kind of entrepreneur that we VCs in general, and my firm Flybridge Capital in particular, like to back. In short, we like to back pigs – entrepreneurs who are truly and completely committed to the outcome of their venture, have a lot of stake, and no fallback.
via Why Venture Capitalists Invest In Pigs, Not Chickens.
So the question is: Are you a Pig or a Chicken?
Do you wait to start your venture until you are fully backed or do you start right away and see where you end up? Gary Vaynerchuck said it over and over again, there is plenty of time after office hours to start your company. It’s all about commitment.
Image source: San Diego Shooter
There is a fairly accepted rule that 1% of a site’s users will create content, 10% will interact with it while 100% consumes the content. That means that 90% of your users will probably be logged out. Consuming users can’t upgrade to Interacting users without having to create a profile. Registering has become far easier with the introduction of OpenID and OAuth connections offered by Twitter and Facebook, but it still is a big step.
At Mobypicture we are experimenting with ways where people leave a comment first and are then guided through the login or registration process, placing the comment afterwards. This way Consuming users can say what they want to say, before getting distracted by login screens.
Fred Wilson, VC and principal of Union Square Ventures, goes a step further and proposes more interaction for logged out users by giving them “phantom profiles“, storing activity against their cookies and building user profiles on logged out users. Read the rest of ‘Don’t forget your logged out users’ »
I’m reading a lot about Continuous Integration and Test-Driven Development lately, to work out the best ways to develop code as agile and with as much flexibility for deployment as possible. I wrote a small post earlier on Continuous Integration in PHP about how to build a CI server with Jenkins. In this post I would like to go deeper into the Why of Continuous Integration and Test-Driven Development.
In the early stage of Sugababes.nl we worked live on our production codebase, changing and testing things while users were visiting the website. That provided a lot of trouble and was soon discarded. Ever since we (my team at Sugababes first and Mobypicture later) work from SVN on development environments, commit code, try to test it in a staging environment, but most of the time we just deploy. That is, export parts of the codebase to our live environment. That is not an ideal situation and proves to be very catchy for bugs that could have been prevented. Read the rest of ‘Things I learned about Deployment, Test Driven Development and Continuous Integration’ »
Backtype has built a powerful system to analyze realtime social data. They help you with insights about your social influence on Twitter and YCombinator’s Hacker News by analyzing tweets.The following graph shows the (not so impressive) stats for my blog:
Because Backtype is processing all tweets for URLs to calculate your influence, they have to process a massive amount of data from the Twitter Firehose. The Firehose can go as fast as 7000 tweets per minute during New Years Eve in Tokyo. That’s a massive 117 tweets per second!
The big problem with realtime is that you can not or not easily process it in batches, because the data keeps coming. When you batch this amount of data you have to be able to process the data faster than realtime or create an always growing backlog. During the World Cup finals last year when The Netherlands was playing against Spain, we (Mobypicture’s MobyNow) had a small flaw in our code processing the tweets with #ned and #wk2010. During roughly 90 minutes we had built a backlog of 18 hours worth of processing. Because people kept using #ned and #wk2010 it was almost impossible to remove the backlog, we had to go many times faster than realtime to remove it. While displaying realtime tweets you don’t want to be more then a couple of seconds behind. With batched processing this process of removing the backlog is a fight you are fighting every time you process a batch.
Backlog also recognized that batched processing wasn’t the way to go for their analytics. So they recently developed a new system for doing realtime processing called Storm to replace their old system of queues and workers: Read the rest of ‘Storm: The Hadoop of Realtime Processing’ »
Yesterday was the last MobileMonday Amsterdam, an inspiring bi-monthly event with great speakers and inspiring visitors. I had a lengthy talk with Martijn Rijntjes about entrepreneurship and he introduced me to something I had never heard of: The Lifestyle Entrepreneur.
Startups are the things all the cool kids do, Martijn told me, but he never felt it was the right fit for him. Martijn likes to travel and see the world, not work his ass off, give up his social life, all in favor of his internet startup. He almost felt guilty he had not the same dreams and goals a lot of his friends have. Until he read an article by Corbett Barr about Startup vs Lifestyle Business.
A startup business has the primary responsibility to grow as big and successful as possible, whatever the impact on the lives of the entrepreneurs. Founders of a startup are competing for success, fame and glory, although some of them, including me, are also in it because they really like the process. After hitting the jackpot, a lot of them jump back into the startup life to repeat their success. Once your lucky twice your good, Sarah Lacy wrote and I believe in this statement. Running a startup is a lifestyle. Read the rest of ‘Introducing the lifestyle entrepreneur’ »
When you need to display an unknown amount of text in a constrained space you may need to somehow hide text that doesn’t fit. One way is to use overflow:hidden to quite brutally hide it. Doing this works, and it works cross-browser, but it can be difficult for the user to realise that text has been hidden since there is no visual indication of it. A property from CSS3 that can help improve the situation is text-overflow.
via Clipping text with CSS3 text-overflow by 456 Berea Street
text-overflow is actually a pretty awesome CSS3 property. It does exactly what we try to do in programming languages all the time when we have titles that can be too long for their space and you need to add ‘…’ after an x amount of characters. But text-overflow does it in CSS. It works in Safari, Chrome, Opera, and even in Internet Explorer 7+, but it doesn’t work in Firefox until Firefox 6. (Firefox 6? Yes, Firefox 6 will be released somewhere in August). No worries, Firefox 6- will respect the overflow: hidden property and will just hide the rest of the text. Read the rest of ‘Clipping text with ‘…’ using CSS3 text-overflow’ »