Thursday, November 13, 2008

OpenSQL Camp Starts Tomorrow!

So the good news is that the inaugural OpenSQL Camp is going to be an awesome event with mouth watering sessions by noted experts. The bad news (for me) is that I won't be attending it, which makes me sad. I cannot leave my town because my wife can go into labor anytime now.

The session list looks great! Congratulations and thanks to Baron, Sheeri, Ronald, all the sponsors and contributors for organizing the first OpenSQL Camp.

I will be watching PlanetMySQL closely for juicy blog posts. Hopefully, the one and only Sheeri is taking her camcorder!

Monday, November 10, 2008

Stack Overflow: Q&A Site

Today I discovered Stack Overflow, a collaborative site that focuses on technical Questions. You can ask questions related to any language, apparently without having to register. The site is currently in beta. There are also a few MySQL questions that are currently unanswered.

Sunday, November 09, 2008

Scalability As A Functional Or Non Functional Requirement

I am currently tasked with writing Software Requirements Specification (SRS) document for a project. Effective sharding (based on specific criterion) and Scalability are key requirements of the project.

Scalability is traditionally classified as a non-functional requirement. My question to the community is that if scalability is crucial to a project, would it still be classified as a non-functional requirement? Are their cases when scalability requirements would be best classified as functional requirements?

Friday, October 31, 2008

Open Source Pony Tail

Sorry for not updating this blog regularly. My wife's due date is soon so I've busy.

Anyway, I wanted to share this very funny interview with Jonathan Schwartz (puppet):

Thursday, October 02, 2008

Startup Scalability Strategies @ Startonomics

Tomorrow morning I am presenting a session, Startup Scalability Strategies, at Startonomics, a conference being organized by Dave McClure and Deal Maker Media. The sessions will be streamed live using UStream. Check the Startonomics website at http://startonomics.com for more details.

Also check out my guest blog post titled How Important Is Scalability written for Startonomics blog.

Sunday, August 24, 2008

Notes from Structure 08, Velocity and Graphing Social Patterns East

I attended several events in June of this year including Graphing Social Patterns East, Velocity and Structure 08. At each of these events, I tried to take some notes and posted them to my personal blog. I received a few pings from readers of this blog to point them to a list of these posts. It took some time but here is the list of my notes. In some cases, I have linked directly to the presentation files.

High-performance Ajax Applications: Julien Lecomte (Yahoo!) talked about how to effectively use AJAX in your applications without compromising performance.
Slideshare: High performance Ajax Applications

Stress, Load and Performance Testing in Quality Assurance: Excellent tips on stress and performance testing by Goranka Bjedov of Google.

Actionable Logging for Smoother Operation and Faster Recovery: Mandi Walls from AOL talked about logging in general including actionable logging, why it's important, logging goals, log file management, things to avoid in logs and more. (Presentation slides)

Clouds are No Substitute for Competence: Presented by Javier Soltero of Hyperic

Energy Efficient Operations: Some Challenges and Opportunities: Luiz Barroso from Google presented this very interesting and informative session about making operations energy efficient.

Innovation That Drives Opportunity for the Web Infrastructure: John Folwer (Sun Microsystems) was the speaker at this talk about Web 2.0 architectures. (Presentation slides)

Importance of Operations and Performance: Artur Bergman of Wikia talked about lessons learned while running 7000 wikis.

Jiffy: Real World Performance Measurement: In this session Scott Ruthfield talks about Jiffy, an open source tool for performance measurement and instrumentation. (Presentation slides)

KITE: Keynote Internet Testing Environment Launch: KITE was one of the interesting products launched at Velocity. KITE allows you to test from desktop to the Internet cloud. At the time of launch KITE was free. Don't know the current pricing model. (Presentation slides)

Harnessing Explosive Growth: Infrastructure Strategies and Tactics: Panelists including Sandy Jen, Akash Garg, Jeremiah Robinson, Jonathan Heiliger and James Barrese discussed strategies and tactics for handling explosive growth.

The Race to the Next Database: Overclocking and Analytics Augment Your Data Layer: At Structure 08, panelists on this session included Mayank Bawa (Aster Data Systems), Doug Judd (Zvents), Luke Lonergan (Greenplum), Damian Black (SQLstream), Dave Schrader (Teradata) and Scott Wiener (Cloud9Analytics). Each panelist provided insight into the ground breaking work their company is doing in solving data processing and handling BI challenges faced by consumers today.

Working the Clouds: NextGen Infrastructure for New Entrepreneurs: This panel on cloud computing featured panelists including Geva Perry (GigaSpaces), Jason Hoffman (Joyent), Tony Lucas (XCalibre), Lew Moorman (Rackspace), Christophe Bisciglia (Google / AppEngine) and Joe Weinman (AT&T). Christophe got grilled heavily by other panelists but he handled it pretty darn well.

Werner Vogels: Keynote at Structure 08: Dr. Vogels keynote was one of the highlights of Structure 08. He presented case study of Animoto and talked about the 70/30 switch among other things.

The Platform Revolution: A Look into disruptive technologies: Jonathan Yarmis AMR Research (VP of Disruptive Technologies) talked about technology trends, social networking, mobility, mobile , cloud computing, stream computing, business models, user 2.0 and the new enterprise reality.

Green Data Centers: Bill Coleman (Cassatt Corporation) presented this session. Bill is known for being responsible for the B in BEA. (Presentation slides)

Creating Bebo Applications: Bebo is now part of OpenSocial and this presentation presented at Graphing Social Patterns talks about how to create applications for Bebo.

Open Social and Google App Engine: Patrick Chanezon (API Evangelist) and Paul McDonald (Product Manager for Google App Engine) presented a technical overview of OpenSocial and Google App Engine at Graphing Social Patterns East.

OpenSocial: Open for Business: In this session, panelists were Patrick Chanezon (Google), Paul Lindner (hi5), Max Newbould (MySpace) and Sachin Rekhi (imeem). (Also see)

Viral Marketing and Advertising Strategies for social networks: One of the best sessions at the Graphing Social Patterns conference presented by Kevin Barenblat and Jeff Ragovin. (Also see)

Mobile Social Networks: A Comparison: Benjamin Joffe's excellent eye opening session for anyone interested in using mobile platform for creating social networking solutions.

Top 5 Things that fail and win on social networks: Dave McClure, chair of Graphing Social Patterns, presented this concise but every effective presentation on what fails and what wins on social networks.

Geek Metrics: Using App Analytics to Drive Distribution, Engagement, & Monetization: Dave McClure (500 Hats) moderated this panel which included Hiten Shah (CrazyEgg / KISSmetrics), Ian Swanson (Sometrics, Inc.), Albert Lai (Kontagent) and Roy Pereira (Refresh)

Social + Mobile = Sociable (Social Networks for SMS, IM & Mobile Devices: Panelists in this session included Benjamin Joffe, Ben Keighran, Gregory Cypes, Craig Dalton and Chris Butler.

Widget Strategies & Social Platforms: Hooman Radfar, CEO of Clearspring Technologies discussed the new role of widgets and how to go about creating them.

Facebook Business and Marketing Solutions: Kent Schoen talked about how to use Facebook for business and marketing.

Developing and Promoting Social Network Applications: Rules of thumb: What does FACEBOOK means when it comes to creating and promoting applications for social networks?

Social Networks for Business and Marketing Managers: Ro Choy of Rock You! gave an overview of social networks for business managers:

Scaling MySQL - powered Web Sites by Sharding and Replication: Slides from Peter Zaitsev's session at Velocity. (Presentation slides)

Capacity Management: John Allspaw's signature presentation on capacity management. John also has a book coming out on this topic. (Presentation slides)

Structure 08 on demand: Watch the Structure 08 conference on demand at Mogulus.

LinkedIn Communication Architecture: Slides about LinkedIn's platform built in Java. (Presentation slides)

SOX Compliance: A presentation by Skye Rogers. I missed this presentation but then caught up with Skye at dinner. I wish Skye would have received more time to discuss SOX Compliance. (Presentation slides)

There were several sessions I didn't get to go to which is a sad thing. You may want to check the conference websites directly (linked at top of this post) to see if there are presentation slides available. Also, if you took notes at these sessions, please feel free to drop the links as comments to this post.

Sunday, July 20, 2008

S3 suffers major outage

Funny how Amazon doesn't use S3 to store any assets for amazon.comtweet by @gruber


Amazon's S3 suffered a major outage today knocking many websites offline. S3 outage started at approximately 12:00 PM EST and the last time I checked at 11:11PM EST, Smugmug, a popular photo hosting site that extensively uses S3, was still down.

- S3 down for more than 7 hours
- S3 outage, 7 hours and counting
- S3 down again
- Amazon failure downs Web 2.0 sites
- Amazon's S3 experiencing outage

Web Developer / Graphic Designer Job Openings

Currently, there are several great opportunities with exciting companies available in the New York area. If you're a rock star Java/PHP/Ruby developer or a pixel-obsessed designer, contact me at your earliest convenience.

Web Developer:

Give Real is a well-funded startup in the midst of an exciting period of growth and success. Our technology uses a patent pending platform that combines the ubiquity of credit card transactions and the power of social networks to create a new gifting experience.

Our primary platform is Rails, but there are programming challenges that range from SOAP APIs to Facebook application development. We are searching for full-time developers with expertise and broad experience in:

* Ruby on Rails (we also use rSpec, Starling, Memcache)
* MySQL
* xHTML & CSS, and comfort with Javascript
* Team development with tools like Git & Trac

In addition, we are also interested in candidates who have:

* Expert Javascript skills
* Java & SOAP experience
* Experience scaling with Rails, or any other web platform
* Comprehensive Linux knowledge
* UI and graphic design backgrounds

We are willing to pay top-notch developers very competitively (plus the possibility of options) to join our team and help write code that will be used by hundreds of thousands of users within a few months. We are ideally located in downtown Manhattan less than a minute walk from the BDFV and NRQW lines

Also, if you know someone who may be a good fit for us (developer or graphic designer), we are offering a $1000 referral reward for anyone we hire.

Please contact us at jobs@givereal.com

Graphic Design:

Give Real is a well-funded startup in the midst of an exciting period of growth and success. Our technology uses a patent pending platform that combines the ubiquity of credit card transactions and the power of social networks to create a new gifting experience.

We're searching for full-time designers with experience in:

* Design for advertisements
* Design for consumer focused websites & applications
* xHTML & CSS coding
* HTML & design for emails
* Working on top of an MVC or template system (we use Rails)

In addition, we are also interested in candidates who have:

* Team development with tools like Git & Trac
* Comfort with Javascript programming
* Rails programming experience

We are willing to pay top-notch developers very competitively (plus the possibility of options) to join our team and help design the look and feel of a service that will be used by hundreds of thousands of users with a few months.

Also, if you know someone who may be a good fit for us (RoR developer or graphic designer), we are offering a $1000 referral reward for anyone we hire.

Please contact us at jobs@givereal.com

Sunday, July 13, 2008

Please Help Save Ivan (Needs a Bone Marrow Transplant)

Please help save Ivan, son of Andrii Nikitin (MySQL Support Engineer), who needs a bone marrow transplant. Andrii's message is below:

"My family got bad news - doctors said allogenic bone marrow transplantation is the only chance for my son Ivan.

"8 months of heavy and expensive immune suppression brought some positive results so we hoped that recovering is just question of time.

"Ivan is very brave boy - not every human meets so much suffering during whole life, like Ivan already met in his 2,5 years. But long road is still in front of us to get full recover - we are ready to come it through.

"Ukrainian clinics have no technical possibility to do such complex operation, so we need 150-250K EUR for Israel or European or US clinic. The final decision will be made considering amount we able to find. Perhaps my family is able to get ~60% of that by selling the flat where parents leave and some other goods, but we still require external help."

-- Andrii Nikitin, MySQL Engineer


Please remember, every little bit will help the family pay for Ivan's operation! Be as generous as you can.

For donation: Donation can be made through PayPal (via MySQL/Sun website)

Andrii and Ivan, our prayers are with you.

Thursday, July 03, 2008

Memcached for MySQL Webinar: Advanced Use Cases

Today at 1PM EST I am presenting the second part of memcached for MySQL webinar. I was told that the registration numbers look as good as the previous one. This one will be a bit more technical than the previous webinar. Sorry for the late notice but hope you can join!

Thursday, June 26, 2008

Chad Hurley at Startup2Startup Dinner

Tonight, I am attending Startup2Startup Dinner on Dave McClure's invitation (Thanks, Dave!). Chad Hurley, CEO and co-founder of YouTube will be speaking at this invitation only event. I will post more updates on my personal blog or you can follow me on Twitter.

Friday, June 06, 2008

Graphing Social Patterns - East

Graphing Social Patterns - East 2008In just a few minutes, I will be leaving for Graphing Social Patterns East, a conference by Oreilly. Dave McClure of 500 Hats is the conference chair. I plan to meet old friends and make new ones. It should be a lot of fun. More about Graphing Social Patterns.

Tuesday, June 03, 2008

Goosh: Google Shell for Geeks

Ever wish you could have a browser based shell for Google? One that was clutter and advertising free? Say hello to Goosh, one of the coolest service to hit the web.



It even recognizes 'clear' :) For now, I am addicted to it.

Sunday, June 01, 2008

Disaster is Inevitable - Must shutdown generators

Disaster is really inevitable. Even with all the redundant power investments, ThePlanet (formerly EV1 and RackShack), had to shut down their backup generators at their H1 data center on the instructions of the fire crew. This happened after a wire-short in fault transformer led to an explosion that knocked off one of their walls, ultimately bringing 9,000 servers down. Luckily no one was injured.

This just goes on to show that just because a data center has redundant power and backup generators, it does not mean that a disaster cannot happen. IIRC, ThePlanet's last disaster was blamed on backup generators not kicking off properly.

While there was no damage to servers, I wonder how many MyISAM repairs need to be triggered once the servers do come back online?

- The Planet Status Update

Saturday, May 31, 2008

Michael Arrington Asks Twitter a Few Tough Questions

Michael Arrington of TechCrunch asks Twitter a few questions. I have only included a sample list below but you should read his blog post for all the questions:
  • Is it true that you only have a single master MySQL server running replication to two slaves, and the architecture doesn’t auto-switch to a hot backup when the master goes down?
  • Do you really have a grand total of three physical database machines that are POWERING ALL OF TWITTER?
  • Is it true that the only way you can keep Twitter alive is to have somebody sit there and watch it constantly, and then manually switch databases over and re-build when one of the slaves fail?

A 'yes' answer to any of these questions by Twitter would be disturbing to say the least. However, it won't be surprising as companies expect databases to just somehow magically work without creating and supporting a proper architecture. High availability doesn't comes cheap and reputation for companies is everything.

I find it amusing that Twitter isn't even looking for a DBA. May be that's considered a job for the SA over there :)

Thursday, May 29, 2008

Memcached Webinar - 560+ registrants

A big thank you to all those who attended the memcached webinar today on which I was a panelist. I was told that there were more than 560 registrants.

The feedback I received directly and indirectly shows that there is a lot of interest about memcached. In the future, I hope to work again with MySQL/Sun on more memcached related webinars.

If you attended the webinar and have some suggestions, comments or questions, please contact me at fmashraqi at yahoo dot com or post a comment on this blog.

Special thanks to Jimmy Guerrero, Monty Taylor, Rich Taylor, Edwin DeSouza and Alex Roedling for their hard work in arranging the webinar. Also thanks to Brian Aker, Matt Ingenthron and Trond Norbye for their assistance at various phases.

In case you missed the webinar:

Wednesday, May 28, 2008

Memcached Webinar - 420 Registrants and Counting!

Regarding my earlier post on memcached webinar, I was informed today that more than 420 registrants have signed up. Space is limited and filling up fast so if you are interested in memcached and haven't registered yet, click on the following link to register now!

Designing and Implementing Scalable Applications with Memcached and MySQL (June 29)

Monday, May 26, 2008

Presenting a Webinar on Memcached Use Cases


Quick link: register for Designing and Implementing Scalable Applications with Memcached and MySQL webinar (June 29)

Ever since its introduction, memcached has been changing the way cost-efficient caching is perceived. Some passionately love it, others cynically hate it.

Today, many large scale web 2.0 properties (including my employer) save millions of dollars by depending on memcached to bring their application response time under control and to offload pressure from databases.

There are several success stories about using memcached to speed up database driven websites. Facebook, for instance, runs the largest memcached installation and the numbers only keep increasing. In May 2007, Facebook was reportedly running 200 dedicated servers with 3TB of memory in their memcached cluster. At the "Scaling MySQL Up or Out" Keynote, Facebook revealed they are now using 805 dedicated memcached servers. That's more than a 400% increase in less than a year!

Twitter, digg, Wikipedia, SourceForge, and even Slashdot depend on memcached to keep their users happy.

For my employer, memcached has been a crucial component of the infrastructure that has been instrumental in handling explosive growth in a cost-efficient manner. In addition, memcached has helped us offload billions of queries from our database.

To highlight several real-life use cases of memcached (see below), I will be presenting a memcached webinar on Thursday, June 29 at 1 PM EST (10 AM PST). Monty Taylor (Senior Consultant, Sun Microsystems) and Jimmy Guerrero (Sr Product Marketing Manager, Sun Microsystems - Database Group) will also be speaking at the event. Space is limited and filling up fast (200+ registrants already) so I highly recommend registering now.

In this webinar, I will be covering several use cases for memcached including (but not limited to):
  • deterministic cache
  • non-deterministic cache
  • proactive cache
  • "state" cache
  • filesystem cache replacement
Hope to "see" you at the webinar.

Note
: This memcached webinar is not to be confused with the memcached webinar being presented by Ivan Zoratti on June 28.

Wednesday, May 21, 2008

Interview by Sun TV at MySQL Conference

At the MySQL Conference and Expo, right after my participation in scaling up or scaling out keynote panel, I talked to Sun's Multimedia team about Sun and MySQL in our environment.

Recently, I found the interview on Sun's Multimedia page. The video of my discussion is embedded below:

Wednesday, May 07, 2008

Interesting Internet Usage and Social Networking Statistics

Over the weekend I took some notes from a presentation and did some research from various sources. The result was a blog post about Internet trends that I posted on my personal blog. There are some very interesting statistics about Internet usage and social networking. Also, Facebook fans will find some interesting facts as well.

Tuesday, May 06, 2008

Solaris 10 User Group Part X

Tomorrow I will be attending the Solaris 10 User Group Part X at the offices of Sun Microsystems, 101 Park Ave., New York, NY. This is an all day event and there is even a MySQL talk by Philip Antoniades. Other presenters include Ambreesh Khanna, Isaac Rozenfeld, Neal Weiss, Sunay Tripathi, Amjad Khan, Damien Farnham and Dave Teszler.

Unfortunately, the event registration is now closed, but if you're attending I look forward to meeting you.

Sun's exciting technologies

It's exciting to see how many technologies Sun is working on.

On May 1, I took a few members of our operations and database team to meet with Vasu Prakash who is an Engagement Architect with Global Systems Engineering division of Sun Microsystems. Vasu generously let us pick his brain regarding a wide range of exciting technologies Sun is working on and to see how they may potentially address our needs and challenges.

The following notes are my personal notes expanded with some articles from my bookmark collection.

Thumper
- Thumper (X4500) offers 48TB (SATA HDD) in a 4U at around $1.30/GB, runs Solaris OS and ZFS and supports RAID 0, 1, 0+1, 5, 6 enabled by RAID-Z and Raid Z2. X4500 supports 16GB RAM and needs 200-220 V AC for power. For non-Solaris users, other operating systems are supported as well.
- We initially evaluated Thumper as our backup storage solution but then ended up going with Sun Storage Tek. I am, however, interested in evaluating it further.
- Robert Milkowski wrote a post benchmarking Thumper and found that he was able to get more than 2GB/s aggregate write throughput using raid-5 volumes! He concludes with "Woooha! It can write more data to disks than most (all?) Intel servers can read or write to memory"
- Jason Hoffman also seems pretty pleased with Thumper
- Jonathan Schwartz's blog post announcing Thumper

ZFS
- ZFS, for those who need an introduction, is a 128-bit transactional file system offering self-healing capabilities and useful if you are running into limitations of 64-bit file systems. It is 18 billion billion times larger than 64-bit file systems.
- ZFS pooled storage can grow and shrink automaticaly.
- One of the questions I am most often asked by people is that if ZFS is really what it is then why hasn't it replaced UFS as default file system for Solaris. I would love to see a blog post by a Sun insider addressing this question.
- ZFS Best Practices Guide
- ZFS Learning Center

Solaris Containers
- For a really interesting project, I may need to create a couple hundred zones on a server (no this is not for a production system as we are a Redshift application). I was surprised to learn that more than 8000 zones (8191 non-global zones to be precise) can be created within a single operating system instance. Of course, if you do create a very high number of zones, don't benchmark boot time as it will take a very long time to boot up:)

SAM-FS
SAM-FS is short for Sun StorageTek Storage Archive Manager and it is a very exciting policy based file system by Sun. According to Sun website (it is marketing lingo but saves me the hassle):

"SAM software provides data classification, centralized meta-data management, policy based data placement, protection, migration, long-term retention, and recovery to help organizations effectively manage and utilize data according to business requirements. SAM enables users to reduce the cost of storing vast data repositories by providing a powerful, easily managed, cost-effective way to access, retain, and protect business data over its entire lifecycle. This self-protecting file system offers continuous backup and fast recovery features to help enhance productivity and improve resource utilization."

In a nutshell, if I understand correctly, SAM allows you to specify policies and then based on those policies it can move your data around from a fast-but-expensive storage to inexpensive-but-slower storage to give you the most bang for the buck. All data migration and transfer is transparent to the application. MLB is a major user of SAM. There is also an interesting case study on how MLB uses SAM.

QFS:
If NFS is your headache then QFS may provide a solution. QFS provides "nearly raw device access to information and data consolidation for read/write file sharing," according to Sun. My understanding is that using QFS requires a fibre channel to connect application servers to storage (if that's not true, can someone please correct me).

A maximum of 128 systems running QFS can share access to the same data without compromising file integrity. QFS volumes can scale up to 4PB. More QFS features are available on Sun site.

The main limitation to note: Mixed architecture (SPARC with x64) metadata servers are not supported for failover purposes. Neither are mixed architecture multi-reader configurations supported.

More Sun technologies I want to write about: Sun Cluster implementations in local (node to node), metro (run a fibre :) ) and global (global load balancer) modes. Sun cluster requires common storage that should be either direct attached or attached through a SAN switch. In addition, failure fencing, memory mirroring and vertical threading in M4000, Sun's Victoria falls processors (T5140 and T5240), PNFS and last but not least, Greenplum (claiming to be world's best database for BI and built upon PostgreSQL). Hopefully, I will talk about them in my future posts.

Saturday, May 03, 2008

New Responsibilities

During my university days when I was working towards a dual degree in Accounting and CIS, I co-founded a small managed hosting company which I ran for four years along with two other co-founders. Then I started a consulting company and eventually moved into online publishing. Things changed and after nearly nine years of being self employed, I took over the very challenging responsibility of single handedly managing and scaling databases of a top 50 site (in 2006). It was definitely not an easy journey and I feel ecstatic to have helped my employer handle 6x growth and rise to being a top 13 site (using same Alexa algorithm).

While I enjoy working with MySQL, Solaris and technology a lot, I really missed being part of business side. Those of you who know me outside my database role, know how much I crave problem solving related to day to day business operations especially strategic decisions, financial, product architecture, monetization, marketing, advertising and SEO etc. For me databases and scalability are very important part of running a successful business in today's environment and I am so happy to have been a key player for my employer in that area.

In short, I wanted to be more involved in both business side and technology side. So I recently accepted a new role with my current employer as Director of Business Operations and Technical Strategy. In addition, I will still be leading and training our database team.

This new role will allow me to get involved with much more than just databases at my job, something I am really looking forward to. Big thanks to my management team for recognizing my skills and giving me a chance to help my company reach new levels.

Sun loses 23% market capital

Sun missed its earnings and sales estimates and as a result it lost approximately 23% of its market capital. Even more disturbing news is the announcement that Sun will be cutting 1500 to 2500 jobs. Eric Day raised his concerns as to whether this job cut will affect MySQL hiring to which Marten replied and pointed to several open positions within MySQL.

Sun has an array of very interesting and useful technologies under its hood. The amount of care Sun takes for its customers is truly impressive and I hope MySQL will follow in Sun's foot steps. Yesterday, I met with a Sun engagement architect and the amount of interest he showed in the technical challenges my team faces was unmatched. I am already working on a blog post to highlight some of the technologies my team discussed with Sun's representative.

With Sun's stock now down, I think it is an excellent time to buy some JAVA stocks which closed at 12.64. I may actually put a small order myself.

Yahoo! Mail Bug? Emails disappearing upon reaching 65,535 emails in one of the folders

I am very confused.

I subscribe to several email lists including MySQL and Ruby on Rails lists. Generally, I keep my mailbox clean except for a folder in which I was archiving messages Ruby on Rails.

A few days ago I noticed that my Ruby on Rails folder reached 65535 messages. Today, I was looking to reply to an email from Keith Murphy (to which I had previously replied as well) and was surprised to find that the particular message didn't show up in my search. This particular message was sent on April 30 so I started scanning all my emails received on that day.

Surprisingly, I didn't find it even after a careful visual scan. Not only that, I noticed several of emails I received in the last 2 weeks missing. My initial reply to Keith was still sitting in my Sent mail folder. My trash folder also had several emails that I had deleted but not the ones that were missing.

For the life of me I cannot figure out where these emails went. Then suddenly I noticed that the Ruby on Rails folder still has 65535. Which is very weird as this is an active list.

I decided to send an email with criteria that would make it land in Ruby on Rails folder. After 6 hours, the email is still isn't in my inbox.

With 65,535 being a magical number representing a limitation of 65,536 objects, I believe this is a limit of a Yahoo! folder. Not only that, it seems that once you hit that limit, all sort of weird things start happening. Like, in my case, random missing emails.

This is pretty upsetting as I am not sure how many of my emails are missing. As soon as I deleted a few emails to bring the count down to 65,535, new emails from Ruby on Rails list started arriving (although not the one I had sent myself earlier today).

Now, unfortunately, I feel paranoid, not knowing how many important emails I have lost.

So, I have decided to open a new email account fmashraqi at yahoo and will be updating my contacts to start sending me email on that address.

My reason for posting it on this blog is to ask the community members if they have noticed anything like this? I know 65,535 emails is an insane number of emails but at one point I was interested in archiving the list. With Yahoo! offering unlimited storage, I wonder why isn't this limit documented?

Friday, May 02, 2008

MySQL / Linux swap problem doesn't exist on Solaris 10

Right now there is a discussion on Planet MySQL regarding MySQL / Linux swap problem. Peter Zaitsev originally brought the problem of MySQL swapping to light. Recently, Dathan Pattishall also wrote about it in his post Linux 64-bit, MySQL, Swap and Memory. Don McAskill followed up with his post, MySQL and the Linux Swap problem, and an interesting way to get around the issue: "make swap partitions out of RAM disks." Don also points to another article by Kevin regarding using O_DIRECT to fix the swap issue.

To get to the point, some time ago, I experienced a similar issue on few of my old servers running Solaris V210, UFS with plenty of memory available. My initial thinking was that I am experiencing similar issue so during my presentation at MySQL Conference, Optimizing MySQL and InnoDB on Solaris 10, I pointed that this *may* exist in Solaris 10. Luckily a Sun representative (I believe it was Matt Ingenthron) corrected me towards the end of my session and pointed that UFS and Solaris 10 kernel have features built to avoid just that. That confirmation from a Sun representative was authoritative. We have already decommissioned the affected servers from production so it may be some time before I can find the precise reason why we experienced the swapping issue. Note that I haven't seen this issue on any of our other V210, V440 and T5120s in production.

Wednesday, April 30, 2008

Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community - Video

The video of one of my three sessions, "Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community", presented at MySQL Conference & Expo 2008 has been uploaded by Sheeri. I am very thankful to her for doing all the hard work and making it available.

There are a few slides that were edited out of video because of reasons beyond my control. However, you should still be able to enjoy most of the video.

There is one point related to this video that I would like to make: Based on my particular experience I was leading to believe that Solaris 10 Kernel had the same issue as Linux Kernel related to swappiness and swapping where the kernel will start putting more importance on maintaining file system cache than the mysqld process. However, towards the end of the session, it was pointed out by a Sun engineer (thanks!) that there must be something else going on as UFS on Solaris 10 shouldn't depict this behavior and a process shouldn't swap in favor of maintaining file system cache. I am having this issue on 3 of my servers and I am currently working with Sun engineers to get to the bottom of the issue.

Velocity Conference -- Web Performance and Operations Conference

Velocity Conference I just made my reservations to attend Velocity Conference in Burlingame, CA. Velocity is a new two day conference being organized by O'Reilly. I was happy to learn at Lunch today that one of my good friends from CafeMom will also be attending. Over at Facebook I see Don McAskill has RSVP'd for the event as well.

Jesse Robbins, chair for Velocity conference graciously provided a 20% discount coupon as a comment on my blog post.

The early registration is about to end, but I find it really interesting that many slots still mention TBC (to be confirmed). I would have expected the schedule to be fully determined by now, however, I still believe this should be a great conference to attend.

Earlier I wrote about my proposed session being rejected at Velocity Conference which was a big disappointment especially since my presentation was about a top 13 website in the world. Wasn't that the point of this conference to begin with? There are several sessions at this conference that have been presented several times at other conferences including MySQL and a little Google search turns up the slides. So some company's 'secret sauce' is worth repeating and others not? Oh well, no hard feelings. As I said, I still think there would be some interesting sessions.

Let me know if you are planning to attend the conference. I will be flying to SFO on Sunday evening and flying back on Wednesday afternoon.

Sunday, April 27, 2008

Don McAskill - People I met at MySQL Conference

"The two metrics that are most important to me are first customer satisfaction and second growth." - Don McAskill

Today, I noticed Don is featured on Sun's customer success stories page:


Don McAskill is the CEO and Chief Geek of Smugmug, a photo and now hi-def video (using H.264) sharing site with a successful business model behind it.

I initially met Don last year at the MySQL Conference when my then boss told me that he is interested in meeting him. That was my introduction to Smugmug. I was impressed by SmugMug's presentation of photos and the care they took to make your photos and galleries look awesome.

This year, as a member of Smugmug, me and my wife got to interact with Don on a personal level.

We had several suggestions related to how our Smugmug experience could be improved and Don listened very carefully. One of the things I was most interested in seeing implemented was blocking Smugmug subdomains from being indexed if a customer is hosting them on their own subdomain.

I was truly impressed by how much Don thinks and cares about his members. It isn't a surprise that he runs a very successful site. From my conversations with Don, It seems there are many interesting projects Don and his team are working on and I can't wait to see them implemented. Almost all of the projects we heard about were focused towards customers. No wonder Smugmug has a high customer retention rate.

Technology wise, I am a fan of decisions Don has made to run Smugmug. He uses MySQL, S3, EC2 for processing and video conversion, Solaris 10 and Sun hardware.

Despite being the CEO, Don is the MySQL guy at Smugmug. His latest blog post, Death of MySQL read replication high exaggerated, was a good natured jab at discussion Brian Aker started with Arjen Lentz and me jumping in.

In the following video, Don Grantham interviews Don McAskill (yup, two Dons together) about Smugmug's relationship with Sun and the challenges of running a successful Web 2.0 business with more than 350,000 paying customers and more than 300,000,000 photos. As you can see in the video, customer satisfaction is more important than growth to Smugmug.


Since I joined Smugmug, several of my friends including Ronald Bradford have also joined. You can view my galleries by clicking on the image below and Ronald's photos from the MySQL conference by clicking on the image underneath:

My Smugmug Gallery

Ronald's Smugmug Gallery: MySQL Conference 2008 Photos

If you use Smugmug as well, drop your Smugmug URL as a comment (of course, only if you want to share).

To stay up to date with exciting stuff happening at Smugmug checkout Don's blog.

People I met at the conference

Every year I meet a lot of new and old friends at the MySQL conference. To highlight their involvement in the MySQL community and at the conference I have decided to start a new series: "People I met at the MySQL conference."

I probably won't be able to cover everyone I met (sorry about that) but I intend to cover as many as possible. There will be no order in which I cover people. Also, there is no secret agenda and of course whatever I say is just my personal opinion. Just whenever I have a few thoughts ready about someone, they will pop out :)

Saturday, April 26, 2008

Disaster is Inevitable -- SQL Injection: Poorly Written Code and No Backups!

Let me start out by saying: the best response to a disaster is backup you can count on.

Found a scary story today about hundreds of thousands of websites using Microsoft IIS and SQL Servers being affected by Internet-wide SQL injection attacks. The story originally reported by F-Secure is now on Slashdot as well.

On the IIS forum, panic is visible. Those who had backups are breathing a sigh of relief like one administrator who commented, "We have been hit by this as well. Lucky backup ran last night just prior to the attack."

Others without backups are just screwed.

F-secure reports in an update to the story, "Do note that this attack doesn't use any vulnerabilities in any of those two applications. What makes this attack possible is poorly written ASP and ASPX (.net) code."

Although this attack is targeted towards IIS and SQL Server, there are lessons to be learned for sites using other servers and databases. There are several guides available on the Internet that will show you how to secure your application against SQL Injection attacks, like http://www.blogger.com/img/gl.link.gifthis one that is focused on securing PHP and MySQL applications.

In this year's "Disaster is Inevitable--Are you Ready" presentation at the MySQL Conference (Yes, I have read Baron's post), I covered a few types of disasters. However, I missed an important kind of disaster: ones that are caused by SQL Injection. My next presentation on this topic will certainly cover this. BTW, if you missed my presentation, you can thank Artem Russakovskii, who took meticulous notes that you can read.

What saddens me is comments like, "but we have all patches applied to the version we are using." There is of course, a disconnect here as far as understanding the problem is concerned.

Patches don't secure you against SQL injection attacks; Properly written code does. Sanity check is very important!

Replication as a backup method won't help against SQL Injection
Based on my survey, a disturbingly high number of sites use replication as their backup strategy. If replication is your sole method of backup, then beware, SQL injection based disasters aren't going to help. Unless, of course, you have time delayed slaves and are able to stop replication before the slaves are affected.

Every year there are a number of backup related presentations at MySQL Conference. All, except one of the following, were presented this year!:

- What do you mean there's no backup? -- A timeless presentation by Mike Kruckenberg and Jay Pipes originally presented in 2006.
- Backup and Recovery Basics by Kai Voigt
- MySQL Backups go near continous by David Wartell
- MySQL Online Backup: An In-depth presentation by Chuck Bell
- Online Backup, Open Replication and a world of contribution by Lars Thalmann and Chuck Bell
- Performing MySQL Backups using LVM Snapshots by Lenz Grimmer
- Top 5 Considerations While Setting Up Your MySQL Backups

Friday, April 25, 2008

Scaling Up Or Out - Keynote at MySQL Conference 2008

At this year's MySQL Conference I was invited to be a keynote panelist at Scaling MySQL Up Or Out keynote. Other keynote panelists included Jeff Rothschild (VP of technology at Facebook and a consulting partner with Accel Partners), Paul Tuckfield (DBA at YouTube), John Allspaw (manager of operations engineering at Flickr) and Domas Mituzas (DBA at Wikipedia). There were also representatives from MySQL (Monty Taylor) and Sun (Matt Ingenthron).

I really enjoyed being a keynote panelist with my peers. We were seated according to our Alexa ranking with the highest ranking YouTube on the right side. Even though I was representing the thirteenth largest site, our traffic compared to Facebook and YouTube was humbling.

All of the keynote panelists met early in the morning to get equipped with microphones and to go over the format.

See the video (below) to hear some funny "can't say" answers by Paul Tuckfield. I wish Google won't keep him so secretive about numbers such as how many database servers etc. Does that really give out YouTube's secrets?

Following are some photos, videos and links to notes from the keynote.

Keynote: Scaling Up or Out at MySQL Conference and Expo 2008
From left to right:
Monty Taylor (MySQL),Matt Ingenthron (Sun), John Allspaw (Flickr), Farhan "Frank" Mashraqi (Fotolog), Domas Mituzas (MySQL/Wikipedia), Jeff Rothschild (Facebook) and Paul Tuckfield (YouTube)

Scaling MySQL Keynote
Jam packed ballroom during the keynote.

Above Photos copyright: James Duncan Davidson.


Kaj Arnö leads the scaling MySQL keynote panel discussion.

John Allspaw, me, Domas Mituzas
Me getting animated.

Domas Mituzas, Jeff Rothschild and Paul Tuckfield at Scaling Up Or Out Keynote
Domas Mituzas, Jeff Rothschild and Paul Tuckfield at the keynote.

Scaling MySQL Up or Out
Matt answers a question as everyone listens

More photos from the keynote session are available at http://photos.mashraqi.com.


Video of keynote session:
-

- Sheeri/Technocation: Download, Play

- A short video by Zack Urlocker


Notes from scaling up or out keynote:
- Biographies of keynote panelists
- Keith Murphy: Scaling MySQL - - Up or Out? Panel @ UC
- Ronald Bradford: Scaling Wisdom
- Venu Anuganti: Notes from Scaling MySQL Up or Out

Thursday, April 24, 2008

MySQL on Solaris 10 -- Buffer Overflow and Security Bypass Vulnerabilities

So found some recently discovered buffer overflow and security bypass vulnerabilities when running MySQL on Solaris 10. According to FrSIRT, these vulnerabilities "could be exploited by attackers or malicious users to bypass security restrictions, gain knowledge of sensitive information, cause a denial of service, or execute arbitrary code." A final resolution for these vulnerabilities is pending completion according to their website.

Unfortunately, I do not have a FrSIRT account currently (need to get one ASAP) so I couldn't dig this vulnerability further. However, I am dying to learn more about this.

Wednesday, April 23, 2008

Java getting fully Open Source

The big news coming from Java One is that Sun is removing the last licensing hurdles in Java. What this means is Java is becoming fully Open Source.

Java users can especially thank Sun now. Also this supports Sun's vision of Open Source.

"We've been engaging with the open-source community for Java to finish off the OpenJDK project, and the specific thing that we've been working on with them is clearing the last bits that we didn't have the rights," to distribute, Sands said.

"Over the past year, we have pretty much removed most of those encumbrances," Sands said. Work still needs to be done to offer the Java sound engine and SNMP code via open source; that effort is expected to be completed this year. Developers, though, may be able to proceed without a component like the sound engine, Sands said.

Source: Yahoo News

I think Monty found the right environment to work in.

Update: Original post mentioned "Java now fully Open Source" however as the article points, Java is expected to become fully open source later this year. I wonder how much role MySQL conference played in this announcement coming earlier.

Tuesday, April 22, 2008

Mashable Party at Webster Hall

I will be at the Mashable Party at Webster Hall on May 16, 2008. The party starts at 8PM and goes till 4 AM although I won't be staying till 4.

There are less than 100 tickets left. If you are attending and use MySQL, Solaris 10 or Sun hardware in your environment, I would love to chat with you.

And, there are no presentations :)

------ EVENT DETAILS ----

What: MashBash NYC : Mashable’s NYC Spring Party!
Who: 2,500 Sold Out Crowd, 400 Mashable VIP Tickets on Balcony, Grandmaster Flash starts the night off
When: Friday, May 16th, 2008
Drinks: Open Bar, 8 - 10 pm sponsored by Kluster
Where: Webster Hall, 125 East 11th Street, New York, NY

Schedule for the Evening: 8 - 10 pm: Mashable is hosting an exclusive 400 person VIP event on the 2nd Floor Balcony of Webster Hall’s Grand Ballroom. There will be an open bar sponsored by Kluster.com.
10:00 pm: Doors open to the public, a 2500 person sold out crowd
10:15 pm: Opening for Mashable’s VIP guests is none other than the legendary Grandmaster Flash.
Midnight till 4 am+: Mashable’s VIP guests are welcome to stay in the VIP area all night for music from acts including MSTRKRFT, L.A. Riots and more.

Monday, April 21, 2008

Back from the MySQL conference

This morning I landed back at my home airport, EWR, after spending a fun-filled week and a half at the MySQL Conference 2008.

This year's conference was the best ever for me. I have a lot of people to thank and a lot to blog about. The number of pings I have received about lack of my blogging during conference is truly humbling. However, I did have a good reason for not being able to blog.

First, I was presenting three sessions, with two on the final day of the conference. Since I have the habit of continuously revising my presentations, that put a little bit of pressure on me. A big thanks to all those who came to my sessions.

Second, I was given a great opportunity to be a keynote panelist at the "Scaling up or out" session at the MySQL Conference. If you missed the keynote, you can watch the full video of the keynote posted by Sheeri.

Third, me, my wife and a few friends were invited to a trip of the lifetime by hardcore community evangelists at Proven Scaling (Jeremy Cole, Eric Bergen and Mike Griffiths). We had a great time visiting Yosemite National Park (more on this later). This was my first time without checking email or being on the Internet in nine years.

Now that I am back, I intend to put all my thoughts regarding the conference and the trip as blog posts in the coming days so stay tuned.

Sunday, April 13, 2008

Heading to MySQL conference in Santa Clara

I am leaving in a few hours from Monterey for Santa Clara, the home of MySQL conference. I should be in the Hyatt Regency Lobby at 5:45 PM. I still have one more space in my car so if you haven't found a ride yet to go to the pre-conference dinner, you can reserve the spot by calling me or sending me a text message at 5/5/1/6/5/5/5/5/9/0.

Wednesday, April 09, 2008

Facebook Scary Message

A friend emailed me a message he had received when attempting to login to Facebook:

The message reads,
Warning: Facebook detected a potential scam to steam your account!
To prevent future problems, please reset your password.

Also, I was hearing in news today that a significant percentage of scams are now targeted towards social networking sites.

Of course, it goes without saying that one should not use their "important" passwords with social networking sites.

On my way to MySQL conference

Later today around 5PM EST both me and my wife will be flying to San Jose to attend MySQL Conference happening next week. We will be staying the first two nights in Burlingame to meet family and friends.

Then on Friday evening we will be going to visit more family in Monterey. We will arrive at Hyatt Regency, Santa Clara, on Sunday afternoon.

Once at Hyatt, I will be happy to give a ride to anyone going to the Pre-Conference dinner.

After the conference, my plan is to spend time with a few friends. I will be flying red-eye, Sunday night, back to home.

Like previous conferences, I can't wait to see all my old and new friends.

My passions include InnoDB, memcached, BLOBs, Latent Semantic Analysis, Ruby on Rails (why won't it scale), SEO, monetization, Solaris 10, Sun hardware, Hadoop, Lucene, replication and Blue Moon :), I would love to meet/talk with other users passionate about similar stuff.

Monday, April 07, 2008

Can backup really kill performance?

Yes, if you are running backing on a large database that is also handling production traffic (not a very smart idea to begin with). This is especially important for backups created using snapshots based on copy-on-write algorithm.

Brian makes an important point in a comment to my post regarding backup. He points out "Backups are always onerous on IO" and that a better way to backup is to use slaves or a standby master (if using multi master replication).

If you *must* run backups on a production server, then ibbackup becomes very important as it doesn't affect performance as much as the evil snapshots created by snapshot tools like fssnap and LVM. I have found that in our case purchasing ibbackup licenses were worth every penny.

In our environment, running backups using copy-on-write snapshots was killing performance. Writes would start stalling several hours into the backup process. It didn't help that backups would take 27 hours to finish. I moved most systems to using ibbackup and for those systems running backups hasn't been an issue at all.

Of course, if you must backup production servers, take snapshots to backup everything except the databases. That way the snapshots will be held for a much smaller period and you can continue backing up databases using ibbackup.

What about mysqldump? I don't consider mysqldump an appropriate tool for periodic backups. I can see it working for small databases but running it on enterprise level databases for daily backups is just not going to be feasible.

I would love to discuss backups more at the conference. I also would like to evaluate some of the backup vendors exhibiting at the conference.

Google App Engine Announced - Limited to 10,000 Accounts

Google's announcement tonight is much bigger than I thought. Google is releasing Google AppEngine (site goes live at midnight EST) tonight, a fully-hosted, "automatically scalable" web application platform that consists of Python App servers, BigTable and GFS.

By making App Engine available only for Python, Google is giving the language a big boost.

Amazon's EC2 (Elastic Compute Cloud) allows developers to choose their own stack. Furthermore, Amazon's S3 allows third party applications to connect directly. With Google AppEngine it seems one must interact with BigTable using Python application.

Here's what Google's AppEngine promises developers:
- Write code once and deploy
- Absorb spikes in traffic
- Easily integrate with other Google services

Google App Engine is limited to first 10,000 developers
The website for Google App Engine (http://appengine.google.com/) goes live at 12:00 AM EST tonight. Only the first 10,000 developers will be given beta accounts. So hurry now before you are left out.

What is offered
The current limits imposed by Google include:
- 500 MB storage
- 200 million megacycles/day CPU time
- 10 GB bandwidth per day

Google App Engine Pricing
During the beta period, the service is completely free. Google has not announced the pricing after beta period finishes.


UPDATE:
I tried gaining an account right at 12:01 AM but thanks to Google "profiling" (which they have complete right to :) ), I got the following message:
Unfortunately, space is limited during Google App Engine's preview release. As we expand, we'll invite more developers, but for now you'll have to wait.

Would you like to be notified by email when space becomes available?


It seems like an "invite only" service. If you have invites or figure out how to get an account, please let me know. I'd love to get one.

UPDATE 2:
Many thanks to Nick Johnson of Google and others for sending me invites. Also, thanks to those who posted a comment. I was able to get an account and couldn't be happier.

Sources:
- Google Jumps Head First Into Web Services With Google AppEngine.
- Google App Engine readies for brawl with Amazon
- Google Launching App Engine for Python Developers
- Google Cloud Now on Tap for Python Developers


"The apps all appear on the appspot.com domain. Each developer currently gets three application ids. When apps are uploaded they will appear at http://application-id.appspot.com. Developers can, of course, bring their own domains. You can see the current set of apps in the application gallery. I love the Appspot domain name; it's an homage of sorts to Blogspot and fits in nicely with Jotspot."
- App Engine: Host your Python Apps with Google
- Google App Engine Blog- Introducing Google App Engine

Google's BigTable as a Web Service Announcement Expected Today

According to TechCrunch, Google is expected to announce BigTable as a web service tonight.

For those unfamiliar with BigTable:
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

Pre-conference Community Dinner - MySQL Forge Registration

I like Arjen's suggestion of having a pre-conference community dinner and wanted to put my name. I tried to register with softwareengineer99 and got this:

This kinds of usernames usually indicate spammers. If you feel like this was in err, contact the wiki administrator.

Return to MySQLConf2008CommunityDinner.

Ok, whatever. I then tried again with a "non spammer" username and multiple email addresses but kept getting this:

A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:

(SQL query hidden)

from within function "User::addToDatabase". MySQL returned error "1062: Duplicate entry '' for key 3 (localhost)".


So there seems to be an issue with MySQL Forge registration.

Sunday, April 06, 2008

Is Backup Really Irrelevant?

Brian Aker writes in his "PostgreSQL to scale to 1 billion users" post:
Backup is irrelevant for those of you who care about this discussion. LVM/ZFS snapshots are the rule of the land.


While I agree with most of Brian's statements in the article, I respectfully disagree with the statement above, especially the bolded part. Copy-on-write snapshots are EVIL for very large databases operating in a high I/O environment and backup, by no means, is entirely irrelevant. Please correct me if I am wrong but it is my understanding that both LVM and ZFS implement copy-on-write snapshots. Backup may be irrelevant for most sites but not for us.

If, however, by "irrelevant" Brian meant that not important in choosing one database over another, I can agree with that. Why? Because no one benchmarks backup methodologies until backup process starts becoming a major PITA.

Backup methods can be a performance killer when dealing with very large databases. If you're interested in finding out why, and more importantly how, ask me at the conference, come to my scaling MySQL and InnoDB on Solaris session, or check on this blog after the conference.

Is Read Replication Really Dying in Favor of Memcached?

I spent my Sunday working on my three presentations that I will be presenting at the upcoming MySQL Conference. About two hours ago, as I was reviewing my stuff, I told my lovely wife that I may talk in my sessions how replication for read scalability no longer makes sense in high traffic environments. I told her, I am probably going to vote in favor of investing in memcached vs read slaves for scaling reads.

Believe it, or not, she hammered me with all sorts of questions. I spent some time answering her questions. I scanned my brain to gather more evidence to support myself including that at work we are moving and staying away from replication as much as possible.

Then, I got busy writing the post about Facebook using MySQL replication to update Memcached. After publishing the post, I checked Planet MySQL and found Arjen discussing (and agreeing with) Brian Aker's post about "The Death of Read Replication."

At that point, I simply turned my MacBook screen towards my wife and smiled :)

I consider Brian's post a brave one from MySQL point of view as I can imagine not everyone at Sun/MySQL will be happy about this. I appreciate his can

However, what Brian says about replication, caching and memcached is very true. memcached is an incredibly important part of our infrastructure. It doesn't has painful latency of MySQL replication associated with it. It requires much less hassle to setup, reset and scale. Like Facebook and all other major Web 2.0 sites, we have a considerably large memcached farm that allows us to serve our ever increasing demand.

P.S. Just to be clear, I highly favor using master-master replication for high availability and a small number of slaves. I just don't favor investing money in slaves alone for scaling reads.

P.P.S. I will leave you with a quote from Arjen's post:
"What needs to be fixed is distributed writes. And economically!"

Facebook using MySQL to replicate Memcached

Faced with the challenge "to figure out a way for memcached servers to replicate data concurrently with the MySQL databases," across the country, Facebook came up with a clever solution of "embedding extra information in to the MySQL replication stream that allows [Facebook] to properly update memcached [servers] in Virginia."

This is very smart! I am curious about how they implemented this. I wonder if by "replication stream" they are just referring to binary logs. The article didn't mention whether they hacked MySQL to do synchronous replication as well, like Google. That would be really neat: synchronous replication that updates memcached.

Synchronous or not, the idea is still uber cool and I would love to see more discussion from Planet MySQL community regarding this.

Making replication possible for Brian Aker's memcached storage engine for MySQL can be another way in the future to making MySQL replicate to memcached. Brian's blog post shows:
ENGINE=MEMCACHE DEFAULT CHARSET=latin1 CONNECTION='localhost,piggy,bitters'
The multiple host specification looks very interesting. I will definitely love to talk about this with the brains at the conference.

Also, something like this would make a nice candidate for programs like Google summer of code.

Thanks to my colleague and friend A. Lee for brining this to my attention.

Friday, April 04, 2008

T5120 goes into production

On Friday, after weeks of benchmarking T5120 and T5220 and studying the Sparc T2 (Niagara 2 chip) architecture, I finally put one in production and the results have been very promising. Though Friday evening wasn't a peak period, we experienced 62% more throughput than the previously deployed V210. I expect T5120 to be able to handle our peaks without breaking a sweat.

We'll have to take a hit in certain database operations to benefit from this 62% gain. (Update: however, luckily, those operations do not occur everyday.) I will be presenting results of my benchmarks and information at the MySQL conference. If you are evaluating Sun servers for MySQL, you will find my session very interesting.

Now, I can't wait to receive bunch of T5120s to replace all our db servers.

Kickfire looking to push MySQL limits

For the past few months, like Baron, Jeremy and Keith, I have been consulting KickFire (formerly known as C2App). There is another startup currently in stealth mode with some very impressive solutions for MySQL. Unlike Kickfire, this other startup isn't SSD based. I was hoping they will be ready for announcement at the conference as well, but it seems they will need some more time. I cannot go into much detail on this startup at this point.

I have been wanting to write on KickFire but I certainly won't be able to beat Baron. He does a wonderful job in capturing what is KickFire and presenting a detailed insight for PlanetMySQL readers.

Like Baron, I only provided consulting and didn't get a chance to actually play with the solution. If KickFire is able to deliver what they have been promising then I can see them becoming a major solution provider to MySQL community.

I can't wait for Kickfire's keynote. Should be very interesting for those interested in giving MySQL scalability a whole new meaning.

Challenges and Payoffs of running a Tech Business in NY

Another great event happening in New York on April 14th is the monthly meeting of the New York Software Industry Association. This month's topic is "Running a Tech Business in NY: Challenges and Payoffs." There is no cost to attend but you must pre-register.

Thursday, April 03, 2008

Golf with Scott McNealy?

Today, I missed on an awesome opportunity: to play golf later this month with Scott McNealy. Scott held the title of 'best golfer among top executives' for eight straight years.

I was made the offer to play golf today at our weekly manager's meeting. Why will I miss it? Because I will be in California, speaking at the MySQL Conference.

There are several Sun related interesting events happening in New York during the time I will be in California for the MySQL conference. This would have been a great chance for me to mingle with the top executives and talent at Sun.

I feel sad for missing this opportunity but very excited as the conference time comes closer and closer. Can't wait to see old friends and make new ones.

Lunch with Sun

Yesterday, I had a very yummy lunch (Seared Halibut at Gotham Bar and Grill) with a team from Sun including Al Ballerini, Anthony Mazzei, Steve Spitz and Vasu Prakash.

The discussions were very interesting and informational. Some of the topics (that I am allowed to discuss publicly) were PNFS, QFS, LDAP for large scale authentication, Sun's new servers developed with Fujitsu and Sun's storage solutions.

Architecture wise, I was able to gain some more insight into UltraSparc T1/T2, Sparc M series, and M1 vs M2 architecture. Yes, there was clarification needed every time someone said T2 and T1 to differentiate T1000s and T2000s from UltraSparc T1 (Niagara 1) and UltraSparc T2 (Niagra 2). Someone please tell Sun they can use other letters of the alphabet to describe their servers and series.

The food, though very small in portions, was just out of this world. I can't wait to take my wife there.

Wednesday, April 02, 2008

Ronald is an evil genius. But we'll get you!

Never in my life I have fallen victim (as severely) to an April Fools joke than the one Ronald played through his blog.

My morning started with checking servers, then heading to PlanetMySQL where I found the "sad" news. Both me and my wife spent the next hour discussing nothing else but Ronald and every topic we could think of related to his 'situation'. In the back of my mind, I was thinking that this could be a joke, but then I thought I knew Ronald well enough that he won't play a joke like this. Of course, I was wrong.

When I got Ronald's message saying "April Fools!" my response was "I HATE YOU!!!!"

In the evening, when I talked to a very good mutual friend, Marc, I found he was equally "mad" at Ronald. Today, I see that we were not alone and poor Jay was very worried.

I would love to form a coalition of all those affected by this so we can take revenge :)

Velocity Conference

O'Reilly's Velocity Conference is happening this year from June 23-24 at Burlingame, CA. Velocity site describes this new conference as:

"Web companies, big and small, face many of the same challenges: sites must be faster, infrastructure needs to scale, and everything must be available to customers at all times, no matter what. Velocity is the place to obtain the crucial skills and knowledge to build successful web sites that are fast, scalable, resilient, and highly available."


When the call for papers was open for Velocity, I submitted a talk proposal regarding cutting MySQL IO for cost effective scaling and performance optimization.

Fotolog is one of the largest sites on the Internet. We are ranked 13th most visited site by Alexa and 3rd most active social network by ComScore. In the past two years, we have experienced and continue to experience incredible growth. By focusing on efficient data modeling and cutting I/O, we have literally pushed the limits of optimization and scalability when it comes to MySQL.

Learning today that my session was not accepted obviously came as a major disappointment to me. While I truly respect the conference chair's decision, I believe my session would have been useful for those who are experiencing strong growth but cannot afford to re-architect their database backend for one reason or another.

There is some good news as well: While Velocity rejected my proposal, I am presenting a somewhat similar session at this year's MySQL Conference. The session is called "Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community". If you're attending the conference and interested in knowing how you can push the limits of your MySQL database servers on Solaris, don't forget to attend my session. It will be a lot of fun, I promise!

I am also presenting two more talks at the MySQL Conference, Disaster is Inevitable—Are You Prepared? and The Power of Lucene.

Tuesday, February 26, 2008

'Decade Zero of Open Source'

Bruce Perens, the man who created the Open Source definition on February 9, 1998, writes about the past and present of Open Source. In his State of Open Source Message, he labels the past and the future of Open Source. In his own words:

"Friday, February 8 is the last day of Decade Zero of Open Source. Saturday, February 9 is the anniversary of Open Source and the start of Decade One. It's a computer scientist thing. We always start counting from zero :-)"

The article talks about the rise of Open Source from Red Hat to most recently Sun's acquisition of MySQL. He also re-iterates the need for non traditional profit centers for Open Source companies, like in the case of MySQL.
"The largest part of the payment for Open Source development today comes from cost-center budgets of IT users, be they companies, institutions, or individuals, rather than profit-centers based on Open Source like that of MySQL. By participating in Open Source development, users distribute the cost and risk of the development of enabling technology and infrastructure for their businesses. Their profit centers are not tied to software sales, but to some other business. To find them, look to the communities rather than the companies. "

To Perens, 'Microsoft remains a problem.' whose current strategy, according to him, 'seems to be to poison us with money, most recently by making patent agreements with a number of Linux distributions.'

And regarding the potential impact of Microsoft's acquisition of Yahoo!:
"Some see the potential purchase of Yahoo by Microsoft as a threat. Certainly it might curtail or corrupt some of Yahoo's involvements in Open Source communities, and in half-Open-Source products like Zimbra. But a buy-the-loser strategy could potentially suck up a large part of Microsoft's unpleasantly (to us) ample cash while leaving them with the loser. An increase of Microsoft's influence in the content business could mean the entrance of DRM into conventional web pages. Goodbye "view source", printing without a fee, and Firefox, if Microsoft is ever successful with that. It wouldn't surprise me if Microsoft were to make more plays in the content market, perhaps investing in music and film companies. "

He also expresses his annoyance with SCO approriately calling it 'a toast'.

Overall, a very interesting read.