Tuesday, February 26, 2008

'Decade Zero of Open Source'

Bruce Perens, the man who created the Open Source definition on February 9, 1998, writes about the past and present of Open Source. In his State of Open Source Message, he labels the past and the future of Open Source. In his own words:

"Friday, February 8 is the last day of Decade Zero of Open Source. Saturday, February 9 is the anniversary of Open Source and the start of Decade One. It's a computer scientist thing. We always start counting from zero :-)"

The article talks about the rise of Open Source from Red Hat to most recently Sun's acquisition of MySQL. He also re-iterates the need for non traditional profit centers for Open Source companies, like in the case of MySQL.
"The largest part of the payment for Open Source development today comes from cost-center budgets of IT users, be they companies, institutions, or individuals, rather than profit-centers based on Open Source like that of MySQL. By participating in Open Source development, users distribute the cost and risk of the development of enabling technology and infrastructure for their businesses. Their profit centers are not tied to software sales, but to some other business. To find them, look to the communities rather than the companies. "

To Perens, 'Microsoft remains a problem.' whose current strategy, according to him, 'seems to be to poison us with money, most recently by making patent agreements with a number of Linux distributions.'

And regarding the potential impact of Microsoft's acquisition of Yahoo!:
"Some see the potential purchase of Yahoo by Microsoft as a threat. Certainly it might curtail or corrupt some of Yahoo's involvements in Open Source communities, and in half-Open-Source products like Zimbra. But a buy-the-loser strategy could potentially suck up a large part of Microsoft's unpleasantly (to us) ample cash while leaving them with the loser. An increase of Microsoft's influence in the content business could mean the entrance of DRM into conventional web pages. Goodbye "view source", printing without a fee, and Firefox, if Microsoft is ever successful with that. It wouldn't surprise me if Microsoft were to make more plays in the content market, perhaps investing in music and film companies. "

He also expresses his annoyance with SCO approriately calling it 'a toast'.

Overall, a very interesting read.

Monday, February 18, 2008

MySQL as a filesystem

For some time now, I have been pondering about a Storage engine for MySQL that interfaces with flat files. Yes, I see a few needs that can solve for me.
Today, browsing around, I found Ben Martin's article on Using MySQL as a filesystem. The articles uses MySQLfs to get the desired results. Not 100% what I was looking for but still a good read. Ben writes:
With MySQLfs you can store a filesystem inside a MySQL relational database. MySQLfs breaks up the byte content of files that you store in its filesystem into tuples in the database, which allows you to store large files in the filesystem without requiring the database to support extremely large BLOB fields. With MySQLfs you can throw a filesystem into a MySQL database and take advantage of whatever database backup, clustering, and replication setup you have to protect your MySQLfs filesystem.

Sunday, February 10, 2008

Someone please change mysqlconf.com redirection

MySQLConf.com uses a non-optimal temporary 302 redirect to the MySQL conference website.

This is very bad for mysqlconf.com domain name and equally bad for people who link to http://mysqlconf.com (instead of linking to en.oreilly.com/mysql2008/) as their links then DON'T BENEFIT the conference site and from Google's point of view they are linking to a page that engages in temporary redirect. The result is unless you link directly to an oreilly.com page for your conference links, your votes/links don't get passed on to the conference site.

A 302 redirect is considered bad from search engine's point of view due to its temporary nature.

So please folks, change the redirection to 301 or I will have to go back and change my links to be "rel='nofollow'" links.

Currently, the site gives:
wget mysqlconf.com
--14:27:44-- http://mysqlconf.com/
=> `index.html.2'
Resolving mysqlconf.com... 209.204.146.28
Connecting to mysqlconf.com|209.204.146.28|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://en.oreilly.com/mysql2008/ [following]
--14:27:44-- http://en.oreilly.com/mysql2008/
=> `index.html.2'
Resolving en.oreilly.com... 208.201.239.26
Connecting to en.oreilly.com|208.201.239.26|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://en.oreilly.com/mysql2008/public/content/home [following]
--14:27:44-- http://en.oreilly.com/mysql2008/public/content/home
=> `home'
Reusing existing connection to en.oreilly.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 16,852 [text/html]

100%[==========================================>] 16,852 45.68K/s

14:27:45 (45.63 KB/s) - `home' saved [16852/16852]


What it should give:
wget fotolog.net  
--14:33:02-- http://fotolog.net/
=> `index.html.2'
Resolving fotolog.net... 65.118.195.131
Connecting to fotolog.net|65.118.195.131|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.fotolog.com/ [following]
--14:33:03-- http://www.fotolog.com/
=> `index.html.2'
Resolving www.fotolog.com... 64.111.215.105, 64.111.215.120
Connecting to www.fotolog.com|64.111.215.105|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 24,585 [text/html]

100%[==========================================>] 24,585 --.--K/s

14:33:03 (191.15 KB/s) - `index.html.2' saved [24585/24585]


It doesn't just ends here. MySQL is also destroying its mysqluc.com domain in a bad manner. Look at the scary number of 302 redirects here:
wget mysqluc.com
--14:35:29-- http://mysqluc.com/
=> `index.html.3'
Resolving mysqluc.com... 209.204.146.28
Connecting to mysqluc.com|209.204.146.28|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.mysqlconf.com/ [following]
--14:35:30-- http://www.mysqlconf.com/
=> `index.html.3'
Resolving www.mysqlconf.com... 209.204.146.28
Connecting to www.mysqlconf.com|209.204.146.28|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://en.oreilly.com/mysql2008/ [following]
--14:35:30-- http://en.oreilly.com/mysql2008/
=> `index.html.3'
Resolving en.oreilly.com... 208.201.239.26
Connecting to en.oreilly.com|208.201.239.26|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://en.oreilly.com/mysql2008/public/content/home [following]
--14:35:31-- http://en.oreilly.com/mysql2008/public/content/home
=> `home.1'
Reusing existing connection to en.oreilly.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 16,852 [text/html]

100%[==========================================>] 16,852 41.99K/s

14:35:31 (41.95 KB/s) - `home.1' saved [16852/16852]

Now I can just hope that someone actually takes action. It's small changes like this that make you get the most out of your domain or waste all the efforts you previously did in making your sites rank high.

For instance, just think how many links people created to mysqluc.com when MySQL conference was known as MySQL Users Conference. Just because MySQL used an insensible 302 redirect, all the efforts of community in linking to that domain went down the drain. The reason being that Google neither transfers the rank with 302 redirect, nor it consolidates the incoming votes/links from old domains to new domains.

Update: Why I keep talking about this: With Google's Bourbon update, I lost a very well performing site that I had worked on for many years. My investigations all pointed to using 302 redirect instead of 301 redirect. Basically, my site was wrongfully classified by Google as engaging in sneaking redirects. The site never rebounded. It was listed in Google news and was doing millions of page views a month. It had proper licensing from all content and news providers. 'Not knowing' didn't set me free in Google's court. By all means, it was a legitimate, high quality content provider site. Although this may never happen to you, Google still considers 302 a very bad form of redirecting and whenever possible it should be avoided.

InnoDB Sessions at MySQL Conference

This year MySQL Conference features some of the best talks on InnoDB and I couldn't be more excited. We'll be hearing from Heikki Tuuri, Ken Jacobs, Mark Callaghan, Vadim Tkachenko, Peter Zaitsev and me :)

Kudos to conference organizers have really done a great job in balancing the sessions this year.

MySQL conference is a great venue to get up to date with what's happening with your favorite database/storage engine. Early registrations end soon so save yourself some money and register now.

If you have known me for sometime or if you are a regular blog reader, please send me a note and I will send you a coupon code to save even more when you register at the conference. You can email me at sofwareengineer99 at yahoo.

Without further ado, here are mouth watering InnoDB sessions scheduled for this year.

Mark Callaghan: Helping InnoDB Scale on Servers with Many CPU Cores and Disks

Ken Jacobs: InnoDB: Status, Architecture and New Features

Heikki Tuuri / Ken Jacobs: InnoDB: Fast, Reliable, Proven Transactional Storage for MySQL

Vadim Tkachenko and Peter Zaitsev: Investigating InnoDB Scaling Limits

Heikki Tuuri: InnoDB: Status, Architecture and New Features

Farhan "Frank" Mashraqi (Me): Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community

Sun's MySQL Acquisition Cleared by Antitrust Regulators

Sun's acquisition of MySQL received an "early termination" of antitrust review by Federal antitrust regulators. As a result, Sun/MySQL acquisition has been given a green light.

Saturday, February 09, 2008

Fotolog seeks MySQL DBA

Fotolog is seeking a MySQL DBA. You'll be working with me in a fast-paced, small-team-running-a-very-large-scale environment.

About Fotolog.com:
• 14th most trafficked site (Alexa)
• 3rd most active social network (ComScore)
• Parent company: Hi-Media (France)


Duties
• Work with Director of Database Infrastructure to maintain/improve and support a high traffic, fine-tuned, scalable and reliable database environment on a day-to-day basis.
• Pro-active and reactive performance analysis, monitoring, troubleshooting and resolution of issues.
• Optimize and tune contentions within the database environment
• Create logging environment(s) to log usage statistics about the environment
• Regularly monitor and periodically conduct random tests of restoration from backups generated within the environment
• Participate in large storage engine migrations
• Compile/patch/install MySQL
• Work closely with operations, development and product teams to ensure smooth deployment of new iterations and availability of database services.
• Create scripts for monitoring key server utilization indicators using DTrace/Perl etc.
• Migrate production systems from MySQL 4.x to 5.x

Ideal Candidates:
Note: (You’re encouraged to apply even if you are missing a few of these guidelines)
• BS in Computer Science or equivalent
• Minimum 3 years experience with Linux, Unix or Solaris
• Minimum 2 years of MySQL experience in production environment
• Experience with Partitioned architecture(s) and data shards
• Production experience with tuning and administering moderate-large scale mission critical MySQL/InnoDB environments.
• Experience with optimizing InnoDB for both OLTP and DW queries
• Experience with estimating database capacity planning
• Passion for hunting bottlenecks and optimizing IO operations
• Solid experience with both SQL and MySQL internals
• Experience with performance analysis tools for MySQL
• In-depth experience with storage engines
• Experience with backup methodologies for MySQL
• Document best practices and routine procedures
• In-depth knowledge of MySQL tuning parameters and performance strategies
• Experience with MySQL replication
• Excellent communication and problem solving skills with attention to detail without losing the big picture
• Must be a team player as well as able to tackle projects on your own upon assignment.
• Able to handle high stress situations without losing calm and focus.
• Comfortable with carrying a pager and participating in “on call” assignments with other members of the operations team and willing to provide 24/7 escalated on-call support.
• Experience with creating and deploying scripts to automate tasks wherever possible.
• Experience with benchmarking methodologies / tools / best practices
• Experience in supporting MySQL for production / development and QA environment(s).


Bonus
• Experience with storage arrays / 3Par
• Experience with master/master replication
• Experience with Hibernate
• Alumni of MySQL Conference
• Active MySQL community member or Planet MySQL blogger
• MySQL Proxy experience
• Experience with disaster recovery

Note
• You’ll be working at our office on 5th Ave., in New York City. We are just a block away from Union Square. Please note that telecommuting is not an option for this job.


Apply
• Only the candidates themselves should send their resume to fmashraqi at fotolog dot com.

Yahoo! rejects Microsoft's hostile bid

Yup, Yahoo! has finally decided to show balls and reject Microsoft's bid.

Thinking just from search point of view, a Microsoft-Yahoo merger is less evil for the search economy (and by extension online economy) than a Yahoo-Google deal. Of course, this is based on my biased view.

Update: I wonder how long before YHOO drop back to their pre-Microsoft-bid levels.

Update 2: Yahoo's 'Demented Board' rejects Microsoft.

Update 3: fixed typo.

Update 4: I should specify that the ideal outcome for Yahoo!, is to survive on its own without selling to Microsoft and without a deal with Google.

Will next billion dollar open source acquisition come in 12-18 months?

Found this quote from Michael Tiemann on Matt Asay's blog:
"I would not be surprised to see another $1B deal of some sort in the next 12-18 months. The reason is simple economics...."

Will this come true? Very unlikely, especially within that time frame. Statements like this make MySQL's billion dollar acquisition look like a walk in the park. The reality is that MySQL is a leader in creating an innovative model that brings them pretty decent revenue. It has taken a lot of work from the leadership at MySQL to get it where it is.

So why is it 'simple economics'?
open source beats proprietary software as a development platform and as a value-delivery platform, and given how many millions of dollars companies are seeing wasted on proprietary software, it's only a matter of time before the majority of software technology deals are denominated in open source.


Ok, to me 'open source beats proprietary software' and opportunity cost of using proprietary vs open source software are NOT enough reasons in their own for an open source company to become a billion dollar company. Proprietary software has a high value licensing model. Open source software lacks that particular model.

There will need to be a very solid monetization plan behind the open source software for it to become that much valuable.

In order to become a billion dollar open source company, first you need enterprise strength customers. Then, you need stable and solid business model. Then you need great leadership to get your momentum going. Finally, you need a company with much more than a few billion dollars to make a leap of faith and buy you out. Or, you will need a history changing IPO and getting there will be a very difficult journey.

So, I personally do not believe at all in the argument of 'simple economics' that is presented above.

What company, if you believe in the quote, you see being acquired for a billion dollar in the next 1.5 years? RedHat, may be, but lets stick to cases where the first acquisition/IPO hasn't happened yet.

Thursday, February 07, 2008

Scaling the third most active Social Network with MySQL/InnoDB/Solaris


Sun Microsystems has published a case study, Delivering quality service to eleven million users with MySQL, InnoDB,
and the Solaris 10 Operating System
, on Fotolog, the third most active social network according to ComScore and 14th most trafficked site according to Alexa.

Challenges:
• Scale to support eleven* million members and more than 100** million page views a day
• Increase performance without increase in database hardware (significant cost savings)

Solution
• MySQL database software
• InnoDB transactional storage engine for MySQL
• Solaris™ 10 Operating System
• Sun V440 and V210

Results
• Support for four times as many users with no additional hardware
• Higher percentage of working dataset in the memory with efficient schemas
• Four times the number of concurrent threads without adding servers
• Anticipated ability to double current number of threads

I will be presenting an updated and much more juicier version of how we achieved such scalability level from database point of view at the MySQL Conference 2008. The talk is titled, Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community. So if you needed a sign to attend the conference, well you got one now :)

Notes:
* Now, Fotolog is reaching 15 million members with same number of database servers.
** Now reaching 150 million plus page views a day

Wednesday, February 06, 2008

re: mysql.com and mysql-press.com in Google

So I don't know who's in charge for this at MySQL so I thought I would just gain their attention by posting here. In addition, I hope this can help others in similar need/situation of serving content on multiple domains.

Basically MySQL serves content on both mysql.com and mysql-press.com which is a NO NO from the point of view of major search engines, most notably Google. Both domains go to the same IP (at least when I checked). This can potentially trigger duplicate content penalty and may even be hurting MySQL's ranking in SERPs (Search engine result pages).

Google seems to be already aware of the other domain.

Right way to serve content over multiple domains?
The ideal way to deal with serving content over multiple domains is to use a 301 Permanent Redirect from one domain to another. So when users go to mysql-press.com a 301 redirect should take them to mysql.com.

Don't dilute your PageRank
One more thing, by serving same content on multiple domains, you are diluting your PageRank. Using 301 redirect consolidates both your PageRank and potentially valuable incoming links.

BTW, This advice is not just for mysql.com. If you are serving content over multiple domains, then it can trigger Google's famous duplicate content penalty. The penalty is severe specially if you serve Google's contextual advertising on all those domains as that can potentially get your site wrongfully classified as a MFA or "Made for Adsense" site.

One quick note: there is an exception to this where if your last 2 subnets of IP address for both domains are different then there seems to be a lesser chance of penalty associated with duplicated content. At least, this was true until not so long ago.

Friday, February 01, 2008

Microsoft's $44.6 billion hostile bid for Yahoo!

Microsoft has finally decided to bite the bullet and made a hostile $44.6 billion bid to acquire dying Yahoo!

I personally think that Microsoft will eventually just kill Yahoo! Although who knows. I can't wait for Microsoft advertising to advertise to hire BSD and MySQL gurus to keep Yahoo! running. That'll be funny. For some reason, I can't see them switching Yahoo! to SQL Server.