Wednesday, January 25, 2006

Does using 'Enum' affects the application performance?

Today, I got yet another interesting question from a friend who has 30,000 rows with 10,000 unique values in a table and needs to query the data in a random order. He asks:

If I change my column type to enum, Will it make any difference in performance of my application?

"I don't think so," was my immediate reaction.

I know enum, though not standard in SQL, makes it efficient to store data but I am not sure whether it would have any effect on the query processing speed. Enum, I believe is limited to 65,536 possibilities.

On a side note, I recently noticed while playing around with Ruby on Rails that enum fields were not supported. Need to make it a part of standard SQL.

While looking for further clarification, I also found Ronald's post titled "To enum or not to enum." He presents his three reasons on why he doesn't use enum.

So, am I right in assuming that changing column type to enum won't speed up queries?


Thanks
Frank

7 comments:

Roland Bouman said...

My immediate reaction would be to say that converting a string column to an enum could definitely improve performance for certain types of queries.

That's based on the idea that an index based on an enum column will usually be a lot smaller: less bytes to search through.

So, I expect the ENUM to be faster in cases where you'd have fairly long strings (at least, longer than than 4 bytes) and where to query would use simple equality comparison.

(I tried to test it, but I don't have enough data for a real test right now. Maybe tonight)

Mike Hillyer said...

I hate answering performance questions with 'benchmark it', but sometimes that is the real answer. On the one hand, an ENUM can save you a join, and joins can be costly depending on the curcumstances. On the other hand I haven't looked at the index efficiency of ENUM lately.

Might be time to break out some benchmarking tools and a good size data set.

Frankly Speaking! said...

Thanks Roland and Mike for your comments.

I agree with both of you.

Roland, you are right in pointing out that due to the smaller index size there will be "less bytes to search through."

Mike, yes, the best answer is to 'benchmark it.'

I was kind of hoping someone would have done it in the past.

I am sure the benchmark data would help many passionate fans of MySQL.

Thanks guys,

Frank

pabloj said...

You don't need to only benchmark, you need to maintain that app, you'll soon discover why enum are evil.

Roland Bouman said...

Just another note, totally beside the perfomance point...In MySQL Enums provide a natural, declarative means to restrict a domain to a fixed set of values, which is sometimes very useful (think column gender enum('female','male','not specified')) In fact, I would always use ENUM to solve such a problem even though you can use a trigger to check for this in standard SQL.

(On the other hand, if MySQL would support CHECK constraints, I'd probably revert (convert) to those.)

Frankly Speaking! said...

Very true Roland.

Looking back, I have used Enum type for a field only when I wanted to ensure that only a set of values would make it to the database.

I would be awesome to have MySQL support CHECK constraints. Like you, I would convert in a heart beat as this will be a giant leap in ensuring data integrity.

Frank

Karsten Wutzke said...

Another small thing I realized with my DB:

I have teams, which can be male, female or mixed.

I have persons, who can only be female or male.

Having the ability to use two different enums

teams: ENUM('female', 'male', 'mixed')
persons: ENUM('female', 'male')

won't allow 'mixed' to be set for persons, if you'd alternatively create one table for all three gender types... and again, it saves a join.

Plus: Your DB model becomes less complex and cluttered. If I imagine adding another table for my 6+ enums, that'll increase the complexity a lot and I'd have to move around the boxes all new...