Wednesday, December 06, 2006

Changing character set from latin1 to utf8: Alter table vs. dump and import

Chat log with kimseong on #mysql regarding character sets.
 when you dump data in a latin1 table with MySQL, is it converted by default to utf8?
FrankMash: mysqldump? yes it is, but there are options to set it to latin1
kimseong: thanks. I have a table in latin1 that I am planning to dump and then change one field in its structure to be utf8
kimseong: should I dump data in utf8 (default) or specify (latin1) to preserve special chars
FrankMash: alter table can change directly, no need to dump, but backup first anyway
alter table does takes longer than dump/importing in our test
FrankMash: if you data is really stored as latin1, then it does not matter
FrankMash: it gets tricky if data is stored as utf8 in latin1 column
kimseong: thanks. so upon re-importing the data (assuming dump is utf8) will the characters be converted back to the charset specified in the table definition
kimseong: yes that would be tricky. thank you :)


Sheeri said...

mysqldump by default does not put the DEFAULT CHARSET tag in, so you need to specify --create-options if you want to make sure to keep the same charset.

mostwanted said...

I am having some problem in shifting my database.
Currently I have MySql 4.1 DB and I am shifting to another database

With these configuration.
Server version: 5.0.22

When I am importing DB on MySql 5.0, the special Danish character stored in DB turns into funny chars but I am able to fix them using your guidance in your blog. But in one table there is fulltext index, and when I drop the index and then change the charset of the coluomn it fixes the Danish chars but it appends more funny character like this.
Erfarne .NET Arkitekter������������...

And this is the whole problem. Could please help me how can I fix this issue.

jmpp said...

I'm having problems migrating data from latin1 to utf8 on MySQL 5.0.32-Debian_7etch5-log. My table is defined as "CREATE TABLE `tabla1` (`cadena` varchar(255) default NULL) ENGINE=MyISAM DEFAULT CHARSET=latin1;". Using "alter table tabla1 convert to character set utf8" only converts the table/columns definitions, as far as I'm concerned (and, indeed, if I connect in utf8 mode I cannot read diacritics even after issuing that command, getting gibberish instead). Dumping my data with mysqldump and --default-character-set='utf8' and reimporting it with that same charset doesn't migrate it either, as I still get gibberish if I connect in utf8 mode and try to read it (regardless of the table definition still being latin1, which I could easily convert to utf8 with the above alter command and which, moreoever, would only affect newly inserted data). Am I missing something? I'd really appreciate any help!

PS: the "ALTER TABLE t1 CHANGE c1 c1 BLOB;" && "ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;" suggestions in the warning at aren't working either for me, I still have latin1 data in my table.

jmpp said...

I think I figured out my prior problem, or at least I haven't been able to reproduce it after realizing and fixing one tiny "uuppss!" that slipped my attention orignally: when I was inserting data with diacritics in a latin1 defined table, I hadn't noticed the file where the data was written on (a php script) was a utf8 file itself, so who knows what was actually being inserted... or what my conversion trials were producing. Afterward I made sure the script was latin1 and a couple conversion techniques worked.

tanha22 said...

Jag want to create a table with this command in mysql.
CREATE DATABASE `mytable` !40100 CHARACTER SET latin1 COLLATE latin1_swedish_ci;
I get the error 1064 for it.

another question. how can I see which version of mysql I have in my computer?
I use the mysql adminstartor tools and I can not import any sql file from this tools. Have somebody any ide about how I can do it?


Anonymous said...

'select version()' query would help!