mysql and its funny charsets

Hi all,

I am about to finish my last project and I made a data dump (using Yii) from the old SQL tables to the new ones. Due that the people programming the old website didn’t take into account HTML to save the data into the table and its table encoding (latin1 - latin1_swedish_ci) I am having issues to decode-encode into what I need for my application.

Since I make use of HTML editors for all my programmed CMS’s from long time a go, I don’t face this problems ever and now I do not know how to solve this issue.

I need to convert their funny chars into appropiate HTML chars:

from -> URBANIZACIÓN -> to -> URBANIZACIÓN

I have tried htmlentities, mb_convert_encoding, preg_replace, str_replace, utf8_decode, blah blah blah…

I don’t want my customer to edit its hundreds of properties, do you know any good solution that I could convert those funny chars into HTML coding?

I was looking whether Yii has that featured but appart from CHtml::encode (that I have also tried and doesn’t do the conversion) I haven’t seen anything else.

Any help? Thanks!

SOLVED!

The more I use this framework the more I love it. I just had to go to the config/main.php file and change its charset:





'db'=>array(

    ....

    'charset' => 'latin1',

    ....

),




The encoding/decoding is automatically and even though the accented chars are maintained through the web, they are displayed correctly. Then, when on the CMS, as they are displayed on the HTML editors, when user updates the text, my code automatically converts before saving to database.

Man… Not even 10 minutes to solve the issue. I knew that was taken into account. :)

The solution your found actually didn’t do any encoding, simply make so that the data that arrives from database are encoded in the same encodig you specified in the header of layout/main.php.

Sometimes is needed to do even:




'charset' => 'utf8',

'initSQLs'=>array('SET NAMES utf8'),



At least, for cyrillic is needed, for your accents I don’t know… just in case.

Thanks Zaccaria, that’s what actually meaned (my page is always utf-8 and the text was latin1 (ISO-…)) -sorry for my bad english, the chars showed funny constanly till I did that.

I do not have ever problems with that if I do use my own data as chars are always HTML encoded through my CMS’s, but this was a special scenario.

Thanks for your help and your input Zac!

Not at all.

This “SET NAMES utf8” was the stuff I learnt in the first day of my work in russia… I was used in italy simply don’t care the strange accent in the database, but with cyirillic al was a mess…

What an experience!!!