Converting data from windows-1251 to UTF-8

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • masterross
    Senior Member
    • Nov 2005
    • 525
    • 4.2.5

    Converting data from windows-1251 to UTF-8

    Hi,

    I'm trying to convert my forums content from windows-1251 to UTF-8.
    I successfully converted DB from latin1_swedish_ci to utf8_general_ci but the data stored is still windows-1251.
    How to convert the data to UTF-8?

    Edit:
    I was able to export mysql dump with "readable" characters, selecting iso-8859-1 when exporting.
    But still after import the data is "unreadable" in PHPmyAdmin.
    Last edited by masterross; Mon 31 Dec '18, 7:33am.
    Latest Tech News in the World
    Кейс със снимка по поръчка
  • Wayne Luke
    vBulletin Technical Support Lead
    • Aug 2000
    • 73981

    #2
    We're currently developing a set of scripts to perform this conversion on vBulletin 5 databases. The trick we've found is to make sure that PHP is communicating with the database in the proper character set. This will vary from server to server.

    It is complicated...

    You can try creating a database backup. Removing all character set and collation data from the backup and then reimporting the data into a database with the new UTF character set and collation. This should be done from the command line because the way PHP communicates with the database can altar the data.

    Otherwise, to convert the data in the text fields, you have to convert them all to a binary data field (i.e. varbinary or blob). Once this is done, you can change the character set and collation to a UTF character set and collation. Once all the fields and tables are converted, you can change the fields back to their previous content type.

    Translations provided by Google.

    Wayne Luke
    The Rabid Badger - a vBulletin Cloud demonstration site.
    vBulletin 5 API

    Comment

    • masterross
      Senior Member
      • Nov 2005
      • 525
      • 4.2.5

      #3
      Hi Wayne,

      Happy New year to all of you!

      A little update.

      Code:
      mysqldump -uroot -c -e --default-character-set=latin1 --single-transaction --skip-set-charset --insert-ignore my_db -r dump.sql
      
      iconv -sc -f cp1251 -t UTF8 dump.sql > utf.sql
      
      sed -i -e 's/CHARSET=latin1/CHARSET=utf8/g' utf.sql
      
      mysql -uroot my_db < utf.sql
      The DB now is "readable" and in UTF8 and seems OK.
      I set also utf8 in config.php and options/language
      But now all Cyrillic characters are "????????????????" when I load the forum.

      My forum is vb4.2.5

      Any ideas?
      Latest Tech News in the World
      Кейс със снимка по поръчка

      Comment

      • motd2
        Member
        • Jun 2010
        • 49
        • 3.8.x

        #4
        re-import language in utf-8

        Comment

        • Wayne Luke
          vBulletin Technical Support Lead
          • Aug 2000
          • 73981

          #5
          If you didn't have the config.php set to UTF-8 communication before the change, try turning it off again. Simply comment out that line again. This is what causes PHP to mess up UTF-8 communication.
          Translations provided by Google.

          Wayne Luke
          The Rabid Badger - a vBulletin Cloud demonstration site.
          vBulletin 5 API

          Comment

          • masterross
            Senior Member
            • Nov 2005
            • 525
            • 4.2.5

            #6
            Guys,

            I think my eyes failing me

            Code:
            $config['Mysqli']['charset'] = 'utf8';
            Was commented!
            Now all is OK.
            Thx!
            Latest Tech News in the World
            Кейс със снимка по поръчка

            Comment

            Related Topics

            Collapse

            Working...