Announcement

Collapse
No announcement yet.

Converting data from windows-1251 to UTF-8

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting data from windows-1251 to UTF-8

    Hi,

    I'm trying to convert my forums content from windows-1251 to UTF-8.
    I successfully converted DB from latin1_swedish_ci to utf8_general_ci but the data stored is still windows-1251.
    How to convert the data to UTF-8?

    Edit:
    I was able to export mysql dump with "readable" characters, selecting iso-8859-1 when exporting.
    But still after import the data is "unreadable" in PHPmyAdmin.
    Last edited by masterross; Mon 31st Dec '18, 8:33am.
    Latest Tech News in the World
    Best iPhone and Android games!

  • #2
    We're currently developing a set of scripts to perform this conversion on vBulletin 5 databases. The trick we've found is to make sure that PHP is communicating with the database in the proper character set. This will vary from server to server.

    It is complicated...

    You can try creating a database backup. Removing all character set and collation data from the backup and then reimporting the data into a database with the new UTF character set and collation. This should be done from the command line because the way PHP communicates with the database can altar the data.

    Otherwise, to convert the data in the text fields, you have to convert them all to a binary data field (i.e. varbinary or blob). Once this is done, you can change the character set and collation to a UTF character set and collation. Once all the fields and tables are converted, you can change the fields back to their previous content type.

    Translations provided by Google.

    Wayne Luke
    The Rabid Badger - a vBulletin Cloud customization and demonstration site.
    vBulletin 5 Documentation - Updated every Friday. Report issues here.
    vBulletin 5 API - Full / Mobile
    I am not currently available for vB Messenger Chats.

    Comment


    • #3
      Hi Wayne,

      Happy New year to all of you!

      A little update.

      Code:
      mysqldump -uroot -c -e --default-character-set=latin1 --single-transaction --skip-set-charset --insert-ignore my_db -r dump.sql
      
      iconv -sc -f cp1251 -t UTF8 dump.sql > utf.sql
      
      sed -i -e 's/CHARSET=latin1/CHARSET=utf8/g' utf.sql
      
      mysql -uroot my_db < utf.sql
      The DB now is "readable" and in UTF8 and seems OK.
      I set also utf8 in config.php and options/language
      But now all Cyrillic characters are "????????????????" when I load the forum.

      My forum is vb4.2.5

      Any ideas?
      Latest Tech News in the World
      Best iPhone and Android games!

      Comment


      • #4
        re-import language in utf-8

        Comment


        • #5
          If you didn't have the config.php set to UTF-8 communication before the change, try turning it off again. Simply comment out that line again. This is what causes PHP to mess up UTF-8 communication.
          Translations provided by Google.

          Wayne Luke
          The Rabid Badger - a vBulletin Cloud customization and demonstration site.
          vBulletin 5 Documentation - Updated every Friday. Report issues here.
          vBulletin 5 API - Full / Mobile
          I am not currently available for vB Messenger Chats.

          Comment


          • #6
            Guys,

            I think my eyes failing me

            Code:
            $config['Mysqli']['charset'] = 'utf8';
            Was commented!
            Now all is OK.
            Thx!
            Latest Tech News in the World
            Best iPhone and Android games!

            Comment

            Related Topics

            Collapse

            Working...
            X