Cant get the search working with Swedish characters like åäö

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Mr B
    Member
    • Mar 2007
    • 97
    • 5.3.x

    [Bug / Issue] Cant get the search working with Swedish characters like åäö

    Im starting to be desperate here... I cant get the search working with Swedish characters like åäö. For example Swedish fish names like gädda, gös, färna. But no words at all with Swedish characters are searchable. The words show up fine in the forum but the search index removes all Swedish characters. So, in the "words" database table the fish names are stored like "gdda", "gs" and "frna" instead of their correct names. This worked very well in version 4.2.5 but not in version 5.4.4. This must be fixed!

    Some server info:
    Windows Server 2016 with IIS 10
    PHP 7.2.7 (default charset iso-8859-1)
    MySQL 5.7.20 (character set "latin1" and collation "latin1_swedish_ci")

    vBulletin info:
    vBulletin 5.4.4
    Seach type is "DB Search"
    Swedish language pack installed

    I have tried loads of different settings in separate testvb environments and i cant getting it to work…
    - Ive tried to do several fresh installs of 5.4.4 with different character sets (both for PHP and also in MySQL). For example ive tried utf8 and utf8mb4 with the same problem.
    - Ive tried to install Sphinx search engine with no luck.
    - Ive tried different language settings for the Swedish language pack with no luck.

    The same issue seems to be here at vBulletin forum. Try to search for "gädda" for example. Will find nothing… try to seach for "gdda" and you will find this thread.

    Why is the åäö characters removed when rebuilding the search index? What settings are affecting which characters are included when the rebuild search index is performed? Why do the database table "words" missing the Swedish characters? Also when creating new threads the åäö is removed when the words is placed in the "words" table in the database.

    As i mentioned before, this worked like a charm in vBulletin 4.2.5 and after the upgrade to 5.4.4 it stopped working… please help with this! I cant solve it by myself.
  • Wayne Luke
    vBulletin Technical Support Lead
    • Aug 2000
    • 74122

    #2
    You need to convert your database to UTF-8 character set and collation. We recommend utf8mb4 for the character set and utf8mb4_general_ci for the collation (how your searches are sorted). You then need to use a UTF-8 character set in your vBulletin language settings. And a UTF-8 Locale in the vBulletin language settings.

    Once that is done, you need to delete your search index and rebuild it.

    vBulletin 4 uses a completely different technology for searches and what works there isn't going to be very relevent here.
    Translations provided by Google.

    Wayne Luke
    The Rabid Badger - a vBulletin Cloud demonstration site.
    vBulletin 5 API

    Comment

    • Mr B
      Member
      • Mar 2007
      • 97
      • 5.3.x

      #3
      Originally posted by Wayne Luke
      You need to convert your database to UTF-8 character set and collation. We recommend utf8mb4 for the character set and utf8mb4_general_ci for the collation (how your searches are sorted). You then need to use a UTF-8 character set in your vBulletin language settings. And a UTF-8 Locale in the vBulletin language settings.

      Once that is done, you need to delete your search index and rebuild it.

      vBulletin 4 uses a completely different technology for searches and what works there isn't going to be very relevent here.
      But i have tried (as mentioned above) to do a clean fresh install several times. Both with utf8 and with utf8mb4. No success for me. Also tried alot different locales in the language settings but havent found one that works for windows server 2016 with iis 10.

      The links provided in the locale settings help is old and does not have correct info. What is the correct utf8 locale to support swedish characters like åäö on windows server with iis?

      I have tried sv-SE, Swedish, swe etc with no luck.

      I have also tried to add sv_SE and sv_SE.UTF-8 directly in the language table in the database (does not work to add through the admincp) and that partially works. When rebuilding search index the åäö appears in the words table but not when posting threads and posts. So, i havent found a solution that works even for a fresh install...

      Comment

      • Wayne Luke
        vBulletin Technical Support Lead
        • Aug 2000
        • 74122

        #4
        I don't know how to make it work on a Windows Server. You should use a Linux Hosting provider. The locale has to be installed at the server OS level for it to work.

        The other option is to discuss this with your server administrator and get them to install the locale on the Windows server. Almost every website that talks about Locale is discussing Linux based operating systems. A certified Windows Administrator should know if it is possible on Windows Server.
        Translations provided by Google.

        Wayne Luke
        The Rabid Badger - a vBulletin Cloud demonstration site.
        vBulletin 5 API

        Comment

        • Mr B
          Member
          • Mar 2007
          • 97
          • 5.3.x

          #5
          Originally posted by Wayne Luke
          I don't know how to make it work on a Windows Server. You should use a Linux Hosting provider. The locale has to be installed at the server OS level for it to work.

          The other option is to discuss this with your server administrator and get them to install the locale on the Windows server. Almost every website that talks about Locale is discussing Linux based operating systems. A certified Windows Administrator should know if it is possible on Windows Server.
          Do you have another person in the support staff that are familiar with Windows Servers? Maybe another person can help me more?

          I have been running vBulletin on my site from 2002 until now and in all other versions (except version 5.x.x) the vBulletin have worked well under Windows Server. It must be a glitch in the search index function in 5.x.x when running on Windows Server with IIS. And i have not found any documentation from vBulletin that says that it will not work. Both in the vBulletin manual and in admincp there are a lot of info about Windows servers but it seems to be rather old information.

          Comment

          • Wayne Luke
            vBulletin Technical Support Lead
            • Aug 2000
            • 74122

            #6
            Previous to version 5.0.0, vBulletin didn't even attempt to handle UTF-8 characters. If a character was outside the ISO-88591-1 character set, vBulletin would convert them to HTML entities (i.e.: åäö ) and store them in MySQL in that method. In addition, older versions of MySQL had problems storing UTF-8 characters. vBulletin 4 also used a fulltext index for the search engine so it was able to do more than the current word based index.

            You're welcome to create a bug and we can have someone look into it. We will need a copy of your database to investigate. Investigation will take time since the developers will need to request a Windows Server from the datacenter. All development and testing is done on Linux servers. Though, this doesn't matter in most cases because PHP is fairly agnostic as far as the OS is concerned as long as you're not doing a lot of system calls and command line work. As for support, I believe that I have the most experience for Windows Servers and spin one up from time to time for testing. Lately all of my work has been done using Apache, PHP, and MySQL within the Windows Subsystem for Linux because it is quicker than IIS and FastCGI by at least 100 fold and is more suitable for command line operations. However, I will bring this up at the support staff meeting next week and see if anyone else has a different opinion.
            Translations provided by Google.

            Wayne Luke
            The Rabid Badger - a vBulletin Cloud demonstration site.
            vBulletin 5 API

            Comment

            Related Topics

            Collapse

            Working...