Announcement Announcement Module
Collapse
No announcement yet.
WS Spiders List (for updated vBulletin "spiders_vbulletin.xml" files) Page Title Module
Move Remove Collapse
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • WS Spiders List (for updated vBulletin "spiders_vbulletin.xml" files)

    #1
    ================================================

    NOTE: The spider list was last updated on Thursday 14th February 2013 @ 06:15 PM AEDT and has a total of 644 spiders.

    NOTE 2: As of Monday 1st July 2013, due to medical & life-style reasons I will no longer be updating/maintaining the spiders list, and will no longer participate in the vBulletin community. At that point I will add the latest .xml file for people to continue using.

    ================================================

    Two Years On
    In the last year (two years since the I launched this service) there have been an extra 41,000 downloads of the new spiders list, which means there have been an average of 122 downloads a day in the last year alone. This spiders list is still being added to and actively maintained.

    ================================================

    One Year On
    In the year since the I launched this service there have been 8,401 downloads of the new spiders list which is an average of 23 downloads a day, and I have added well over 100 new and updated spiders to the list since I took over from Dream.

    ================================================

    WS Spiders List

    ================================================

    Hi all,

    This is a follow-on thread to Dream's vBulletin Spiders Directory (for updated spiders_vbulletin.xml files) thread.

    As Dream has not updated his Spider List (which has 503 spiders) since early October 2009, I got the urge to update the spiders list..... and as I do not have access to Dream's source code I decided to write my own Spider List service from scratch that emulates the system that Dream created.

    I would like to thank Dream for creating his service and maintaining the database for over a year and a half. My new service will hopefully pick up where his left off.

    The free service I have created allows you to download the updated "spiders_vbulletin.xml" file (currently with 515 spiders being identified) and this will hopefully increase weekly (with your help by submitting spiders).

    This is the official thread to submit spiders, and as long as the submitted spiders are not currently in the database, I will add them as soon as possible after they are submitted and I have confirmed they are valid.

    I hope you enjoy this new free service.

    Regards,

    Mosh Shigdar, former vBulletin Project Tools Developer.

    ================================================

    If you have questions about how this works read the FAQ:

    WS Spiders List FAQ

    ================================================

    How to Submit a Spider here:

    Post the following info in this thread:
    1. Spider name
    2. Spider Ident (aka user-agent) (It's not the IP address! please read here for more info)
    3. Spider Website (optional)
    4. Spider Contact Email (optional)



    If you are not sure of the information to be submitted, then follow the instructions listed in the I want to submit a spider. What do you need? section of the FAQ.
    Last edited by Mosh; Sun 31st Mar '13, 10:18pm.

  • #2
    Sweet Mosh! Thanks for this updating mine now

    Comment


    • #3
      This one is new:
      Mozilla/5.0 (compatible; Search17Bot/1.1; http://www.search17.com/bot.php)

      This one is not good:

      Name: 80legs
      Ident: 008
      Info (website): http://www.80legs.com/spider.html
      Added on: Sat 22nd May 2010 @ 04:05am


      because the ident string is not unique (i.e.: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.20) Gecko/20081217 MRA 5.6 (build 03399) is identified as bot with this ident, but it's not).
      Last edited by ALcorn; Tue 8th Jun '10, 12:42am. Reason: merging posts

      Comment


      • #4
        Originally posted by ALcorn View Post
        This one is new:
        Mozilla/5.0 (compatible; Search17Bot/1.1; http://www.search17.com/bot.php)
        Added to the list now.

        Originally posted by ALcorn View Post
        This one is not good:

        Name: 80legs
        Ident: 008
        Info (website): http://www.80legs.com/spider.html
        Added on: Sat 22nd May 2010 @ 04:05am


        because the ident string is not unique (i.e.: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.20) Gecko/20081217 MRA 5.6 (build 03399) is identified as bot with this ident, but it's not).
        There is nothing I can do about that, as it is it's official "ident", see http://80legs.pbworks.com/FAQ#Webmasters

        It looks like this is down to how vBulletin itself compares the "spiders_vbulletin.xml" file with the user-agent of the visitor.

        So, you may want to report this as a bug on vBulletin's Bug Tracker - http://tracker.vbulletin.com/browse/VBIV

        Alternatively, you could always contact the authors of the 80legs spider and ask for a more unique "ident", which I can then use to update the list with.
        Last edited by Mosh; Tue 8th Jun '10, 1:11am.

        Comment


        • #5
          Thanx for everything!
          Originally posted by Mosh View Post
          There is nothing I can do about that, as it is it's official "ident", see http://80legs.pbworks.com/FAQ#Webmasters
          I think it's a problem of this bot owners too.

          Comment


          • #6
            Hi all,

            Major update to the list - now stands at 560 spiders. So, head over to the WS Spiders List (see link in signature or in first post) and download an updated list.

            Regards,

            Mosh.

            Comment


            • #7
              A bug report that might interest you guys:

              http://tracker.vbulletin.com/browse/VBIV-4390

              Itīs about how visitors get identified as spiders by the idents, which I havenīt been able to test.

              Also new list downloaded and using it on my forums now, thanks Mosh .

              Comment


              • #8
                Originally posted by Dream View Post
                A bug report that might interest you guys:

                http://tracker.vbulletin.com/browse/VBIV-4390

                Itīs about how visitors get identified as spiders by the idents, which I havenīt been able to test.

                Also new list downloaded and using it on my forums now, thanks Mosh .
                Thanks Dream

                I have also not been able to replicate this bug...... so, if anyone does (and they have the latest WS Spider List installed), can they take a screenshot (with show useragents enabled) and post it here or in the bug report Dream posted above and hopefully it can get sorted.

                Comment


                • #9
                  Originally posted by Mosh View Post
                  I have also not been able to replicate this bug.
                  I can confirm that Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html) is correctly recognised as Cuil Spider, at least on my installation of vBulletin v4.0.3 PL1 and latest spiders list (and I have post my feedback to the bug report)......so, if anyone does (and they have the latest WS Spider List installed), can they take a screenshot (with show useragents enabled) and post it in the bug report Dream posted above and hopefully it can get sorted.

                  Comment


                  • #10
                    I believe just "Twiceler" doesn't get recognized then, this was reported to me on the other thread.

                    Comment


                    • #11
                      I have never got the "twiceler." ident appear on my forum while I have been logged in, so can not confirm if it is the culprit or not, I do have the other ident appear on my forum daily. So, if someone else can confirm that is the issue, then please post it to the bug tracker.

                      Comment


                      • #12
                        Removed a duplicated, added a new spider.

                        Comment


                        • #13
                          Originally posted by Mosh View Post
                          Removed a duplicated, added a new spider.
                          Sorry, what does that mean? If you have updated the script, could you please say the script has been updated? Providing additional information would be fine, of course. I don't mean to be a nit-picker about it, but it may help eliminate any confusion at all if you said something similar to "script updated..."

                          Thanks, I appreciate the work you put into this.

                          Jim

                          Comment


                          • #14
                            It means the spiders list has been updated, link to the spiders list is in first post. I am not going to post in elaborate detail, each and every time I add/remove or edit a spider (or spiders) to the spider list, at the most I am going to indicate how many have been added/removed or edited (to update it with correct details) and amend the first post with the correct spider count.

                            BTW, I just added another spider to the list. Now there are a total of 561 spiders on the list.

                            Comment


                            • #15
                              I understand and, to be clear, I wasn't asking for any details. I only wanted to know if the script had been updated. That's all I really care about, although many others might like more info than that. My only problem was your previous message didn't say the script was updated. I was confused, so wanted clarification.

                              The easiest thing for anyone following this thread in the forum is to click the first unread link. That doesn't go to the first page. The custom previously was to make a simple announcement that the script had been updated. I would then usually just click the link in the signature to go to where I could download the updated script. I'm hoping a similar routine could be established to make it easy for anyone to know if it's been updated and then to click in your signature to get the script.

                              Thanks again,

                              Jim
                              PS: here's a transparent version of your avatar - it might look better in the postbit.

                              Comment

                              Working...
                              X