View Full Version : Smarter URL parsing
mjames
Fri 22nd Feb '02, 7:17pm
When I type a URL in, I either have to go through the steps of forming it with vB coding (more typing than I should have to do) or putting a space at the end of it, which looks peculiar.
For example:
http://www.vbulletin.com,
should be:
http://www.vbulletin.com
This particularly comes about for me when I'm typing a post and have to include the URL before a comma or period. I'd like to see vB be able to recognize the punctuation and not parse the URL with that in it as the URL becomes defunct with a comma or period at the end. I can't tell you how many times I've clicked on links from forums and had to manually correct the URL myself.
tubedogg
Fri 22nd Feb '02, 7:25pm
The problem, at least with periods, is that it is a valid character in a URL. There is no way for vBulletin to tell if the following URL should have the period at the end or not:
http://www.vbulletin.com/file.
Commas on the other hand are not, at least on Unix, and so they could be filtered out, feasibly...
mjames
Fri 22nd Feb '02, 7:28pm
Good point - but filtering out commas seems like a good idea. :)
Wayne Luke
Fri 22nd Feb '02, 9:18pm
A lot of large sites like CNN, ZDNET and MSNBC use commas in their URLs. Considering ZDNET's strong anti-microsoft zeal, I would hazard a guess that they don't use Windows as their servers. However since it isn't actually files names we have to be worrying about here but URL's. I would hazard a guess that commas are valid characters in a URL.
The problem isn't the characters the problem is getting it to recognize the space after the character. Mike has said in the past that he has worked and worked on this parsing issue and it just wasn't working. IIRC, it doesn't work at all on PHP 3.0X and it only works half the time on 4.0.X. With newer versions of PHP this type of parsing may be more reliable but that would force incompatibilities with older versions.
Anyway, I think this is a constant project and when it works it will be put into place.
JamesUS
Sat 23rd Feb '02, 3:48am
No URL should have an ending comma or period and then nothing else after it...so if we can figure out a way to filter these out with regexps then that should be fine. It's just coming up with a regexp that works well and doesn't slow anything down.
tubedogg
Sat 23rd Feb '02, 10:14am
Originally posted by JamesUS
No URL should have an ending comma or period and then nothing else after it...Not saying that it happens often but to say it shouldn't isn't accurate, as it can...
http://www.idlewords.net/file.
is a valid URL to a file on my server.
JamesUS
Sat 23rd Feb '02, 11:16am
True, but I think it would be an acceptable solution if we could filter out URLs like that and if someone wants to link to one of those they just have to enclose it with [url] tags.
GameCrash
Mon 25th Feb '02, 10:09am
Why not say to remove , and . only if it's a domain name, so if I write http:/ /www.domain.tld, or http:/ /www.domain.tld/. the ./, will be removed ("http:/ /www.*.*." and "http:/ /www.*.*/." (same with ,)) then the ./, in the path will not be removed...
vBulletin® v3.8.0 Release Candidate 1, Copyright ©2000-2008, Jelsoft Enterprises Ltd.