Comment On pider Detection

"I came across this snippet in our header file," wrote David, "it's a basic webspider detector that is used later on to record certain actions differently if $is_spider was set to 1." [expand full text]
« PrevPage 1 | Page 2 | Page 3Next »

Re: pider Detection

2008-08-18 09:11 • by Sa (unregistered)
I like it. Simple, creative, easy to understand. And it even works most of the time.

Re: pider Detection

2008-08-18 09:11 • by wtfrox
rillant!

Re: pider Detection

2008-08-18 09:17 • by Nikolai (unregistered)
Actually, this indicates that the developer knows what he is doing and tries to write the most efficient code possible (the kind of thing a lot of modern software engineers lack). Definitely not WTF.

Re: pider Detection

2008-08-18 09:19 • by delenit (unregistered)
ee roblems or ther oders hen hey ry o igure ut hat s oing n n his ode...

Re: pider Detection

2008-08-18 09:27 • by diaphanein (unregistered)
212521 in reply to 212518
Nikolai:
Actually, this indicates that the developer knows what he is doing and tries to write the most efficient code possible (the kind of thing a lot of modern software engineers lack). Definitely not WTF.
May your career die a horrible death porting code for the unix epoch fail.

Re: pider Detection

2008-08-18 09:28 • by frustrati (unregistered)
212522 in reply to 212518
Nikolai:
Actually, this indicates that the developer knows what he is doing and tries to write the most efficient code possible (the kind of thing a lot of modern software engineers lack). Definitely not WTF.
Yes, because there is absolutely no chance that PHP has an "inarray" type function. After all, PHP hardly has any library functions...

Re: pider Detection

2008-08-18 09:28 • by Greg (unregistered)
212523 in reply to 212518
Nikolai:
Actually, this indicates that the developer knows what he is doing and tries to write the most efficient code possible (the kind of thing a lot of modern software engineers lack). Definitely not WTF.

No, this just means the developer can't be bothered to either read the documentation or stop for one second and think. So instead of "someone must have had this problem before, let me read the manual" something entirely different went through his/her head: "Stupid C! Stupid computers! Nothing works! I hate this, why am I here" :)

Re: pider Detection

2008-08-18 09:32 • by roe
Hah! I actually like his style... He managed, not only make it resiliant to differing first-letter case (googlebot or Googlebot, anyone?), but also save bunch of letters (the 'signatures' as well as the "=== false").

As a previous commenter put it... Absolutely rillant!

Re: pider Detection

2008-08-18 09:32 • by Val (unregistered)
And one more thing: 'pider' is kind of 'q...er' in Russian :(

Re: pider Detection

2008-08-18 09:32 • by powerlord
212526 in reply to 212514
Sa:
I like it. Simple, creative, easy to understand. And it even works most of the time.

This is one of those things where the "correct" solution only takes a few seconds to fix, though.

For that matter, the PHP online manual page for strpos() even explains why you should use === to check the return value.

Re: pider Detection

2008-08-18 09:34 • by MET
212527 in reply to 212518
Nikolai:
Actually, this indicates that the developer knows what he is doing and tries to write the most efficient code possible (the kind of thing a lot of modern software engineers lack). Definitely not WTF.

In my 13 years of experience a focus on efficiency first is always the mark of a n000b. After a while most realise that correctness and maintainability are almost always more important, and only a very few places need to coded as if every cycle counts. I agree a good developer should know how to code very efficiently, I just think they should know not to do so most of the time.

Re: pider Detection

2008-08-18 09:34 • by heise (unregistered)
He he, that's funny.

Still, I'm not sure what surprised me more:
a) The "ingenuity" of developer.
b) That the function can return "0" AND "FALSE" and that's not the same.
c) That "strpos" is not appropriate function for searching substrings inside string.

Re: pider Detection

2008-08-18 09:35 • by powerlord
212529 in reply to 212524
roe:
He managed, not only make it resiliant to differing first-letter case (googlebot or Googlebot, anyone?)

I can do that by adding one character (an i after str).

Re: pider Detection

2008-08-18 09:36 • by unting eb evelopers (unregistered)
212530 in reply to 212518
You're fucking joking, right? If you truly believe this then you have no fucking idea and have no right to even compare yourself to software engineers, let alone criticize.
How can you possibly put up a shitty, halfwitted string compare saving an entire one character as a paragon of efficiency when you routinely don't bat an eye at stuff that is 100% string and/or XML based, use interpreted scripting languages, javascript, reflection/runtime class interrogation and think its ok to transmit 1000000 bytes to describe a basic screen with minimal functionality.

Copyright Infringement

2008-08-18 09:38 • by ParkinT
Don't you realize that each of those spider names is a Registered Trademark.
If they appeared in the source code, David's company would be obliged to pay usage fees!

{or post a disclaimer in the comments}

Re: pider Detection

2008-08-18 09:39 • by Jonathan (unregistered)
212533 in reply to 212525
Val:
And one more thing: 'pider' is kind of 'q...er' in Russian :(


Quaker? Or can I buy a vowel?

Re: pider Detection

2008-08-18 09:40 • by ideo (unregistered)
212534 in reply to 212526
powerlord:
Sa:
I like it. Simple, creative, easy to understand. And it even works most of the time.

This is one of those things where the "correct" solution only takes a few seconds to fix, though.

For that matter, the PHP online manual page for strpos() even explains why you should use === to check the return value.

While the code above is a WTF, a function call that either returns a boolean OR a number is a bigger WTF.

Re: pider Detection

2008-08-18 09:41 • by Pidgeot
212535 in reply to 212528
frustrati:
Yes, because there is absolutely no chance that PHP has an "inarray" type function. After all, PHP hardly has any library functions...


in_array wouldn't work unless he puts the complete UA string in there, which isn't a very good idea.

heise:
c) That "strpos" is not appropriate function for searching substrings inside string.


This surprises me too, since I don't see any indication this is the case.

Re: pider Detection

2008-08-18 09:44 • by damnum (unregistered)
Robert Donnelly:
Dude? What is the difference between 0 and false?

RD
www.useurl.us/12m
In a strongly typed language a nice huge about.
In PHP it depends on the location of a sig in forum software that does not support sigs.

So for you the difference is 'z', but for me it is 'awk { $1 }' as I don't have a sig line.

Re: pider Detection

2008-08-18 09:44 • by Val (unregistered)
212537 in reply to 212533
Stop kidding! ('f...ggot' also suits well).

Re: pider Detection

2008-08-18 09:45 • by powerlord
212538 in reply to 212534
ideo:
powerlord:
Sa:
I like it. Simple, creative, easy to understand. And it even works most of the time.

This is one of those things where the "correct" solution only takes a few seconds to fix, though.

For that matter, the PHP online manual page for strpos() even explains why you should use === to check the return value.

While the code above is a WTF, a function call that either returns a boolean OR a number is a bigger WTF.

Unfortunately PHP is filled with functions that do that. The alternative would have been to return negative numbers for failues... or to throw an Exception.

Re: pider Detection

2008-08-18 09:47 • by Quietust
I'm surprised he didn't just do "strpos(' '.$agent, $spider_name)" - it's what I use in a small script on my site (which I use to redirect traffic from my site's old hostname to the new one; search engines get a "301 Moved Permanently" response to make them update their index, while normal clients get a "300 Multiple Choices" to force them to read the page and either report a broken link or update their damn bookmarks), and it works nicely.

Re: pider Detection

2008-08-18 09:48 • by brodie
212540 in reply to 212527
MET:
I agree a good developer should know how to code very efficiently, I just think they should know not to do so most of the time.

In your mind, a "good developer" should know that they shouldn't write efficient code most of the time?

I'm sure glad you don't work with me. We obviously have extremely different views on what makes a "good" developer. Writing inefficient code "most of the time" is not the hallmark of a "good developer."

Re: pider Detection

2008-08-18 09:48 • by Harun (unregistered)
everybody can fake an agent string.
why not take reverse dns and use it?
see:
http://livebookmark.net/journal/2007/04/11/sitemaps-in-the-robotstxt-happy-harvesting/

Re: pider Detection

2008-08-18 09:49 • by Nick Johnson (unregistered)
Actually, there's a pseudo-plausible reason to do this: strpos(blah, "pider") matches both "spider" and "Spider".

Re: pider Detection

2008-08-18 09:52 • by powerlord
212544 in reply to 212542
Nick Johnson:
Actually, there's a pseudo-plausible reason to do this: strpos(blah, "pider") matches both "spider" and "Spider".

So does stripos(blah, "Spider")
except that stripos also matches "sPider", "SPIDER", etc... I imagine it just uses strtolower or strtoupper on both strings first.

Re: pider Detection

2008-08-18 09:58 • by Andy Goth
212547 in reply to 212542
Nick Johnson:
Actually, there's a pseudo-plausible reason to do this: strpos(blah, "pider") matches both "spider" and "Spider".
Yeah, sometimes I do the same trick (leave off the first character) if I don't know what case it will be in. For example, CHAT scripts. "ogin:", "assword:" Hehehe, assword. :^)

Re: pider Detection

2008-08-18 10:00 • by Gustav (unregistered)
I agree with all of you who say that PHP is a poor language. I mean, that's why Facebook, Wikipedia, Yahoo!, Digg, Sourceforge and Flickr are built on it, right?

Re: pider Detection

2008-08-18 10:07 • by Nath (unregistered)
212553 in reply to 212548
Gustav:
I agree with all of you who say that PHP is a poor language. I mean, that's why Facebook, Wikipedia, Yahoo!, Digg, Sourceforge and Flickr are built on it, right?


What a cute comeback! It's a shame that nobody mentioned PHP was poor, but otherwise it was delightful. Well done.

Re: pider Detection

2008-08-18 10:13 • by biziclop (unregistered)
212556 in reply to 212548
Gustav:
I agree with all of you who say that PHP is a poor language. I mean, that's why Facebook, Wikipedia, Yahoo!, Digg, Sourceforge and Flickr are built on it, right?


If you feel yourself inferior because you are a PHP developer that's not our fault. :)

Re: pider Detection

2008-08-18 10:14 • by Smash King
Robert Donnelly:
Dude? What is the difference between 0 and false?

RD
<someURI>
Hmm I always thought this was a bot but this time it actually said something related to the article.

Probably it is an human smart as a bot then

Re: pider Detection

2008-08-18 10:14 • by troels (unregistered)
Introducing:
piderman baman

Re: pider Detection

2008-08-18 10:15 • by Canthros (unregistered)
212559 in reply to 212540
brodie:
In your mind, a "good developer" should know that they shouldn't write efficient code most of the time?

I suspect the point was that maintainability is or should be a higher priority than efficiency in most modern code. Premature optimization and all that, plus the benefits of having code that your developers can still read six months or even six years down the road may be substantially greater than saving a clock cycle or three here and there.

Re: pider Detection

2008-08-18 10:16 • by AT (unregistered)
212560 in reply to 212540
brodie:
MET:
I agree a good developer should know how to code very efficiently, I just think they should know not to do so most of the time.

In your mind, a "good developer" should know that they shouldn't write efficient code most of the time?

I'm sure glad you don't work with me. We obviously have extremely different views on what makes a "good" developer. Writing inefficient code "most of the time" is not the hallmark of a "good developer."


Really? So in other words, you *don't* evaluate the trade-offs between clarity and efficiency (where they diverge) on a case-by-case basis and choose the solution that best your overall design goals? You just always choose efficiency without regard to the problem, technology, or environment at hand?

I'm glad I don't work with *you*!

Re: pider Detection

2008-08-18 10:17 • by andy (unregistered)
212562 in reply to 212548
well that's like saying a million flies can't be wrong, manure must be tasty!

Re: pider Detection

2008-08-18 10:24 • by biziclop (unregistered)
212565 in reply to 212560
AT:
brodie:
MET:
I agree a good developer should know how to code very efficiently, I just think they should know not to do so most of the time.

In your mind, a "good developer" should know that they shouldn't write efficient code most of the time?

I'm sure glad you don't work with me. We obviously have extremely different views on what makes a "good" developer. Writing inefficient code "most of the time" is not the hallmark of a "good developer."


Really? So in other words, you *don't* evaluate the trade-offs between clarity and efficiency (where they diverge) on a case-by-case basis and choose the solution that best your overall design goals? You just always choose efficiency without regard to the problem, technology, or environment at hand?

I'm glad I don't work with *you*!


We're all glad we don't work with each other, after all who wants to work with people who spend all day surfing on websites?

But what your problem is with the definition of "efficiency". The code in this article should not be considered efficient by any definition. Because only code that does the job properly should be ranked on a scale of efficiency and this doesn't. It shouldn't even be considered for efficiency.

Re: pider Detection

2008-08-18 10:24 • by biziclop (unregistered)
212566 in reply to 212560
AT:
brodie:
MET:
I agree a good developer should know how to code very efficiently, I just think they should know not to do so most of the time.

In your mind, a "good developer" should know that they shouldn't write efficient code most of the time?

I'm sure glad you don't work with me. We obviously have extremely different views on what makes a "good" developer. Writing inefficient code "most of the time" is not the hallmark of a "good developer."


Really? So in other words, you *don't* evaluate the trade-offs between clarity and efficiency (where they diverge) on a case-by-case basis and choose the solution that best your overall design goals? You just always choose efficiency without regard to the problem, technology, or environment at hand?

I'm glad I don't work with *you*!


We're all glad we don't work with each other, after all who wants to work with people who spend all day surfing on websites?

But what your problem is with the definition of "efficiency". The code in this article should not be considered efficient by any definition. Because only code that does the job properly should be ranked on a scale of efficiency and this doesn't. It shouldn't even be considered for efficiency.

Re: pider Detection

2008-08-18 10:27 • by MG! (unregistered)
hat's enious, hat s!

Re: pider Detection

2008-08-18 10:39 • by Waffle (unregistered)
212571 in reply to 212529
powerlord:
roe:
He managed, not only make it resiliant to differing first-letter case (googlebot or Googlebot, anyone?)

I can do that by adding one character (an i after str).

Maybe so, but He did it by *removing* letters... which is obviously better.

Re: pider Detection

2008-08-18 10:43 • by cookre (unregistered)
One is reminded of Michael Jackson's (no, not THAT one) rules of code optimization:

Rule #1 - Don't do it.

Rule #2 (for experts only) - Don't do it yet.

Re: pider Detection

2008-08-18 10:52 • by fmobus (unregistered)
212575 in reply to 212539
Quietust:
I'm surprised he didn't just do "strpos(' '.$agent, $spider_name)" - it's what I use in a small script on my site (which I use to redirect traffic from my site's old hostname to the new one; search engines get a "301 Moved Permanently" response to make them update their index, while normal clients get a "300 Multiple Choices" to force them to read the page and either report a broken link or update their damn bookmarks), and it works nicely.


Well, your solution requires a temporary string to be spawned and could cost you some cycles. Stop being lazy and just do what TFM tells you to:
if (strpos($needle,$haystack) === true) {}

But yeah, I've got to admit the original code was clever; but it is simply wrong from a clarity standpoint.

The Real WTF(tm) is a library function returning int or bool. It should rather behave like C's strpos, Java's indexOf, Python's find(); they all return -1 if the haystack does not contain the needle. It makes more sense that way: you're testing the position of a substring, which is a number. A boolean would be expected if you're testing if string contains substring, regardless of position.

Re: pider Detection

2008-08-18 10:57 • by benh (unregistered)
Where is the WTF? It's not the most visually appealing, but it would clearly work and is not too roundabout.

Re: pider Detection

2008-08-18 11:06 • by brazzy
212581 in reply to 212540
brodie:
I'm sure glad you don't work with me. We obviously have extremely different views on what makes a "good" developer. Writing inefficient code "most of the time" is not the hallmark of a "good developer."

Yes, it is. Because the most efficient code possible is usually unmaintainable garbage.

In nearly all applications, nearly all of the code has absolutely no reason to be efficient because it will be executed so rarely that it does not matter at all. Yes, even in embedded systems. It needs to be correct and maintainable first. Then, and only then does efficiency enter the picture, and *if* it runs too slow or uses to much memory, you do some profiling to see where the code needs to be more efficient.

Until you understand that, you're not a software engineer, you're not a developer, you're a wannabe cowboy coder.

Re: pider Detection

2008-08-18 11:25 • by WhiskeyJack
212586 in reply to 212547
Andy Goth:
For example, CHAT scripts. "ogin:", "assword:" Hehehe, assword. :^)


You mean buttword.

Re: pider Detection

2008-08-18 11:25 • by dmh2000 (unregistered)
that's a poorly designed API. if a call returns an integer most of the time, why not do like nearly every other language and have it return -1 if the string is not found.

Re: pider Detection

2008-08-18 11:32 • by Dave (unregistered)
Staggers me the amount of PHP developers who never heard of stristr(), including all commenters here, it seems.

Re: pider Detection

2008-08-18 11:35 • by W. Snapper (unregistered)
I just did this on a recent project, to find the status of account applications. Their database had both "Approved" and "approved," "declined" and "Declined," etc.

Why would I not simply omit the first letter? And for those who are somehow claiming clarity, is there anyone who didn't instantly understand the code?

Re: pider Detection

2008-08-18 11:41 • by califa (unregistered)
212594 in reply to 212526
It eludes me why they couldn't make strpos() return -1 in case the "needle" was not found.

Re: pider Detection

2008-08-18 11:50 • by Dave (unregistered)
212597 in reply to 212587
It could be worse in perl you would probably pass in a hash of hashes containing any number of needles and haystacks and you get back either "FALSE", the position of the needle in the haystack, a hash containing the index of which haystack the needle was in and for some reason the name of the implentors cat or FILE_NOT_FOUND.

And of course the behaviour would be dependant on random symbols in global scope.

Re: pider Detection

2008-08-18 11:54 • by G (unregistered)
212600 in reply to 212587
because -1 is perfectly legal argument, for example, start or length of a substring (php.net/substr)
therefore
to prevent people from shooting their leg and not realizing it by directly using its return value as a parameter to another function, ie
substr($s, 0, strpos($s, '%'))
will return 'a' for 'a%b', but 'ab' for 'abc'
having strpos return false will issue a warning, under normal circumstances

do not criticize something you haven't throughoutly researched
« PrevPage 1 | Page 2 | Page 3Next »

Add Comment