Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Perplexity crawled it anyway

No, they did not. Crawling = recursive fetching, which wasn't what was happening here.

But also, I don't think there is anything wrong with ignoring robots.txt. In fact, I believe it is discriminatory and people should ignore it. See: https://wiki.archiveteam.org/index.php/Robots.txt





> I don't think there is anything wrong with ignoring robots.txt

Neither do I, I just thought your reply was disingenuous.

> Crawling = recursive fetching

I do not find this convincing. I am ok with using the word crawler for recursive fetching only. But robots.txt is not only for excluding crawlers and never has been. From the very beginning it was used to exclude specific automated clients, whether they only fetch one page or many, and that is certainly how the vast majority of people think about it today.

Like I implied in my first comment, I have no problem with you saying you dislike robots.txt, but it is not reasonable to pretend the article is unclear in some way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: