Rose here. Also @umbraroze for non-kbin stuff.

  • 1 Post
  • 2 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle
  • Yup. The robots.txt file is not only meant to block robots from accessing the site, it’s also meant to block bots from accessing resources that are not interesting for human readers, even indirectly.

    For example, MediaWiki installations are pretty clever in that by default, /w/ is blocked and /wiki/ is encouraged. Because nobody wants technical pages and wiki histories in search results, they only want the current versions of the pages.

    Fun tidbit: in the late 1990s, there was a real epidemic of spammers scraping the web pages for email addresses. Some people developed wpoison.cgi, a script whose sole purpose was to generate garbage web pages with bogus email addresses. Real search engines ignored these, thanks to robots.txt. Guess what the spam bots did?

    Do the AI bros really want to go there? Are they asking for model collapse?


  • I literally just looked at Reddit for the first time in ages.

    What the fuck.

    Here’s the thing: Reddit’s UI design has always been shitty. Old Reddit was fucking garbage, so admins cheerfully asked RES folks to fix their shit. (Instead of, you know, hiring them.) New Reddit? Always been shit, and nobody’s going to fix it.

    This Newer New Reddit? I… I don’t think they even know at this point. What. What’s going on.

    If they ask critique from the community, some AI bot will AI-pat the admin’s arse and AI-splain the remaining AI-users that things will be just fine. (Now, “things actually getting better” has literally never happened as far as Reddit or its user interface has ever been concerned, as you should well know if you’ve ever been a human Reddit user.)