- Joined
- Nov 21, 2020
These 'people' are literally retarded.The software is only doing naïve user agent checks, and can be trivially bypassed by changing the user-agent header, which all browsers transmit to identify themselves to the web server. Very sloppy job on the dev's part.
Out of the box, Anubis is pretty heavy-handed. It will aggressively challenge everything that might be a browser (usually indicated by having `Mozilla` in its user agent). However, some bots are smart enough to get past the challenge. Some things that look like bots may actually be fine (IE: RSS readers). Some resources need to be visible no matter what. Some resources and remotes are fine to begin with.
Code:
{
"bots": [
{
"import": "(data)/bots/_deny-pathological.yaml"
},
{
"import": "(data)/bots/ai-robots-txt.yaml"
},
{
"import": "(data)/crawlers/_allow-good.yaml"
},
{
"import": "(data)/bots/aggressive-brazilian-scrapers.yaml"
},
{
"import": "(data)/common/keep-internet-working.yaml"
},
{
"name": "generic-browser",
"user_agent_regex": "Mozilla|Opera",
"action": "CHALLENGE"
}
],
"dnsbl": false,
"status_codes": {
"CHALLENGE": 200,
"DENY": 200
}
}
And yes, almost all of the conditionals in the yaml files are based on user_agent.
Last edited:
