aeo.press monitors a wide variety of web bots, including search engine crawlers, AI agents, social media scrapers, and other automated tools. Tracking these bots helps users understand and control automated access to their websites.
Overview of Bot Detection in aeo.press
- Detection Methods:
- Uses both the DeviceDetector library and custom pattern matching for accurate identification.
- Bot detection is applied both server-side and in client-side analytics scripts.
- Transparency:
- The complete list of monitored bots is maintained in the open-source
robots.jsonfile. - This list is updated regularly and is open to community contributions.
- The complete list of monitored bots is maintained in the open-source
Categories of Tracked Bots
Bots are grouped by their primary purpose:
| Category | Example Bots | Description |
|---|---|---|
| AI | OpenAI GPTBot, Claude | Large language model and AI crawlers |
| Search | Googlebot, Bingbot | Major search engine crawlers |
| Social | Facebook, Twitterbot | Social media and preview scrapers |
| Crawler | AhrefsBot, SemrushBot, MJ12bot, Unknown Bot | General web crawlers and scrapers |
List of Tracked Bots
The following bots are explicitly tracked by aeo.press:
- AI & Large Language Model Bots
- OpenAI GPTBot (
gptbot) - OpenAI SearchBot (
oai-searchbot) - Claude / Anthropic (
claude,anthropic) - Perplexity AI (
perplexitybot)
- OpenAI GPTBot (
- Major Search Engines
- Googlebot (
googlebot) - Bingbot (
bingbot) - DuckDuckBot (
duckduckbot) - Baiduspider (
baiduspider) - YandexBot (
yandex)
- Googlebot (
- Social & Scraper Bots
- Facebook (
facebookexternalhit,meta-externalagent) - Twitterbot (
twitterbot) - LinkedInBot (
linkedinbot)
- Facebook (
- Other Crawlers
- AhrefsBot (
ahrefsbot) - SemrushBot (
semrushbot) - MJ12bot (
mj12bot) - Unknown Bot (
bot,spider,crawler)
- AhrefsBot (
How Bot Data Is Used in Analytics
For sites with analytics enabled, aeo.press tracks bot activity using a browser script. This script checks the browser's user agent string for known bot patterns and logs the following data:
- Whether the visitor is a bot (
is_bot) - Detected bot name (
bot_name)
This information is sent alongside regular analytics data, allowing for more accurate reporting and filtering of non-human traffic.