Skip to content

Good Bot List

Reference list of legitimate bots recognized by WebDecoy.


WebDecoy maintains a list of 60+ known legitimate bots. When enabled, these bots are allowed through with minimal detection impact.


Primary search engine crawlers that index your content.

BotUser Agent PatternPurpose
GooglebotGooglebotGoogle Search indexing
Googlebot ImagesGooglebot-ImageGoogle Images indexing
Googlebot VideoGooglebot-VideoGoogle Video indexing
Googlebot NewsGooglebot-NewsGoogle News indexing
Google AdsBotAdsBot-GoogleGoogle Ads landing page check
BingbotbingbotBing Search indexing
Yahoo! SlurpSlurpYahoo Search indexing
DuckDuckBotDuckDuckBotDuckDuckGo indexing
BaiduspiderBaiduspiderBaidu (China) indexing
YandexBotYandexBotYandex (Russia) indexing
SogouSogouSogou (China) indexing
ExabotExabotExalead indexing
QwantifyQwantifyQwant indexing

Google and Bing bots can be verified by reverse DNS:

  • Googlebot: *.googlebot.com or *.google.com
  • Bingbot: *.search.msn.com

Crawlers that generate link previews and share content.

BotUser Agent PatternPurpose
FacebookfacebookexternalhitLink preview generation
Facebook CatalogFacebotProduct catalog scraping
TwitterTwitterbotTweet card generation
LinkedInLinkedInBotPost preview generation
PinterestPinterestPin image fetching
WhatsAppWhatsAppLink preview generation
SlackSlackbotLink unfurling
TelegramTelegramBotLink preview
DiscordDiscordbotEmbed generation
SkypeSkypeUriPreviewLink preview

Crawlers for AI training and LLM services.

BotUser Agent PatternPurposeDefault
GPTBotGPTBotOpenAI trainingBlocked
ChatGPT-UserChatGPT-UserChatGPT browsingBlocked
ClaudeBotClaudeBot, Claude-WebAnthropic trainingBlocked
Google-ExtendedGoogle-ExtendedGoogle AI trainingBlocked
PerplexityBotPerplexityBotPerplexity AIBlocked
CCBotCCBotCommon CrawlBlocked
Coherecohere-aiCohere trainingBlocked
Applebot ExtendedApplebot-ExtendedApple AI featuresBlocked

Note: AI crawlers are blocked by default. Enable them in Settings if you want your content included in AI training.


Services that check your site availability.

BotUser Agent PatternPurpose
PingdomPingdomUptime monitoring
UptimeRobotUptimeRobotUptime monitoring
StatusCakeStatusCakeUptime monitoring
Site24x7Site24x7Performance monitoring
DatadogDatadogAPM & monitoring
New RelicNewRelicPingerPerformance monitoring
GTmetrixGTmetrixPerformance testing
WebPageTestWebPageTestPerformance testing
CatchpointCatchpointSynthetic monitoring

Tools used for SEO analysis and marketing.

BotUser Agent PatternPurpose
SemrushSemrushBotSEO analysis
AhrefsAhrefsBotBacklink analysis
MozMozBot, rogerbotSEO analysis
MajesticMJ12botBacklink analysis
Screaming FrogScreaming FrogSite crawling
SistrixSISTRIXSEO analysis
SerpstatserpstatbotSEO analysis
SpyFuSpyFuCompetitor analysis

Services that fetch RSS/Atom feeds and aggregate content.

BotUser Agent PatternPurpose
FeedlyFeedlyRSS aggregation
FeedbinFeedbinRSS reader
InoreaderInoreaderRSS reader
NewsBlurNewsBlurRSS reader
Apple NewsAppleNewsBotApple News aggregation
FlipboardFlipboardContent aggregation

Tools used for website testing and development.

BotUser Agent PatternPurpose
W3C ValidatorW3C_ValidatorHTML validation
W3C Link CheckerW3C-checklinkLink validation
Google PageSpeedGoogle Page SpeedPerformance testing
LighthouseChrome-LighthouseWeb auditing
Archive.orgia_archiverWeb archiving

Cloud provider and CDN health checks.

BotUser Agent PatternPurpose
AWS ELBELB-HealthCheckerLoad balancer health
CloudflareCloudFlare-AlwaysOnlineAlways Online feature
FastlyFastlyCDN health check
AkamaiAkamaiGHostCDN monitoring

In Settings → Good Bots (or plugin settings):

OptionDescription
Allow Search EnginesGooglebot, Bingbot, etc.
Allow Social MediaFacebook, Twitter, LinkedIn, etc.
Allow MonitoringPingdom, UptimeRobot, etc.
Block AI CrawlersGPTBot, ClaudeBot, etc.

Add your own bots:

MyInternalBot
PartnerCrawler/1.0
CustomMonitor

For critical applications, verify bot identity:

  1. Reverse DNS (recommended for Google/Bing)

    Terminal window
    host <ip_address>
    # Should resolve to *.googlebot.com or *.search.msn.com
  2. IP Range Verification

    • Google publishes Googlebot IP ranges
    • Bing publishes bingbot IP ranges

When a good bot is detected:

{
"is_good_bot": true,
"bot_name": "Googlebot",
"bot_category": "search_engine",
"bot_verified": true,
"threat_score": 5
}
CategoryExamples
search_engineGooglebot, Bingbot
social_mediaFacebook, Twitter
ai_crawlerGPTBot, ClaudeBot
monitoringPingdom, UptimeRobot
seo_toolSemrush, Ahrefs
feed_readerFeedly, Feedbin
unknownUnrecognized bot