AI2Bot
Allen Institute for AI
AI Training
Allen Institute for AI bot for academic research and model training
User Agent:
Copy
Mozilla/5.0 (compatible; AI2Bot/1.0; +https://allenai.org/)
Tags:
#ai2
#academic
#research
#training
Amazonbot
Amazon
AI Assistant
Amazon bot to improve Alexa and AWS AI services
User Agent:
Copy
Amazonbot/0.1 (+https://developer.amazon.com/support/amazonbot)
Tags:
#amazon
#alexa
#aws
#assistant
Andi AI search engine bot, competitor to Perplexity
User Agent:
Copy
Mozilla/5.0 (compatible; Andibot/1.0)
Tags:
#andi
#search
#answer-engine
#competitor
anthropic-ai
Anthropic
AI Training
Training bot for Anthropic's Claude models, collects data to improve models
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-Web/1.0; +https://www.anthropic.com)
Bot name:
Copy
anthropic-ai
Tags:
#claude
#anthropic
#training
#bulk-data
Anthropic-Claude
Anthropic
AI Assistant
Updated Anthropic Claude bot for real-time web access and citations
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Anthropic-Claude/1.0; +https://www.anthropic.com)
Bot name:
Copy
Anthropic-Claude
Tags:
#anthropic
#claude
#realtime
#citations
Claude-Web
Anthropic
AI Search
Claude's web bot for exploration and indexing of web content
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-Web/1.0; +https://www.anthropic.com)
Bot name:
Copy
claude-web
Tags:
#claude
#anthropic
#web
#crawling
ClaudeBot
Anthropic
AI Assistant
Bot used by Claude to fetch citations and references in real-time during conversations
User Agent:
Copy
ClaudeBot/1.0; +https://www.anthropic.com
Tags:
#claude
#anthropic
#citations
#assistant
Applebot-Extended
Apple
AI Training
Bot for training Apple AI models (Apple Intelligence)
User Agent:
Copy
Mozilla/5.0 (compatible; Applebot-Extended/1.0)
Bot name:
Copy
Applebot-Extended
Tags:
#apple
#apple-intelligence
#training
#siri
bigsur.ai
BigSur AI
AI Training
New emerging AI bot, details on usage still limited
User Agent:
Copy
Mozilla/5.0 (compatible; bigsur.ai/1.0)
Tags:
#bigsur
#emerging
#new
#training
Brightbot
Bright Data
AI Training
Bright Data analysis bot to collect data for AI
User Agent:
Copy
Mozilla/5.0 (compatible; Brightbot/1.0)
Tags:
#bright-data
#analysis
#data-collection
#training
Bytespider
ByteDance
AI Training
ByteDance (TikTok) bot for training their Chinese AI models
Bot name:
Copy
Bytespider
Tags:
#bytedance
#tiktok
#chinese
#training
TerraCotta
Ceramic
AI Training
Ceramic AI crawler for web content indexing and model training, first seen June 2025
User Agent:
Copy
TerraCotta https://github.com/CeramicTeam/CeramicTerracotta
Bot name:
Copy
TerraCotta
Tags:
#ceramic
#training
#emerging
#crawler
Character-AI
Character.AI
AI Assistant
Character.AI bot for training conversational AI characters
User Agent:
Copy
Mozilla/5.0 (compatible; Character-AI/1.0; +https://character.ai/)
Bot name:
Copy
Character-AI
Tags:
#character-ai
#conversational
#characters
#training
Devin
Cognition AI
AI Assistant
Devin AI code assistant bot to analyze and understand online code
User Agent:
Copy
Mozilla/5.0 (compatible; Devin/1.0)
Tags:
#devin
#code-assistant
#programming
#cognition-ai
Cohere-Ai
Cohere
AI Training
Cohere bot for training their language models and NLP
User Agent:
Copy
Mozilla/5.0 (compatible; Cohere-AI/1.0; +https://cohere.com/)
Tags:
#cohere
#nlp
#training
#enterprise
Cohere-Command
Cohere
AI Assistant
Cohere Command model bot for real-time information retrieval
User Agent:
Copy
Mozilla/5.0 (compatible; Cohere-Command/1.0; +https://cohere.com/)
Bot name:
Copy
Cohere-Command
Tags:
#cohere
#command
#assistant
#enterprise
CCBot
Common Crawl
AI Training
Common Crawl bot, widely used for training open source AI models
User Agent:
Copy
CCBot/2.0 (https://commoncrawl.org/faq/)
Tags:
#common-crawl
#open-data
#training
#dataset
Crawlspace
Crawlspace
AI Training
Crawling service specialized for AI and data extraction
User Agent:
Copy
Mozilla/5.0 (compatible; Crawlspace/1.0)
Bot name:
Copy
Crawlspace
Tags:
#crawling-service
#data-extraction
#ai
#training
DeepseekBot
DeepSeek
AI Training
DeepSeek AI bot for training their advanced reasoning models and data collection
User Agent:
Copy
Mozilla/5.0 (compatible; DeepseekBot/1.0; +https://www.deepseek.com/bot)
Bot name:
Copy
DeepseekBot
Tags:
#deepseek
#reasoning
#training
#chinese
Diffbot
Diffbot
AI Training
Diffbot bot for structured data extraction and creating knowledge graphs for AI
User Agent:
Copy
Mozilla/5.0 (compatible; Diffbot/0.1; +http://www.diffbot.com/our-apis/crawler/)
Tags:
#diffbot
#knowledge-graph
#extraction
#structured-data
DuckAssistBot
DuckDuckGo
AI Assistant
DuckDuckGo bot for their privacy-respecting AI assistant
User Agent:
Copy
Mozilla/5.0 (compatible; DuckAssistBot/1.0; +https://duckduckgo.com/duckassist)
Bot name:
Copy
DuckAssistBot
Tags:
#duckduckgo
#privacy
#assistant
#search
FirecrawlAgent
Firecrawl
AI Training
New scraping service specialized for AI and LLMs
User Agent:
Copy
Mozilla/5.0 (compatible; FirecrawlAgent/1.0)
Bot name:
Copy
FirecrawlAgent
Tags:
#firecrawl
#scraping
#llm
#training
Bard-Ai
Google
AI Assistant
Google Bard AI assistant bot for web content retrieval
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Bard-AI/1.0; +https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers)
Tags:
#google
#bard
#assistant
#search
Gemini-Ai
Google
AI Assistant
Google Gemini AI model bot for training and web content analysis
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Gemini-AI/1.0; +https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers)
Tags:
#google
#gemini
#training
#analysis
Gemini-Deep-Research
Google
AI Assistant
Bot for Gemini Deep Research in-depth searches
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Gemini-Deep-Research/1.0)
Bot name:
Copy
Gemini-Deep-Research
Tags:
#google
#gemini
#deep-research
#assistant
Google-CloudVertexBot
Google
AI Training
Google crawler for Vertex AI Agents, crawls content at request of site owners building AI agents
User Agent:
Copy
Mozilla/5.0 (compatible; Google-CloudVertexBot/1.0; +https://cloud.google.com/vertex-ai)
Bot name:
Copy
Google-CloudVertexBot
Tags:
#google
#vertex-ai
#training
#agents
Google-Extended
Google
AI Training
Token to control access to content for Gemini/Bard and Vertex AI
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Google-Extended/1.0; +https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers)
Bot name:
Copy
Google-Extended
Tags:
#google
#gemini
#bard
#vertex-ai
Google-NotebookLM
Google
AI Assistant
Google NotebookLM bot that fetches individual URLs provided by users as sources for their research projects
User Agent:
Copy
Mozilla/5.0 (compatible; Google-NotebookLM/1.0; +https://notebooklm.google.com/)
Bot name:
Copy
Google-NotebookLM
Tags:
#google
#notebooklm
#user-triggered
#research
#assistant
GoogleAgent-Mariner
Google
AI Assistant
Google Project Mariner agentic browser for AI Ultra subscribers ($249.99/month). Operates on cloud-based virtual machines as a remote browser environment rather than traditional crawler.
User Agent:
Copy
GoogleAgent-Mariner
Bot name:
Copy
GoogleAgent-Mariner
Tags:
#google
#mariner
#agentic-browser
#premium
#cloud-vm
Groq-Bot
Groq
AI Training
Groq inference engine bot for high-speed AI model data collection
User Agent:
Copy
Mozilla/5.0 (compatible; Groq-Bot/1.0; +https://groq.com/)
Tags:
#groq
#inference
#high-speed
#training
HuggingFace-Bot
Hugging Face
AI Training
Hugging Face bot for training open-source AI models and datasets
User Agent:
Copy
Mozilla/5.0 (compatible; HuggingFace-Bot/1.0; +https://huggingface.co/)
Bot name:
Copy
HuggingFace-Bot
Tags:
#huggingface
#open-source
#training
#datasets
IbouBot
Ibou.io
AI Search
Ethical search engine crawler that drives traffic to original sources. Uses GenAI for query processing but does NOT train AI models. Respects creators and publisher rights
User Agent:
Copy
Mozilla/5.0 (compatible; IbouBot/1.0;
[email protected] ; +https://ibou.io/iboubot.html)
IP Ranges:
Copy All
217.113.196.0/24
Tags:
#ibou
#french
#ethical-search
#traffic-driver
#creator-friendly
FacebookBot
Meta
AI Training
Traditional Facebook bot extended for AI and machine learning
User Agent:
Copy
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Bot name:
Copy
FacebookBot
Tags:
#meta
#facebook
#social
#ai
Meta-ExternalAgent
Meta
AI Training
Meta bot for training their AI models (Llama, etc.)
User Agent:
Copy
Meta-ExternalAgent/1.0 (+https://developers.facebook.com/docs/sharing/bot)
Bot name:
Copy
Meta-ExternalAgent
Tags:
#meta
#facebook
#llama
#training
meta-webindexer
Meta
AI Search
Meta web indexer bot for building independent search capabilities for Meta AI chatbot
User Agent:
Copy
meta-webindexer/1.1
Bot name:
Copy
meta-webindexer
Tags:
#meta
#search
#indexing
#ai-search
BingBot
Microsoft
AI Search
Microsoft Bing crawler used for Bing Search and Copilot AI features
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/W.X.Y.Z Safari/537.36 Edg/W.X.Y.Z
Tags:
#microsoft
#bing
#copilot
#search
MistralAI-User
Mistral AI
AI Assistant
Mistral AI bot to retrieve citations in Le Chat
User Agent:
Copy
MistralAI-User/1.0
Bot name:
Copy
MistralAI-User
Tags:
#mistral
#le-chat
#french
#citations
ChatGPT Atlas
OpenAI
AI Assistant
β οΈ STEALTH
OpenAI's agentic browser with integrated AI. Uses standard Chrome user-agent, making it completely indistinguishable from regular browser traffic. Cannot be blocked via robots.txt. Features "agent mode" for autonomous task completion.
User Agent:
Copy
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36
Bot name:
Copy
ChatGPT-Atlas
Tags:
#openai
#chatgpt
#atlas
#agentic-browser
#stealth
#undetectable
ChatGPT-Browser
OpenAI
AI Assistant
ChatGPT web browsing bot for real-time web access during conversations
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-Browser/1.0; +https://openai.com/bot)
Bot name:
Copy
ChatGPT-Browser
Tags:
#openai
#chatgpt
#browsing
#realtime
ChatGPT-User
OpenAI
AI Assistant
Bot used for real-time searches when a user asks a question to ChatGPT
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/1.0; +https://openai.com/bot)
Bot name:
Copy
ChatGPT-User
Tags:
#chatgpt
#realtime
#search
#user-triggered
ChatGPT-User v2.0
OpenAI
AI Assistant
Updated version of ChatGPT-User bot for real-time searches (since February 2025)
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/2.0; +https://openai.com/bot)
Bot name:
Copy
ChatGPT-User-v2
Tags:
#chatgpt
#realtime
#search
#user-triggered
#v2
GPTBot
OpenAI
AI Training
Bot used by OpenAI to collect training data for ChatGPT and future GPT models
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)
Tags:
#chatgpt
#training
#openai
#gpt
OAI-SearchBot
OpenAI
AI Search
Specific indexing bot for ChatGPT Search, competitor to Google Search
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot)
Bot name:
Copy
OAI-SearchBot
Tags:
#openai
#search
#indexation
#chatgpt-search
Perplexity Stealth
Perplexity AI
AI Assistant
β οΈ STEALTH
Perplexity uses headless browsers with Chrome user agents to bypass blocking
User Agent:
Copy
Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36
Bot name:
Copy
Perplexity-Stealth
Tags:
#perplexity
#stealth
#headless
#chrome
Perplexity-User
Perplexity AI
AI Assistant
Bot triggered when a user clicks on a link in a Perplexity response
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/bot)
Bot name:
Copy
Perplexity-User
Tags:
#perplexity
#user-triggered
#realtime
PerplexityBot
Perplexity AI
AI Search
Perplexity indexing bot to feed their AI search engine
User Agent:
Copy
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/bot)
Bot name:
Copy
PerplexityBot
Tags:
#perplexity
#search
#answer-engine
#indexation
Replicate-Bot
Replicate
AI Training
Replicate platform bot for AI model training and data collection
User Agent:
Copy
Mozilla/5.0 (compatible; Replicate-Bot/1.0; +https://replicate.com/)
Bot name:
Copy
Replicate-Bot
Tags:
#replicate
#platform
#training
#models
RunPod-Bot
RunPod
AI Training
RunPod cloud platform bot for GPU-based AI training data collection
User Agent:
Copy
Mozilla/5.0 (compatible; RunPod-Bot/1.0; +https://runpod.io/)
Bot name:
Copy
RunPod-Bot
Tags:
#runpod
#gpu
#cloud
#training
ImagesiftBot
The Hive
AI Training
Bot for reverse image search and training image generation models
User Agent:
Copy
Mozilla/5.0 (compatible; ImagesiftBot/1.0)
Bot name:
Copy
ImagesiftBot
Tags:
#image-search
#reverse-search
#image-generation
#training
TimpiBot
Timpi
AI Training
Timpi bot for training their Large Language Models
User Agent:
Copy
Mozilla/5.0 (compatible; TimpiBot/1.0)
Tags:
#timpi
#llm
#training
#search
Together-Bot
Together AI
AI Training
Together AI platform bot for decentralized AI model training
User Agent:
Copy
Mozilla/5.0 (compatible; Together-Bot/1.0; +https://together.ai/)
Bot name:
Copy
Together-Bot
Tags:
#together-ai
#decentralized
#training
#platform
Kangaroo Bot
Unknown (China)
AI Training
Chinese AI bot, origin and exact usage unknown
User Agent:
Copy
Mozilla/5.0 (compatible; Kangaroo Bot/1.0)
Bot name:
Copy
Kangaroo Bot
Tags:
#chinese
#unknown
#training
#suspicious
PanguBot
Unknown (China)
AI Training
Another Chinese AI bot, possibly linked to Pangu models
User Agent:
Copy
Mozilla/5.0 (compatible; PanguBot/1.0)
Tags:
#chinese
#pangu
#training
#unknown
Cotoyogi
Unknown (Japan)
AI Training
Japanese AI bot, specific usage unknown
User Agent:
Copy
Mozilla/5.0 (compatible; Cotoyogi/1.0)
Tags:
#japanese
#unknown
#training
#asia
AkiraBot
Unknown (Malicious)
AI Training
β οΈ STEALTH
Malicious spam bot using OpenAI LLMs to generate custom spam messages for contact forms. Uses generic Chrome user-agent strings and residential proxies. Primarily targets customer support chats via Selenium automation.
User Agent:
Copy
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Tags:
#spam
#malicious
#llm-powered
#selenium
#contact-form-spam
Webzio-Extended
Webz.io
AI Training
Webz.io bot that collects data to sell to AI companies for training
User Agent:
Copy
Mozilla/5.0 (compatible; Webzio-Extended/1.0)
Bot name:
Copy
Webzio-Extended
Tags:
#webzio
#data-broker
#training
#commercial
Elon Musk's xAI bot for training Grok and other AI models
User Agent:
Copy
Mozilla/5.0 (compatible; xAI-Bot/1.0; +https://x.ai/)
Tags:
#xai
#grok
#elon-musk
#training
You.com AI search engine bot for indexing and answering questions
User Agent:
Copy
Mozilla/5.0 (compatible; YouBot/1.0; +https://you.com/bot)
Tags:
#you-com
#search
#answer-engine
#ai
π€
No AI bots found
Try adjusting your filters or search terms
Clear All Filters