n24q02m/wet-mcp

Free

Web search (embedded SearXNG), content extraction, and library docs indexing with hybrid search (FTS5 + semantic). Built-in Qwen3 embedding, no API keys require

GitHub

About

Web search (embedded SearXNG), content extraction, and library docs indexing with hybrid search (FTS5 + semantic). Built-in Qwen3 embedding, no API keys required.

README

mcp-name: io.github.n24q02m/wet-mcp

Web search, content extraction, and library docs for AI agents -- 5-strategy scraping, runs without API keys.

Phase	Status	Scope
Phase 1	Shipped	web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2	Shipped	Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets)
Phase 3	Shipped	`extract.agent` multi-step research with cited synthesis, `extract.interact` click/fill/submit via patchright (optional session persistence), `docs_004_chunk_summaries` migration, `media.analyze` removed (v2.0.0)

Current release: v3.x. media(action="analyze") was removed in the v2.0.0 BREAKING release. Use imagine-mcp's understand action for vision/audio/video analysis. See docs/migration.md for the upgrade recipe.

CI codecov PyPI Docker License: MIT

Python SearXNG MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)

Project	Tagline	Tag
better-code-review-graph	Knowledge graph for token-efficient code reviews -- semantic search and call-...	MCP
better-email-mcp	IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att...	MCP
better-godot-mcp	Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g...	MCP
better-notion-mcp	Markdown-first Notion for AI agents -- pages, databases, blocks, and comments...	MCP
better-telegram-mcp	Telegram for AI agents -- messages, chats, media, and contacts across both bo...	MCP
claude-plugins	Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea...	Marketplace
imagine-mcp	Image and video understanding + generation for AI agents -- across Gemini, Op...	MCP
jules-task-archiver	Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a...	Tooling
mcp-core	Shared foundation for building MCP servers -- Streamable HTTP transport, OAut...	MCP
mnemo-mcp	Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi...	MCP
qwen3-embed	Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF	Library
skret	Secrets without the server.	CLI
tacet	TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn...	Tooling
web-core	Shared web infrastructure package for search, scraping, HTTP security, and st...	Library
wet-mcp	Open-source MCP server for AI agents: web search, content extraction, and lib...	MCP

Features
Status
Quick install
Configuration
Documentation
Tools
Comparison
Security
Build from Source
Deploy to Cloudflare
Trust Model
License

Features

Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap. Optional cloud search backends (Tavily, Brave, Exa) as a fallback chain via SEARCH_BACKENDS
Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
Media -- List + download images / videos / audio files. analyze was removed in v2.0.0 -- use imagine-mcp.understand for vision/audio inference
Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere, xAI, Anthropic) selected per task via the EMBEDDING_MODELS / RERANK_MODELS / LLM_MODELS model chains for higher-quality vectors and LLM features
Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 2 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8080 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Configuration

wet runs zero-config out of the box: web search uses an embedded local SearXNG, and embedding/reranking fall back to the bundled local Qwen3 ONNX models when no cloud keys are set. For higher-quality results, point each task at a cloud model chain. All settings are plain environment variables (no app prefix) -- in the HTTP self-host mode they are entered through the browser setup form instead.

Model chains (CSV provider/model,provider/model; order = fallback). Leave a chain empty to use the local ONNX models (embedding/rerank) or to disable LLM features (LLM):

Env var	Task	Empty default
`EMBEDDING_MODELS`	Embeddings for docs search	Local Qwen3-Embedding ONNX
`RERANK_MODELS`	Result reranking	Local Qwen3-Reranker ONNX
`LLM_MODELS`	`extract(action="agent")` synthesis	LLM features disabled

Provider keys -- the provider is inferred from each model's prefix; supply the matching key (litellm <PROVIDER>_API_KEY convention):

Model prefix	Key env var	Get it at
`jina_ai/`	`JINA_AI_API_KEY`	jina.ai/api-key
`gemini/`	`GEMINI_API_KEY`	aistudio.google.com/apikey
`openai/` (or bare)	`OPENAI_API_KEY`	platform.openai.com
`cohere/`	`COHERE_API_KEY`	dashboard.cohere.com
`xai/`	`XAI_API_KEY`	console.x.ai
`anthropic/`	`ANTHROPIC_API_KEY`	console.anthropic.com

Any other litellm provider works via env passthrough -- see litellm provider docs for its key name.

Search backends -- SEARCH_BACKENDS (CSV, runtime fallback chain) over searxng (default, local) plus optional cloud providers tavily / brave / exa. Point at an external SearXNG with SEARXNG_URL. Cloud providers need TAVILY_API_KEY / BRAVE_API_KEY / EXA_API_KEY.

Docs sync -- SYNC_ENABLED (default true), GOOGLE_DRIVE_CLIENT_ID (required for sync), SYNC_FOLDER (default wet-mcp), SYNC_INTERVAL (default 300s). Sync uses Google Drive over the OAuth Device Code flow (no browser redirect).

HTTP self-host -- MCP_TRANSPORT=http, PUBLIC_URL=<your-domain>. The setup form is gated by MCP_RELAY_PASSWORD; multi-user deployments also require CREDENTIAL_SECRET (per-user vault key) and MCP_DCR_SERVER_SECRET.

Example stdio config (cloud chains):

{
  "mcpServers": {
    "wet": {
      "command": "uvx",
      "args": ["wet-mcp"],
      "env": {
        "EMBEDDING_MODELS": "jina_ai/jina-embeddings-v5-text-small",
        "RERANK_MODELS": "jina_ai/jina-reranker-v3",
        "LLM_MODELS": "gemini/gemini-3-flash-preview",
        "JINA_AI_API_KEY": "jina_xxx",
        "GEMINI_API_KEY": "AIza_xxx"
      }
    }
  }
}

Status

Stable architecture with two transports: stdio (default, local) and HTTP (self-host, OAuth-gated). No daemon-bridge layer and no auto-spawn from stdio. The media.analyze action was removed in the v2.0.0 BREAKING release -- see docs/migration.md for the upgrade recipe. Current release line: v3.x.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/setup/:

Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
Modes overview -- stdio / local-relay / remote-relay / remote-oauth
Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

6 MCP tools (3 domain + config + help + config__open_relay). The legacy setup tool merged into config action dispatch.

Tool	Description
`search`	Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Includes `docs_resolve` (library name -> ranked id), `docs_query` (version-aware + topic + 5000-token cap), `docs_lock_project` (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection).
`extract`	URL -> smart chunks dict (`clean_text` + `markdown` + `structured_data` + `code_blocks` + `metadata`) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
`media`	`list` (discover URLs from gallery pages), `download` (SSRF-safe). `analyze` was removed in v2.0.0 -- use `imagine-mcp.understand` instead
`config`	`status`, `set`, `cache_clear`, `docs_reindex`, `warmup`, `setup_sync`, `setup_status`, `setup_skip`, `setup_reset`, `setup_complete`
`help`	Per-tool documentation: `search`, `extract`, `media`, `config`
`config__open_relay`	Re-trigger the zero-config relay setup flow (prints a fresh relay URL for the browser form). Registered via `mcp-core`'s `register_open_relay_tool` so an LLM can restart setup without a manual restart.

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze was removed in wet v2.0.0 -- use imagine-mcp.understand instead.

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability	wet-mcp	Brave Search	Tavily	Firecrawl	Context7
Web search	Yes (SearXNG aggregation)	Yes	Yes	No	No
Extract URL	Yes (5-strategy chain)	No	Yes (basic)	Yes	No
Media list / download	Yes	No	No	No	No
Library docs search	Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets)	No	No	No	Yes
Academic research	Yes (6 providers)	No	No	No	No
Self-hostable	Yes	No	No	No	Yes
Free tier	Yes (open source)	Limited	Limited	Limited	Yes

Security

SSRF prevention -- URL validation on crawl targets
Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
Error sanitization -- No credentials in error messages
File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Deploy to Cloudflare

Run your own single-user wet instance serverless on Cloudflare (Containers + D1 + Vectorize + KV).

Prerequisites: a Cloudflare account on the Workers Paid plan and the wrangler CLI.

git clone https://github.com/n24q02m/wet-mcp && cd wet-mcp
wrangler login

Provision resources and apply the D1 schema:

wrangler d1 create wet-docs
wrangler d1 execute wet-docs --file migrations/0001_init_wet.sql --remote
wrangler vectorize create wet-docs-vectors --dimensions 768 --metric cosine
wrangler kv namespace create wet-kv

Paste the returned IDs into wrangler.jsonc.

Push the container image to your Cloudflare managed registry (CF Containers cannot pull from external registries directly), then set <YOUR_ACCOUNT_ID> in wrangler.jsonc:

docker pull ghcr.io/n24q02m/wet-mcp:beta
docker tag ghcr.io/n24q02m/wet-mcp:beta wet-mcp:beta
wrangler containers push wet-mcp:beta   # prints registry.cloudflare.com/<ACCOUNT_ID>/wet-mcp:beta

Set secrets (use SEARXNG_URL with basic-auth userinfo, e.g. https://user:[email protected], or TAVILY_API_KEY if you set SEARCH_BACKEND=tavily):

wrangler secret put CREDENTIAL_SECRET
wrangler secret put JINA_AI_API_KEY
wrangler secret put GOOGLE_VERTEX_EXPRESS_API_KEY
wrangler secret put XAI_API_KEY
wrangler secret put MCP_RELAY_PASSWORD
wrangler secret put MCP_DCR_SERVER_SECRET
wrangler secret put SEARXNG_URL

wrangler deploy and complete setup in the browser relay form at your Worker domain.

Storage maps to Cloudflare via MCP_STORAGE_BACKEND=cf-kv (credentials/tokens, encrypted), DOCS_DB_BACKEND=cf-d1 (docs + BM25 full-text), and Vectorize (embeddings). Web search uses a SearXNG instance (SEARCH_BACKEND=searxng, SEARXNG_URL) or Tavily (SEARCH_BACKEND=tavily); embed/rerank are forced cloud via EMBEDDING_MODELS/RERANK_MODELS.

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.

Mode	Storage	Encryption	Who can read your data?
stdio (default)	`~/.wet-mcp/config.json`	AES-GCM, machine-bound key	Only your OS user (file perm 0600)
HTTP self-host	Same as stdio	Same	Only you (admin = user)

License

MIT -- See LICENSE.

Install n24q02m/wet-mcp in Claude Desktop, Claude Code & Cursor

Run in your terminal:

claude mcp add n24q02m-wet-mcp -- npx

FAQ

Is n24q02m/wet-mcp MCP free?

Yes, n24q02m/wet-mcp MCP is free — one-click install via Unyly at no cost.

Does n24q02m/wet-mcp need an API key?

No, n24q02m/wet-mcp runs without API keys or environment variables.

Is n24q02m/wet-mcp hosted or self-hosted?

Self-hosted: the server runs locally on your machine via the install command above.

How do I install n24q02m/wet-mcp in Claude Desktop, Claude Code or Cursor?

Open n24q02m/wet-mcp on unyly.org, pick your client tab (Claude Desktop, Claude Code, Cursor) and press Install — the config is generated automatically, no JSON editing.

n24q02m/wet-mcp

About

README

Table of contents

Features

Quick install

Configuration

Status

Documentation

Tools

Comparison

Security

Build from Source

Deploy to Cloudflare

Trust Model

License

Install n24q02m/wet-mcp in Claude Desktop, Claude Code & Cursor

FAQ

Is n24q02m/wet-mcp MCP free?

Does n24q02m/wet-mcp need an API key?

Is n24q02m/wet-mcp hosted or self-hosted?

How do I install n24q02m/wet-mcp in Claude Desktop, Claude Code or Cursor?

Related MCPs

Compare n24q02m/wet-mcp with

Notion

Linear

Google Drive

mindsdb/mindsdb

Command Palette

n24q02m/wet-mcp

About

README

Table of contents

Features

Quick install

Configuration

Status

Documentation

Tools

Comparison

Security

Build from Source

Deploy to Cloudflare

Trust Model

License

Install n24q02m/wet-mcp in Claude Desktop, Claude Code & Cursor

FAQ

Is n24q02m/wet-mcp MCP free?

Does n24q02m/wet-mcp need an API key?

Is n24q02m/wet-mcp hosted or self-hosted?

How do I install n24q02m/wet-mcp in Claude Desktop, Claude Code or Cursor?

Related MCPs

Compare n24q02m/wet-mcp with

Notion

Linear

Google Drive

mindsdb/mindsdb