# SearchHub A local search engine for your browser bookmarks and history. Import bookmarks/history from Firefox, Zen, Chrome, or Chromium, search them with full-text queries, and optionally forward searches to external engines like wikipedia or SearXNG (aggregates results from dozens of backends). Content can be automatically tagged via local ONNX embeddings (opt-in; set `tagging_enabled = true` in config). ## Install **Binaries** are available at [vit.am/~ololduck/search_hub/latest](https://vit.am/~ololduck/search_hub/latest/). Download the binary for your architecture, extract, and run. **Source:** Clone the [repository](https://vit.am/~ololduck/search_hub/repository.git) and build with Rust: **Prerequisites:** Rust (install via [rustup](https://rustup.rs/)). ```sh git clone https://vit.am/~ololduck/search_hub/repository.git search_hub cd search_hub cargo install --path . ``` This installs the `search_hub` binary to `~/.cargo/bin/search_hub`. To update later, pull the latest code and reinstall. ## First steps ```sh # Import bookmarks from Firefox (auto-discovers your profile) search_hub import firefox # Import from Chrome search_hub import chrome # Start the web UI search_hub serve ``` Open http://127.0.0.1:8080 in your browser. You can now search your bookmarks. Search queries are also forwarded to external engines: Wikipedia, [crates.io](https://crates.io) via its public JSON API, and optionally [SearXNG](https://searx.space) (which aggregates Google, Bing, DDG, and dozens more) if `[[engines]]` is configured. Works as a custom search provider in Firefox/Zen via the OpenSearch protocol (your browser should auto-discover it at `/opensearch.xml`). ## CLI reference | Command | What it does | |---------|-------------| | `search_hub serve` | Start web UI on port 8080 | | `search_hub serve --port 3000` | Start on a custom port | | `search_hub import firefox` | Import bookmarks from Firefox | | `search_hub import chrome` | Import from Chrome/Chromium | | `search_hub import zen` | Import from Zen Browser | | `search_hub search "query"` | Search bookmarks from the terminal | | `search_hub list` | List all bookmarks | | `search_hub insert "Title" https://..."` | Add a bookmark (fetches content, auto-tags if enabled) | | `search_hub remove --id 1` | Delete a bookmark by ID | | `search_hub retag --all` | Re-run auto-tagging (requires `tagging_enabled = true` in config) | | `search_hub init-config` | Create a default config file at `~/.config/search_hub/config.toml` | | `search_hub self-update` | Check abbaye Atom feed and update to the latest release | | `search_hub self-update --dry-run` | Check for updates without downloading | | `search_hub self-update --target x86_64-unknown-linux-gnu` | Override the target triple | All commands use `~/.local/share/search_hub/bookmarks.db` by default. Override with `--db-path` or set `db_path` in the config file. The first time you use a search or insert command, SearchHub downloads an ONNX embedding model to `$XDG_CACHE_DIR` (defaults to `~/.cache/search_hub`) (about 127 MB). ## Configuration Run `search_hub init-config` to create `~/.config/search_hub/config.toml` with all available options commented out. Or create it manually: ```toml # Bookmark database path (default: platform data directory) # db_path = "/home/you/.local/share/search_hub/bookmarks.db" # Custom tags override the built-in defaults # [[tags]] # name = "my-custom-tag" # examples = ["example text one", "example text two"] # Whether auto-tagging is enabled (default: false, requires ONNX model download on first use) # tagging_enabled = true # Minimum confidence for auto-tagging (0.0 to 1.0, default: 0.6) # tagging_threshold = 0.6 # Hosts to skip when fetching content for bookmarking (default: local addresses) # exclude_urls = ["localhost", "127.0.0.1", "::1"] # Per-engine configuration (optional) # Multiple instances supported (e.g., public + private crates.io registries) [[engines]] type = "searxng" instance = "https://search.kael.ink" # timeout_secs = 10.0 # optional per-engine timeout # Best: use an existing public instance (see https://searx.space). # Also possible: run your own with Docker: # docker run -d --name searxng -p 8888:8080 searxng/searxng # Custom crates.io registry (optional) # [[engines]] # type = "crates_io" # url = "https://registry.example.com/api/v1/crates?q={}&per_page=10" # timeout_secs = 5.0 # Wikipedia search (optional, defaults to English) # [[engines]] # type = "wikipedia" # lang = "fr" # timeout_secs = 5.0 # MDN Web Docs search (optional, defaults to en-US) # [[engines]] # type = "mdn" # locale = "fr" # timeout_secs = 5.0 # Generic HTML-scraped engine (use with any search site) # Provide a URL template with `{}` for the query and a CSS selector # targeting the result container. Results are extracted from `<a>` links # inside that container (deduplicated, up to 10, http/https only). # # Note: most commercial search engines (Google, Bing, DuckDuckGo, etc.) # block automated requests. This engine works best with small/niche sites # that don't enforce bot detection. To find the right selector, view the # page source or use browser dev tools on the search results page. # [[engines]] # type = "generic" # name = "DuckDuckGo" # url = "https://html.duckduckgo.com/html/?q={}" # selector = "div.results" # timeout_secs = 10.0 # shortcode = "ddg" # optional: override auto-generated shortcode # bang_enabled = true # optional: disable ! redirect but keep @ # bang_url = "..." # optional: custom redirect URL (keeps shortcode) ``` ## Search shortcuts SearchHub supports two query prefixes that use **shortcodes** — compact aliases auto-generated from your configured `[[engines]]`. | Prefix | Example | Behavior | |--------|---------|----------| | `!` | `!w Rust` | HTTP 302 redirect to the site's own search results page | | `@` | `@w Rust` | Show search results from that engine only (bookmarks still shown) | ### Auto-generated shortcodes Each engine type gets a sensible default shortcode: | Engine | Shortcode | Bang URL | |--------|-----------|----------| | Wikipedia (lang=en) | `w` | `https://en.wikipedia.org/w/index.php?search={}` | | Wikipedia (lang=fr) | `wfr` | `https://fr.wikipedia.org/w/index.php?search={}` | | MDN (locale=en-US) | `mdn` | `https://developer.mozilla.org/en-US/search?q={}` | | MDN (locale=fr) | `mdnfr` | `https://developer.mozilla.org/fr/search?q={}` | | crates.io | `crates` | `https://crates.io/search?q={}` | | SearXNG | `sx` | `{instance}/search?q={}` | | Generic | slugified name | the engine's own URL template | ### Overriding shortcodes per engine Set `shortcode`, `bang_url`, or `bang_enabled` directly on the engine: ```toml [[engines]] type = "wikipedia" lang = "fr" shortcode = "wikifr" # overrides "wfr" bang_enabled = false # disable ! redirect (still searchable via @) bang_url = "https://..." # custom redirect URL ``` ### Custom bangs (standalone shortcuts — no @ support) ```toml [[bangs]] trigger = "gh" url = "https://github.com/search?q={}" name = "GitHub" # Suppress an auto-generated shortcut [[bangs]] trigger = "crates" enabled = false ``` ### Collisions If two engines produce the same shortcode, SearchHub panics at startup with a message naming both engines. Set `shortcode` on one of them to resolve it. ## Run the web server as a systemd user service Keeps the web UI running in the background, starts automatically on login. ```sh VERSION=(search_hub --version | cut -d\ -f2) mkdir -p ~/.config/systemd/user wget -O ~/.config/systemd/user/search-hub-web.service https://vit.am/~ololduck/search_hub/repository/browse/v$VERSION/contrib/search-hub-web.service systemctl --user daemon-reload systemctl --user enable --now search-hub-web.service ``` Check status with `systemctl --user status search-hub-web`. View logs with `journalctl --user -u search-hub-web -f`. ## Auto-import with systemd ```sh VERSION=(search_hub --version | cut -d\ -f2) mkdir -p ~/.config/systemd/user wget -O ~/.config/systemd/user/search-hub-import.service https://vit.am/~ololduck/search_hub/repository/browse/v$VERSION/contrib/search-hub-import.service wget -O ~/.config/systemd/user/search-hub-import.timer https://vit.am/~ololduck/search_hub/repository/browse/v$VERSION/contrib/search-hub-import.timer systemctl --user daemon-reload systemctl --user enable --now search-hub-import.timer ``` This imports bookmarks from Zen Browser daily. Edit the file to import from another browser. ## Auto-update with systemd ```sh VERSION=(search_hub --version | cut -d\ -f2) mkdir -p ~/.config/systemd/user wget -O ~/.config/systemd/user/search-hub-update.service https://vit.am/~ololduck/search_hub/repository/browse/v$VERSION/contrib/search-hub-update.service wget -O ~/.config/systemd/user/search-hub-update.timer https://vit.am/~ololduck/search_hub/repository/browse/v$VERSION/contrib/search-hub-update.timer systemctl --user daemon-reload systemctl --user enable --now search-hub-self-update.timer ``` This checks for new releases weekly and updates the binary automatically. ## Run with Podman / Docker A container image is available at `oci.vit.am/search-hub:latest`. It serves on port 8080 as the `search_hub` user and expects: - **Config** mounted at `/home/search_hub/.config/search_hub/config.toml` - **Database** directory mounted at `/home/search_hub/.local/share/search_hub/` ```sh # Pull and run podman run -d --name search-hub \ -p 8080:8080 \ -v ~/.config/search_hub:/home/search_hub/.config/search_hub:ro \ -v ~/.local/share/search_hub:/home/search_hub/.local/share/search_hub \ oci.vit.am/search-hub:latest serve # SIGHUP reload (re-reads config) podman kill -s HUP search-hub # Build locally from the Containerfile podman build -t search-hub:latest -f Containerfile . ``` ### docker-compose ```sh docker compose up -d ``` See `docker-compose.yaml` at the project root. A SearXNG service is included as a commented-out example. ### Podman Quadlet (systemd-native) ```sh mkdir -p ~/.config/containers/systemd wget -O ~/.config/containers/systemd/search-hub.container https://vit.am/~ololduck/search_hub/repository/browse/main/contrib/search-hub.container systemctl --user daemon-reload systemctl --user enable --now search-hub ``` The Quadlet file uses `%h` (your home directory) for volume source paths. ## Resources - **Downloads:** [vit.am/~ololduck/search_hub/latest](https://vit.am/~ololduck/search_hub/latest/) - **Repository browser:** [vit.am/~ololduck/search_hub/repository](https://vit.am/~ololduck/search_hub/repository) - **Git clone:** `git clone https://vit.am/~ololduck/search_hub/repository.git search_hub`