# SearchHub A local search engine for your browser bookmarks. Import bookmarks from Firefox, Zen, Chrome, or Chromium, search them with full-text queries, and optionally forward searches to external engines like crates.io (via its public JSON API) or SearXNG (aggregates results from dozens of backends). Content can be automatically tagged via local ONNX embeddings (opt-in; set `tagging_enabled = true` in config). ## Install **Binaries** are available at [vit.am/~ololduck/search_hub/latest](https://vit.am/~ololduck/search_hub/latest/). Download the archive or the statically-linked binary for your architecture, extract, and run. **Source:** Clone the [repository](https://vit.am/~ololduck/search_hub/repository.git) and build with Rust: **Prerequisites:** Rust (install via [rustup](https://rustup.rs/)). ```sh git clone https://vit.am/~ololduck/search_hub/repository.git cd search_hub cargo install --path . ``` This installs the `search_hub` binary to `~/.cargo/bin/search_hub`. To update later, pull the latest code and reinstall. ## First steps ```sh # Import bookmarks from Firefox (auto-discovers your profile) search_hub import firefox # Import from Chrome search_hub import chrome # Start the web UI search_hub serve ``` Open http://127.0.0.1:8080 in your browser. You can now search your bookmarks. Search queries are also forwarded to external engines: [crates.io](https://crates.io) via its public JSON API, and optionally [SearXNG](https://searx.space) (which aggregates Google, Bing, DDG, and dozens more) if `[[engines]]` is configured. Works as a custom search provider in Firefox/Zen via the OpenSearch protocol (your browser should auto-discover it at `/opensearch.xml`). ## CLI reference | Command | What it does | |---------|-------------| | `search_hub serve` | Start web UI on port 8080 | | `search_hub serve --port 3000` | Start on a custom port | | `search_hub import firefox` | Import bookmarks from Firefox | | `search_hub import chrome` | Import from Chrome/Chromium | | `search_hub import zen` | Import from Zen Browser | | `search_hub search "query"` | Search bookmarks from the terminal | | `search_hub list` | List all bookmarks | | `search_hub insert "Title" https://..."` | Add a bookmark (fetches content, auto-tags if enabled) | | `search_hub remove --id 1` | Delete a bookmark by ID | | `search_hub retag --all` | Re-run auto-tagging (requires `tagging_enabled = true` in config) | | `search_hub init-config` | Create a default config file at `~/.config/search_hub/config.toml` | | `search_hub self-update` | Check abbaye Atom feed and update to the latest release | | `search_hub self-update --dry-run` | Check for updates without downloading | | `search_hub self-update --target x86_64-unknown-linux-gnu` | Override the target triple | All commands use `~/.local/share/search_hub/bookmarks.db` by default. Override with `--db-path` or set `db_path` in the config file. The first time you use a search or insert command, SearchHub downloads an ONNX embedding model to `.fastembed_cache/` in the project directory (about 127 MB). ## Configuration Run `search_hub init-config` to create `~/.config/search_hub/config.toml` with all available options commented out. Or create it manually: ```toml # Bookmark database path (default: platform data directory) # db_path = "/home/you/.local/share/search_hub/bookmarks.db" # Custom tags override the built-in defaults # [[tags]] # name = "my-custom-tag" # examples = ["example text one", "example text two"] # Whether auto-tagging is enabled (default: false, requires ONNX model download on first use) # tagging_enabled = true # Minimum confidence for auto-tagging (0.0 to 1.0, default: 0.6) # tagging_threshold = 0.6 # Hosts to skip when fetching content for bookmarking (default: local addresses) # exclude_urls = ["localhost", "127.0.0.1", "::1"] # Per-engine configuration (optional) # Multiple instances supported (e.g., public + private crates.io registries) [[engines]] type = "searxng" instance = "https://search.kael.ink" # timeout_secs = 10.0 # optional per-engine timeout # Best: use an existing public instance (see https://searx.space). # Also possible: run your own with Docker: # docker run -d --name searxng -p 8888:8080 searxng/searxng # Custom crates.io registry (optional) # [[engines]] # type = "crates_io" # url = "https://registry.example.com/api/v1/crates?q={}&per_page=10" # timeout_secs = 5.0 # Wikipedia search (optional, defaults to English) # [[engines]] # type = "wikipedia" # lang = "fr" # timeout_secs = 5.0 # MDN Web Docs search (optional, defaults to en-US) # [[engines]] # type = "mdn" # locale = "fr" # timeout_secs = 5.0 # Generic HTML-scraped engine (use with any search site) # Provide a URL template with `{}` for the query and a CSS selector # targeting the result container. Results are extracted from `<a>` links # inside that container (deduplicated, up to 10, http/https only). # # Note: most commercial search engines (Google, Bing, DuckDuckGo, etc.) # block automated requests. This engine works best with small/niche sites # that don't enforce bot detection. To find the right selector, view the # page source or use browser dev tools on the search results page. # [[engines]] # type = "generic" # name = "DuckDuckGo" # url = "https://html.duckduckgo.com/html/?q={}" # selector = "div.results" # timeout_secs = 10.0 ``` ## Run the web server as a systemd user service Keeps the web UI running in the background, starts automatically on login. ```sh cp contrib/search-hub-web.service ~/.config/systemd/user/ systemctl --user daemon-reload systemctl --user enable --now search-hub-web.service ``` Check status with `systemctl --user status search-hub-web`. View logs with `journalctl --user -u search-hub-web -f`. ## Auto-import with systemd ```sh cp contrib/search-hub-import.service ~/.config/systemd/user/ cp contrib/search-hub-import.timer ~/.config/systemd/user/ systemctl --user daemon-reload systemctl --user enable --now search-hub-import.timer ``` This imports bookmarks from Zen Browser daily. Edit the file to import from another browser. ## Auto-update with systemd ```sh cp contrib/search-hub-self-update.service ~/.config/systemd/user/ cp contrib/search-hub-self-update.timer ~/.config/systemd/user/ systemctl --user daemon-reload systemctl --user enable --now search-hub-self-update.timer ``` This checks for new releases weekly and updates the binary automatically. ## Run with Podman / Docker A container image is available at `oci.vit.am/search-hub:latest`. It serves on port 8080 as the `search_hub` user and expects: - **Config** mounted at `/home/search_hub/.config/search_hub/config.toml` - **Database** directory mounted at `/home/search_hub/.local/share/search_hub/` ```sh # Pull and run podman run -d --name search-hub \ -p 8080:8080 \ -v ~/.config/search_hub:/home/search_hub/.config/search_hub:ro \ -v ~/.local/share/search_hub:/home/search_hub/.local/share/search_hub \ oci.vit.am/search-hub:latest serve # SIGHUP reload (re-reads config) podman kill -s HUP search-hub # Build locally from the Containerfile podman build -t search-hub:latest -f Containerfile . ``` ### docker-compose ```sh docker compose up -d ``` See `docker-compose.yaml` at the project root. A SearXNG service is included as a commented-out example. ### Podman Quadlet (systemd-native) ```sh mkdir -p ~/.config/containers/systemd cp contrib/search-hub.container ~/.config/containers/systemd/ systemctl --user daemon-reload systemctl --user enable --now search-hub ``` The Quadlet file uses `%h` (your home directory) for volume source paths. ## Resources - **Downloads:** [vit.am/~ololduck/search_hub/latest](https://vit.am/~ololduck/search_hub/latest/) - **Repository browser:** [vit.am/~ololduck/search_hub/repository](https://vit.am/~ololduck/search_hub/repository) - **Git clone:** `git clone https://vit.am/~ololduck/search_hub/repository.git`