Skip to content

ang-zeyu/infisearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

dd497a1 Â· Jan 2, 2022
Jan 2, 2022
Jan 2, 2022
Dec 30, 2021
Jan 2, 2022
Dec 9, 2021
Dec 12, 2021
Jan 2, 2022
Aug 24, 2021
Aug 12, 2021
Jan 2, 2022
Dec 31, 2021
Dec 12, 2021
Dec 12, 2021
Jan 2, 2022
Dec 12, 2021
Jan 2, 2022
Oct 10, 2021
Aug 9, 2021
Dec 13, 2021
Dec 13, 2021
Dec 13, 2021

Repository files navigation

Morsels.j/rs 🧀

A complete and more scalable pre-built index approach to client-side search.


Description

Morsels is a complete client-side search solution, including a search user interface and library that depends on a pre-built index generated by a command-line build tool.

The secondary value proposition here versus other pre-built index options is the option of splitting of this index to many smaller chunks ("morsels"), which enables the client to retrieve and load only what it needs when searched. The index is also generated in a low-level format with various compression schemes employed, powered by WebAssembly, enabling a much smaller index size.

In all, this avoids blowing up network and memory usage on startup, and increases the scalability of client-side search options powered by a pre-built index tremendously.

Features

  • Multi-threaded CLI indexer powered by Rust
  • WebWorker built-in: no more hanging UI threads!
  • Disjunctive expression scoring using BM25
  • Standard search features, such as boolean queries and field filters
  • Positional search features: phrase queries, and query term proximity boosts
  • Gap and varint compression, giving you more bang-per-byte
  • Incremental indexing
  • Customisable dropdown / fullscreen popup user interface
  • A plugin for mdbook!

Use Cases

The main target use case for this tool right now is providing a complete search solution for static sites (and possibly really, really large ones) or static site generators.

That said, the indexing tool was built with support for a few other file formats (.json, .csv, .html) in mind, and might be useful elsewhere as such.

Getting Started

Please check out the docs!

Preview

preview gif of morsels search


Faq

How Scalable is it?

This tool should be able to handle 800MB pure text (not counting things like html soup) collections with the full set of features enabled (numbers here).

What's the Catch?

  1. Latency & File Bloat

    Scaling this tool for larger collections necessitates fragmenting the index and retrieving only what's needed when searched, which means extra network requests (but to a reasonable degree).

    Nevertheless, this tradeoff can also be configured to varying degrees. I.e., morsels can also function much like other monolithic pre-built index options for smaller collections.

  2. Wasm -- no IE support

  3. Not production ready

Contributing

Contributions are highly welcome! Please refer to the setup guide to get started.

License

This project is MIT licensed.