- Fixed typo in README abstract [@remram44, #295].
- Fixed typos in code and documentation [@kianmeng, #294].
- Rolled back
rocksdb
version, as the latest version does not link properly in--release
mode [@valeriansaliou].
- Dependencies have been bumped to latest versions (namely:
rocksdb
,clap
,regex
) [@valeriansaliou].
- Dependencies have been bumped to latest versions (namely:
hashbrown
,whatlang
,regex
) [@valeriansaliou]. - Moved the release pipeline to GitHub Actions [@valeriansaliou].
- The language detection system is now about 2x faster (due to the upgrade of
whatlang
pastv0.14.0
) [@valeriansaliou]. - Added Armenian stopwords [@valeriansaliou].
- Added Georgian stopwords [@valeriansaliou].
- Added Gujarati stopwords [@valeriansaliou].
- Added Tagalog stopwords [@valeriansaliou].
- Fixed Norwegian stopwords [@valeriansaliou, #239].
- Code has been formatted according to
clippy
recommendations. This does not change the way Sonic behaves [@pleshevskiy, #233].
- Added support for Chinese word segmentation in tokenizer (note that as this adds quite some size overhead to the final binary size, the feature
tokenizer-chinese
can be disabled when building Sonic) [@vincascm, #209].
- Apple Silicon is now supported [@valeriansaliou].
- Added Norwegian stopwords [@mikalv, #236].
- Added Catalan stopwords [@coopanio, #227].
- Dependencies have been bumped to latest versions (namely:
rocksdb
,fst-levenshtein
,fst-regex
,hashbrown
,whatlang
,byteorder
,rand
) [@valeriansaliou].
- A few rarely-used languages have been removed, following
whatlang
v0.12.0
release, see the notes here [@valeriansaliou, 940d3c3].
- Added support for Slovak, which is now auto-detected from terms [@valeriansaliou, 19412ce].
- Added Slovak stopwords [@valeriansaliou, 19412ce].
- Dependencies have been bumped to latest versions (namely:
whatlang
) [@valeriansaliou, 19412ce].
- Fixed multiple deadlocks, which where not noticed in practice by running Sonic at scale, but that are still theoretically possible [@BurtonQin, #213, #211].
- Added support for Latin, which is now auto-detected from terms [@valeriansaliou, e6c5621].
- Added Latin stopwords [@valeriansaliou, e6c5621].
- Dependencies have been bumped to latest versions (namely:
rocksdb
,radix
,hashbrown
,whatlang
) [@valeriansaliou].
- Added a release script, with cross-compilation capabilities (currently for the
x86_64
architecture, dynamically linked against GNU libraries) [@valeriansaliou, 961bab9].
- RocksDB compression algorithm has been changed from LZ4 to Zstandard, for a slightly better compression ratio, and much better read/write performance; this will be used for new SST files only [@valeriansaliou, cd4cdfb].
- Dependencies have been bumped to latest versions (namely:
rocksdb
) [@valeriansaliou, cd4cdfb].
- Fixed a regression on optional configuration values not working anymore, due to an issue in the environment variable reading system introduced in
v1.2.1
[@valeriansaliou, #155].
- Optimized some aspects of FST consolidation and pending operations management [@valeriansaliou, #156].
- FST graph consolidation is now able to ignore new words when the graph is over configured limits, which are set with the new
store.fst.graph.max_size
andstore.fst.graph.max_words
configuration variables [@valeriansaliou, 53db9c1]. - An integration testing infrastructure has been added to the Sonic automated test suite [@vilunov, #154].
- Configuration values can now be sourced from environment variables, using the
${env.VARIABLE}
syntax inconfig.cfg
[@perzanko, #148]. - Dependencies have been bumped to latest versions (namely:
rand
,radix
andhashbrown
) [@valeriansaliou, c1b1f54].
- Fixed a rare deadlock occurring when 3 concurrent operations get executed on different threads for the same collection, in the following timely order:
PUSH
thenFLUSHB
thenPUSH
[@valeriansaliou, d96546b].
- Reworked the KV store manager to perform periodic memory flushes to disk, thus reducing startup time [@valeriansaliou, 6713488].
- Stop accepting Sonic Channel commands when shutting down Sonic [@valeriansaliou, #131].
- Introduced a server statistics
INFO
command to Sonic Channel [@valeriansaliou, #70]. - Added the ability to disable the lexer for a command with the command modifier
LANG(none)
[@valeriansaliou, #108]. - Added a backup and restore system for both KV and FST stores, which can be triggered over Sonic Channel with
TRIGGER backup
andTRIGGER restore
[@valeriansaliou, #5]. - Added the ability to disable KV store WAL (Write-Ahead Log) with the
write_ahead_log
option, which helps limit write wear on heavily loaded SSD-backed servers [@valeriansaliou, #130].
- RocksDB has been bumped to
v5.18.3
, which fixes a dead-lock occurring in RocksDB at scale when a compaction task is ran under heavy disk writes (ie. disk flushes). This dead-lock was causing Sonic to stop responding to any command issued for the frozen collection. This dead-lock was due to a bug in RocksDB internals (not originating from Sonic itself) [@baptistejamin, 19c4a10].
- Reworked the
FLUSHB
command internals, which now use the atomicdelete_range()
operation provided by RocksDBv5.18
[@valeriansaliou, 660f8b7].
- Added the
LANG(<locale>)
command modifier forQUERY
andPUSH
, that lets a Sonic Channel client force a text locale (instead of letting the lexer system guess the text language) [@valeriansaliou, #75]. - The FST word lookup system, used by the
SUGGEST
command, now support all scripts via a restricted Unicode range forward scan [@valeriansaliou, #64].
- A store acquire lock has been added to prevent 2 concurrent threads from opening the same collection at the same time [@valeriansaliou, 2628077].
- A superfluous mutex was removed from KV and FST store managers, in an attempt to solve a rare dead-lock occurring on high-traffic Sonic setups in the KV store [@valeriansaliou, 60566d2].
- Reverted changes made in
v1.1.5
regarding the open filesrlimit
, as this can be set from outside Sonic [@valeriansaliou, f6400c6]. - Added Chinese Traditional stopwords [@dsewnr, #87].
- Improved the way database locking is handled when calling a pool janitor; this prevents potential dead-locks under high load [@valeriansaliou, fa78372].
- Added the
server.limit_open_files
configuration variable to allow configuringrlimit
[@valeriansaliou].
- Added Kannada stopwords [@dileepbapat].
- The Docker image is now much lighter [@codeflows].
- Automatically adjust
rlimit
for the process to the hard limit allowed by the system (allows opening more FSTs in parallel) [@valeriansaliou].
- Limit the size of words that can hit against the FST graph, as the FST gets slower for long words [@valeriansaliou, #81].
- Rework Sonic Channel buffer management using a VecDeque (Sonic should now work better in harsh network environments) [@valeriansaliou, 1c2b9c8].
- FST graph consolidation locking strategy has been improved even further, based on issues with the previous rework we have noticed at scale in production (now, consolidation locking is done at a lower-priority relative to actual queries and pushes to the index) [@valeriansaliou, #68].
- FST graph consolidation locking strategy has been reworked as to allow queries to be executed lock-free when the FST consolidate task takes a lot of time (previously, queries were being deferred due to an ongoing FST consolidate task) [@valeriansaliou, #68].
- Removed special license clause introduced in
v1.0.2
, Sonic is fullMPL 2.0
now. [@valeriansaliou]
- Change how buckets are stored in a KV-based collection (nest them in the same RocksDB database; this is much more efficient on setups with a large number of buckets -
v1.1.0
is incompatible with thev1.0.0
KV database format) [@valeriansaliou].
- Bump
jemallocator
to version0.3
[@valeriansaliou].
- Re-license from
MPL 2.0
toSOSSL 1.0
(Sonic has a special license clause) [@valeriansaliou].
- Added automated benchmarks (can be ran via
cargo bench --features benchmark
) [@valeriansaliou]. - Reduced the time to query the search index by 50% via optimizations (in multiple methods, eg. the lexer) [@valeriansaliou].
- Initial Sonic release [@valeriansaliou].