You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lucene Term Vectors are a bit heavy to use in the way this plugin does. And why encode/decode the vector ordinal numbers as terms at all? Instead I propose as follows:
Add a new special text field that has payloads enabled. No Term vectors. This field will only ever index one nominal term, say the empty-string or one letter 'X' -- it doesn't matter. Each vector ordinal 0 thru 5 or however long it is becomes a term position of this term for a document. The payload encodes the number -- a 4-byte float. The home page of this plugin shows the numbers as dense but this approach (and the term vec one) could easily be sparse. This would be somewhat slower than a custom BinaryDocValues (another implementation path) but it leverages Lucene more and is less custom, for whatever benefit that is (e.g. easier debug-ability).
Ideally a FieldType would be added which could be used to enclose the implementation details of analysis, and it could even be used to query without the addition of any other top level classes / plugins, since a FieldType works with most query parsers, including the default/standard/lucene one and you can do some neat things this way. e.g. q=vecField:"0.1,4.75,0.3,1.2,0.7,4.0" (taken from the example)
The text was updated successfully, but these errors were encountered:
Lucene Term Vectors are a bit heavy to use in the way this plugin does. And why encode/decode the vector ordinal numbers as terms at all? Instead I propose as follows:
Add a new special text field that has payloads enabled. No Term vectors. This field will only ever index one nominal term, say the empty-string or one letter 'X' -- it doesn't matter. Each vector ordinal 0 thru 5 or however long it is becomes a term position of this term for a document. The payload encodes the number -- a 4-byte float. The home page of this plugin shows the numbers as dense but this approach (and the term vec one) could easily be sparse. This would be somewhat slower than a custom BinaryDocValues (another implementation path) but it leverages Lucene more and is less custom, for whatever benefit that is (e.g. easier debug-ability).
Ideally a FieldType would be added which could be used to enclose the implementation details of analysis, and it could even be used to query without the addition of any other top level classes / plugins, since a FieldType works with most query parsers, including the default/standard/lucene one and you can do some neat things this way. e.g.
q=vecField:"0.1,4.75,0.3,1.2,0.7,4.0"
(taken from the example)The text was updated successfully, but these errors were encountered: