-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support tagfiltertree for fast matching metricIDs to queries #4310
base: master
Are you sure you want to change the base?
Changes from 7 commits
13cf2c2
4c96204
bde1d75
ee6c313
b9368bb
3423225
c65aa85
41769a2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -186,7 +186,6 @@ linters: | |
- gci | ||
- goconst | ||
- gocritic | ||
- golint | ||
- gosimple | ||
- govet | ||
- ineffassign | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Tag Filter Tree | ||
|
||
## Motivation | ||
There are many instances where we want to match an input metricID against | ||
a set of tag filters. One such use-case is metric attribution to namespaces. | ||
Iterating through each filter individually and matching them is extremely expensive | ||
since it has to be done on each incoming metricID. Therefore, this data structure | ||
pre-compiles a set of tag filters in order to optimize matches against an input metricID. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't really understand this paragraph.
Perhaps a diagram or example would help here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the Readme |
||
|
||
## Usage | ||
First create a trie using New() and then add tagFilters using AddTagFilter(). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And then I guess you use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! |
||
The tags within a filter can be specified in any order but to condense the compiled | ||
output of the trie, try and specify the most common set of tags in the beginning | ||
and in the same order. | ||
For instance, in case you have a tag "service" which you anticipate to be present | ||
in all filters then make sure that is specified first and then specify the remaining tags | ||
in the filter. | ||
The trie also supports "*" for a tag value which can be used to ensure the existance of a tag | ||
in the input metricID. | ||
|
||
## Caveats | ||
The trie might return duplicates and it is up to the caller to de-dup the results. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
package tagfiltertree | ||
|
||
import "github.com/m3db/m3/src/metrics/filters" | ||
|
||
// Options is a set of options for the attributor. | ||
type Options interface { | ||
TagFilterOptions() filters.TagsFilterOptions | ||
SetTagFilterOptions(tf filters.TagsFilterOptions) Options | ||
} | ||
|
||
type options struct { | ||
tagFilterOptions filters.TagsFilterOptions | ||
} | ||
|
||
// NewOptions creates a new set of options. | ||
func NewOptions() Options { | ||
return &options{} | ||
} | ||
|
||
// TagFilterOptions returns the tag filter options. | ||
func (o *options) TagFilterOptions() filters.TagsFilterOptions { | ||
return o.tagFilterOptions | ||
} | ||
|
||
// SetTagFilterOptions sets the tag filter options. | ||
func (o *options) SetTagFilterOptions(tf filters.TagsFilterOptions) Options { | ||
opts := *o | ||
opts.tagFilterOptions = tf | ||
return &opts | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
package tagfiltertree | ||
|
||
import "math/bits" | ||
|
||
// PointerSet is a set of pointers backed by a bitmap to | ||
// represent a sparse set of at most 127 pointers. | ||
type PointerSet struct { | ||
bits [2]uint64 // Using 2 uint64 gives us 128 bits (0 to 127). | ||
} | ||
|
||
// Set adds a pointer at index i (0 <= i < 127). | ||
func (ps *PointerSet) Set(i byte) { | ||
if i < 64 { | ||
ps.bits[0] |= (1 << i) | ||
} else { | ||
ps.bits[1] |= (1 << (i - 64)) | ||
} | ||
} | ||
|
||
// IsSet checks if a pointer is present at index i. | ||
func (ps *PointerSet) IsSet(i byte) bool { | ||
if i < 64 { | ||
return ps.bits[0]&(1<<i) != 0 | ||
} | ||
return ps.bits[1]&(1<<(i-64)) != 0 | ||
} | ||
|
||
// CountSetBitsUntil counts how many bits are set to 1 up to index i (inclusive). | ||
func (ps *PointerSet) CountSetBitsUntil(i byte) int { | ||
if i < 64 { | ||
// Count bits in the first uint64 up to index i. | ||
return bits.OnesCount64(ps.bits[0] & ((1 << (i + 1)) - 1)) | ||
} | ||
|
||
// Count all bits in the first uint64. | ||
count := bits.OnesCount64(ps.bits[0]) | ||
// Count bits in the second uint64 up to index i - 64. | ||
count += bits.OnesCount64(ps.bits[1] & ((1 << (i - 64 + 1)) - 1)) | ||
return count | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
package tagfiltertree | ||
|
||
import ( | ||
"math" | ||
"testing" | ||
|
||
"github.com/stretchr/testify/require" | ||
) | ||
|
||
func TestPointerSetCountBits(t *testing.T) { | ||
tests := []struct { | ||
name string | ||
setBits []uint64 | ||
expected int | ||
}{ | ||
{ | ||
name: "empty set", | ||
setBits: []uint64{0, 0}, | ||
expected: 0, | ||
}, | ||
{ | ||
name: "single set bit", | ||
setBits: []uint64{0, 1}, | ||
expected: 1, | ||
}, | ||
{ | ||
name: "multiple set bits", | ||
setBits: []uint64{7, 7}, | ||
expected: 6, | ||
}, | ||
{ | ||
name: "all set bits", | ||
setBits: []uint64{math.MaxUint64, math.MaxUint64}, | ||
expected: 128, | ||
}, | ||
} | ||
|
||
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
ps := PointerSet{} | ||
l := tt.setBits[0] | ||
r := tt.setBits[1] | ||
var i byte | ||
for i = 0; i < 128; i++ { | ||
if i < 64 { | ||
if l&0x1 == 1 { | ||
ps.Set(i) | ||
} | ||
l >>= 1 | ||
} else { | ||
if r&0x1 == 1 { | ||
ps.Set(i) | ||
} | ||
r >>= 1 | ||
} | ||
} | ||
|
||
require.Equal(t, tt.expected, ps.CountSetBitsUntil(127)) | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
golint is deprecated now and mainly not detecting the type correctly when using Generics.