-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map: Add Drain for traversing maps while removing entries #1349
base: main
Are you sure you want to change the base?
Conversation
d65df8e
to
9a2089a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for putting in the work. Though I would like to request some changes.
11c813b
to
2dfaef8
Compare
Many thanks again Dylan for the feedback! |
2dfaef8
to
23c82af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patches! I'm not completely convinced we want to make MapIterator an interface (yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question regarding semantics:
- What happens when you do a plain Lookup on a queue or stack? Does that function like a peek? Or does it return an error?
- Are there other map types which support LookupAndDelete? Are there ones which support passing a Key argument?
Thanks to all the reviewers for the feedback.
|
Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like. |
@lmb I apologize, I missed the message.
|
Clarification, for stacks/queues:
Only hash maps(and variants: LRU, PER_CPU, LRU_PER_CPU) have support for Also interesting is that
Yes, the hashmap supports passing a key value. Actually, even before v5.14, a valid pointer must be provided to the key field. The kernel always checks it and copies it into kernel memory, just to discard it. After v5.14 it of course starts to actually get passed to the map implementation. |
Okay! That was very helpful, thanks to both of you. The peek operations only return the top of the stack / first item in the queue? And currently doing I'm currently leaning towards adding a new method |
Yes, peeking only returns the top of the stack. Any key provided will be ignored as far as I understand. But as @S41m0n mentioned the |
c3c539c
to
3acb798
Compare
Thanks for the info-gathering and brainstorming. I just pushed a possible implementation of what has been discussed. |
3acb798
to
c620552
Compare
39b3749
to
3ac6523
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the end result, but between the commits a lot happens that is canceled out later. So please squash all commits into one (or cleanup the git history some other way)
3ac6523
to
3d9f4e0
Compare
0916768
to
318ab1b
Compare
I removed from the code the support for the fallback sequential |
2ae048d
to
d6678a8
Compare
d6678a8
to
47fdd29
Compare
This commit introduces the `Map.Drain` API to traverse the map while also removing its entries. It leverages the same `MapIterator` structure, with the introduction of a new unexported method to handle the map draining. The tests make sure that the behavior is as expected, and that this API returns an error while invoked on the wrong map, such as arrays, for which `Map.Iterate` should be used instead. The `LookupAndDelete` system call support has been introduced in: 1. 5.14 for BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, BPF_MAP_TYPE_LRU_HASH and BPF_MAP_TYPE_LRU_PERCPU_HASH. 2. 4.20 for BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK Do not expect the `Map.Drain` API to work on prior versions, according to the target map type. From the user perspective, the usage should be similar to `Map.Iterate`, as shown as follows: ```go m, err := NewMap(&MapSpec{ Type: Hash, KeySize: 4, ValueSize: 8, MaxEntries: 10, }) // populate here the map and defer close it := m.Drain() for it.Next(keyPtr, &value) { // here the entry doesn't exist anymore in the underlying map. ... } ``` Signed-off-by: Simone Magnani <[email protected]>
47fdd29
to
46f5343
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with this iteration 😉.
if ptr, ok := keyOut.(unsafe.Pointer); ok { | ||
copy(unsafe.Slice((*byte)(ptr), len(buf)), buf) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this is a bit off topic. @ti-mo Do you recall why we have this exception here? This is just moving the existing code added in 4609dc7. But we seem to take so much care to ensure safety elsewhere, including in sysenc.Unmarshal
. But not here.
If for some reason the user passes in an unsafe pointer in keyOut
that is smaller than buf
this causes us to write into unknown parts of the heap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure, but the commit you linked does contain some hints around slices of types with trailing padding:
Allowing such a type creates an edge case: make([]padding, 1)
uses zero-allocation marshaling while make([]padding, 2)
doesn't, due to interior padding. It's simpler to skip such
types for now.
Passing in unsafe.Pointer sidesteps this limitation. Also see f0d238d. unsafe.Pointer
is supported for most map operations, see marshalMapSyscallInput
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some changes to be made here, and I think some more testing needs to be done to understand the behaviour in the face of concurrent deletes.
It's also important to understand that LookupAndDelete may keep failing after a successful NextKey on very busy maps, and we may need to protect the application from getting into runaway CPU usage. I vaguely recall some existing code in the library that gives up after x attempts, but I can't remember where exactly.
Please reach out offline if you'd like some help with this.
We'll ship this in a 0.17.1 after the xmas break.
// Next decodes the next key and value. | ||
// | ||
// Iterating a hash map from which keys are being deleted is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this comment removed? This applies in general, and it's documentation on exported API, so we shouldn't just remove it.
// safe. You may see the same key multiple times. Iteration may | ||
// also abort with an error, see IsIterationAborted. | ||
// | ||
// Iterating a queue/stack map returns an error (NextKey invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below:
// Iterating a queue/stack map returns an error (NextKey invalid | |
// Iterating a queue/stack map returns an error, use [Map.Drain] instead. |
Nit: try to reduce your use of the term 'API', it's redundant and often implied. Less is more!
// but their respective outputs will differ. | ||
// | ||
// Draining a map that does not support entry removal such as | ||
// an array return an error (LookupAndDelete not supported): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exact error returned is an implementation detail and may change over time unless it's a sentinel. If it's in godoc, it becomes a contract.
If it does return an error wrapping ErrNotSupported
, this is valid docmentation:
Draining a map that doesn't support deletions will return [ErrNotSupported]
.
@@ -1602,6 +1623,12 @@ func marshalMap(m *Map, length int) ([]byte, error) { | |||
return buf, nil | |||
} | |||
|
|||
// isKeyValueMap returns true if map supports key-value pairs (ex. hash) | |||
// and false in case of value-only maps (ex. queue). | |||
func isKeyValueMap(m *Map) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make more sense inverted, e.g. func (m *Map) isKeyless() bool
since you're negating its only use below. Then the docstring can also become a bit more focused: isKeyless returns true if the map is value-only, like a [Queue].
return mi.nextIterate(keyOut, valueOut) | ||
} | ||
|
||
func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool { | |
func (mi *MapIterator) pop(keyOut, valueOut interface{}) bool { |
This function pops a single entry, it doesn't drain the whole map.
return false | ||
} | ||
|
||
func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool { | |
func (mi *MapIterator) next(keyOut, valueOut interface{}) bool { |
|
||
iter := m.Drain() | ||
allocs := testing.AllocsPerRun(int(m.MaxEntries()-1), func() { | ||
if !iter.Next(keyPtr, &value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !iter.Next(keyPtr, &value) { | |
qt.Assert(t, qt.IsTrue(iter.Next(keyPtr, &value))) |
Or maybe qt.Assert allocates? Feel free to ignore.
} else { | ||
err = m.Put(uint32(i), uint32(i)) | ||
} | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil { | |
qt.Assert(t, qt.IsNil(err)) |
// Handle value-only maps (ex. queue). | ||
if !isKeyValueMap(mi.target) { | ||
if keyOut != nil { | ||
mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead") | |
mi.err = fmt.Errorf("non-nil keyOut provided for map without a key") |
mi.cursor = make([]byte, mi.target.keySize) | ||
} | ||
|
||
// Always retrieve first key in the map. This should ensure that the whole map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the case. Consider the following scenario:
- drainer A: NextKey(0) yields a
- drainer B: NextKey(0) yields a
- drainer B: LookupAndDelete(a) succeeds
- drainer A: LookupAndDelete(a) fails
The L&D failing can happen at any point due to a concurrent delete, also when e.g. a bpf program deletes a key, which is fairly common. In the current implementation, I think iteration halts at that point, while there may still be keys left in the map.
Dear community, with this PR I would like to introduce the
Map.Drain
API to traverse a map while also removing its entries.The current
MapIterator.Next
traverse a map without removing the entries. For some use cases, it might be useful to have this API so that maps such as Hash can be freed during the lookup.In addition, the
MapIterator.Next
would normally fail with maps of type Queue or Stack, for two main reasons:The proposed solution reuses the
MapIterator
structure and as much code as possible: theMap.Drain
API return an iterator with a special flag set, specifying that the retrieved entries must be deleted as well.To support different kernel versions in which the support forThese compatibility support has been dropped in this PR. Queues are expected to work since kernel v4.2, and the other Hash-like maps from v5.14.LookupAndDelete
system call is not equal among the different maps (e.g., queues and hash support for this has been introduced in different commit),this implementation introduces a fallback method that would leverage the sequential
Lookup
->Delete
operations in caseLookupAndDelete
for that specific map is not supported.Please any feedback is really welcome, don't hesitate ping me 😃