-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Index object for NativeTypes instead of index fields #303
Conversation
You found an ingeniously simple solution to this problem. I like it very much. 🥳 Overall performance improvement in Labkit is 20 %. That's quite some margin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me ;)
@@ -49,6 +50,8 @@ | |||
{ | |||
protected final T type; | |||
|
|||
protected final Index i; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it slightly annoying that this field is named "i" in CellCursor
while it is named "index" in AbstractArrayCursor
.
I understand the reason that the name is clashing with the already existing protect int index
. Maybe the exiting field could be removed now? Or maybe "typeIndex" could be used as a name everywhere consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also didn't like that. I didn't want to make any changes unrelated to the problem for now.
We could swap the names 'i' and 'index' in the CellCursor.
Maybe the existing field could be removed, we would need to do benchmarks though. I remember that we spent a lot of time tweaking performance but that was before JMH. Maybe the current polymorphism issue played into the measurements and also maybe the JIT changed to favour different patterns.
In any case we will have to check that there is no derived class somewhere that uses the existing 'index'.
It would be good to revisit, but in a separate issue.
(Also #304 that I stumbled across while making the changes...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this this class in KNIME image processing that overloads CellCursor and uses the index
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's either merge this branch as it is, or rename all the fields to something consistent, like typeIndex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find typeIndex
a good compromise as it is unambiguous and is unlikely to create surprises (such as in for (int i ...
loops).
* This is used by accessors (e.g., a {@code Cursor}) to position {@code | ||
* NativeType}s in the container. | ||
*/ | ||
public void set( final int index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public void set( final int index) | |
public void set( final int index ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me with the caveat that I only looked at it at a high level without diving into every single file. This should automatically improve performance for practically every single downstream code base. Trying summarize in my own words why this fix works (please confirm that this is correct; may be helpful for others):
With NativeType.incIndex()
, we pay the polymorphism penalty at every single pixel. With the new Index
class, we pay the penalty only at creation time of the RandomAccess
or Cursor
. When moving RandomAccess
or Cursor
, we call Index.inc()
(or any other method on Index
) the JVM always sees the same class, so there is no polymorphism penalty when moving from pixel to pixel. Making Index
final
prevents accidental introduction of polymorphism in downstream code.
Great fix. I am in favor of the name change to typeIndex
as suggested by @maarzt
... instead of deprecated NativeType methods.
... instead of deprecated NativeType methods.
…tion ... instead of deprecated NativeType methods.
I did the |
This addresses a polymorphism issue discovered by @maarzt:
There are multiple implementations of
NativeType.incIndex()
etc, and when using e.g. aRandomAccess
to move over images of different types, this slows things down. Also in the case that the type is always the same at the call site.The problem is illustrated by this benchmark where pixels of a 1000x1000 image are summed using a
ArrayRandomAccess
. If different types than the one of the image have been observed anywhere else in the program this operation slows down by 4x. This can be traced toNativeType.updateIndex()
etc. This is obviously a problem for a lot of things using imglib.This PR introduces an
Index
class that is used by allNativeType
implementations to store their index.Accessors can directly obtain (and cache) this object and use it to modify the types index.
This solves the issue:
(As far as I can tell, using the
Index
object instead of a primitiveint
field does not cause performance overhead.)The only issue with this is that the types deriving from
AbstractBitType
had along
index before and now, throughIndex
, implicitly haveint
indexes. This means that while you could create anAbstractBitType
that would run on primitive arrays of more thanInteger.MAX_VALUE
bits before, you no longer can. I don't think this is an actual issue. As far as I can tell, it never really would work in any concrete use case because theint
index assumption is baked in deeply in all Accessors etc.I suggest that we just live with it for now.
If there are actual use cases that bypass images and accessors and do something directly with those types, I would suggest that we simply create a parallel class hierarchy explicitly for those scenarios.