-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional fixes for IR #55
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/55/index.html |
adb7d8e
to
1e4d3c1
Compare
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #55 +/- ##
==========================================
+ Coverage 91.77% 91.79% +0.01%
==========================================
Files 86 86
Lines 15286 15324 +38
==========================================
+ Hits 14029 14066 +37
- Misses 1257 1258 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
0576845
to
0bb0a7b
Compare
I had to update this to account for a specific fparser behaviour (see stfc/fparser#400) which has quite significant impact and for which we may have to watch out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, first of all, many thanks for figuring this stuff out. I will certainly appreciate 10% of CLOUDSC parsing time reduction.
I think the behavioural change is fine, but the implementation that exposes details of the outer IR to the inner is not ok, as it goes against the existing visitor pattern and will be hard to remove at a later stage.
In addition, it might be useful to separate the frontend change relating to fparser from the SubstituteExpression
change to streamline the merge process a little.
loki/expression/mappers.py
Outdated
@@ -520,6 +521,10 @@ def __init__(self, invalidate_source=True): | |||
def __call__(self, expr, *args, **kwargs): | |||
if expr is None: | |||
return None | |||
kwargs.setdefault( | |||
'recurse_to_declaration_attributes', | |||
'current_node' not in kwargs or isinstance(kwargs['current_node'], DECLARATION_NODES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😬 This sort-of starts violating all sorts of encapsulation rules, as the expression mapper (inner IR) is now aware of (and depends on) the implementation details of the outer IR. This creates even more cyclic dependencies that become hard to unfuddle later.
Instead, I can see two way to do this more neatly here:
- Honour
kwargs["recurse_to_declaration_attributes"]
and set it inSubstituteExpression.visit_Declaration(...)
. This results in only a single style of behviour, where declarations only recurse into the type, which is certainly what we want as default. - Otherwise, or in addition to the above, we could set a constructor flag in both
SubstituteExpression
and theMapper
toalways_recurse_into_types
(or similar), as a hard override to force the current behaviour if the user wants/needs this.
Since this here is constrained purely to substitutions from the same map, I think we should be safe with only the first case, and no universal override. For general "for all variable" traversals, this might not hold true, but for map-based substitutions, I cannot see an edge-case where we'd need this (but I could be wrong here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point, thanks for pointing that out! I'll try your first proposal, I think it is essentially what I was shooting for but with a much more awkward solution, and I don't remember why I picked that route (it's been so long 😏)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good now. Many thanks for addressing this.
loki/expression/mappers.py
Outdated
expr.scope.symbol_attrs[expr.name] = expr.type.clone(bind_names=as_tuple(bind_names)) | ||
if kwargs['recurse_to_declaration_attributes']: | ||
_kwargs = kwargs.copy() | ||
_kwargs['recurse_to_declaration_attributes'] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure I understand why the bind_names
are so special here that they require this entire code block. Maybe a comment why simple calling self.rec(...)
is not sufficient here might make this a little more accessible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I have tried to restructure the control flow in a way that makes it clearer what's going on, and add comments.
In essence: only when recurse_to_declaration_attributes is True, we recurse to all the bits that are stored in a symbol's type. But, because Fortran, we don't want this recursion to happen while actually updating these declaration attributes. Reason is that the declared variable can actually appear in its own initializer expression:
REAL var = HUGE(var)
Or "it" can appear in an allocation:
TYPE(SOME_TYPE), ALLOCATABLE :: var(:)
...
DO j=2,n
ALLOCATE(var(j)%field(size(var(j-1)%field)))
END DO
And since we insert the dimensions from allocate statements as type.shape
, this now becomes also part of the symbol table entries...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh, thanks for the explanation. Makes sense and looks much clearer now.
29a5a80
to
f01ea67
Compare
…before dispatching to expr tree mappers
eaaebaf
to
9cef8f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing and fixing this all. This is now GTG, and going in next.
I've confirmed with EC-physics regression and indeed we're shaving 15-20% off certain stages in the pipeline with this change, so again, many, many thanks! 🙏 🙏 🙏
Note: This sits on top of #44
I'm filing this separately because it contributes a significant behaviour change for the
SubstituteExpressions
visitor, which I consider to be sensible but that might require extra testing to make sure it doesn't break behaviour in (untested) edge cases.In a few places, we ran into infinite recursion in the
AttachScopes
visitor (although it's not specific to that one, it's just the first to be encountered within the frontends), more specifically, if a declared variable is being used in one of the definining attributes. Encountered examples include:REAL(KIND=JPRB), PARAMETER :: ZEXPLIMIT = LOG(HUGE(ZEXPLIMIT))
shape
in the Loki IR), e.g.:allocate(levels(jscale)%data(size(levels(jscale-1)%data, 1), size(levels(jscale-1)%data, 2)))
procedure (some_proc), deferred :: some_proc
The cause of the infinite recursion is the fact that we traverse declaration attributes (
type.initial
,type.shape
,type.bind_names
) on every symbol use in theLokiIdentityMapper
. By adding arecurse_to_declaration_attributes
keyword argument to the visitor methods (defaults toTrue
for nodes that are the "authority" on a symbol's type:VariableDeclaration
,ProcedureDeclaration
,Import
), we recurse into these properties only once. Since these are properties that are cached in the symbol table, this is also sufficient (unless transformations rely on the implicit modification of a symbol's declaration when traversing a routine's body, which seems somewhat counterintuitive behaviour).Conceptually I think this is the right thing to do but the implementation might not be the most elegant...
This also removes quite a bit of traversal overhead in the expression transformer, which shaves off about 10% of the time required to read in cloudsc.F90.