-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory overhead of DatePeriod via virtual properties #15598
Reduce memory overhead of DatePeriod via virtual properties #15598
Conversation
c70de9d
to
9144b30
Compare
@@ -33,18 +33,25 @@ object(DatePeriod)#%d (%d) { | |||
["start"]=> | |||
object(DateTime)#%d (%d) { | |||
["date"]=> | |||
string(26) "2022-07-14 00:00:00.000000" | |||
string(26) "2024-08-28 00:00:00.000000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unserialization doesn't revert the already done changes, therefore if an error happens at include_start_date
(due to the wrong type) then the expected result is that the new values are applied for current
, start
, end
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a change in behaviour then? I don't quite understand why you need to make this changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous behavior was a by-product of storing properties in two places. The following happened: internal struct members of DatePeriod
were assigned a value during unserialization, until an incompatible type of the include_start_date
key caused a failure. That's why regular object properties were left intact, since https://github.com/php/php-src/pull/15598/files#diff-975ac482adc74b487c6ac9d8dde32153e4b8a803ef8ec41e8d6594f0a042db69L5846 was skipped. The test reflected the old object property values then, while the internal struct contained the new values.
The new behavior is consistent with the comment here:
Line 5694 in 176d285
/* this function does no rollback on error */ |
/* this function does no rollback on error */
). If there is an error during serialization, there is no "rollback", so the test reflects the new values for those properties which were affected by the unserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your usage of virtual seems right. I'm not the codeowner of this component so I can't comment much on the other aspects. Ultimately it's up to Derick to decide what to do with this.
ext/date/php_date.c
Outdated
return; | ||
} | ||
|
||
zend_class_entry *ce_ptr = ce; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You moved this out of the else block, but is there a reason for this except for coding style? I would rather not have code motion especially since it's unrelated to this PR. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: #15598 (comment) Although I commented this to the wrong line, because line 320 (before my changes) was missing a return
and therefore zend_throw_error
was invoked twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unrelated to the actual change though (memory reduction); I also don't think the code style should be updated in the same PR/commit, and even if you did, I would prefer that variable declarations be kept at the top of functions.
ext/date/php_date.c
Outdated
@@ -2038,7 +1999,7 @@ static int date_object_compare_timezone(zval *tz1, zval *tz2) /* {{{ */ | |||
|
|||
if (!o1->initialized || !o2->initialized) { | |||
zend_throw_error(date_ce_date_object_error, "Trying to compare uninitialized DateTimeZone objects"); | |||
return 1; | |||
return ZEND_UNCOMPARABLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like an unrelated cleanup to this PR and should perhaps be done separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it then need a rebase — because it shouldn't show up in this PR again then surely?
ext/date/php_date.c
Outdated
@@ -3066,6 +3027,7 @@ PHP_METHOD(DateTime, __wakeup) | |||
|
|||
if (!php_date_initialize_from_hash(&dateobj, myht)) { | |||
zend_throw_error(NULL, "Invalid serialization data for DateTime object"); | |||
RETURN_THROWS(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: I would not do these cleanups/clarifications in the code if it's not strictly necessary in this PR. Otherwise it shows up in the git history as if it's related and that's extra noise + confusion tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this done in another PR? Perhaps this needs a rebase then first.
ext/date/php_date.c
Outdated
@@ -4656,6 +4623,10 @@ static void php_date_interval_initialize_from_hash(zval **return_value, php_inte | |||
Z_STRVAL_P(date_str), | |||
err->error_messages[0].position, | |||
err->error_messages[0].character ? err->error_messages[0].character : ' ', err->error_messages[0].message); | |||
|
|||
timelib_time_dtor(time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a real fix, should perhaps be backported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was my original plan but I haven't had time yet: #15598 (comment) Will do it soon.
ext/date/php_date.c
Outdated
|
||
static HashTable *date_period_get_properties_for(zend_object *object, zend_prop_purpose purpose) | ||
{ | ||
switch (purpose) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by the implementation: why this switch is necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copy-pasted this piece of code. I'm not sure why Nikita added the switch there... but after a closer examination, it seemed that ZEND_PROP_PURPOSE_GET_OBJECT_VARS
is missing from the cases 🤔 I don't see the reason... but in any case, I now removed the switch case so that the custom handler kicks in in all cases now.
ext/date/php_date.c
Outdated
return; | ||
} | ||
|
||
zend_class_entry *ce_ptr = ce; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unrelated to the actual change though (memory reduction); I also don't think the code style should be updated in the same PR/commit, and even if you did, I would prefer that variable declarations be kept at the top of functions.
ext/date/php_date.c
Outdated
@@ -3066,6 +3027,7 @@ PHP_METHOD(DateTime, __wakeup) | |||
|
|||
if (!php_date_initialize_from_hash(&dateobj, myht)) { | |||
zend_throw_error(NULL, "Invalid serialization data for DateTime object"); | |||
RETURN_THROWS(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this done in another PR? Perhaps this needs a rebase then first.
ext/date/php_date.c
Outdated
@@ -5934,20 +5903,47 @@ PHP_METHOD(DatePeriod, __wakeup) | |||
ZEND_PARSE_PARAMETERS_NONE(); | |||
|
|||
period_obj = Z_PHPPERIOD_P(object); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't just add whitespace changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll get rid of the unnecessary changes once I rebase this PR on another batch of return fixes.
@@ -33,18 +33,25 @@ object(DatePeriod)#%d (%d) { | |||
["start"]=> | |||
object(DateTime)#%d (%d) { | |||
["date"]=> | |||
string(26) "2022-07-14 00:00:00.000000" | |||
string(26) "2024-08-28 00:00:00.000000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a change in behaviour then? I don't quite understand why you need to make this changes.
["var3":protected]=> | ||
int(3) | ||
["var4":protected]=> | ||
int(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confused me. I thought these were now added where they originally were not. GH diff is confusing sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the diff in this case was very confusing for me as well... But in fact, the only thing that changed is the order of the properties...
ext/date/php_date.c
Outdated
zend_throw_error(date_ce_date_object_error, "Object of type %s not been correctly initialized by calling parent::__construct() in its constructor", ZSTR_VAL(ce->name)); | ||
} | ||
zend_throw_error(date_ce_date_object_error, "Object of type %s (inheriting %s) has not been correctly initialized by calling parent::__construct() in its constructor", ZSTR_VAL(ce->name), ZSTR_VAL(ce_ptr->name)); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be RETURN_THROWS()
, not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, RETURN_THROWS() requires a return_value
variable in scope (thus it can be mainly used inside PHP_FUNCTION
s/PHP_METHOD
s)
ext/date/php_date.c
Outdated
} | ||
zend_throw_error(date_ce_date_object_error, "Object of type %s (inheriting %s) has not been correctly initialized by calling parent::__construct() in its constructor", ZSTR_VAL(ce->name), ZSTR_VAL(ce_ptr->name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also misses a RETURN_THROWS()
(yes, I know it's the last line in the function, but still)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add them separately for sure. I agree that the RETURN_THROWS() are useful
ext/date/php_date.c
Outdated
@@ -2038,7 +1999,7 @@ static int date_object_compare_timezone(zval *tz1, zval *tz2) /* {{{ */ | |||
|
|||
if (!o1->initialized || !o2->initialized) { | |||
zend_throw_error(date_ce_date_object_error, "Trying to compare uninitialized DateTimeZone objects"); | |||
return 1; | |||
return ZEND_UNCOMPARABLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it then need a rebase — because it shouldn't show up in this PR again then surely?
79f50f4
to
042bfa6
Compare
Issues originally found in #15598
bbc6cf7
to
ceb6a95
Compare
Note that you'll need an unset handler that forbids unsetting these virtual properties, otherwise it's going to crash because there's no backing storage. I just did something similar for DOM: #15891 |
Thanks for the heads up, I've just implemented it. |
f611bf7
to
bd6162c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was at work when RC1 got tagged.
I'm not so sure that the rules allow optimizations to be merged still after RC1 is released.
From the PoV of how virtual properties works this LGTM. But wait for Derick to confirm. Also you still need to answer https://github.com/php/php-src/pull/15598/files#r1742364030
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke too soon.
The following will crash because when the object is uninitialized it executes the standard has_property object handler, but that can't handle virtuals of course:
<?php
$rc = new ReflectionClass('DatePeriod');
$di = $rc->newInstanceWithoutConstructor();
var_dump(isset($di->recurrences));
bd6162c
to
176d285
Compare
Very smart observation, thank you! I've just force pushed a new commit adding the missing functionality along with a rebase to recent master because I think this optimization can wait until PHP 8.5 indeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see one mistake anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything wrong anymore
Related to #11644 and #13988. Test for high memory consumption: https://3v4l.org/QW9Tm