Skip to content

Conversation

@Girgias
Copy link
Member

@Girgias Girgias commented Nov 14, 2025

Commits should be reviewed in order.

@Girgias Girgias requested a review from ndossche November 14, 2025 09:02
@Girgias Girgias force-pushed the array-refactoring-2025-11-extract branch from 8e98d2e to 123d9e3 Compare November 14, 2025 14:02
@Girgias Girgias marked this pull request as ready for review November 16, 2025 14:54
@Girgias Girgias requested a review from bukka as a code owner November 16, 2025 14:54
Copy link
Member

@ndossche ndossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to such sensitive code make me a bit nervous ngl

/* }}} */

static zend_always_inline int php_valid_var_name(const char *var_name, size_t var_name_len) /* {{{ */
static zend_always_inline bool php_valid_var_name(const zend_string *var_name) /* {{{ */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I always kinda wonder with changing (char*, len) pairs to zend_string is how much flexibility we lose.
A pair is more flexible because you can pass plain old C strings to it, while now you'd need a zend_string which we can't efficiently make from a C string.
Granted, in this particular case it doesn't really matter but it's food for thought.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think we should move things to zend_string* rather than a char* + size_t pair. The main reason why certain parts of the engine (e.g. SAPI or streams) use a pair is because it predates PHP 7 and the introduction of them.

But in this instance all the inputs are zend_strings anyway so this improves readability IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to use zend_string* if it's pretty much always passed there like here (all calls looks like php_valid_var_name(ZSTR_VAL(var_name), ZSTR_LEN(var_name))) but many parts of streams and SAPI are different and I would prefer not to over-use zend_string* there.

zend_string *str = zend_long_to_str(num_key);
php_prefix_varname(&final_name, prefix, ZSTR_VAL(str), ZSTR_LEN(str), true);
zend_string_release_ex(str, 0);
zend_string *str = zend_ulong_to_str(num_key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ulong? If the key is a negative number the behaviour will differ.
I also almost missed this due to the commit title. If you were to do such a change this should be done in a separate commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix is already a ulong so that's why I went for that changre, but yes should be in a seperate commit must have messed up when I tried rearranging order of them or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test, as I expected we actually drop indexes which are negative... probably should warn.

@Girgias Girgias force-pushed the array-refactoring-2025-11-extract branch from 123d9e3 to 3319bee Compare November 16, 2025 22:24
@Girgias Girgias force-pushed the array-refactoring-2025-11-extract branch from 3319bee to 66946a9 Compare November 16, 2025 22:26
@bukka
Copy link
Member

bukka commented Nov 17, 2025

Seems like a bit of waste of time that is risky but I don't see any visible issue from the quick look. Doesn't mean that there is none...

@bukka
Copy link
Member

bukka commented Nov 17, 2025

Btw. I'm not questioning that the new code isn't better. It looks like an improvement to me. Just not sure if benefit (which is really more cosmetic) outweights the risks and if the amount of work to put into this is really worth it.

Copy link
Member

@ndossche ndossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix is already a ulong so that's why I went for that changre

Hash table keys are always ulongs, what matters is how they are interpreted.

Your test regarding the ulong changes is wrong. This shows a difference in behaviour:

<?php

$arr = [-1 => "foo"];
var_dump(extract($arr, EXTR_PREFIX_ALL | EXTR_REFS, "prefix"));

Please revert the long->ulong changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants