-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
GH-6114: Static path matching #6146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GH-6114: Static path matching #6146
Conversation
|
||
class FileMatcherPattern | ||
{ | ||
public function __construct(public string $path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will/could include the $suffix
, $prefix
and $exclude
also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or could simply remove this and handle the prefix/suffix independently of the "file matcher"
b2da2f7
to
797d6b7
Compare
Have refactored to tokenize the glob string, while less performant it's easier to reason about and we only need to compile the regex for each |
792bae8
to
6f99ddc
Compare
continue; | ||
} | ||
|
||
$resolved[] = [$type, $char]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this handles any "left over tokens" - including T_ASTERISK
- maybe all tokens should be handled explicitly?
ba1d99d
to
6fc5179
Compare
self::T_BRACKET_CLOSE => ']', | ||
self::T_HYPHEN => '-', | ||
self::T_COLON => ':', | ||
self::T_BACKSLASH => '\\', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COLON and BACKSLASH are not tested
6fc5179
to
e07a96f
Compare
@sebastianbergmann this PR is indicative of the approach but needs the final 20% of work:
But I'm keen for any thought you have on the approach so far - the biggest concern for me is providing confidence that it will be compatible with at least 99.99% of configuration files. |
} | ||
|
||
foreach ($this->includeDirectoryRegexes as $directoryRegex) { | ||
if ($directoryRegex->matches($path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will need to change to account for include/exclude matching on the basename
either in the regex or otherwise.
Thank you for your work so far. I am afraid that I might not be able to look into this before the PHPUnit Code Sprint in April in Berlin. There I will be able to look into this, for sure, and together with other developers. I hope I can look into this before then, but I cannot promise that.
We could delay this until PHPUnit 13 and break backward compatibility (for edge cases, "esoteric" glob patterns, etc.). |
Is it expected that the tests are currently failing? |
Yes - the prefix/suffix filtering hasn't been added yet and the tests fail due to the tests we added upstream. The PR is at an inflection point:
And then I was also considering how to implement the suffix/prefix logic - as it probably makes sense to include them in the compiled regex - but then it intrudes on the "copy the behavior of glob" approach" - so if we decide to simplify the globbing behavior it makes that decision easier too. [1] currently all the source mapping/filtering tests share the same fixture which makes it hard to test edge cases without breaking other tests - I think it could generate the file tree instead. |
1bb9097
to
5af6639
Compare
This (early stage) WIP PR introduces a static path matcher which intends to emulate the behavior of the PHPUnit
FileIterator
in order to prevent PHPUnit traversing the filesystem when a deprecation is triggered.The PHPUnit FileIterator uses
glob
to find directories and we therefore need to support the glob patterns - which can vary according to the platform. This PR uses https://man7.org/linux/man-pages/man7/glob.7.html as a reference in addition to testing the behavior locally to confirm assumptions.The webmozart/glob provides a similar feature however it's behavior is different as it supports curly braces, and
*
is restricted to a single directory level, while*
in PHPUnit will return all descendants and I'm sure there are other differences - however I've used that as a starting point.TODO:
[:alnum:]
etc.glob
behavior of unterminated[
character groups./a**
will match/b
and all other directories, where as/ab*
will not match anything. We can either copy that behavior or "fix" it.and maybe writing the implementation from scratch if regex turns out to be a bad fit.Usages on Github:
directory.*\[
): 0: https://github.com./search?q=%2Fdirectory.*%5C%5B%2F+path%3A*.xml+path%3A**%2Fphpunit.xml+language%3AXML&type=code?
any char: 0 https://github.com./search?q=%2Fdirectory.*%5C%5B%2F+path%3A*.xml+path%3A**%2Fphpunit.xml+language%3AXML&type=code