Go to bug ID
Hello, guest. We have noticed that you are not registered at this bug tracker. Your experience will be greatly enhanced if you log in. To do so, you first must register by clicking on the Register tab at the top. If you are already registered, you can login at the Login tab.
Syndicate Syndicate Listing Display Search Login/Register
Bug Id ?
Reporter ?
Guest
Product/Version ?
Crimson Editor / Version 3.72 Beta 241
Status ?
Confirmed
Severity ?
Enhancement
Duplicate Of ?
- none -
Summary ?
RegExp handling does not understand lazy star *? or *+
Report Time ?
August 9, 2007 07:32:08 AM
Assignment ?
- none -
Resolution ?
Open
Priority ?
Low
Dependencies ?
- none -


Votes
For: 0 (0%)
Against: 0 (0%)
Total: 0

August 9, 2007 07:32:08 AM Guest
I have been having trouble using the regular expression search function.
I would like to be able to use the "lazy star" ability of regexp to find the smallest match on a line.
eg. regexp: <a.*?>
from the line: <a href="test.html">this is an <b>example</b></a>
should give: <a href="test.html">
but gives error "Failed to compile Regular Expression".

Also *+ is not understood.

August 9, 2007 07:38:13 AM richard
This bug posted by richard.

March 22, 2008 07:47:08 PM Ankit Singla
What should '.*?' do exactly? As far as I understand it, '.*' means any character (the .) any number of times (0 or more of the character before--this is the *). Tacking a ? (0 or 1 match of the character before (which in this case is a character that means 0 or more) on to the end does nothing useful...

March 22, 2008 07:52:05 PM Ankit Singla
I should probably continue to say that according to your example, you should be able to use:
'<a.*>' to find apparently in this case the largest block with '<a' at the beginning and '>' at the end. How would the '?' help here?

May 16, 2008 01:46:04 AM Arantor
The lazy star is a Perl (and other regexps) feature; the normal context for regexp is to find the largest match it can, which is what CE is doing. The lazy star is an instruction to the regexp compiler to instead match the shortest string it can.

It would appear that this is not a feature coded into this particular regexp library (from what I understand from "Mastering Regular Expressions, 2nd Ed.", the author of this library used an unusual hybrid which gave all round good performance, but lacked in some features like lazy star)

I can't remember off hand what *+ would do, but if you wanted to find 1 or more * symbols, you'd need to escape it first.

May 16, 2008 09:38:12 AM Pvt_Ryan
Changing to an enhancement as this is not a bug simply a limitation of the regex lib used.

We may at some point update to a more featured lib.