Regex question mark

To match a string with pattern like:

-TEXT-someMore-String

To get -TEXT-, I came to know that this works:

/-(.+?)-/ // -TEXT-

As of what I know, ? makes preceding token as optional as in:

colou?r matches both colour and color

I initially put in regex to get -TEXT- part like this:

/-(.+)-/

But it gave -TEXT-someMore-.

How does adding ? stops regex to get the -TEXT- part correctly? Since it used to make preceding token optional not stopping at certain point like in above example ?

Answers:

Answer

As you say, ? sometimes means "zero or one", but in your regex +? is a single unit meaning "one or more — and preferably as few as possible". (This is in contrast to bare +, which means "one or more — and preferably as many as possible".)

As the documentation puts it:

However, if a quantifier is followed by a question mark, then it becomes lazy, and instead matches the minimum number of times possible, so the pattern /\*.*?\*/ does the right thing with the C comments. The meaning of the various quantifiers is not otherwise changed, just the preferred number of matches. Do not confuse this use of question mark with its use as a quantifier in its own right. Because it has two uses, it can sometimes appear doubled, as in \d??\d which matches one digit by preference, but can match two if that is the only way the rest of the pattern matches.

Answer

Alternatively, you can use Ungreedy modifier to set the whole regular expression to search for preferably as short as possible match:

/-(.+)-/U
Answer

? before a token is shorthand for {0,1}, which means: Anything up from 0 to 1 appearances as the foremost.

But + is not a token, but a quantifier. shorthand for {1,}: 1 up to endless appearances.

A ? after a quantifier sets it into nongreedy mode. If in greedy mode, it matches as much of the string as possible. If non greedy it matches as little as possible

Answer

Another, perhaps the underlying error in your regex is that you try to match a number of arbitrary characters via .+?. However, what you really want is probably: "any character except -". You can get that via [^-]+ In this case, it doesn't matter if you do a greedy match or not -- the repeated match will terminate as soon as you encounter the second "-" in your string.

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.