Regex question: Why does '[0-9]*' return a blank on '>335', but '[0-9]{1,3}' return 335?
This pertains to libreoffice calc which tells me that it uses ICU regular expressions
[#]Regex, #HelpNeeded, #LibreOffice
=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social
@RobertoArchimboldi A regex always finds the leftmost match. Your first regex, [0-9], can match at offset 0 in >335 because means "0 or more of the preceding thing", so it can match all 0 digits before >. On the other hand, [0-9]{1,3} requires at least one digit to match, so the first location where it can succeed is at offset 1, matching all three available digits.
=> More informations about this toot | More toots from barubary@infosec.exchange
@barubary combined with @erAck I think that I now understand. So if my test string was '30-45' '[0-9]*' would match 3, 30, 30-, -, 4, and 45?
=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social
@barubary @erAck experimenting, I don't understand. The third group of five is 45 in my spreadsheet. Groups 1, 4, 5 return an empty string, 2 returns 30 and 3 returns 45. '[0-9]*,,6' returns N/A
=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social
@RobertoArchimboldi If you tell a regex engine to find all matches, they normally don't overlap; i.e. each search takes off where the previous match stopped. So for a pattern of [0-9]* against the string 30-45 I'd expect four matches: The two digits at offset 0 (30 at the beginning of the string), the zero digits at offset 2 (, just before -), the two digits at offset 3 (45, after -), and the zero digits at offset 5 ( at the end of the string).
(None of the matches can contain - because [0-9] only matches digits.)
=> More informations about this toot | More toots from barubary@infosec.exchange
@barubary I'm learning thank you. Your visualization did render well
=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social
@RobertoArchimboldi
No. '[0-9]*' in '30-45' would match '30', then an empty match, then '45', then empty. The * "zero or more" is a greedy operator, it matches as many as possible.
See https://regex101.com/r/4J7tZd/1
@barubary
=> More informations about this toot | More toots from erAck@social.tchncs.de
@erAck @barubary thank you. I'm learning super helpful
=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social This content has been proxied by September (ba2dc).Proxy Information
text/gemini