Ancestors

Written by Roberto von Archimboldi on 2025-01-14 at 19:14

Regex question: Why does '[0-9]*' return a blank on '>335', but '[0-9]{1,3}' return 335?

This pertains to libreoffice calc which tells me that it uses ICU regular expressions

[#]Regex, #HelpNeeded, #LibreOffice

=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social

Written by erAck on 2025-01-14 at 20:13

@RobertoArchimboldi

Because the empty match is the first match for any "zero or more" pattern if the string does not start with it (and it's not a blank but an empty string). It's the same for the pattern 'x*' and this ">335" string. The second possible match for '[0-9]*' is 335. If you want it to match only the digits then instead use '[0-9]+', or restrict match to the second occurrence, like

=> 335";"[0-9]*";;2)">EGEX(">335";"[0-9]*";;2)

'[0-9]{1,3}' matches one to three digits, as many times as possible.

=> More informations about this toot | More toots from erAck@social.tchncs.de

Written by Roberto von Archimboldi on 2025-01-14 at 20:29

@erAck Thank you very much. I'm nearly there. I am not quite clear what an 'empty match' is or a 'zero or more' pattern. Fortunately, I want to match one to three digits as many times as possible and then flag for the first or second occurrence. I want to separate a column that contains ranges, 30 - 355, into the lower and upper bounds.

I'd also like to sum the lower bounds without making a whole new column, but I haven't worked out how to do that yet.

=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social

Toot

Written by erAck on 2025-01-14 at 20:43

@RobertoArchimboldi

For your task, best ask on https://ask.libreoffice.org/

A * in a regex pattern tells to match the preceding string or expression zero or more times. See https://unicode-org.github.io/icu/userguide/strings/regexp.html#regular-expression-operators the Regular Expression Operators. You may also test expressions at https://regex101.com/ best use the Java 8 or ECMAScript flavour for ICU behaviour.

=> More informations about this toot | More toots from erAck@social.tchncs.de

Descendants

Written by Roberto von Archimboldi on 2025-01-14 at 20:44

@erAck thank you. Will do

=> More informations about this toot | More toots from RobertoArchimboldi@kolektiva.social

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113828654719485701
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
272.323583 milliseconds
Gemini-to-HTML Time
2.045231 milliseconds

This content has been proxied by September (3851b).