Wednesday, November 3, 2021

Gnome Text Editor gedit, and Regular Expressions

Gedit is the Gnome project's default text editor

Somewhere over the last twenty years, it became a victim of malicious simplification, and it's hard to get information on advanced features. (I'm still trying to remember how to enable the included "advanced features" with the new UI.)

Since it's hard to find information on gedit's regular expressions, I'm taking notes here:

  • \s matches newline, as well as other whitespace. Matches across end-of-line.
    \S seems to invert that.
  • ^ is line beginning,
    $ is line-end, but line-end can be hidden by \s .
  • \h matches non-newline whitespace, and
    \H inverts that.
  • () Parenthetic groups work, \1 -- \9 are the first nine in the replacement pattern.
  • | Alternate patterns work, separated by | .
  • [] Brackets collect arbitrary single matches:
    \d is the same as [0-9] .
Repeats:
  • * is 0 or more
  • + is one or more

Thus 

\h+ 

is one or more non-newline whitespace characters.

I'll add notes as it seems appropriate.

Some examples:

  • Insert a semicolon comment character where it was left out of EQU statements:
    match: (\h+EQU)(\h+)(\S+)(\h+)(\S)
    replace: \1\2\3\4; \5
  • Replace LDA A style assembler lines with LDAA, inserting semicolon before comments as you go (but missing lines without comments):
    match: (\h+)(LDA)\h+([AB])(\h+)(\S+)(\h+)(\S)
    \1\2\3\4\5\6; \7
  • Insert semicolon comment characters in LDD lines with comments:
    match: (\h+)(LD|ST)D(\h+)(\S+)(\h+)(\S)
    replace: \1\2D\3\4\5; \6
  • Insert semicolon comment characters in branch (Bcc) lines:
    match: (\h+B??)(\h+)(\S+)(\h+)(\S)
    replace: \1\2\3\4; \5


No comments:

Post a Comment