Capturing Groups in Sed
While migrating this blog from Jekyll to Hugo, I found I needed to replace all my old markdown posts which has my title
params using single-quotes which Hugo does not appreciate, to double-quotes.
For example, I have the following across multiple files (over 200+ files)
---
title: 'abcdefg'
tags:
- new
- stuff
---
And I needed to replace title: 'abcdefgh'
with title: "abcdefgh"
.
To do that, I’m going to use sed
because its going to be terribly painful to do this manually.
Although I often use sed
to replace full words, modifying characters around some text is is something that I haven’t have much practice with.
In normal regex, we will often use (.*)
to capture a group of everything .*
and $1
to reference the matched group.
In sed, it looks similar:
(.*) refers to matching groups
\1 refers to the first match
\2 refers to the 2nd match
...
So for the above scenario, we can quickly replace the single-quotes with just:
$ sed -i "s,title: '\(.*\)',title: \"\1\",g"
What this translate to is:
- find a line that starts with
title:
, and has''
surrounding a matching group of words(.*)
- then replace that line with
title:
and add a placeholder of\1
with surrounding""
- then replace
\1
with the captured matching groups of word(.*)
So now,
title: 'abcdefgh'
...
becomes
title: "abcdefgh"
...
- Note: in the above example, I am using
,
as the sed delimiter instead of the normal/
- Also, notice that we have
\"
just before and after\1
. This is because we need to escape double quotes since we’re using double quotes as the argument wrapper already.