Capturing Groups in Sed

2021-09-19

While migrating this blog from Jekyll to Hugo, I found I needed to replace all my old markdown posts which has my title params using single-quotes which Hugo does not appreciate, to double-quotes.

For example, I have the following across multiple files (over 200+ files)

---
title: 'abcdefg'
tags:
- new
- stuff
---

And I needed to replace title: 'abcdefgh' with title: "abcdefgh".

To do that, I’m going to use sed because its going to be terribly painful to do this manually.

Although I often use sed to replace full words, modifying characters around some text is is something that I haven’t have much practice with.

In normal regex, we will often use (.*) to capture a group of everything .* and $1 to reference the matched group.

In sed, it looks similar:

(.*) refers to matching groups
\1 refers to the first match
\2 refers to the 2nd match
...

So for the above scenario, we can quickly replace the single-quotes with just:

$ sed -i "s,title: '\(.*\)',title: \"\1\",g"

What this translate to is:

  • find a line that starts with title:, and has '' surrounding a matching group of words (.*)
  • then replace that line with title: and add a placeholder of \1 with surrounding ""
  • then replace \1 with the captured matching groups of word (.*)

So now,

title: 'abcdefgh'
...

becomes

title: "abcdefgh"
...

  • Note: in the above example, I am using , as the sed delimiter instead of the normal /
  • Also, notice that we have \" just before and after \1. This is because we need to escape double quotes since we’re using double quotes as the argument wrapper already.

I’m currently working on fyra.sh, a CLI-first static site deployment tool where you push your site and it’s served globally through a built-in CDN, without the overhead of heavy platforms.