Photo by Nate Grant on Unsplash

Using Command Line Tools to Aid Development — Part 2 (Sed)

Living The Dream
8 min readApr 3, 2019

--

The first part of this series of articles is here and I recommend you start with that one to get an idea of this series of articles and why using command line tools to aid your development (in any language/environment) can be very useful.

I am going to dive right in here with another text based tool called ‘sed’. Sed stands for ‘stream editor’ and it is another very early Unix days tool created in 1974, its goal being a streams based implementation of text editing that utilises regular expressions which back then was a new era for efficiently processing text. I do consider these articles to at least assume some knowledge/use of regular expressions in any environment prior, explaining regular expressions is beyond the scope of these use cases/tool tutorials and because they are aimed at developers I think it’s safe you would know what they are.

These days GNU have their own offering of sed which has become the standard edition, this version includes some big improvements and new features including editing files in-place and other functionality for convenience. This package is default available on most Unix/Linux distros, as well as MacOS and can be installed for use on Windows based setups as well.

Once you have made sure it is available on your chosen development environment you can use sed for various operations that would take longer or be tedious if you were doing it manually or using GUI tools to accomplish. I will show examples of some of these situations for you below.

I am using sed on MacOS, which it has installed by default, most versions will operate the same for most use cases with the same switches & functionality.

livethedream$ sed 
usage: sed script [-Ealn] [-i extension] [file ...]
sed [-Ealn] [-i extension] [-e script] ... [-f script_file] ... [file ...]
livethedream$

In modern programming there is a big emphasis on splitting your projects/scripts/programs into multiple neatly organised files that are in various formats and structures in order to keep your project to standards, make it easier for other developers to read & modify, for compiler rules etc. This can mean you end up needing to replace certain words, phrases, variable names and the like project-wide or folder-wide in order to update name changes and such. Sed can help you handle situations like this with ease.

Giving commands to Sed

On the most basic level, sed takes ‘commands’ in the form of regular expressions, these different commands denote how it will use/behave with the provided expressions. The most used command is substitute, this provides find/replace style functionality.

Remember like most regular expression engines, you can use flags like /g for global or ‘greedy’ expressions which will perform the match for every occurrence rather than the first one it finds. /I is for ignoring case (Note: MacOS/BSD shipped version of sed is missing the ignore case switch among a couple of other more modern features that the GNU one has).

You can also use different delimiters which is handy especially if your doing operations involving filenames, like this example:

livethedream$ sed 's_/usr/bin_/usr/local/bin_/gI' <filename>

Specifying locations

If you want to be more specific on where in the file(s) the operations are carried out, you can tell sed line numbers or line number ranges to operate on and even do the same with pattern matches:

livethedream$ sed '3 s/<find>/<replace>/' <filename>
^ Only match on line 3, Note: it treats line 1 as line 1, not 0 like some environments.
livethedream$ sed '3,10 s/<find>/<replace>/' <filename>
^ Only match on lines 3 to 10
livethedream$ sed '/^#/ s/<find>/<replace>/' <filename>
^ Only operates on lines that start with a #, so comments in certain languages in this instance.
livethedream$ sed '/first/,/last/ s/<find>/<replace>/' <filename>
^ Only operates on lines between the lines that say first and last. Here's a working example where any line between the two range pattern lines are deleted using the d command:
livethedream$ cat file.txt
one
first
two
three
last
four
livethedream$ sed '/first/,/last/ d' file.txt
one
four
livethedream$

Using your matches with & and \1

If you want to use part of your matched input in your replacement or command, you can either use & to use the entire matched pattern per line or \1 up to \9 for different grouped parts. Here’s a couple of examples:

livethedream$ echo 'hello-world-123456789' | sed 's/[a-z\-]*/&/'
hello-world-123456789
^ The use of & in the second expression means anything matched in the line/stream pattern will be re-used to define how the second regular expression is carried out.livethedream$ echo 'hello-world-123456789' | sed 's/\([a-z\-]*\).*/\1/'
hello-world-
livethedream$
^ You can see here that creating a group in the first regular expression using \(<group>\) means that the groups get a number starting at \1 and going up to \9 maximum that you can re-use the group in the second expression. The group here is told to match characters a-z in lowercase and the character -, this means it will ignore the numbers at the end and store everything before in \1 for later use.livethedream$ echo 'hello world' | sed 's/\([a-z]*\) \([a-z]*\)/\2 \1/'
world hello
livethedream$
^ This is a common example where two words can be switched round using the grouping references.

This sort of expansion to the commands gives a lot more power once you start to master them.

Chaining commands & Command files

Once you get the hang of some basic sed commands you can use some of the switches to chain commands, store commands in sed ‘script’ files etc, here is some examples of more advanced switches. The -e switch can be used for chaining commands to sed, this is also sometimes used by people for single command calls due to personal preference/clarity.

livethedream$ sed -i -e '<command>' -e '<command>' <filename>
e.g.
livethedream$ sed -i -e '/\s*#.*$/d' -e '/^ *$/d' <filename>^ This chains the commands up to remove all # style comments and all blank/whitespace lines from your file or project, you can chain up more expressions and eventually come up with one-liners to clean your code up when required.

You can create sed scripts by just putting one sed command per line into a file and calling it with the -f switch.

File x.sed:

/\s*#.*$/d
/^ *$/d

Command:

livethedream$ sed -f x.sed <filename>

Creating little sed scripts and keeping them handy to clean up code or even data-files (e.g. simple batch csv processing) can be a good use of this feature.

There is lots of commands for sed including the ability to read (/r) & write (/w) to files, append (/a), insert (/i) and change (/c) lines and more right in the pattern itself. Explaining all of these much less utilised commands would make this article a small book. If you want to learn more about the capabilities of the sed version on your system, I recommend reading the man pages on your system using this command:

livethedream$ man sed

This will give you in-depth advice and commands for your exact system version be it GNU or BSD.

Use — Replacing text in files

I will now try to go through some basic use cases for aiding development using sed. As I mentioned, the most used command is the ‘substitute’. The -i switch means sed will edit the input file(s) in-place like in this example:

livethedream$ sed -i s/<search>/<replace>/ <filename>
e.g.
livethedream$ cat file1.txt
abc
livethedream$ sed -i s/abc/cba/ file1.txt
livethedream$ cat file1.txt
cba
livethedream$

If you omit the -i switch, it will simply output the full input with any changes it makes, this makes it useful for using with pipes or any bash scripting scenarios. If we take the above example without -i, it works out like below — with file1.txt untouched.

livethedream$ sed s/abc/cba/ file1.txt
cba
livethedream$ sed s/abc/cba/ file1.txt > file2.txt
livethedream$
^ This second command shows how you can easily take one file's sed output straight into a new file, keeping a copy of both states.

Use — Replacing text in all files in a folder/directory

You can use the Bash shell * wildcard to unleash sed on the entire directory.

livethedream$ sed -i /<search>/<replace>/ *
e.g.
livethedream$ sed -i s/abc/cba/ *

Use — Cleaning up code in your project or file

You can go further with this sort of concept and start using the functionality of sed to clean up your code, remove unwanted function calls and such with commands like the ones below. Some of these examples utilise the ‘delete’ command specified of course by the letter d, working exactly like the substitute command but only taking one pattern to find and delete the offending matches.

livethedream$ sed -i '/^ *$/d' <filename>^ This removes all lines with just whitespace or that are blank.livethedream$ sed -i s/<function name>\(\)\;// <filename>^ This would remove all calls in a file or project to a certain function.livethedream$ sed -i '/\s*#.*$/d' <filename>^ This would remove all comments that use the # style from your file or project.livethedream$ sed -i s/<old variable/function/class name>/<new variable/function/class name>/ <filename>^ This last one is self explanatory really but it is very useful for when you need to change the name of something that gets referenced or called all over the place in a large project.

You will notice the different styles of sed use that you can do, single quoting can be useful if your using certain characters in your command that will upset your preferred command line shell. The function call example above shows the other method of escaping your special characters using \, this is just another way to do the same thing.

Teaming up with find

Like in part one with Grep, you can pair sed up with find to really get recursive through a large project with some help from the xargs tool & a pipe.

livethedream$ find <path> -type f -print0 | xargs -0 sed -i 's/<find>/<replace>/g'
^ print0 (GNU find only) tells find to use the null character (\0) instead of whitespace as the output delimiter between files found. This is a safer option if your filenames contain blanks or other special characters. It is recommended to use the -print0 argument to find if you use -exec command or xargs (the -0 argument is needed in xargs.).

Find called with -type f -print0 prints out a stream of all the filenames recursively in the specified path & deeper. Xargs then calls sed with the required arguments of the filename, -i for in-place and the requested regular expression replacement. This can be used to quickly replace on a large scale.

livethedream$ find <path> -type f -exec sed -i 's/<find>/<replace>/g' {} +
^ This is another option using find's -exec switch rather than the xargs solution - though the xargs solution is safer in some circumstances.

You can also team up sed with anything else that outputs plaintext, so anything that outputs to std in your shell can be run through a sed command.

This concludes my brief introduction to using the great command line tool ‘sed’ for aiding development. There is lots of other great ways you can use sed to process any type of text file or even bash streams & pipes . I have tried to show you the best switches and commands that get used with sed calls the most rather than going in-depth on every single piece of functionality it holds. It is a most powerful tool when leveraged correctly.

If you need a more adaptable advanced version of what sed does with even more functionality to really mangle your text definitely look into the command line tool AWK.

If you like this article, please keep an eye out for part 3. If you have any feedback or questions please email me at root@livingthedream.fun. :)

--

--

Living The Dream

Software/Web Development with data science, multiple disciplines and cynicism thrown in.