SED Tutorial - how replace or substitute file contents

2021-08-18

Linux Linux Utilities

The syntax for SED substitution is:

sed 's/regexp/replacement/g' inputFileName

s stands for substitute
g stands for global, which means that all matching occurrences in the line would be replaced

Let us consider a sample file, example.txt, as shown below:

~] cat example.txt
cat example.txt 
one Red-Hat is Linux
two red-hat is linux
(3) FreeBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

add something to the beginning of a every line in a file

~] sed 's/^/our_string: /' example.txt      
our_string: one Red-Hat is Linux
our_string: two red-hat is linux
our_string: (3) FreeBSD is Unix
our_string: "4" freebsd is Unix
our_string: 
our_string: #6# xubuntu is debian

add something to the end of the file:

~] sed 's/$/ our_string/' example.txt     
one Red-Hat is Linux our_string
two red-hat is linux our_string
(3) FreeBSD is Unix our_string
"4" freebsd is Unix our_string
 our_string
#6# xubuntu is debian our_string

To replace or substitute a particular first occurence character in all lines, say to replace 'e' with 'E'.

~] sed 's/e/E/' example.txt                   
onE Red-Hat is Linux
two rEd-hat is linux
(3) FrEeBSD is Unix
"4" frEebsd is Unix

#6# xubuntu is dEbian

To replace or substitute all occurrences of 'e' with 'E' you must add `g` switch

~] sed 's/e/E/g' example.txt 
onE REd-Hat is Linux
two rEd-hat is linux
(3) FrEEBSD is Unix
"4" frEEbsd is Unix

#6# xubuntu is dEbian

Delete all empty lines

~] sed '/^\s*$/d' example.txt        
one Red-Hat is Linux
two red-hat is linux
(3) FreeBSD is Unix
"4" freebsd is Unix
#6# xubuntu is debian

This uses \s to match any whitespace character.

or you can use

sed '/^$/d' file

The ^$ indicates sed command to delete empty lines. However, this sed do not remove the lines that contain spaces.

how replace the second occurrence or third occurrence or in other words nth occurrence.

~] echo -e 'eeeeee\neeeee' | sed 's/e/E/2'  
eEeeee
eEeee

~] echo -e 'eeeeee\neeeee' | sed 's/e/E/4'  
eeeEee
eeeEe

Now, say to replace all occurrences from 3nd occurrence onwards:

~] echo -e 'eeeeee\neeeee' | sed 's/e/E/3g'  
eeEEEE
eeEEE

Say, you want to replace string only in a specific line say 3rd line, not in the entire file:

~] sed '3s/e/E/g' example.txt                         
one Red-Hat is Linux
two red-hat is linux
(3) FrEEBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

3s denotes the substitution to be done is only for the 3rd line.

To replace or substitute string on a range of lines, say from 1st to 3rd line:

~] sed '1,3s/e/E/g' example.txt           
onE REd-Hat is Linux
two rEd-hat is linux
(3) FrEEBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

To replace the entire line with something.

example:

~] sed 's/.*/& and run on my pc/' example.txt             
one Red-Hat is Linux and run on my pc
two red-hat is linux and run on my pc
(3) FreeBSD is Unix and run on my pc
"4" freebsd is Unix and run on my pc
 and run on my pc
#6# xubuntu is debian and run on my pc

The & symbol denotes the entire pattern matched. In this case, since we are using .* which means matching the entire line, & contains the entire line.

example:

~] sed 's/free/&---/' example.txt  
one Red-Hat is Linux
two red-hat is linux
(3) FreeBSD is Unix
"4" free---bsd is Unix

#6# xubuntu is debian

In this example we use find and sed to rename all files:

find . -name "file*" | sed 's/.*/mv & &.txt/' | sh

In this above, find finds the files and the sed command prepares the move instruction. If you note carefully, the syntax is same as the one used inside vi. And the output of the sed is piped to sh command.

Replace some string only if line contains another string

We replace only lines that contain string free

~] sed '/free/s/i/I/g' example.txt  
one Red-Hat is Linux
two red-hat is linux
(3) FreeBSD is Unix
"4" freebsd Is UnIx

#6# xubuntu is debian

Replace some string only in specified lines and if line contains another string

We replace in line from 1 to 3 only if line contains string red

~] sed '1,3{/red/s/i/I/g}' example.txt      
one Red-Hat is Linux
two red-hat Is lInux
(3) FreeBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

we can also do multiple substitution.

For example, say to replace all 'i' to 'I', and 'e' to 'E':

~] sed 's/i/I/g; s/e/E/g' example.txt                     
onE REd-Hat Is LInux
two rEd-hat Is lInux
(3) FrEEBSD Is UnIx
"4" frEEbsd Is UnIx

#6# xubuntu Is dEbIan

OR This can also be done as:

~] sed 's/i/I/g' -e 's/e/E/g' example.txt  
sed: can't read s/i/I/g: No such file or directory
onE REd-Hat is Linux
two rEd-hat is linux
(3) FrEEBSD is Unix
"4" frEEbsd is Unix

#6# xubuntu is dEbian

The option -e is used when you have more than one set of substitutions to be done.

Print the first 'N' characters of a line

syntax for such substitute is:

sed -e 's/^\(.\{N\}\).*/\1/' file

example: We want print only first 6 chars:

~] sed -e 's/^\(.\{6\}\).*/\1/' example.txt 
one Re
two re
(3) Fr
"4" fr

#6# xu

s/regexp/replacement/ - Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.

You can make this example with cut linux utility: cat file | cut -c 1-N

~] cat example.txt | cut -c 1-6
one Re
two re
(3) Fr
"4" fr

#6# xu

Delete first 'N'chars from every line

sed -e 's/^\(.\{6\}\)//' example.txt  
d-Hat is Linux
d-hat is linux
eeBSD is Unix
eebsd is Unix

buntu is debian

Substitute first 'N' chars

~] sed -e 's/^\(.\{6\}\)/@/' example.txt 
@d-Hat is Linux
@d-hat is linux
@eeBSD is Unix
@eebsd is Unix

@buntu is debian

Delete last N characters

syntax for such substitute is:

sed -e 's/.\{N\}$//g' file

example: We want delete last 3 chars from every line

~} sed -e 's/.\{3\}$//g' example.txt    
one Red-Hat is Li
two red-hat is li
(3) FreeBSD is U
"4" freebsd is U

#6# xubuntu is deb

Print last N characters

~] sed -e 's/.*\(.\{3\}\)$/\1/g' example.txt
nux
nux
nix
nix

ian

How print characters from N - M

~] cat example.txt
one Red-Hat is Linux
two red-hat is linux
(3) FreeBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

# capture chars from position 3 to position 16
# We don't want firt 2 chars and than want capture next 14 chars
~] sed -e 's/^.\{2\}\(.\{14\}\).*/\1/' example.txt 
e Red-Hat is L
o red-hat is l
) FreeBSD is U
" freebsd is U

# xubuntu is d

# capture chars from position 3 to 6
# don't include first 2 chars and then capture next 4 chars
~] echo -e "123456789\n987654321" | sed -e 's/^.\{2\}\(.\{4\}\).*/\1/'
3456
7654

How delete characters from N - M

~] echo -e "123456789\n987654321" | sed -e 's/\(.\{2\}\).\{4\}\(.*\)/\1\2/'
12789
98321

Extract String Between Two STRINGS

~] echo "Here is a StringBefore and here is StringAfter this strings" | sed -e 's/.*StringBefore\(.*\)StringAfter.*/\1/'
 and here is

Replace only if the string is found in a certain context

Replace red with bar only if there is a linux string later on the same line:

~] sed -e 's/red\(.*linux\)/bar\1/g' example.txt   
one Red-Hat is Linux
two bar-hat is linux
(3) FreeBSD is Unix
"4" freebsd is Unix

#6# xubuntu is debian

In sed, using () saves whatever is in the parentheses and you can then access it with 1 . But you have to prepend this special chars with \ char when you use it with regexp.