Revision as of 16:47, 28 March 2004 editKwertii (talk | contribs)Autopatrolled, Extended confirmed users4,806 editsm link Sozialistiche Einheitspartei Deutschland (SED)← Previous edit | Latest revision as of 13:02, 29 November 2024 edit undoCitation bot (talk | contribs)Bots5,458,274 edits Add: author-link1, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Pattern matching programming languages | #UCB_Category 4/30 | ||
(798 intermediate revisions by more than 100 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Standard UNIX utility for editing streams of data}} | |||
]]]]]] | |||
{{About|the text processing utility}} | |||
{{Infobox programming language | |||
| name = sed | |||
| paradigm = ] | |||
| released = {{start date and age|1974}} | |||
| designer = ] | |||
| programming_language = ] | |||
| influenced by = ] | |||
| influenced = ], ] | |||
| screenshot = Sed stream editor cropped1.jpg | |||
| screenshot caption = An excerpt from GNU sed's ] | |||
| website = hide | |||
}} | |||
{{lowercase|sed}} | |||
'''sed''' ("stream editor") is a ] utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by ] of ],<ref name=sed-faq-2.1>{{cite web | |||
| url=http://sed.sourceforge.net/sedfaq2.html#s2.1 | |||
| title=The sed FAQ, Section 2.1 | |||
| access-date=2013-05-21 | |||
| archive-date=2018-06-27 | |||
| archive-url=https://web.archive.org/web/20180627160704/http://sed.sourceforge.net/sedfaq2.html#s2.1 | |||
| url-status=dead | |||
}}</ref> | |||
and is available today for most operating systems.<ref name=sed-faq-2.2>{{cite web | |||
| url=http://sed.sourceforge.net/sedfaq2.html#s2.2 | |||
| title=The sed FAQ, Section 2.2 | |||
| access-date=2013-05-21 | |||
| archive-date=2018-06-27 | |||
| archive-url=https://web.archive.org/web/20180627160704/http://sed.sourceforge.net/sedfaq2.html#s2.2 | |||
| url-status=dead | |||
}}</ref> sed was based on the scripting features of the interactive editor ] ("editor", 1971) and the earlier ] ("quick editor", 1965–66). It was one of the earliest tools to support ]s, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include ] and ]. | |||
==History== | |||
''SED is the German abbrieviation for the Sozialistiche Einheitspartei Deutschland, or ], which was the governing power in ]. | |||
First appearing in ],{{r|reader}} sed is one of the early Unix commands built for command line processing of data files. It evolved as the natural successor to the popular ] command.<ref name=early_history> | |||
{{cite web | |||
| title = On the Early History and Impact of Unix | |||
| url = http://www.columbia.edu/~rh120/ch001j.c11 | |||
| quote = "A while later a demand arose for another special-purpose program, gres, for substitution: g/re/s. Lee McMahon undertook to write it, and soon foresaw that there would be no end to the family: g/re/d, g/re/a, etc. As his concept developed it became sed…" | |||
}} | |||
</ref> The original motivation was an analogue of grep (g/re/p) for substitution, hence "g/re/s".<ref name="reader">{{cite tech report |first1=M. D. |last1=McIlroy |author-link1=Doug McIlroy |year=1987 |url=http://www.cs.dartmouth.edu/~doug/reader.pdf |title=A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 |series=CSTR |number=139 |institution=Bell Labs}}</ref> Foreseeing that further special-purpose programs for each command would also arise, such as g/re/d, McMahon wrote a general-purpose line-oriented stream editor, which became sed.<ref name=early_history /> The syntax for sed, notably the use of <code>/</code> for ], and <code>s///</code> for substitution, originated with ], the precursor to sed, which was in common use at the time,<ref name=early_history /> and the regular expression syntax has influenced other languages, notably ] and ]. Later, the more powerful language ] developed, and these functioned as cousins, allowing powerful text processing to be done by ]s. sed and AWK are often cited as progenitors and inspiration for Perl, and influenced Perl's syntax and semantics, notably in the matching and substitution operators. | |||
] sed added several new features, including ] of files. ''Super-sed'' is an extended version of sed that includes regular expressions compatible with ]. Another variant of sed is ''minised'', originally reverse-engineered from 4.1BSD sed by ] and currently maintained by ]. minised was used by the ] until the GNU Project wrote a new version of sed based on the new GNU regular expression library. The current minised contains some extensions to BSD sed but is not as ] as GNU sed. Its advantage is that it is very fast and uses little memory. It is used on embedded systems and is the version of sed provided with ].<ref>{{Cite web |last1=Raymond |first1=Eric Steven |author-link1=Eric S. Raymond |last2=Rebe |first2=René |author-link2=René Rebe |date=2017-03-03 |title=tar-mirror/minised: A smaller, cheaper, faster SED implementation |url=https://github.com/tar-mirror/minised |url-status=live |archive-url=https://web.archive.org/web/20180613031040/https://github.com/tar-mirror/minised |archive-date=2018-06-13 |access-date=2024-05-20 |website=]}}</ref> | |||
----- | |||
==Mode of operation== | |||
'''Sed''' (which stands for '''S'''tream '''ED'''itor) is a simple but powerful computer program used to apply various pre-specified textual transformations to a sequential stream of text data. It reads input files line by line, edits each line according to rules specified in its simple language (the ''sed script''), and then outputs the line. While originally created as a ] utility by Lee E. McMahon of ] in ]/], sed is now available for virtually every operating system that supports a ]. | |||
sed is a line-oriented text processing utility: it reads text, line by line, from an ] or file, into an internal buffer called the ''pattern space''. Each line read starts a ''cycle''. To the pattern space, sed applies one or more operations which have been specified via a ''sed script''. sed implements a ] with about 25 ''commands'' that specify the operations on the text. For each input line, after running the script, sed ordinarily outputs the pattern space (the line as modified by the script) and begins the cycle again with the next line. Other end-of-script behaviors are available through sed options and script commands, e.g. <code>d</code> to delete the pattern space, <code>q</code> to quit, <code>N</code> to add the next line to the pattern space immediately, and so on. Thus a sed script corresponds to the body of a loop that iterates through the lines of a stream, where the loop itself and the loop variable (the current line number) are implicit and maintained by sed. | |||
The sed script can either be specified on the ] (<code>-e</code> option) or read from a separate file (<code>-f</code> option). Commands in the sed script may take an optional ''address,'' in terms of line numbers or ]s. The address determines when the command is run. For example, <code>2d</code> would only run the <code>d</code> (delete) command on the second input line (printing all lines but the second), while <code>/^ /d</code> would delete all lines beginning with a space. A separate special buffer, the ''hold space'', may be used by a few sed commands to hold and accumulate text between cycles. sed's command language has only two variables (the "hold space" and the "pattern space") and ]-like branching functionality; nevertheless, the language is ],<ref>{{cite web | |||
Sed is often thought of as a non-interactive ]. | |||
| title = Implementation of a Turing Machine as Sed Script | |||
It differs from conventional text editors in that the processing of the two inputs is inverted. | |||
| url = http://sed.sourceforge.net/grabbag/scripts/turing.txt | |||
Instead of iterating once through a list of edit commands applying each one to the whole text file in memory, sed iterates once through the text file applying the whole list of edit commands to each line. | |||
| access-date = 2003-04-24 | |||
Because only one line at a time is in memory, sed can process arbitrarily-large text files. | |||
| archive-date = 2018-02-20 | |||
| archive-url = https://web.archive.org/web/20180220011912/http://sed.sourceforge.net/grabbag/scripts/turing.txt | |||
| url-status = dead | |||
}}</ref><ref>{{cite web | |||
| title = Turing.sed | |||
| url = http://sed.sourceforge.net/grabbag/scripts/turing.sed | |||
| access-date = 2003-04-24 | |||
| archive-date = 2018-01-16 | |||
| archive-url = https://web.archive.org/web/20180116201401/http://sed.sourceforge.net/grabbag/scripts/turing.sed | |||
| url-status = dead | |||
}}</ref> and ] sed scripts exist for games such as ], ],<ref name="gamez">{{cite web | |||
| url = http://sed.sourceforge.net/#gamez | |||
| title = The $SED Home - gamez | |||
}} | |||
</ref> ],<ref>{{cite web |title=bolknote/SedChess |url=https://github.com/bolknote/SedChess |access-date=August 23, 2013 |website=GitHub}}</ref> and ].<ref name="tetris"> | |||
{{cite web |title=Sedtris, a Tetris game written for sed |url=https://github.com/uuner/sedtris |access-date=October 3, 2016 |website=]}}</ref> | |||
A ] executes for each line of the input stream, evaluating the sed script on each line of the input. Lines of a sed script are each a pattern-action pair, indicating what pattern to match and which action to perform, which can be recast as a ]. Because the main loop, working variables (pattern space and hold space), input and output streams, and default actions (copy line to pattern space, print pattern space) are implicit, it is possible to write terse ]s. For example, the sed program given by: | |||
Sed's command set is modeled after the ] editor, and most commands work similarly in this inverted paradigm. For example, the command '''25d''' means ''if this is line 25, then delete (don't output) it'', rather than ''go to line 25 and delete it'' as it does in ed. The notable exception is the copy and move commands, which span a range of lines and thus don't have straight-forward equivalents in sed. Instead, sed introduces an extra | |||
10q | |||
buffer called the ''hold'' space, and additional commands to manipulate it. | |||
will print the first 10 lines of input, then stop. | |||
The ed command to copy line 25 to line 76 ('''25t76''') for example would be coded as two separate commands in sed ('''25h; 76g'''), to store the line in the hold space until the point at which it should be retrieved. | |||
==Usage== | |||
The following example shows a typical usage of sed: | |||
===Substitution command=== | |||
sed -e 's/oldstuff/newstuff/g' inputFileName > outputFileName | |||
The following example shows a typical, and the most common, use of sed: substitution. This usage was indeed the original motivation for sed:<ref name=early_history /> | |||
<syntaxhighlight lang=bash> | |||
sed 's/regexp/replacement/g' inputFileName > outputFileName | |||
</syntaxhighlight> | |||
In some versions of sed, the expression must be preceded by <code>-e</code> to indicate that an expression follows. The <code>s</code> stands for substitute, while the <code>g</code> stands for global, which means that all matching occurrences in the line would be replaced. The ] (i.e. pattern) to be searched is placed after the first delimiting symbol (slash here) and the replacement follows the second symbol. Slash (<code>/</code>) is the conventional symbol, originating in the character for "search" in ed, but any other could be used to make syntax more readable if it does not occur in the pattern or replacement; this is useful to avoid "]". | |||
The substitution command, which originates in search-and-replace in ed, implements simple parsing and ]. The <code>regexp</code> provides both pattern matching and saving text via sub-expressions, while the <code>replacement</code> can be either literal text, or a format string containing the characters <code>&</code> for "entire match" or the special ]s <code>\1</code> through <code>\9</code> for the ''n''th saved sub-expression. For example, <code>sed -r "s/(cat|dog)s?/\1s/g"</code> replaces all occurrences of "cat" or "dog" with "cats" or "dogs", without duplicating an existing "s": <code>(cat|dog)</code> is the 1st (and only) saved sub-expression in the regexp, and <code>\1</code> in the format string substitutes this into the output. | |||
The ''s'' stands for substitute; the ''g'' stands for global. That means in the whole line. After the first slash is the ] to search for and after the second slash is the expression to replace it with. The substitute command (s///) is by far the most powerful and most commonly used sed command. | |||
===Other sed commands=== | |||
Under Unix, sed is often used as a ] in a ]: | |||
Besides substitution, other forms of simple processing are possible, using some 25 sed commands. For example, the following uses the ''d'' command to filter out lines that only contain spaces, or only contain the end of line character: | |||
generate_data | sed -e 's/x/y/' | |||
<syntaxhighlight lang=bash> | |||
That is, generate the data, but make the small change of replacing ''x'' with ''y''. | |||
sed '/^ *$/d' inputFileName | |||
</syntaxhighlight> | |||
This example uses some of the following ] ]s (sed supports the full range of regular expressions): | |||
* The ] (<code>^</code>) matches the beginning of the line. | |||
* The ] (<code>$</code>) matches the end of the line. | |||
* The ] (<code>*</code>) matches zero or more occurrences of the previous character. | |||
* The ] (<code>+</code>) matches one or more occurrence(s) of the previous character. | |||
* The ] (<code>?</code>) matches zero or one occurrence of the previous character. | |||
* The ] (<code>.</code>) matches exactly one character. | |||
Complex sed constructs are possible, allowing it to serve as a simple, but highly specialized, ]. Flow of control, for example, can be managed by the use of a ] (a colon followed by a string) and the branch instruction <code>b</code>, as well as the conditional branch <code>t</code>. An instruction <code>b</code> followed by a valid label name will move processing to the command following that label. The <code>t</code> instruction will only do so if there was a successful substitution since the previous <code>t</code> (or the start of the program, in case of the first <code>t</code> encountered). Additionally, the <code>{</code> instruction starts a ] of commands (up to the matching <code>}</code>); in most cases, it will be conditioned by an address pattern. | |||
Several substitutions or other commands can be put together in a file called for example ''subst.sed'' and then be applied like | |||
sed -f subst.sed inputFileName > outputFileName | |||
===sed used as a filter=== | |||
Besides substitution, other forms of simple processing are possible. For example the following script deletes empty lines or lines that only contain spaces: | |||
Under Unix, sed is often used as a ] in a ]: | |||
<syntaxhighlight lang="console"> | |||
$ generateData | sed 's/x/y/g' | |||
</syntaxhighlight> | |||
That is, a program such as "generateData" generates data, and then sed makes the small change of replacing ''x'' with ''y''. For example: | |||
<syntaxhighlight lang="console"> | |||
$ echo xyz xyz | sed 's/x/y/g' | |||
yyz yyz | |||
</syntaxhighlight> | |||
<ref group="notes" name="quotes"> | |||
In command line use, the quotes around the expression are not required, and are only necessary if the shell would otherwise not interpret the expression as a single word (token). For the script <code>s/x/y/g</code> there is no ambiguity, so <code>generateData | sed s/x/y/g</code> works correctly. However, quotes are usually included for clarity, and are often necessary, notably for whitespace (e.g., <code>'s/x x/y y/'</code>). Most often single quotes are used, to avoid having the shell interpret <code>$</code> as a shell variable. Double quotes are used, such as <code>"s/$1/$2/g"</code>, to allow the shell to substitute for a command line argument or other shell variable. | |||
</ref> | |||
===File-based sed scripts=== | |||
sed -e '/^ *$/d' inputFileName | |||
It is often useful to put several sed commands, one command per line, into a script file such as <code>subst.sed</code>, and then use the <code>-f</code> option to run the commands (such as <code>s/x/y/g</code>) from the file: | |||
<syntaxhighlight lang=bash> | |||
sed -f subst.sed inputFileName > outputFileName | |||
</syntaxhighlight> | |||
Any number of commands may be placed into the script file, and using a script file also avoids problems with shell escaping or substitutions. | |||
Such a script file may be made directly executable from the command line by prepending it with a "] line" containing the sed command and assigning the executable permission to the file. For example, a file <code>subst.sed</code> can be created with contents: | |||
This example used some of the following ]s: | |||
<syntaxhighlight lang="sed"> | |||
* ^ Matches the beginning of the line | |||
#!/bin/sed -f | |||
* $ Matches the end of the line | |||
s/x/y/g | |||
* . Matches any single character | |||
</syntaxhighlight> | |||
* * Matches zero or more occurrences of the previous character | |||
* Matches any of the characters inside the | |||
The file may then be made executable by the current user with the <code>chmod</code> command:<syntaxhighlight lang=bash> | |||
Sed is one of the very early Unix commands that permitted command line processing of data files. It evolved as the natural successor to the popular ] command. Cousin to the later ], sed allowed powerful and interesting data processing to be done by shell scripts. Sed was probably the earliest Unix tool that really encouraged regular expressions to be used ubiquitously. | |||
chmod u+x subst.sed | |||
</syntaxhighlight>The file may then be executed directly from the command line: | |||
<syntaxhighlight lang=bash> | |||
subst.sed inputFileName > outputFileName | |||
</syntaxhighlight> | |||
===In-place editing=== | |||
Sed and AWK are often cited as the progenitors and inspiration for ]; in particular the s/// syntax from the example above is part of Perl's syntax. | |||
The <code>-i</code> option, introduced in GNU sed, allows in-place editing of files (actually, a temporary output file is created in the background, and then the original file is replaced by the temporary file). For example: | |||
<syntaxhighlight lang=bash> | |||
sed -i 's/abc/def/' fileName | |||
</syntaxhighlight> | |||
==Examples== | |||
Sed's language does not have variables and only primitive ] and branching functionality; nevertheless, the language is ]. | |||
===Hello, world! example=== | |||
<syntaxhighlight lang="sed"> | |||
# convert input text stream to "Hello, world!" | |||
s/.*/Hello, world!/ | |||
q | |||
</syntaxhighlight> | |||
This ] script is in a file (e.g., script.txt) and invoked with <code>sed -f script.txt inputFileName</code>, where "inputFileName" is the input text file. The script changes "inputFileName" line #1 to "Hello, world!" and then quits, printing the result before sed exits. Any input lines past line #1 are not read, and not printed. So the sole output is "Hello, world!". | |||
There is an extended version of sed called '''Super-sed''' that includes several new features such as in-place editing of files. | |||
The example emphasizes many key characteristics of sed: | |||
===External links:=== | |||
* Typical sed programs are rather short and simple. | |||
* | |||
* sed scripts can have comments (the line starting with the <code>#</code> symbol). | |||
* | |||
* The <code>s</code> (substitute) command is the most important sed command. | |||
* | |||
* sed allows simple programming, with commands such as <code>q</code> (quit). | |||
* | |||
* sed uses regular expressions, such as <code>.*</code> (zero or more of any character). | |||
* | |||
* | |||
* | |||
* | |||
===Other simple examples=== | |||
---- | |||
<!-- please avoid the use of GNU sedisms. These are non-portable and lead to bad code --> | |||
Below follow various sed scripts; these can be executed by passing as an argument to sed, or put in a separate file and executed via <code>-f</code> or by making the script itself executable. | |||
To replace any instance of a certain word in a file with "REDACTED", such as an IRC password, and save the result: | |||
In ], '''Sed''' was a god of redemption. | |||
<syntaxhighlight lang="shell-session"> | |||
$ sed -i "s/yourpassword/REDACTED/" ./status.chat.log | |||
</syntaxhighlight> | |||
To delete any line containing the word "yourword" (the ''address'' is '/yourword/'): | |||
Alternative: Sedim | |||
<syntaxhighlight lang="sed"> | |||
/yourword/ d | |||
</syntaxhighlight> | |||
To delete all instances of the word "yourword": | |||
<syntaxhighlight lang="sed"> | |||
s/yourword//g | |||
</syntaxhighlight> | |||
To delete two words from a file simultaneously: | |||
<syntaxhighlight lang="sed"> | |||
s/firstword//g | |||
s/secondword//g | |||
</syntaxhighlight> | |||
To express the previous example on one line, such as when entering at the command line, one may join two commands via the semicolon: | |||
<syntaxhighlight lang="shell-session"> | |||
$ sed "s/firstword//g; s/secondword//g" inputFileName | |||
</syntaxhighlight> | |||
===Multiline processing example=== | |||
In the next example, sed, which usually only works on one line, removes newlines from sentences where the second line starts with one space. | |||
Consider the following text: | |||
This is my dog, | |||
whose name is Frank. | |||
This is my fish, | |||
whose name is George. | |||
This is my goat, | |||
whose name is Adam. | |||
The sed script below will turn the text above into the following text. Note that the script affects only the input lines that start with a space: | |||
This is my dog, whose name is Frank. | |||
This is my fish, | |||
whose name is George. | |||
This is my goat, whose name is Adam. | |||
The script is: | |||
<syntaxhighlight lang="sed"> | |||
N | |||
s/\n / / | |||
P | |||
D | |||
</syntaxhighlight> | |||
This is explained as: | |||
* (<code>N</code>) add the next line to the pattern space; | |||
* (<code>s/\n / /</code>) find a new line followed by a space, replace with one space; | |||
* (<code>P</code>) print the top line of the pattern space; | |||
* (<code>D</code>) delete the top line from the pattern space and run the script again. | |||
This can be expressed on a single line via semicolons: | |||
sed '{{codett|2=sed|N; s/\n / /; P; D}}' inputFileName | |||
==Limitations and alternatives== | |||
While simple and limited, sed is sufficiently powerful for a large number of purposes. For more sophisticated processing, more powerful languages such as ] or ] are used instead. These are particularly used if transforming a line in a way more complicated than a regex extracting and template replacement, though arbitrarily complicated transforms are in principle possible by using the hold buffer. | |||
Conversely, for simpler operations, specialized Unix utilities such as ] (print lines matching a pattern), ] (print the first part of a file), ] (print the last part of a file), and ] (translate or delete characters) are often preferable. For the specific tasks they are designed to carry out, such specialized utilities are usually simpler, clearer, and faster than a more general solution such as sed. | |||
The ed/sed commands and syntax continue to be used in descendent programs, such as the text editors ] and ]. An analog to ed/sed is ]/ssam, where sam is the ] editor, and ssam is a stream interface to it, yielding functionality similar to sed. | |||
==See also== | |||
* ] | |||
* ] | |||
== Notes == | |||
{{Reflist|group=notes}} | |||
==References== | |||
{{Reflist}} | |||
==Further reading== | |||
* | |||
* or | |||
* {{cite book|title=sed & awk|edition=2nd|date=March 1997|url=http://shop.oreilly.com/product/9781565922259.do|author=Dale Dougherty & Arnold Robbins|publisher=]|isbn=1-56592-225-5}} | |||
* {{cite book|title=sed and awk Pocket Reference|edition=2nd|date=June 2002|url=http://shop.oreilly.com/product/9780596003524.do|author=Arnold Robbins|publisher=]|isbn=0-596-00352-8}} | |||
* {{cite book|title=UNIX AWK and SED Programmer's Interactive Workbook (UNIX Interactive Workbook)|author=Peter Patsis|publisher=]|date=December 1998|isbn=0-13-082675-8}} | |||
* {{cite book|title=Definitive Guide to sed|url=https://www.ehdp.com/press/sed-book/|date=February 2013|author=Daniel Goldman|publisher=EHDP Press|isbn=978-1-939824-00-4}} | |||
* , the sed FAQ (March, 2003) | |||
==External links== | |||
{{Wikibooks|Sed}} | |||
* {{man|cu|sed|SUS}} | |||
* {{man|1|sed|Plan 9}} | |||
* , by Bruce Barnett | |||
* {{cite web|url=https://www.gnu.org/software/sed/ |title=GNU sed homepage}} (includes manual) | |||
* {{cite web|url=https://www.pement.org/sed/ |title=sed the Stream Editor |year=2004 |author=Eric Pement}} | |||
* {{cite web|url=https://exactcode.com/opensource/minised/ |title=minised sed implementation |author=] |website=ExactCODE}} | |||
{{Unix commands}} | |||
{{Plan 9 commands}} | |||
{{Authority control}} | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] |
Latest revision as of 13:02, 29 November 2024
Standard UNIX utility for editing streams of data This article is about the text processing utility. For other uses, see Sed (disambiguation).An excerpt from GNU sed's man page | |
Paradigm | scripting |
---|---|
Designed by | Lee E. McMahon |
First appeared | 1974; 51 years ago (1974) |
Implementation language | C |
Influenced by | |
ed | |
Influenced | |
Perl, AWK |
sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed ("editor", 1971) and the earlier qed ("quick editor", 1965–66). It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.
History
First appearing in Version 7 Unix, sed is one of the early Unix commands built for command line processing of data files. It evolved as the natural successor to the popular grep command. The original motivation was an analogue of grep (g/re/p) for substitution, hence "g/re/s". Foreseeing that further special-purpose programs for each command would also arise, such as g/re/d, McMahon wrote a general-purpose line-oriented stream editor, which became sed. The syntax for sed, notably the use of /
for pattern matching, and s///
for substitution, originated with ed, the precursor to sed, which was in common use at the time, and the regular expression syntax has influenced other languages, notably ECMAScript and Perl. Later, the more powerful language AWK developed, and these functioned as cousins, allowing powerful text processing to be done by shell scripts. sed and AWK are often cited as progenitors and inspiration for Perl, and influenced Perl's syntax and semantics, notably in the matching and substitution operators.
GNU sed added several new features, including in-place editing of files. Super-sed is an extended version of sed that includes regular expressions compatible with Perl. Another variant of sed is minised, originally reverse-engineered from 4.1BSD sed by Eric S. Raymond and currently maintained by René Rebe. minised was used by the GNU Project until the GNU Project wrote a new version of sed based on the new GNU regular expression library. The current minised contains some extensions to BSD sed but is not as feature-rich as GNU sed. Its advantage is that it is very fast and uses little memory. It is used on embedded systems and is the version of sed provided with Minix.
Mode of operation
sed is a line-oriented text processing utility: it reads text, line by line, from an input stream or file, into an internal buffer called the pattern space. Each line read starts a cycle. To the pattern space, sed applies one or more operations which have been specified via a sed script. sed implements a programming language with about 25 commands that specify the operations on the text. For each input line, after running the script, sed ordinarily outputs the pattern space (the line as modified by the script) and begins the cycle again with the next line. Other end-of-script behaviors are available through sed options and script commands, e.g. d
to delete the pattern space, q
to quit, N
to add the next line to the pattern space immediately, and so on. Thus a sed script corresponds to the body of a loop that iterates through the lines of a stream, where the loop itself and the loop variable (the current line number) are implicit and maintained by sed.
The sed script can either be specified on the command line (-e
option) or read from a separate file (-f
option). Commands in the sed script may take an optional address, in terms of line numbers or regular expressions. The address determines when the command is run. For example, 2d
would only run the d
(delete) command on the second input line (printing all lines but the second), while /^ /d
would delete all lines beginning with a space. A separate special buffer, the hold space, may be used by a few sed commands to hold and accumulate text between cycles. sed's command language has only two variables (the "hold space" and the "pattern space") and GOTO-like branching functionality; nevertheless, the language is Turing-complete, and esoteric sed scripts exist for games such as sokoban, arkanoid, chess, and tetris.
A main loop executes for each line of the input stream, evaluating the sed script on each line of the input. Lines of a sed script are each a pattern-action pair, indicating what pattern to match and which action to perform, which can be recast as a conditional statement. Because the main loop, working variables (pattern space and hold space), input and output streams, and default actions (copy line to pattern space, print pattern space) are implicit, it is possible to write terse one-liner programs. For example, the sed program given by:
10q
will print the first 10 lines of input, then stop.
Usage
Substitution command
The following example shows a typical, and the most common, use of sed: substitution. This usage was indeed the original motivation for sed:
sed 's/regexp/replacement/g' inputFileName > outputFileName
In some versions of sed, the expression must be preceded by -e
to indicate that an expression follows. The s
stands for substitute, while the g
stands for global, which means that all matching occurrences in the line would be replaced. The regular expression (i.e. pattern) to be searched is placed after the first delimiting symbol (slash here) and the replacement follows the second symbol. Slash (/
) is the conventional symbol, originating in the character for "search" in ed, but any other could be used to make syntax more readable if it does not occur in the pattern or replacement; this is useful to avoid "leaning toothpick syndrome".
The substitution command, which originates in search-and-replace in ed, implements simple parsing and templating. The regexp
provides both pattern matching and saving text via sub-expressions, while the replacement
can be either literal text, or a format string containing the characters &
for "entire match" or the special escape sequences \1
through \9
for the nth saved sub-expression. For example, sed -r "s/(cat|dog)s?/\1s/g"
replaces all occurrences of "cat" or "dog" with "cats" or "dogs", without duplicating an existing "s": (cat|dog)
is the 1st (and only) saved sub-expression in the regexp, and \1
in the format string substitutes this into the output.
Other sed commands
Besides substitution, other forms of simple processing are possible, using some 25 sed commands. For example, the following uses the d command to filter out lines that only contain spaces, or only contain the end of line character:
sed '/^ *$/d' inputFileName
This example uses some of the following regular expression metacharacters (sed supports the full range of regular expressions):
- The caret (
^
) matches the beginning of the line. - The dollar sign (
$
) matches the end of the line. - The asterisk (
*
) matches zero or more occurrences of the previous character. - The plus (
+
) matches one or more occurrence(s) of the previous character. - The question mark (
?
) matches zero or one occurrence of the previous character. - The dot (
.
) matches exactly one character.
Complex sed constructs are possible, allowing it to serve as a simple, but highly specialized, programming language. Flow of control, for example, can be managed by the use of a label (a colon followed by a string) and the branch instruction b
, as well as the conditional branch t
. An instruction b
followed by a valid label name will move processing to the command following that label. The t
instruction will only do so if there was a successful substitution since the previous t
(or the start of the program, in case of the first t
encountered). Additionally, the {
instruction starts a subsequence of commands (up to the matching }
); in most cases, it will be conditioned by an address pattern.
sed used as a filter
Under Unix, sed is often used as a filter in a pipeline:
$ generateData | sed 's/x/y/g'
That is, a program such as "generateData" generates data, and then sed makes the small change of replacing x with y. For example:
$ echo xyz xyz | sed 's/x/y/g' yyz yyz
File-based sed scripts
It is often useful to put several sed commands, one command per line, into a script file such as subst.sed
, and then use the -f
option to run the commands (such as s/x/y/g
) from the file:
sed -f subst.sed inputFileName > outputFileName
Any number of commands may be placed into the script file, and using a script file also avoids problems with shell escaping or substitutions.
Such a script file may be made directly executable from the command line by prepending it with a "shebang line" containing the sed command and assigning the executable permission to the file. For example, a file subst.sed
can be created with contents:
#!/bin/sed -f s/x/y/g
The file may then be made executable by the current user with the chmod
command:
chmod u+x subst.sed
The file may then be executed directly from the command line:
subst.sed inputFileName > outputFileName
In-place editing
The -i
option, introduced in GNU sed, allows in-place editing of files (actually, a temporary output file is created in the background, and then the original file is replaced by the temporary file). For example:
sed -i 's/abc/def/' fileName
Examples
Hello, world! example
# convert input text stream to "Hello, world!" s/.*/Hello, world!/ q
This "Hello, world!" script is in a file (e.g., script.txt) and invoked with sed -f script.txt inputFileName
, where "inputFileName" is the input text file. The script changes "inputFileName" line #1 to "Hello, world!" and then quits, printing the result before sed exits. Any input lines past line #1 are not read, and not printed. So the sole output is "Hello, world!".
The example emphasizes many key characteristics of sed:
- Typical sed programs are rather short and simple.
- sed scripts can have comments (the line starting with the
#
symbol). - The
s
(substitute) command is the most important sed command. - sed allows simple programming, with commands such as
q
(quit). - sed uses regular expressions, such as
.*
(zero or more of any character).
Other simple examples
Below follow various sed scripts; these can be executed by passing as an argument to sed, or put in a separate file and executed via -f
or by making the script itself executable.
To replace any instance of a certain word in a file with "REDACTED", such as an IRC password, and save the result:
$ sed -i "s/yourpassword/REDACTED/" ./status.chat.log
To delete any line containing the word "yourword" (the address is '/yourword/'):
/yourword/ d
To delete all instances of the word "yourword":
s/yourword//g
To delete two words from a file simultaneously:
s/firstword//g s/secondword//g
To express the previous example on one line, such as when entering at the command line, one may join two commands via the semicolon:
$ sed "s/firstword//g; s/secondword//g" inputFileName
Multiline processing example
In the next example, sed, which usually only works on one line, removes newlines from sentences where the second line starts with one space. Consider the following text:
This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam.
The sed script below will turn the text above into the following text. Note that the script affects only the input lines that start with a space:
This is my dog, whose name is Frank. This is my fish, whose name is George. This is my goat, whose name is Adam.
The script is:
N s/\n / / P D
This is explained as:
- (
N
) add the next line to the pattern space; - (
s/\n / /
) find a new line followed by a space, replace with one space; - (
P
) print the top line of the pattern space; - (
D
) delete the top line from the pattern space and run the script again.
This can be expressed on a single line via semicolons:
sed 'N; s/\n / /; P; D
' inputFileName
Limitations and alternatives
While simple and limited, sed is sufficiently powerful for a large number of purposes. For more sophisticated processing, more powerful languages such as AWK or Perl are used instead. These are particularly used if transforming a line in a way more complicated than a regex extracting and template replacement, though arbitrarily complicated transforms are in principle possible by using the hold buffer.
Conversely, for simpler operations, specialized Unix utilities such as grep (print lines matching a pattern), head (print the first part of a file), tail (print the last part of a file), and tr (translate or delete characters) are often preferable. For the specific tasks they are designed to carry out, such specialized utilities are usually simpler, clearer, and faster than a more general solution such as sed.
The ed/sed commands and syntax continue to be used in descendent programs, such as the text editors vi and vim. An analog to ed/sed is sam/ssam, where sam is the Plan 9 editor, and ssam is a stream interface to it, yielding functionality similar to sed.
See also
Notes
-
In command line use, the quotes around the expression are not required, and are only necessary if the shell would otherwise not interpret the expression as a single word (token). For the script
s/x/y/g
there is no ambiguity, sogenerateData | sed s/x/y/g
works correctly. However, quotes are usually included for clarity, and are often necessary, notably for whitespace (e.g.,'s/x x/y y/'
). Most often single quotes are used, to avoid having the shell interpret$
as a shell variable. Double quotes are used, such as"s/$1/$2/g"
, to allow the shell to substitute for a command line argument or other shell variable.
References
- "The sed FAQ, Section 2.1". Archived from the original on 2018-06-27. Retrieved 2013-05-21.
- "The sed FAQ, Section 2.2". Archived from the original on 2018-06-27. Retrieved 2013-05-21.
- ^ McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
- ^
"On the Early History and Impact of Unix".
A while later a demand arose for another special-purpose program, gres, for substitution: g/re/s. Lee McMahon undertook to write it, and soon foresaw that there would be no end to the family: g/re/d, g/re/a, etc. As his concept developed it became sed…
- Raymond, Eric Steven; Rebe, René (2017-03-03). "tar-mirror/minised: A smaller, cheaper, faster SED implementation". GitHub. Archived from the original on 2018-06-13. Retrieved 2024-05-20.
- "Implementation of a Turing Machine as Sed Script". Archived from the original on 2018-02-20. Retrieved 2003-04-24.
- "Turing.sed". Archived from the original on 2018-01-16. Retrieved 2003-04-24.
- "The $SED Home - gamez".
- "bolknote/SedChess". GitHub. Retrieved August 23, 2013.
- "Sedtris, a Tetris game written for sed". GitHub. Retrieved October 3, 2016.
Further reading
- Bell Lab's Eighth Edition (circa 1985) Unix sed(1) manual page
- GNU sed documentation or the manual page
- Dale Dougherty & Arnold Robbins (March 1997). sed & awk (2nd ed.). O'Reilly. ISBN 1-56592-225-5.
- Arnold Robbins (June 2002). sed and awk Pocket Reference (2nd ed.). O'Reilly. ISBN 0-596-00352-8.
- Peter Patsis (December 1998). UNIX AWK and SED Programmer's Interactive Workbook (UNIX Interactive Workbook). Prentice Hall. ISBN 0-13-082675-8.
- Daniel Goldman (February 2013). Definitive Guide to sed. EHDP Press. ISBN 978-1-939824-00-4.
- Sourceforge.net, the sed FAQ (March, 2003)
External links
sed
– Shell and Utilities Reference, The Single UNIX Specification, Version 4 from The Open Groupsed(1)
– Plan 9 Programmer's Manual, Volume 1- Sed - An Introduction and Tutorial, by Bruce Barnett
- "GNU sed homepage". (includes manual)
- Eric Pement (2004). "sed the Stream Editor".
- Eric S. Raymond. "minised sed implementation". ExactCODE.
Unix command-line interface programs and shell builtins | |
---|---|
File system | |
Processes | |
User environment | |
Text processing | |
Shell builtins | |
Searching | |
Documentation | |
Software development | |
Miscellaneous | |
|
Plan 9 command-line interface programs and shell builtins | |
---|---|
File system | |
Processes | |
User environment | |
Text processing | |
Shell builtins | |
Networking | |
Searching | |
Software development | |
Miscellaneous | |