Delete lines between patterns with sed
sed can be a powerful but tricky tool to master, but it rose to the occasion tonight for cleaning up several old HTML pages. I had some JS code needing killing that was always between two known HTML comments, so some sort of multi-line removal seemed perfect.
Given some HTML that looked like this:
<div id="footer">
<!-- Start of StatCounter Code -->
<script type="text/javascript">
var sc_project=000000;
var sc_invisible=1;
var sc_partition=9;
var sc_security="00000000";
var sc_text=2;
</script>
<script type="text/javascript" src="https://www.statcounter.com/counter/counter.js"></script>
<!-- End of StatCounter Code -->
<p><img src="./validxhtml10.png" alt="Valid XHTML 1.0" /></p>
</div>
The following sed script did exactly what I wanted, which was leave the footer <div/>
intact but kill all of the code and comments nested within.
# kill-statcounter.sed
# remove all lines between 'Start' and 'End' inclusive
/Start of StatCounter/ {
:loop
# pull in the next line to the pattern space
N
# if our line matches, delete entire pattern space
# AND restart the cycle outside of the loop
/End of StatCounter/d
# if we get here we didn't match delete, so keep looking
b loop
}
Tags
See Also
- Trouble with sudoers (or last entry wins) - December 9, 2009
- 2.4 > 2.6 in OpenWrt - November 29, 2009
- Slicehost kernel upgrade - November 1, 2009
- Eee Kernel 2.6.31.2-1 Update - October 6, 2009
- Eee Kernel Scheduler Tweaks - September 25, 2009