As the URL (but not, sadly, the website itself) suggests, you're exhausting yourself with all that negativity; you can remove two '-'s (as well as two ' 's), and get a much tastier command line:
The 'e' character is seen as an argument to the '-i' switch, meaning this command line will yield `Can't open perl script "s/foo/bar/gi": No such file or directory`
They're functionally interchangeable for simple usage (minus some perl-specific stuff like PCRE).
As I note below, escape behavior is a little different because sed wants you to escape +'s to have the normal regex semantics ("one or more matches"). And I actually think Perl is correct here and you should only need to escape those characters if you want literal matches, but I have a weird environment (Cygwin) and it's possible the sed build there is a little messed up.
The major difference for me is that Perl can match across multiple lines using the -0777 flag. I've been doing a lot of regex-based mass manipulation of source code lately and most people write functions across multiple lines. You can't do that with sed without multi-line appending and it gets really ugly really fast. Sed is pretty much just single line matches only.
For example, I had 100-odd classes with getters for certain values but not setters. So I did:
grep -rle "getAddTime" | while read line; do if ! grep -q "setAddTime" $line ; then echo $line; perl -i -0777 -pe 's/public\s+Date\s+getAddTime\s*\(\s*\)\s*\{[\s\w=;]+\}/$&\n\n public void setAddTime(Date addTime) { this.addTime = addTime; }/' $line; fi done
translation: look for files that contain getAddTime, if they do not contain setAddTime then find the string "public Date getAddTime() {...} and append the setter after that". There are a few edge cases you could hit there but it was close enough to work on my codebase.
I wish Perl would do an inplace edit of a file without creating a backup, though. I am under source control so there's no harm in just operating right on the files. It's not the end of the world to follow up with a rm -r *.bak I guess, but it's annoying. At least they're in my git-ignore which helps a little.
Fun factoid most have forgotten: regex is perl. The beginnings are elsewhere but regex as we know it was designed as part of the language and the engine was pulled out and reused when people found how useful it was.
Perls regex parser is still far above the features in more modern languages, supporting, among other things, code execution within capture groups. If I remember right the perl regex parser is actually Turing complete
I know you're not trying to say that regular expressions were created as part of Perl, but I think you're giving a bit too much credit to it[1] regarding regexes.
The PCRE library is indeed used all over. And Perl was, I think, the first first-class scripting language that integrated regexes so closely to control structures and other language features in a way that feels truly natural.
There are still a lot of tools out there that use other regex libraries. Don't have it in front of me, but there's a lovely chart in the book _Mastering Regular Expressions_[2] that breaks out regular expression library use by tool. But, generally, I think the diversity of regex libraries actually causes problems for adoption these days, because people who are tempted to use them (thus learn more) tend to run in to other tools where the things they've learns mysteriously don't work anymore, and scares them off.
Anyway, regular expressions in the wild go back to Unix v.4, which included Ken Thompson's grep.
[1] Perl deserves a ton of credit it doesn't get in general, including credit for giving the world PCREs.
[2] In general, if you work with regexes a lot and don't own this book, you're doing yourself a disservice. It is one of my top-10 technical books, not just for density of actionable information, but also for the pure general excellence.
Have a look at grammars in Perl6 and the new regexen. Light years ahead of anything else. Perl6 also does numeric division properly and, if I'm not mistaken, eliminates NPEs so what's not to like?
What you describe happened far earlier. As far as I understand it, regexps were originally a part of ed (having been derived from QED), the original Unix text editor. Its “g” command with a “p” flag, or “g/re/p”, for globally searching for a regexp and printing the matching lines, was later found so useful that it was implemented into a separate utility, “grep”. Many Unix utilities started using regular expressions from then on, including Perl.
Maybe I was a hit too excited about the perl part. Perl perfected regex and the perl regex engine was integrated into other languages until it became a normal language feature.
Regex as we know it was largely a result of the adoption of perl and the flexibility of its regex engine
PCRE is a nice library. I read (on its site, IIRC) that it is used for the regex support in Python and some other languages.
I once worked - as part of new product work in an enterprise company - on building the PCRE library as an object file on multiple Unixes from different vendors (like IBM AIX, HP-UX, Solaris, DEC Ultrix, etc.) and also on Windows (including on both 32-bit and 64-bit variants of some of those OSes), using the C compiler toolchain on each of those platforms. I was a bit surprised to see the amount of variation in the toolchain commands and flags (command-line options) across the tools on all those Unixes. But on further thought, knowing about the Unix wars [1] and configure [2], maybe I should not have been surprised.
> If no extension is supplied, and your system supports it, the original file is kept open without a name while the output is redirected to a new file with the original filename. When perl exits, cleanly or not, the original file is unlinked.
Which system are you using? With macOS and Linux, I get no automatic .bak extension when not providing a backup suffix, i.e. it behaves like you want under these systems.
Update: Apparently, anonymous files are not supported by Windows: http://stackoverflow.com/a/30539016 which would explain the behaviour you describe.
The same as for anything that could be done in the shell, but is done with a full-fledged programming language: when you discover you need to tweak it to accommodate an additional requirement, it's easy in Perl but hard in sed.
If you feel unproductive or uncomfortable with Perl, you should avoid it and use whatever language strikes your fancy.
Many people however are extremely comfortable and productive with Perl, for a wide variety of reasons. It's really an awesome language that has proven its vast usefulness a long time ago.
perl has the best regex, or one of the best, also it is the only other language other than python which is installed by default in common distros afaik
None really. I would think if you use Perl, you would probably use the one line from Perl is all. Also sed doesn't have the in place function on Solaris (and AIX) by default I believe. So with Perl, you can do it everywhere.
A lot of these perl one liners are from the old days, from unix admins that had versions of sed (or find, etc) that didn't have fancy features like -i (inplace edit).
Also in the old days tools like sed and awk had more arbitrary limitations. E.g., line length was limited to something (reasonable) like 8192 bytes but input data I was processing was not reasonable.
For me, one part laziness and one part flexibility, and the flexibility part I might be rationalizing. :P. Perl's a superset of what can be done with sed, so I tend to use perl for this case even when sed would do, just because I can remember the perl command line args and regex syntax off the top of my head. I often have to take a trip to the sed man page if I use sed. It's easier for me to add special corner cases to a perl one-liner, etc.
That said, sed is simpler, sometimes less to type, sometimes there on systems where perl isn't (though that seems uncommon these days). But I'm using sed more often, and the more I use it the more familiar it is. Nicer to have two good tools around than just one, right?
One big advantage is portability / compatibility. For example, Linux uses GNU sed but Mac OS X uses BSD sed. I've run into issues where a sed script works on one but not the other. Using perl in sed mode avoids this.
Perl's substitution and translation facilities work pretty much the same way as sed, if you understand sed you can just drop those operations into Perl. For example, sed -i 's/old/new/' will do an in-place replacement of all strings 'old' with the string 'new'.
I find that you do escaping a little differently with sed though. For example you need to escape +'s for them to have their normal "one or more matches" semantics. Might just be an abnormality of my environment (Cygwin).
The -i switch takes an optional string that creates a backup of the file with that extension. To use your example (with zsh blobbing):
perl -p -i.bak -e 's/foo/bar/gi' **/*.txt
Recursively replaces foo with bar on every .txt file in a hierarchy and saves the original file with a .bak extension, along with the in-place edit in the original.
Awk is a easier for me to remember than sed or perl, just because "sub" (for substitution) is in there:
awk '{gsub(/this/,"that")}; 1'
Edit: I just learned that "s" in the sed command is for substitute... (http://www.grymoire.com/Unix/Sed.html)
Well, maybe I'll be able to remember that.