My favorite perl one liner is a globbed search and replace. It even has it's own...

JadeNB · on Nov 29, 2016

As the URL (but not, sadly, the website itself) suggests, you're exhausting yourself with all that negativity; you can remove two '-'s (as well as two ' 's), and get a much tastier command line:

    perl -pie 's/foo/bar/gi' ./*.txt

pre_action · on Nov 29, 2016

The 'e' character is seen as an argument to the '-i' switch, meaning this command line will yield `Can't open perl script "s/foo/bar/gi": No such file or directory`

JadeNB · on Nov 30, 2016

Sigh; I am not making much of a show of my Perl knowledge today. (I messed up a Perl find-alike elsethread.) Thanks for setting me straight.

jxy · on Nov 29, 2016

What is the advantage of this over sed?

paulmd · on Nov 29, 2016

They're functionally interchangeable for simple usage (minus some perl-specific stuff like PCRE).

As I note below, escape behavior is a little different because sed wants you to escape +'s to have the normal regex semantics ("one or more matches"). And I actually think Perl is correct here and you should only need to escape those characters if you want literal matches, but I have a weird environment (Cygwin) and it's possible the sed build there is a little messed up.

The major difference for me is that Perl can match across multiple lines using the -0777 flag. I've been doing a lot of regex-based mass manipulation of source code lately and most people write functions across multiple lines. You can't do that with sed without multi-line appending and it gets really ugly really fast. Sed is pretty much just single line matches only.

For example, I had 100-odd classes with getters for certain values but not setters. So I did:

  grep -rle "getAddTime" | while read line; do if ! grep -q "setAddTime" $line ; then echo $line; perl -i -0777 -pe 's/public\s+Date\s+getAddTime\s*\(\s*\)\s*\{[\s\w=;]+\}/$&\n\n  public void setAddTime(Date addTime) { this.addTime = addTime; }/' $line; fi done

translation: look for files that contain getAddTime, if they do not contain setAddTime then find the string "public Date getAddTime() {...} and append the setter after that". There are a few edge cases you could hit there but it was close enough to work on my codebase.

I wish Perl would do an inplace edit of a file without creating a backup, though. I am under source control so there's no harm in just operating right on the files. It's not the end of the world to follow up with a rm -r *.bak I guess, but it's annoying. At least they're in my git-ignore which helps a little.

throwbsidbdk · on Nov 29, 2016

Fun factoid most have forgotten: regex is perl. The beginnings are elsewhere but regex as we know it was designed as part of the language and the engine was pulled out and reused when people found how useful it was.

Perls regex parser is still far above the features in more modern languages, supporting, among other things, code execution within capture groups. If I remember right the perl regex parser is actually Turing complete

__jal · on Nov 29, 2016

I know you're not trying to say that regular expressions were created as part of Perl, but I think you're giving a bit too much credit to it[1] regarding regexes.

The PCRE library is indeed used all over. And Perl was, I think, the first first-class scripting language that integrated regexes so closely to control structures and other language features in a way that feels truly natural.

There are still a lot of tools out there that use other regex libraries. Don't have it in front of me, but there's a lovely chart in the book _Mastering Regular Expressions_[2] that breaks out regular expression library use by tool. But, generally, I think the diversity of regex libraries actually causes problems for adoption these days, because people who are tempted to use them (thus learn more) tend to run in to other tools where the things they've learns mysteriously don't work anymore, and scares them off.

Anyway, regular expressions in the wild go back to Unix v.4, which included Ken Thompson's grep.

[1] Perl deserves a ton of credit it doesn't get in general, including credit for giving the world PCREs.

[2] In general, if you work with regexes a lot and don't own this book, you're doing yourself a disservice. It is one of my top-10 technical books, not just for density of actionable information, but also for the pure general excellence.

cutler · on Nov 29, 2016

Have a look at grammars in Perl6 and the new regexen. Light years ahead of anything else. Perl6 also does numeric division properly and, if I'm not mistaken, eliminates NPEs so what's not to like?

teddyh · on Nov 30, 2016

What you describe happened far earlier. As far as I understand it, regexps were originally a part of ed (having been derived from QED), the original Unix text editor. Its “g” command with a “p” flag, or “g/re/p”, for globally searching for a regexp and printing the matching lines, was later found so useful that it was implemented into a separate utility, “grep”. Many Unix utilities started using regular expressions from then on, including Perl.

throwbsidbdk · on Nov 30, 2016

Maybe I was a hit too excited about the perl part. Perl perfected regex and the perl regex engine was integrated into other languages until it became a normal language feature.

Regex as we know it was largely a result of the adoption of perl and the flexibility of its regex engine

TazeTSchnitzel · on Nov 29, 2016

Thus the PCRE regex library, Perl-Compatible Regular Expressions, for instance.

theoh · on Nov 29, 2016

Note that this library is by Philip Hazel and did not originate in the Perl source code, but in Exim.

Regex facilities for text processing were first implemented by Ken Thompson, long before Perl.

On the topic of implementations, this is important: https://swtch.com/~rsc/regexp/regexp1.html

vram22 · on Nov 29, 2016

PCRE is a nice library. I read (on its site, IIRC) that it is used for the regex support in Python and some other languages.

I once worked - as part of new product work in an enterprise company - on building the PCRE library as an object file on multiple Unixes from different vendors (like IBM AIX, HP-UX, Solaris, DEC Ultrix, etc.) and also on Windows (including on both 32-bit and 64-bit variants of some of those OSes), using the C compiler toolchain on each of those platforms. I was a bit surprised to see the amount of variation in the toolchain commands and flags (command-line options) across the tools on all those Unixes. But on further thought, knowing about the Unix wars [1] and configure [2], maybe I should not have been surprised.

[1] https://en.wikipedia.org/wiki/Unix_wars

[2] https://en.wikipedia.org/wiki/Configure_script

harry8 · on Nov 30, 2016

grep == global regular expression print and has its origins in ed from pre-vi days if my memory of passed on lore is correct.

jsrn · on Nov 30, 2016

Regarding this: "I wish Perl would do an inplace edit of a file without creating a backup, though."

    $ perldoc perlrun

(or http://perldoc.perl.org/perlrun.html)

says:

> If no extension is supplied, and your system supports it, the original file is kept open without a name while the output is redirected to a new file with the original filename. When perl exits, cleanly or not, the original file is unlinked.

Which system are you using? With macOS and Linux, I get no automatic .bak extension when not providing a backup suffix, i.e. it behaves like you want under these systems.

Update: Apparently, anonymous files are not supported by Windows: http://stackoverflow.com/a/30539016 which would explain the behaviour you describe.

claystu · on Nov 30, 2016

"I wish Perl would do an inplace edit of a file without creating a backup, though."

It's been a while, but I think the -i command line switch accomplishes what you want here.

JadeNB · on Nov 29, 2016

> What is the advantage of this over sed?

The same as for anything that could be done in the shell, but is done with a full-fledged programming language: when you discover you need to tweak it to accommodate an additional requirement, it's easy in Perl but hard in sed.

nilved · on Nov 29, 2016

If that was the case you think you'd reach for literally any other language than Perl.

aduitsis · on Nov 29, 2016

If you feel unproductive or uncomfortable with Perl, you should avoid it and use whatever language strikes your fancy.

Many people however are extremely comfortable and productive with Perl, for a wide variety of reasons. It's really an awesome language that has proven its vast usefulness a long time ago.

untoreh · on Nov 29, 2016

perl has the best regex, or one of the best, also it is the only other language other than python which is installed by default in common distros afaik

collyw · on Nov 29, 2016

Perl is usually my goto language for that sort of task. It has loads of built in features for interacting with the system.

dozzie · on Nov 29, 2016

> What is the advantage of [perl -pi -e 's///'] over sed?

PCRE and /e flag.

sigzero · on Nov 29, 2016

None really. I would think if you use Perl, you would probably use the one line from Perl is all. Also sed doesn't have the in place function on Solaris (and AIX) by default I believe. So with Perl, you can do it everywhere.

tyingq · on Nov 29, 2016

Not much over GNU sed.

A lot of these perl one liners are from the old days, from unix admins that had versions of sed (or find, etc) that didn't have fancy features like -i (inplace edit).

mark-wagner · on Nov 29, 2016

Also in the old days tools like sed and awk had more arbitrary limitations. E.g., line length was limited to something (reasonable) like 8192 bytes but input data I was processing was not reasonable.

GNU sed does not have that limit: https://www.gnu.org/software/sed/manual/html_node/Limitation...

dahart · on Nov 30, 2016

For me, one part laziness and one part flexibility, and the flexibility part I might be rationalizing. :P. Perl's a superset of what can be done with sed, so I tend to use perl for this case even when sed would do, just because I can remember the perl command line args and regex syntax off the top of my head. I often have to take a trip to the sed man page if I use sed. It's easier for me to add special corner cases to a perl one-liner, etc.

That said, sed is simpler, sometimes less to type, sometimes there on systems where perl isn't (though that seems uncommon these days). But I'm using sed more often, and the more I use it the more familiar it is. Nicer to have two good tools around than just one, right?

electrum · on Nov 30, 2016

One big advantage is portability / compatibility. For example, Linux uses GNU sed but Mac OS X uses BSD sed. I've run into issues where a sed script works on one but not the other. Using perl in sed mode avoids this.

mitchty · on Nov 29, 2016

It works on non linux unixes.

swuecho · on Nov 29, 2016

for me, I do not know sed or awk.

paulmd · on Nov 29, 2016

Perl's substitution and translation facilities work pretty much the same way as sed, if you understand sed you can just drop those operations into Perl. For example, sed -i 's/old/new/' will do an in-place replacement of all strings 'old' with the string 'new'.

I find that you do escaping a little differently with sed though. For example you need to escape +'s for them to have their normal "one or more matches" semantics. Might just be an abnormality of my environment (Cygwin).

scrame · on Nov 29, 2016

The -i switch takes an optional string that creates a backup of the file with that extension. To use your example (with zsh blobbing):

   perl -p -i.bak -e 's/foo/bar/gi' **/*.txt

Recursively replaces foo with bar on every .txt file in a hierarchy and saves the original file with a .bak extension, along with the in-place edit in the original.

nilved · on Nov 29, 2016

I try to avoid Perl when I can (not that sed is much better.)

    sed -i "s/foo/bar/gi" ./*.txt

shanemhansen · on Nov 29, 2016

I actually have trouble remembering perl pie, even though it's got a cute name. Usually I end up running something like this.

   git grep -l something | xargs sed -i.bak -e 's/something/else/g'

Because the longer chain is easier to remember for me.

sndean · on Nov 29, 2016

Awk is a easier for me to remember than sed or perl, just because "sub" (for substitution) is in there:

    awk '{gsub(/this/,"that")}; 1'

Edit: I just learned that "s" in the sed command is for substitute... (http://www.grymoire.com/Unix/Sed.html) Well, maybe I'll be able to remember that.

nilved · on Nov 29, 2016

I think you'd like my `git sed` script.

https://github.com/devlinzed/dotfiles/blob/master/bin/git-se...