Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

FWIW yash is an impressive and mature project that seems to overlap with the goals of mrsh:

https://yash.osdn.jp/index.html.en

Yash, yet another shell, is a POSIX-compliant command line shell written in C99 (ISO/IEC 9899:1999). Yash is intended to be the most POSIX-compliant shell in the world while supporting features for daily interactive and scripting use.

I've installed the Ubuntu package and poked around at it a bit. Its source code seems well-written, and it even contains its own line editing library (i.e. the functionality of GNU readline).

It looks like it's still being developed, and magicant is the original author: https://github.com/magicant/yash

I mentioned it here (and just added a proper link): https://github.com/oilshell/oil/wiki/ExternalResources

I wrote some notes about POSIX here: http://www.oilshell.org/blog/2018/01/28.html#limit-to-posix



Thanks for sharing! A brief review of the home page shows that it is not compatible with the goals of mrsh, though. It seems Yash aims for high compatibility with POSIX, but adds extensions - mrsh is strictly POSIX, such that in some cases it even detects bashisms and aborts the shell. The goals of mrsh are:

- Support the proliferation of portable shell scripts and discourage the use of non-standard extensions

- Provide a "POSIX shell as a library" to have a useful standalone parser and pluggable event loop

- Provide a moderately comfortable interactive shell experience OOTB

Personally, I'm going to eventually use the second goal to make a new shell based on libmrsh which has a fish-like interactive experience but with a strict POSIX syntax.


On one hand I applaud, but on the other strict POSIX shell seems overly limited.

My personal grudges:

- no local variables (even dash implements them)

- no process substitution [0] so one has to use temp files, temp named pipes or possibly joggle fd numbers. I once did an external implementation of process substitution, so it's doable with POSIX shell, but then pointless.

- set -o pipefail

I understand that you know about and get over those limitations. But isn't it too cumbersome when Linux world settled on bash and BSD on ksh? Especially since ksh is a bit like a subset of bash. At least the things I mentioned are common for both.

It's sad that shell makes it so hard to make more complicated pipelines than | the | old | classic. Fds, processes and pipelines should all be first class. Maybe that's the point to make it harder to make those complicated scripts in something other than shell. In other languages it may be easier to make correct pipelines, but with heavy boilerplate. Then it's probably easier to count on libraries instead of pipelines, even if those libraries spawn processes in background.

I'm hoping for oil shell more as it means to address all those issues and more with an escape hatch to convert old scripts. Although I'm not so sure about the implementation details. For those in the know it made more sense to me after I read that OVM itself would become a VM for oil. So that there would be a single VM to run and not two as it is now. At least that's how I understand it.

POSIX shell is kinda magical, because it's nominally portable. However the rabbit seems to have died in the hat, because it's really that old. Shell was conceived in a different era with different limitations. I think that we should go further.

[0] take-in-and-out <(process-out) >(filter-this | grep that) | grep msg


>no local variables

They're not standard, so don't use them. Instead I often find a subshell to be sufficient. You can declare a function like so:

    my_func() (
        local_variable=10
        echo $local_variable
    )
This is also a good strategy for helping you make purer shell functions.

>no process substitution, so one has to use temp files, temp named pipes or possibly joggle fd numbers

I don't find myself doing this often enough to care. It is possible, and if you're writing something which heavily relies on this perhaps you're better off with Go or Python.

>set -o pipefail

I agree, we've been talking to the Austin Group about getting this standardized.

I think we should go further, too, not by attempting to replace the shell with something that tries to do shell scripts better, but by making full-blown programming languages in which shell things are nicer to use.


Yeah making Oil more efficient is an open problem. I've been going back and forth on the whole OVM thing (long story). I think that's the right idea, but it's not clear how long it will take.

But I think the efficiency problem is easier than the problem of running big hairy unmodified shell scripts, which I view as essential to replacing bash.

That is, it's easier to make something that works efficient, as opposed to thinking you have 60% of a shell and really having 20% of one :-/ That was the case for awhile, but since OSH can run many real programs, I'm pretty confident in its feature set.


In my experience, the parser definitely needs to be a library, but I'm not sure about the runtime.

There are a few places in POSIX where the parser has to be invoked recursively:

- command sub: $() and ``

- eval

- alias expansion. (I found some divergence in how shells implement this, but it does involve the parser.)

In Oil I also used the parser as a library in several other places:

- For interactive completion. Bash does not do this, and I don't believe any other POSIX-ish shell does. I wrote a bit about that in the latest blog post [1]. This turned out really well.

- For history expansion, because unevaluated words have to be picked out of previous command lines. Bash does not do this either.

Consider:

    $ echo ${x:-a b c}
    a b c
    $ echo !$
    echo c}
    c}
IMO this is fairly nonsensical behavior, and the underlying cause is that bash chooses to write duplicate, ad hoc parsers for its own language! There are many cases like this with completion, e.g.

    $ if ec<TAB>
    $ for i in 1 2 3; do ec<TAB>
Bash isn't smart enough to complete "echo" in these cases, because it doesn't know it's in the "first word' state.

It also chooses to treat = and : as completion word delimiters, even though they don't delimit normal words, and this causes a lot of problems that the bash-completion project patches over in a very ugly fashion.

----

As for the runtime, one problem is that the shell inherently modifies global process state. So there is a limit to the abstraction you can provide over it. For example consider this program:

    { echo hi
      ls / 
      echo bye
    } > out.txt
There's essentially one way to implement this with Unix system calls, but you couldn't have two different interpreters running them concurrently in the same process, because the process FD tables would get stomped on. (i.e. my definition of library is that you can make multiple instances of it with different parameters.)

----

The general idea of a shell that conforms to POSIX but provides a better interactive experience is a good one. (That seems to be the feedback on the Fish Shell 3.0 thread on the front page).

Although it is a huge amount of work! I hope that I will be able to metaprogram / compile Oil into something more compact, but that's an open problem now :) It is shaping up to be a better interactive shell than I originally thought though. Treating the parser as a library was a big win.

(Among other reasons, it's not a library in bash because it uses many global variables.)

[1] http://www.oilshell.org/blog/2018/12/16.html


Thanks for sharing your insights!

>There are a few places in POSIX where the parser has to be invoked recursively

In your examples, we have the runtime invoke the parser as necessary. The parser doesn't know about the runtime. For alias resolution, we have a callback function, which hooks into the runtime but is pretty thin and abstract.

>For history expansion

Thankfully, this is non-POSIX so mrsh doesn't have to worry about it.

>one problem is that the shell inherently modifies global process state [...] i.e. my definition of library is that you can make instantiate multiple instances of it with different parameters

My definition doesn't line up with yours. My definition is a shared object or static archive and a bunch of headers with an API you can link to instead of implementing something yourself.


Are you parsing command subs at runtime too? Bash does that [1], but I believe it's a bad idea. dash, mksh, and zsh seem to do it "the right way", although none of them statically parses as much as OSH.

IIRC a case that really seals the deal is:

    $ echo $(case x in x) echo foo;; esac)
    foo
How do you find the closing paren? You basically have to parse shell, so you might as well do that at parse time rather than runtime. There's a section in the aosabook bash chapter that talks about that.

In other words, bash has had parsing bugs with PAREN MATCHING for 20 years (I have a case in my suite that was fixed between bash 4.3 and 4.4). If you just statically parse then you can get it right all on the first try.

It can get arbitrarily complicated, you can add a subshell and nested command subs in there too, etc.:

    $ echo $( ( case x in $(echo x)) echo foo;; esac) )
    foo
Bash syntax makes it worse, but this problem appears in POSIX sh too.

[1] http://www.oilshell.org/blog/2016/10/13.html


    ~/s/m/build > cat test.sh
    #!/bin/sh
    echo $(case x in x) echo foo;; esac)
    ~/s/m/build > mrsh -n test.sh
    program
    program
    └─command_list ─ pipeline
      └─simple_command
        ├─name ─ word_string [2:1 → 2:5] echo
        └─argument 1 ─ word_command ─ program
          └─command_list ─ pipeline
            └─case_clause
              ├─word ─ word_string [2:13 → 2:14] x
              └─items
                └─case_item
                  ├─patterns
                  │ └─word_string [2:18 → 2:19] x
                  └─body
                    └─command_list ─ pipeline
                      └─simple_command
                        ├─name ─ word_string [2:21 → 2:25] echo
                        └─argument 1 ─ word_string [2:26 → 2:29] foo


OK it looks like mrsh is parsing command subs at parse time, which is good! bash doesn't do that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: