A bigger problem in my opinion is that Rust has chosen to follow the poll-based model (you can say that it was effectively designed around epoll), while the completion-based one (e.g. io-uring and IOCP) with high probability will be the way of doing async in future (especially in the light of Spectre and Meltdown).
Instead of carefully weighing advantages and disadvantages of both models, the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1]. Unfortunately, because of this rush, Rust got stuck with a poll-based model with a whole bunch of problems without a clear solution in sight (async drop anyone?). And instead of a proper solution for self-referencing structs (yes, a really hard problem), we did end up with the hack-ish Pin solution, which has already caused a number of problems since stabilization and now may block enabling of noalias by default [2].
Many believe that Rust async story was unnecessarily rushed. While it may have helped to increase Rust adoption in the mid term, I believe it will cause serious issues in the longer term.
> A bigger problem in my opinion is that Rust has chosen to follow the poll-based model
This is an inaccurate simplification that, admittedly, their own literature has perpetuated. Rust uses informed polling: the resource can wake the scheduler at any time and tell it to poll. When this occurs it is virtually identical to completion-based async (sans some small implementation details).
What informed polling brings to the picture is opportunistic sync: a scheduler may choose to poll before suspending a task. This helps when e.g. there is data already in IO buffers (there often is).
There's also some fancy stuff you can do with informed polling, that you can't with completion (such as stateless informed polling).
Everything else I agree with, especially Pin, but informed polling is really elegant.
I believe they mean that when you poll a future, you pass in a context. The future derives a "waker" object from this context which it can store, and use to later trigger itself to be re-polled.
By using a context with a custom "waker" implementation, you can learn which future specifically needs to be re-polled.
Normally only the executor would provide the waker implementation, so you only learn which top-level future (task) needs to be re-polled, but not what specific future within that task is ready to proceed. However, some future combinators also use a custom waker so they can be more precise about which specific future within the task should be re-polled.
So stateful async would be writing IO. You've passed in a buffer, the length to copy from the buffer. In the continuation, you'd need to know which original call you were working with so that you can correlate it with those parameters you passed through.
var state = socket.read(buffer);
while (!state.poll()) {}
state.bytesRead...
Stateless async is accepting a connection. In 95% of servers, you just care that a connection was accepted; you don't have any state that persists across the continuation:
while (!listeningSocket.poll()) {}
var socket = listeningSocket.accept();
Stateless async skirts around many of the issues that Rust async can have (because Pin etc. has to happen because of state).
No, it's not. Interrupting is when you call the scheduler at any time, even when it's doing some other work. When it's idle, it can not be interrupted.
Interruptions are something one really tries to restrict to the hardware - kernel layers. Because when people write interruption handlers, they almost always write them wrong.
To elaborate on what it is rather than what it is not, when implementing poll based IO with rust async, typically you have code like “select(); waker.wake()” on a worker thread. Select blocks. Waking tells the executor to poll the related future again, from the top of its tree. The waker implementation may indeed cause an executor thread to stop waiting, it depends on the implementation. It could also be the case that the executor is already awake and the future is simply added to a synchronised queue. Etc. You can implement waking however you like, and technically this could involve an interruptible scheduler, if you really wanted. But you would kinda have to write that.
> Instead of carefully weighing advantages and disadvantages of both models, the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1].
That is not an accurate summary of that comment. withoutboats may have been complaining about someone trying to revisit the decision made in 2015-2016, but as the comment itself points out, there were good reasons for that decision.
Mainly two reasons, as far as I know.
First, Rust prefers unique ownership and acyclic data structures. You can make cyclic structures work if you use RefCell and Rc and Weak, but you're giving up the static guarantees that the borrow checker gives you in favor of a bunch of dynamic checks for 'is this in use' and 'is this still alive', which are easy to get wrong. But a completion-based model essentially requires a cyclic data structure: a parent future creates a child future (and can then cancel it), which then calls back to the parent future when it's complete. You might be able to minimize cyclicity by having the child own the parent and treating cancellation as a special case, but then you lose uniqueness if one parent has multiple children.
(Actually, even the polling model has a bit of cyclicity with Wakers, but it's kept to an absolute minimum.)
Second, a completion-based model makes it hard to avoid giving each future its own dynamic allocation, whereas Rust likes to minimize dynamic allocations. (It also requires indirect calls, which is a micro-inefficiency, although I'm not convinced that matters very much; current Rust futures have some significant micro-inefficiencies of their own.) The 2016 blog post linked in the comment goes into more detail about this.
As you might guess, I find those reasons compelling, and I think a polling-based model would still be the right choice even if Rust's async model was being redesigned from scratch today. Edit: Though to be fair, the YouTube video linked from withoutboats' comment does mention that mio decided on polling simply because that's what worked best on Linux at the time (pre-io_uring), and that had some influence on how Futures ended up. But only some.
…That said, I do agree Pin was rushed and has serious problems.
> Suggestions that we should revisit our underlying futures model are suggestions that we should revert back to the state we were in 3 or 4 years ago, and start over from that point. <..> Trying to provide answers to these questions would be off-topic for this thread; the point is that answering them, and proving the answers correct, is work. What amounts to a solid decade of labor-years between the different contributors so far would have to be redone again.
How should I read it except like "we did the work on the poll-based model, so we don't want for the results to go down the drain in the case if the completion-based model will turn to be superior"?
I don't agree with your assertion regarding cyclic structures and the need of dynamic allocations in the completion-based model. Both models result in approximately the same cyclisity of task states, no wonders, since task states are effectively size-bound stacks. In both models you have more or less the same finite state machines. The only difference is in how those FSMs interact with runtime and in the fact that in the completion-based model you usually pass ownership of a task state part to runtime during task suspension. So you can not simply drop a task if you no longer need its results, you have to explicitly request its cancellation from runtime.
There's a difference between "we decided this 3 years ago" and "we rushed the decision". At this point, it's no longer possible to weigh the two models on a neutral scale, because changing the model would cause a huge amount of ecosystem churn. But that doesn't mean they weren't properly weighed in the first place.
Regarding cyclicity… well, consider something like a task running two sub-tasks at the same time. That works out quite naturally in a polling-based model, but in a completion-based model you have to worry about things like 'what if both completion handlers are called at the same time', or even 'what if one of the completion handlers ends up calling the other one'.
Regarding dynamic allocations… well, what kind of desugaring are you thinking of? If you have
Which by itself is no better; it still implies separate allocations. But then I suppose we could have an `ArcDerived<T>` which acts like `Arc<T>` but can point to a part of a larger allocation, so that `self` and `completion` could be parts of the same object.
However, in that case, how do you deal with borrowed arguments? You could rewrite them to Arc, I suppose. But if you must use Arc, performance-wise, ideally you want to be moving references around rather than actually bumping reference counts. You can usually do that if there's just `self` and `completion`, but not if there are a bunch of other Arcs.
Also, what if the implementation misbehaved and called `completion` without giving up the reference to `self`? That would imply that any further async calls by the caller could not use the same memory. It's possible to work around this, but I think it would start to make the interface relatively ugly, less ergonomic to implement manually.
Also, `ArcDerived` would have to consist of two pointers and there would have to be at least one `ArcDerived` in every nested future, bloating the future object. But really you don't want to mandate one particular implementation of Arc, so you need a vtable, but that means indirect calls and more space waste.
Most of those problems could be solved by making the interface unsafe and using something with more complex correctness requirements than Arc. But the fact that current async fns desugar to a safe interface is a significant upside. (...Even if the safety must be provided with a bunch of macros, thanks to Pin not being built into the language.)
>There's a difference between "we decided this 3 years ago" and "we rushed the decision".
As far as I understand the situation, the completion-based API simply was not on the table 3 years ago. io-uring was not a thing and there was a negligible interest in properly supporting IOCP. So when a viable alternative has appeared right before stabilization of the developed epoll-centric API, the 3 year old decision has not been properly reviewed in the light of the changed environment and instead the team has pushed forward with the stabilization.
>because changing the model would cause a huge amount of ecosystem churn.
No, the discussion has happened before the stabilization (it's literally in the stabilization issue). Most of the ecosystem at the time was on futures 0.2.
Regarding your examples, I think you simply look at the problem from a wrong angle. In my opinion compiler should not desugar async fns into usual functions, instead it should construct explicit FSMs out of them. So no need for Arcs, the String would be stored directly in the "output" FSM state generated for foo. Yes, this approach is harder for compiler, but it opens some optimization capabilities, e.g. regarding the trade-off between FSM "stack" size and and number of copies which state transition functions have to do. AFAIK right now Rust uses "dumb" enums, which can be quite sub-optimal, i.e. they always minimize the "stack" size at the expense of additional data copies and they do not reorder fields in the enum variants to minimize copies.
In your example with two sub-tasks a generated FSM could look like this (each item is a transition function):
1) initialization [0 -> init_state]: create requests A and B
2) request A is complete [init_state -> state_a]: if request B is complete do nothing, else mark that request A is complete and request cancellation of task B, but do not change layout of a buffer used by request B.
3) cancellation of B is complete [state_a -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer B in this handler.
4) request B is complete [init_state -> state_b]: if request A is complete do nothing, else mark that request B is complete and request cancellation of task A, but do not change layout of a buffer used by request A.
5) cancellation of A is complete [state_b -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer A in this handler.
(This FSM assumes that it's legal to request cancellation of a completed task)
Note that handlers 2 and 4 can not be called at the same time, since they are bound to the same ring and thus executed on the same thread. Other completion handler simply can not call another handler, since they are part of the same FSM and only one FSM transition function can be executed at a time. At the first glance all those states and transitions look like an unnecessary complexity, but I think that it's how a proper select should work under the hood.
Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model, and the polling based model presents no issues whatsoever with io-uring. You do not know what you are talking about.
>Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model
Can you provide a link to a design document or at the very least to a discussion with motivation for this switch outside of the desire to be as compatible as possible with the "Linux industry standard"?
>the polling based model presents no issues whatsoever with io-uring
IIUC the best solutions right now are either to copy data around (bye-bye zero-cost) or to use another Pin-like awkward hack with executor-based buffer management, instead of using simple and familiar buffers which are part of a future state.
The completion based futures that Alex started with were also based on epoll. The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll, because there is no impedence issue. You are confused.
Thank you for the link! But immideately we can see the false equivalence: completion based API does not imply the callback-based approach. The article critigues the latter, but not the former. Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.
>The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll
Sorry, but what? Even aturon's article states zero-cost as one of the 3 main goals. So performance issues with strong roots in the selected model is a very big problem in my book.
You cannot literally make extremely inflammatory comments about people's work, and accuse them of all sorts of things, and then get upset when they are mad about it. You've made a bunch of very serious accusations on multiple people's hard work, with no evidence, and with arguments that are shaky at best, on one of the largest and most influential forums in the world.
I mean, you can get mad about it, but I don't think it's right.
I found it highly critical but not inflammatory - though I'm not sure if I'd've felt the same way had they been being similarly critical of -my- code.
However, either way, responding with condescension (which is how the 'industry standard' thing came across) and outright aggression is never going to be constructive, and if that's the only response one is able to formulate then it's time to either wait a couple hours or ask somebody else to answer on your behalf instead (I have a number of people who are kind enough to do that for me when my reaction is sufficiently exothermic to make posting a really bad idea).
boats-of-a-year ago handled a similar situation much more graciously here - https://news.ycombinator.com/item?id=22464629 - so it's entirely possibly a lockdown fatigue issue - but responding to calmly phrased criticism with outright aggression is still pretty much never a net win and defending that behaviour seems contrary to the tone the rust team normally tries to set for discussions.
Of course I was more gracious to pornel - that remark was uncharacteristically flippant from a contributor who is normally thoughtful and constructive. pornel is not in the habit of posting that my work is fatally flawed because I did not pursue some totally unviable vaporware proposal.
I am not mad, it was nothing more than an attempt to urge a more civil tone from boats. If you both think that such tone is warranted, then so be it. But it does affect my (really high) opinion about you.
I do understand the pain of your dear work to be harshly criticized. I have experienced it many times in my career. But my critique intended as a tough love for the language in which I am heavily invested in. If you see my comments as only "extremely inflammatory"... Well, it's a shame I guess, since it's not the first case of the Rust team unnecessarily rushing something (see the 2018 edition debacle), so I guess such attitude only increases rate of mistake accumulation by Rust.
I do not doubt that you care about Rust. Civility, though, is a two-way street. Just because you phrase something in a way that has a more neutral tone does not mean that the underlying meaning cannot be inflammatory.
"Instead of carefully weighing advantages and disadvantages of both models," may be written in a way that more people would call "civil," but is in practice a direct attack on both the work, and the people doing the work. It is extremely difficult to not take this as a slightly more politely worded "fuck you," if I'm being honest. In some sense, that it is phrased as being neutral and "civil" makes it more inflammatory.
You can have whatever opinion that you want, of course. But you should understand that the stuff you've said here is exactly that. It may be politely worded, but is ultimately an extremely public direct attack.
>Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.
I've been lurking your responses, but now I'm confused. If you are not using a callback based approach, then what are you using? Rust's FSM approach is predicated on polling; In other words if you aren't using callbacks, then how do you know that Future A has finished? If the answer is to use Rust's current systems, then that means the FSM is "polled" periodically, and then you still have "async Drop" problem as described in withoutboat's notorious article and furthermore, you haven't really changed Rust's design.
Edit: As I've seen you mention in other threads, you need a sound design for async Drop for this to work. I'm not sure this is possible in Rust 1.0 (as Drop isn't currently required to run in safe Rust). That said it's unfair to call async "rushed", when your proposed design wouldn't even work in Rust 1.0. I'd be hesitant to call the design of the entire language rushed just because it didn't include linear types.
I meant the callback based approach described in the article, for example take this line from it:
>Unfortunately, this approach nevertheless forces allocation at almost every point of future composition, and often imposes dynamic dispatch, despite our best efforts to avoid such overhead.
It clearly does not apply to the model which I've described earlier.
Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.
I can agree with the argument that a proper async Drop can not be implemented in Rust 1.0, so we have to settle with a compromise solution. Same with proper self-referential structs vs Pin. But I would like to see this argument to be explicitly stated with sufficient backing of the impossibility statements.
>Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.
No, I'm not talking about the state transition functions. I'm talking about the runtime - the thing that will call the state transition function. In the current design, abstractly, the runtime polls/checks every if future if it's in a runnable state, and if so executes it. In an completion based design the future itself tells the runtime that the value is ready (either driven by a kernel thread, another thread or some other callback). (conceptually the difference is, in an poll based design, the future calls waker.wake(), and in a completion one, the future just calls the callback fn). Aaron has already described why that is a problem.
The confusion I have is that both would have problems integrating io_uring into rust (due to the Drop problem; as Rust has no concept of the kernel owning a buffer), but your proposed solution seems strictly worse as it requires async Drop to be sound which is not guaranteed by Rust; which would make it useless for programs that are being written today. As a result, I'm having trouble accepting that your criticism is actually valid - what you seem to be arguing is that async/await should have never been stabilized in Rust 1.0, which I believe is a fair criticism, but it isn't one that indicates that the current design has been rushed.
Upon further thought, I think your design ultimately requires futures to be implemented as a language feature, rather than a library (ex. for the future itself to expose multiple state transition functions without allocating is not possible with the current Trait system), which wouldn't have worked without forking Rust during the prototype stage.
>In an completion based design the future itself tells the runtime that the value is ready
I think there is a misunderstanding. In a completion-based model (read io-uring, but I think IOCP behaves similarly, though I am less familiar with it) it's a runtime who "notifies" tasks about completed IO requests. In io-uring you have two queues represented by ring buffers shared with OS. You add submission queue entries (SQE) to the first buffer which describe what you want for OS to do, OS reads them, performs the requested job, and places completion queue events (CQEs) for completed requests into the second buffer.
So in this model a task (Future in your terminology) registers SQE (the registration process may be proxied via user-space runtime) and suspends itself. Let's assume for simplicity that only one SQE was registered for the task. After OS sends CQE for the request, runtime finds a correct state transition function (via meta-information embedded into SQE, which gets mirrored to the relevant CQE) and simply executes it, the requested data (if it was a read) will be already filled into a buffer which is part of the FSM state, so no need for additional syscalls or interactions with the runtime to read this data!
If you are familiar with embedded development, then it should sound quite familiar, since it's roughly how hardware interrupts work as well! You register a job (e.g. DMA transfer), dedicated hardware block does it, and notifies a registered callback after the job was done. Of course, it's quite an oversimplification, but fundamental similarity is there.
>I think your design ultimately requires futures to be implemented as a language feature, rather than a library
I am not sure if this design would have had a Future type at all, but you are right, the advocated approach requires a deeper integration with the language compared to the stabilized solution. Though I disagree with the opinion that it would've been impossible to do in Rust 1.
It does not work in the current version of Rust, but it's not given that a backwards-compatible solution for it could not have been designed, e.g. by using a deeper integration of async tasks with the language or by adding proper linear types, thus all the discussions around reliable async Drop. The linked blog post takes for given that we should be able to drop futures at any point in time, which while being convenient has a lot of implications.
As I've mentioned several times, in this model you can not simply "drop the task" without running its asynchronous Drop. Each state in FSM will be generated with a "drop" transition function, which may include asynchronous cancellation requests (i.e. cleanup can be bigger than one transition function and may represent a mini sub-FSM). This would require introducing more fundamental changes to the language (same as with proper self-referential types) be it either some kind of linear type capabilities or a deeper integration of runtimes with the language (so you will not be able to manipulate FSM states directly as any other data structure), since right now it's safe to forget anything and destructors are not guaranteed to run. IMO such changes would've maid Rust a better language in the end.
“Rust would have been a better language by breaking its stability guarantees” is just saying “Rust would have been a better language by not being Rust.” Maybe true, but not relevant to the people whose work you’ve blanket criticized. Rust language designers have to work within the existing language and your arguments are in bad faith if you say “async could have been perfect with all this hindsight and a few breaking language changes”.
I do not think that impossibility of a reliable async Drop in Rust 1 is a proven thing (prior to the stabilization of async in the current form). Yes, it may require some unpleasant additions such as making Futures and async fns more special than they are right now and implementing it with high probability would have required a lot of work (at least on the same scale as was invested into the poll-based model), but it does not make it impossible automatically.
I don’t agree with this analysis TBH - async drop has been revisited multiple times recently with no luck. Without a clear path there I don’t know why that would seem like an option for async/await two years ago. Do you actually think the language team should have completely exhausted that option in order to try to require an allocator for async/await?
Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.
No worries, I like when someone disagrees with me and argues his or her position well, since it's a chance for me to learn.
>async drop has been revisited multiple times recently with no luck
The key word is "recently", meaning "after the stabilization". It's exactly my point: this problem was not sufficiently explored in my opinion prior stabilization. I would've been fine with a well argued position "async Drop is impossible without breaking language changes, so we will not care about it", but now we try to shoehorn async Drop on top of the stabilized feature.
>Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.
> First, Rust prefers unique ownership and acyclic data structures. You can make cyclic structures work if you use RefCell and Rc and Weak, but you're giving up the static guarantees that the borrow checker gives you in favor of a bunch of dynamic checks for 'is this in use' and 'is this still alive',
One way to get around that is to instead of doing it like the very structureless async way actually impose lifetime restrictions on the lifetimes of async entities. For example, if you use the ideas of the structured concurrency movement ([x], for example but it has since been picked up by kotlin, swift and other projects), then the parent is guaranteed to live longer than any child thus solving most of the problem that way.
Well the link I posted is older than your link from 2019 so I doubt that. The other direction may be possible.
However, enither are definitely not the first having ideas along these lines - structures and logic like this can also be found in Erlang supervisors after all.
And as for the quote, it is quite explicitly referring to Dijkstra and structured programming constructs in nonconcurrent settings.
No doubt. In any case, the author notes his main influences in footnote 3, and those are not part of that. It seems he has a more practical than academic background in this.
I agree that Rust async is currently in a somewhat awkward state.
Don't get me wrong, it's usable and many projects use it to great effect.
But there are a few important features like async trait methods (blocked by HKT), async closures, async drop, and (potentially) existential types, that seem to linger. The unresolved problems around Pin are the most worrying aspect.
The ecosystem is somewhat fractured, partially due to a lack of commonly agreed abstractions, partially due to language limitations.
There also sadly seems to be a lack of leadership and drive to push things forward.
I'm ambivalent about the rushing aspect. Yes, async was pushed out the door. Partially due to heavy pressure from Google/Fuchsia and a large part of the userbase eagerly .awaiting stabilization.
Without stabilizing when they did, we very well might still not have async on stable for years to come. At some point you have to ship, and the benefits for the ecosystem can not be denied. It remains to be seen if the design is boxed into a suboptimal corner; I'm cautiously optimistic.
But what I disagree with is that polling was a mistake. It is what distinguishes Rusts implementation, and provides significant benefits. A completion model would require a heavier, standardized runtime and associated inefficiencies like extra allocations and indirection, and prevent efficiencies that emerge with polling. Being able to just locally poll futures without handing them off to a runtime, or cheaply dropping them, are big benefits.
Completion is the right choice for languages with a heavy runtime. But I don't see how having the Rust dictate completion would make io_uring wrapping more efficient than implementing the same patterns in libraries.
UX and convenience is a different topic. Rust async will never be as easy to use as Go, or async in languages like Javascript/C#. To me the whole point of Rust is providing as high-level, safe abstractions as possible, without constraining the ability to achieve maximum efficiency . (how well that goal is achieved, or hindered by certain design patterns that are more or less dictated by the language design is debatable, though)
>A completion model would require a heavier, standardized runtime and associated inefficiencies like extra allocations and indirection, and prevent efficiencies that emerge with polling.
You are not the first person who uses such arguments, but I don't see why they would be true. In my understanding both models would use approximately the same FSMs, but which would interact differently with a runtime (i.e. instead of registering a waker, you would register an operation on a buffer which is part of the task state). Maybe I am missing something, so please correct me if I am wrong in a reply to this comment: https://news.ycombinator.com/item?id=26407824
Correct me if I'm wrong, but isnt any sort of async support non integral to rust? For example in something like Javscript you can't impliment your own async. But in C, C++ or Rust you can do pretty much anything you want.
So if in the future io-uring and friends become the standard can't that just be a library you could then use?
Similar to how in C you don't need the standard library to do threads or async.
I agree. Completion-based APIs are more high level, and not a good abstraction at the systems language level. IOCP and io_uring use poll-based interfaces internally. In io_uring's case, the interfaces are basically the same ones available in user space. In Windows case IOCP uses interfaces that are private, but some projects have figured out the details well enough to implement decent epoll and kqueue compatibility libraries.
Application developers of course want much higher level interfaces. They don't want to do a series of reads; they want "fetch_url". But if "fetch_url" is the lowest-level API available, good luck implementing an efficient streaming media server. (Sometimes we end up with things like HTTP Live Streaming, a horrendously inefficient protocol designed for ease of use in programming environments, client- and server-side, that effectively only offer the equivalent of "fetch_url".)
Plus, IOCP models tend to heavily rely on callbacks and closures. And as demonstrated in the article, low-level languages suck at providing ergonomic first-class functions, especially if they lack GC. (It's a stretch to say that Rust even supports first-class functions.) If I were writing an asynchronous library in Rust, I'd do it the same way I'd do it in C--a low-level core that is non-blocking and stateful. For example, you repeatedly invoke something like "url_fetch_event", which returns a series of events (method, header, body chunk) or EAGAIN/EWOULDBLOCK. (It may not even pull from a socket directly, but rely on application to write source data into an internal buffer.) Then you can wrap that low-level core in progressively higher-level APIs, including alternative APIs suited to different async event models, as well as fully blocking interfaces. And if a high-level API isn't to some application developer's liking, they can create their own API around the low-level core API. This also permits easier cross-language integration. You can easily use such a low-level core API for bindings to Python, Lua, or even Go, including plugging into whatever event systems they offer, without losing functional utility.
It's the same principle with OS and systems language interfaces--you provide mechanisms that can be built upon. But so many Rust developers come from high-level application environments, including scripting language environments, where this composition discipline is less common and less relevant.
> IOCP models tend to heavily rely on callbacks and closures
While perhaps higher level libraries are written that way, I can’t think of a reason why the primitive components of IOCP require callbacks and closures. The “poll for io-readiness and then issue non-blocking IO” and “issue async IO and then poll for completion” models can be implemented in a reactor pattern in a similar manner. It is just a question of whether the system call happens before or after the reactor loop.
EDIT: Reading some of the other comments and thinking a bit, one annoying thing about IOCP is the cancelation model. With polling IO readiness, it is really easy to cancel IO and close a socket: just unregister from epoll and close it. With IOCP, you will have to cancel the in-flight operation and wait for the completion notification to come in before you can close a socket (if I understand correctly).
Anyways, I've been playing around with implementing some async socket APIs on top of IOCP for Windows in Rust [1]. Getting the basic stuff working is relatively easy. Figuring our a cancellation model is going to be a bit difficult. And ultimately I think it would be cool if the threads polling the completion ports could directly execute the wakers in such a way that the future could be polled inline, but getting all the lifetimes right is making my head hurt.
Yes, you can encode state machines manually, but it will be FAR less ergonomic than the async syntax. Rust has started with a library-based approach, but it was... not great. Async code was littered with and_then methods and it was really close to the infamous JS callback hell. The ergonomic improvements which async/await brings is essentially a raison d'être for incorporating this functionality into the language.
For comparison, Haskell went with the library approach but has the syntactic sugar of the equivalent of `and_then` built into the language. (I am talking about Monads and do-notation.)
It's a bit like iterating in Python: for-loops are a convenient syntactic sugar to something that can be provided by a library.
It is a PhD level research problem to know if monads and do notation would be able to work in Rust. The people who are most qualified to look into it (incidentally: a lot of the same crew was who was working on async) believe that it may literally be impossible.
Yes, a number of people have suggested introduction of a general do notation (or its analog) instead of usecase-specific async/awayt syntax, but since Rust does not have proper higher kinded types (and some Rust developers say it never will), such proposals have been deemed impractical.
Can't Rust come up with a new syntax (that matches io_uring idea better) and deprecate the old one? Or simply replace the implementation keeping the old syntax if it's semantically the same?
It could, although I highly doubt that any deficiencies with the current implementation of async/await are so severe as to warrant anything so dramatic.
FWIW I've done a fair bit of researching with io_uring. For file operations it's fast, bit over epoll the speedups are negligible. The creator is a good guy but they're having issues with the performance numbers being skewed due to various deficiencies in the benchmark code, such as skipping error checks in the past.
Also, io_uring can certainly be used via polling. Once the shared rings are set up, no syscalls are necessary afterward.
We've briefly been playing with io_uring (in async rust) for a network service that is CPU-bound and seems to be bottlenecked in context switches. In a very synthetic comparison, the io_uring version seemed very promising (as in "it may be worth rewriting a production service targeting an experimental io setup"), we ran out of the allocated time before we got to something closer to a real-world benchmark but I'm fairly optimistic that even for non-file operations there are real performance gains in io_uring for us.
I'm not sure io_uring polling counts as polling since you're really just polling for completions, you still have all the completion-based-IO things like the in-flight operations essentially owning their buffers.
Yes, I should have specified - in theory io_uring is much faster and less resource intensive. With the right polish, it can certainly be the next iteration of I/O syscalls.
That being said, you have to restructure a lot of your application in order to be io_uring ready in order to reap the most gains. In theory, you'll also have to be a bit pickier with CPU affinities, namely when using SQPOLL (submit queue poll), which creates a kernel thread. Too much contention means such facilities will actually slow you down.
The research is changing weekly and most of the exciting stuff is still on development branches, so tl;dr (for the rest of the readers) if you're on the fence, best stick to epoll for now.
This post is completely and totally wrong. At least you got to ruin my day, I hope that's a consolation prize for you.
There is NO meaningful connection between the completion vs polling futures model and the epoll vs io-uring IO models. comex's comments regarding this fact are mostly accurate. The polling model that Rust chose is the only approach that has been able to achieve single allocation state machines in Rust. It was 100% the right choice.
After designing async/await, I went on to investigate io-uring and how it would be integrated into Rust's system. I have a whole blog series about it on my website: https://without.boats/tags/io-uring/. I assure you, the problems it present are not related to Rust's polling model AT ALL. They arise from the limits of Rust's borrow system to describe dynamic loans across the syscall boundary (i.e. that it cannot describe this). A completion model would not have made it possible to pass a lifetime-bound reference into the kernel and guarantee no aliasing. But all of them have fine solutions building on work that already exists.
Pin is not a hack any more than Box is. It is the only way to fit the desired ownership expression into the language that already exists, squaring these requirements with other desireable primitives we had already committed to shared ownership pointers, mem::swap, etc. It is simply FUD - frankly, a lie - to say that it will block "noalias," following that link shows Niko and Ralf having a fruitful discussion about how to incorporate self-referential types into our aliasing model. We were aware of this wrinkle before we stabilized Pin, I had conversations with Ralf about it, its just that now that we want to support self-referential types in some cases, we need to do more work to incorporate it into our memory model. None of this is unusual.
And none of this was rushed. Ignoring the long prehistory, a period of 3 and a half years stands between the development of futures 0.1 and the async/await release. The feature went through a grueling public design process that burned out everyone involved, including me. It's not finished yet, but we have an MVP that, contrary to this blog post, does work just fine, in production, at a great many companies you care about. Moreover, getting a usable async/await MVP was absolutely essential to getting Rust the escape velocity to survive the ejection from Mozilla - every other funder of the Rust Foundation finds async/await core to their adoption of Rust, as does every company that is now employing teams to work on Rust.
Async/await was, both technically and strategically, as well executed as possible under the circumstances of Rust when I took on the project in December 2017. I have no regrets about how it turned out.
Everyone who reads Hacker News should understand that the content your consuming is usually from one of these kinds of people: a) dilettantes, who don't have a deep understanding of the technology; b) cranks, who have some axe to grind regarding the technology; c) evangelists, who are here to promote some other technology. The people who actually drive the technologies that shape our industry don't usually have the time and energy to post on these kinds of things, unless they get so angry about how their work is being discussed, as I am here.
Thank you for this post. I have been interested in rust because of matrix, and although I found it a bit more intimidating than go to toy with, I was inclined to try it on a real project over go because it felt like the closest to the hardware while not having the memory risks of C. The co-routines/async was/is the most daunting aspect of Rust, and a post with a sensational title like the grand-parent could have swayed me the other way.
As an aside, It would be great to have some sort of federated cred(meritocratic in some way) in hackernews, instead of a flat democratic populist point system; it would lower the potential eternal September effect.
I would love to see a personal meta-pointing system, it could be on wrapping site: if I downvote a "waste of hackers daytime" article (say a long form article about what is life) in my "daytime" profile, I get a weighted downvoted feed by other users that also downvoted this item--basically using peers that vote like you as a pre-filter. I could have multiple filters, one for quick daytime hacker scan, and one for leisure factoid. One could even meta-meta-vote and give some other hackers' handle a heavier weight...
Please, calm down. I do appreciate your work on Rust, but people do make mistakes and I strongly belive that in the long term the async stabilization was one of them. It's debatable whether async was essential or not for Rust, I agree it gave Rust a noticeable boost in popularity, but personally I don't think it was worth the long term cost. I do not intend to change your opinion, but I will keep mine and reserve the right to speak about this opinion publicly.
In this thread [1] we have a more technicall discussion about those models, I suggest to continue that thread.
>I assure you, the problems it present are not related to Rust's polling model AT ALL.
I do not agree about all problems, but my OP was indeed worded somewhat poorly, as I've admitted here [2].
>Pin is not a hack any more than Box is. It is the only way to fit the desired ownership expression into the language that already exists
I can agree that it was the easiest solution, but I strongly disagree about the only one. And frankly it's quite disheartening to hear from a tech leader such absolutistic statements.
>It is simply FUD - frankly, a lie - to say that it will block "noalias,
Where did I say "will"? I think you will agree that it at the very least will it will cause a delay. Also the issue shows that Pin was not proprely thought out, especially in the light of other safety issues it has caused. And as uou can see by other comments, I am not the only one who thinks so.
>the content your consuming is usually from one of these kinds of people:
Please, satisfy my curiosity. To which category do I belong in your opinion?
By the way, that will almost certainly be taken in a bad way. It's never a good idea to start a comment with something like "chill" or "calm down", as it feels incredibly dismissive.
> I do appreciate your work on Rust, but
There's a saying that anything before a "but" is meaningless.
This is not meant to critique the rest of the comment, just point out a couple parts that don't help in defusing the tense situation.
So why did you not present your own solutions to the issues that you criticized or better yet fix it with an RFC rather than declaring a working system as basically a failure (per your title). I think you wouldn't have 10% of the saltiness if you didn't have such an aggressive title to your article.
I have tried to raise those issues in the stabilization issue (I know, quite late in the game), but it simply got shut down by the linked comment with a clear message that further discussion be it in an issue on in a new RFC will be pointless.
You know the F-35 is a disaster of a government project from looking at it, why not submit a better design? That isn't helpful. You might be interested in the discussion from here:
https://news.ycombinator.com/item?id=26407770
Is it just me or you're supporting your parent's point of:
> ...the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1].
When you write:
> Moreover, getting a usable async/await MVP was absolutely essential to getting Rust the escape velocity to survive the ejection from Mozilla...
This whole situation saddens me. I wish Mozilla could have given you guys more breathing room to work on such critical parts. Regardless, thank you for your dedication.
That is not a correct reading of the situation. async/await was not rushed, and does not have flaws that could have been solved with more time. async/await will continue to improve in a backward compatible way, as it already has since it was released in 2019.
Please keep going, Rust is awesome and one of the few language projects trying to push the efficient frontier and not just rolling a new permutation of the trade-off dice.
I've jumped on the Rust bandwagon as part of ZeroTier 2.0 (not rewriting its core, but rewriting some service stuff in Rust and considering the core eventually). I've used a bit of async and while it's not as easy as Go (nothing is!) it's pretty damn ingenious for language-native async in a systems programming language.
I personally would have just chickened out on language native async in Rust and told people to roll their own async with promise patterns or something.
Ownership semantics are hairy in Rust and require some forethought, but that's also true in C and C++ and in those languages if you get it wrong there you just blow your foot off. Rust instead tells you that the footgun is dangerously close to going off and more or less prohibits you from doing really dangerous things.
My opinion on Rust async is that it its warts are as much the fault of libraries as they are of the language itself. Async libraries are overly clever, falling into the trap of favoring code brevity over code clarity. I would rather have them force me to write just a little more boilerplate but have a clearer idea of what's going on than to rely on magic voodoo closure tricks like:
WUT? I'm still not totally 100% sure why mine works and theirs works, and I don't blame Rust. I'd rather have seen this interface (in hyper) implemented with traits and interfaces. Yes it would force me to write something like a "factory," but I would have spent 30 minutes doing that instead of three hours figuring out how the fuck make_service_fn() and service_fn() are supposed to be used and how to get a f'ing Arc<> in there. It would also result in code that someone else could load up and easily understand what the hell it was doing without a half page of comments.
The rest of the Rust code in ZT 2.0 is much clearer than this. It only gets ugly when I have to interface with hyper. Tokio itself is even a lot better.
Oh, and Arc<> gets around a lot of issues in Rust. It's not as zero-cost as Rc<> and Box<> and friends but the cost is really low. While async workers are not threads, it can make things easier to treat them that way and use Arc<> with them (as long as you avoid cyclic structures). So if async ownership is really giving you headaches try chickening out and using Arc<>. It costs very very little CPU/RAM and if it saves you hours of coding it's worth it.
Oh, and to remind people: this is a systems language designed to replace C/C++, not a higher level language, and I don't expect it to ever be as simple and productive as Go or as YOLO as JavaScript. I love Go too but it's not a systems language and it imposes costs and constraints that are really problematic when trying to write (in my case) a network virtualization service that's shooting (in v2.0) for tens of gigabits performance on big machines.
I skimmed some of this, but are you asking why you need to clone in the closure? Because "async closures" don't exist at the moment, the closest you can get is a closure that returns a future, this usually has the form:
<F, Fut> where F: Fn() -> Fut, Fut: Future
i.e. you call some closure f that returns a future that you can then await on. when writing that out it will look like:
`make_service_fn` likely takes something like this and puts it in a struct, then for every request it will call the closure to create the future to process the request. (edit: and indeed it does, it's definition literally takes your closure and uses it to implement the Service trait, which you are free to do also if you didn't want to write it this way https://docs.rs/hyper/0.14.4/src/hyper/service/make.rs.html#...)
The reason you need to clone in the closure is that is what 'closes over' the scope and is able to capture the Arc reference you need to pass to your future. Whenever make_service_fn uses the closure you pass to it, it will call the closure, which can create your Arc references, then create a future with those references "moved" in.
It's a little deceptive as this means the exact same thing as above, just with the first set of curly braces not needed
|| async move {}
This is still a closure which returns a Future. Does all of that make sense? Perhaps they could use a more explicit example, but it also helps to carefully read the type signature.
Wait so you're saying "|| async move {}" is equivalent to "|| move { async move {} }"? If so then mystery solved, but that is not obvious at all and should be documented somewhere more clearly.
In that case all I'm doing vs. their example is explicitly writing the function that returns the promise instead of letting it be "inferred?"
Well, no, that second one isn't valid rust, perhaps you mean:
move || async move {}
But this is not equivalent to:
|| async move {}
crucially the closure is not going to take ownership of anything. This is kind of besides the point though, what I'm getting at is that both of the above are a closure which returns a future. i.e. you can also write them in this style:
|| {
return async move {};
}
Maybe that's more clear with the explicit return?
I don't understand your second question about it begin "inferred", I never used that word. make_service_fn is a convenience function for implementing the Service trait.
Ohhh.... I think I get it. The root of my confusion is that BRACES ARE OPTIONAL in Rust closures.
This is apparently valid Rust:
let func = || println!("foo!");
I didn't know that, which is why I thought "|| async move ..." was some weird form of pseudo-async-closure instead of what it is: a function that returns an async function.
Most of the code I see always uses braces in closures for clarity, but I now see that a lot of async code does not.
> I didn't know that, which is why I thought "|| async move ..." was some weird form of pseudo-async-closure instead of what it is: a function that returns an async function.
It does not return an async function, it is a closure that returns a future. Carefully read the function signature I had posted:
In all this time, maestro Andrei Alexandrescu was right when he said Rust feels like it "skipped leg day" when it comes to concurrency and metaprogramming capabilities. Tim Sweeney was complaining about similar things, saying about Rust that is one step forward, one step backward. These problems will be evident at a later time, when it will be already too late. I will continue experimenting with Rust, but Zig seems to have some great things going on, especially the colourless functions and the comptime thingy. Its safety story does not dissapoint also, even if it is not at Rust's level of guarantees.
And Zap (scheduler for Zig) is already faster than Tokio.
Zig and other recent languages have been invented after Rust and Go, so they could learn from them, while Rust had to experiment a lot in order to combine async with borrow checking.
So, yes, the async situation in Rust is very awkward, and doing something beyond a Ping server is more complicated than it could be. But that’s what it takes to be a pioneer.
D and Zig have dynamically typed generics (templates/"comptime thingy"), while Rust has statically typed generics. A lot of people confuse this for Rust having less powerful generics. It's simply a different approach: the dynamic vs. static types distinction, at the type level instead of the value level.
Since you clearly have expertise, I'm curious if you might provide some insight into what would roughly be different in an async completion-based model & why that might be at a fundamental odds with the event-based one? Like is it an incompatibility with the runtime or does it change the actual semantics of async/await in a fundamental way to the point where you can't just swap out the runtime & reuse existing async code?
It's certainly possible to pave over the difference between models to a certain extent, but the resulting solution will not be zero-cost.
Yes, there is a fundamental difference between those models (otherwise we would not have two separate models).
In a poll-based model interactions between task and runtime look roughly like this:
- task to runtime: I want to read data on this file descriptor.
- runtime: FD is ready, I'll wake-up the task.
- task: great, FD is ready! I will read data from FD and then will process it.
While in a completion based model it looks roughly like this:
- task to runtime: I want data to be read into this buffer which is part of my state.
- runtime: the requested buffer is filled, I'll wake-up the task.
- task: great requested data is in the buffer! I can process it.
As you can see the primary difference is that in the latter model the buffer becomes "owned" by runtime/OS while task is suspended. It means that you can not simply drop a task if you no longer need its results, like Rust currently assumes. You have either wait for the data read request to complete or to (possibly asynchronously) request cancellation of this request. With the current Rust async if you want to integrate with io-uring you would have to use awkward buffers managed by runtime, instead of simple buffers which are part of the task state.
Even outside of integration with io-uring/IOCP we have use-cases which require async Drop and we currently don't have a good solution for it. So I don't think that the decision to allow dropping tasks without an explicit cancellation was a good one, even despite the convenience which it brings.
FWIW, I'd bet almost anything that this problem isn't solvable in any general way without linear types, at which point I bet it would be a somewhat easy modification to what Rust has already implemented. (Most of my development for a long time now has been in C++ using co_await with I/O completion and essentially all of the issues I run into--including the things analogous to "async Drop", which I would argue is actually the same problem as being able to drop a task itself--are solvable using linear types, and any other solutions feel like they would be one-off hacks.) Now, the problem is that the Rust people seem to be against linear types (and no one else is even considering them), so I'm pretty much resigned that I'm going to have to develop my own language at some point (and see no reason to go too deep into Rust in the mean time) :/.
I did a double take seeing your username above this comment!
Thank you for your contributions to the jailbreak community, it’s what got me started down the programming / tinkering path back in middle school and has significantly shaped the opportunities I have today. Can’t believe I’m at the point where I encountered you poking around the same threads on a forum... made my day! :)
This is an article about why linear types are hard to implement... and it doesn't even claim they can't be done; regardless, I have argued this at you before :(.
I continue to believe that the strongest point in that article is actually the third footnote, which correctly admits that this is mostly about a lack of appreciation.
> The Swift devs have basically the exact same argument for move-only code, and their implicit Copy bound. Hooray!
My claim here is that, given linear types, it should be trivial to use async/await style coroutines for I/O continuation. You have given no evidence against this idea.
Ah! I misunderstood you, sorry. I thought you were saying that linear types would be easy to implement. I wasn't trying to say anything about the stuff you'd do with them if you had them.
> FWIW, I'd bet almost anything that this problem isn't solvable in any general way without linear types
I think this part of your comment is absolutely right and it's fatal to the argument that Rust made the wrong decision about I/O models for Rust. Maybe in the context of some other language, Rust's decision was not the best one, but not for Rust, because Rust just doesn't have linear types.
Rust implements affine types, which means every object must be used at most once. You cannot use them twice, but you can discard them and not do anything with them. Linear types means exactly once.
but I don't think you can easily move from affine types to linear types in the case of Rust, see leakpocalypse[1]
The problem here is not with Rust's async design. It's that Rust has affine types and not linear types. This is not something that could have been solved with more work on the design. It is not that there was "a decision" to allow dropping tasks; it's a constraint on the design that the language requires. (Personally, I'm unsure as to whether a practical language with true linear types is possible, but it's worth experimenting with. Rust is not and never will be that language, however.)
I'm also curious about this. Boats wrote some about rust async and io-uring a while ago that's interesting[1], but also points out a very clear path forward that's not actually outside the framework of rust's Future or async implementation: using interfaces that treat the kernel as the owner of the buffers being read into/out of, and that seems in line with my expectations of what should work for this.
But I haven't touched IOCP in nearly 20 years and haven't gotten into io-uring yet, so maybe I'm missing something.
Really the biggest problem might be that switching out backends is currently very difficult in rust, even the 0.x to 1.x jump of tokio is painful. Switching from Async[Reader|Writer] to AsyncBuf[Reader|Writer] might be even harder.
There's a workaround, but it's unidiomatic, requires more traits, and requires inefficient copying of data if you want to adapt from one to the other.
However, I wouldn't call this a problem with a polling-based model.
At least part of the goal here must be to avoid allocations and reference counting. If you don't care about that, then the design could have been to 'just' pass around atomically-reference-counted buffers everywhere, including as the buffer arguments to AsyncRead/AsyncWrite. That would avoid the need for AsyncBufRead to be separate from AsyncRead. It wouldn't prevent some unidiomaticness from existing – you still couldn't, say, have an async function do a read into a Vec, because a Vec is not reference counted – but if the entire async ecosystem used reference counted buffers, the ergonomics would be pretty decent.
But we do care about avoiding allocations and reference counting, resulting in this problem. However, that means a completion-based model wouldn't really help, because a completion-based model essentially requires allocations and reference counting for the futures themselves.
To me, the question is whether Rust could have avoided this with a different polling-based model. It definitely could have avoided it with a model where the allocations for async functions are always managed by the system, just like the stacks used for regular functions are. But that would lose the elegance of async fns being 'just' a wrapper over a state machine. Perhaps, though, Rust could also have avoided it with just some tweaks to how Pin works [1]… but I am not sure whether this is actually viable. If it is, then that might be one motivation for eventually replacing Pin with a different construct, albeit a weak motivation by itself.
Having investigated this myself, I would be very surprised to discover that it is.
The only viable solution to make AsyncRead zero cost for io-uring would be to have required futures to be polled to completion before they are dropped. So you can give up on select and most necessary concurrency primitives. You really want to be able to stop running futures you don't need, after all.
If you want the kernel to own the buffer, you should just let the kernel own the buffer. Therefore, AsyncBufRead. This will require the ecosystem to shift where the buffer is owned, of course, and that's a cost of moving to io-uring. Tough, but those are the cards we were dealt.
Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete. Future doesn't currently have a "cancel" method, but I guess it would just be represented as async drop. So this requires some way of enforcing that async drop is called, which is hard, but I believe it's equally hard as enforcing that futures are polled to completion: either way you're requiring that some method on the future be called, and polled on, before the memory the future refers to can be reused. For the sake of this post I'll assume it's somehow possible.
Having to wait for cancellation does sound expensive, especially if the end goal is to pervasively use APIs like io_uring where cancellation can be slow.
But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.
So I think the endgame of this hypothetical world is to encourage having the actual I/O be initiated by a Future or Stream created outside the loop. Then within the loop you would poll on `&mut future` or `stream.next()`. This already exists and is already cheaper in some cases, but it would be significantly cheaper when the backend is io_uring.
> But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.
You often do want to cancel them in some branches of the code that handles the result (for example, if they error). It indeed may be prohibitively expensive to wait until cancellation is complete - because io-uring cancellation requires a full round trip through the interface, the IORING_OP_ASYNC_CANCEL op is just a hint to the kernel to cancel any blocking work, you still have to wait to get a completion back before you know the kernel will not touch the buffer passed in.
And this doesn't even get into the much better buffer management strategies io-uring has baked into it, like registered buffers and buffer pre-allocation. I'm really skeptical of making those work with AsyncRead (now you need to define buffer types that deref to slices that are tracking these things independent of the IO object), but since AsyncBufRead lets the IO object own the buffer, it is trivial.
Moving the ecosystem that cares about io-uring to AsyncBufRead (a trait that already exists) and letting the low level IO code handle the buffer is a strictly better solution than requiring futures to run until they're fully, truly cancalled. Protocol libraries should already expose the ability to parse the protocol from an arbitrary stream of buffers, instead of directly owning an IO handle. I'm sure some libraries don't, but that's a mistake that this will course correct.
> Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete.
Right. Which is more or less what the structured concurrency primitives in Kotlin, Trio, and soon Swift are doing.
Wouldn't a more 'correct' implementation be moving the buffer into the thing that initiates the future (and thus, abstractly, into the future), rather than refcounting? At least with IOCP you aren't really supposed to even touch the memory region given to the completion port until it's signaled completion iirc.
Ie. to me, an implementation of read() that would work for a completion model could be basically:
I recognize this doesn't resolve the early drop issues outlined, and it obviously does require copying to adapt it to the existing AsyncRead trait, or if you want to like.. update a buffer in an already allocated object. It's just what I would expect an api working against iocp to look like, and I feel like it avoids many of the issues you're talking about.
Essentially each component has a buffered interface (an interface message queue), which static analysis sizes at compile time. This buffer can act as a daemon, ref counter, offline dropbox, cache, cancellation check, and can probably help with cycle checking.
Is this the sort of model which would be useful here?
I should have worded my message more carefully. Completion-based model is not a silver bullet which would magically solve all problems (though I think it would help a bit with the async Drop problem). The problem is that Rust async was rushed without careful deliberation, which causes a number of problems without a clear solution in sight.
> The problem is that Rust async was rushed without careful deliberation
As someone who observed the process, this couldn't be further from the truth. Just because one disagrees with the conclusion does not mean that the conclusion was made in haste or in ignorance.
> Without introducing Rust 2? Highly unlikely.
This is incorrect. async/await is a leaf node on the feature tree; it is supported by other language features, but does not support any others. Deprecating or removing it in favor of a replacement would not be traumatic for the language itself (less so for the third-party async/await ecosystem, of course). But this scenario is overly dramatic: the benefits of a completion-based model are not so clear-cut as to warrant such actions.
>Just because one disagrees with the conclusion does not mean that the conclusion was made in haste or in ignorance.
Believe me, I do understand the motivation behind the decision to push async stabilization in the developed form (at least I think I do). And I do not intend to argue in a bad faith. My point is that in my opinion the Rust team has chosen to get the mid-term boost of Rust popularity at the expense of the long-term Rust health.
Yes, you are correct that in theory it's possible to deprecate the current version of async. But as you note yourself, it's highly unlikely to happen since the current solution is "good enough".
I and many others would disagree that they made the decision "at the expense of the long-term Rust health". You aren't arguing in good faith if you put words in their mouth. There is no data to suggest the long-term health of rust is at stake because of the years long path they took in stabilizing async today. There are merits to both models but nothing is as clear-cut as you make it to be - completion-based futures are not definitively better than poll-based and would have a lot of trade-offs. To phrase this as "Completion based is totally better and the only reason it wasn't done was because it would take too long and Rust needed popularity soon" is ridiculous
I do not put words in their mouth or have you missed the "in my opinion" part?
The issues with Pin, problems around noalias, inability to design a proper async Drop solution, not a great compatibility with io-uring and IOCP. In my eyes they are indicators that Rust health in the async field has suffered.
>Completion based is totally better and the only reason it wasn't done was because it would take too long and Rust needed popularity soon
I find your statements so strange. I honestly don't care about noalias, and very few people really should. Same with 'async drop'. Same with io-uring, which seems to be totally fine in Rust so far.
Despite your repeated statements that async has harmed Rust, I don't have any problem whatsoever day to day writing 10s of thousands of lines of async code with regards to what you've brought up.
Yes, it's a real possibility. But the problem is that the other route was not properly explored, so we can not compare advantages and disadvantages. Instead Rust went all-in on a bet which was made 3 years ago.
> My point is that in my opinion the Rust team has chosen to get the mid-term boost of Rust popularity at the expense of the long-term Rust health.
I don't think a conscious decision of that sort was made? My impression is that at the time the road taken was understood to be the correct solution and not a compromise. Is that wrong?
The decision was made 3 years ago and at the time it was indeed a good one, but the situation has changed and the old decision was not (in my opinion) properly reviewed. See this comment: https://news.ycombinator.com/item?id=26408524
Yes, it's possible, but it will be a second way of doing async, which will split the ecosystem even further. So without a REALLY good motivation it simply will not happen. Unfortunately, the poll-based solution is "good enough"... I guess, some may say that "perfect is the enemy of good" applies here, but I disagree.
> The problem is that Rust async was rushed without careful deliberation, which causes a number of problems without a clear solution in sight.
Are we talking about the same Rust? I remember the debate and consideration over async was enormous and involved. It was practically the polar opposite of “without careful deliberation”.
There exists actually a proposal for adding completion based futures at [1], which is compatible to what exists now and certainly doesn't require a Rust 2. It will however certainly increase the language surface area.
First of all yes, Rust futures use a poll model, where any state changes from different tasks don't directly call completions, but instead just schedule the original task to wake up again. I still think this is a good fit, and makes a lot of sense. It avoids a lot of errors on having a variety of state on the call stack before calling the continuation, which then gets invalidated. The model by itself also doesn't automatically make using completion based IO impossible.
However the polling model in Rust is combined with the model of always being able to drop a Future in order to cancel a task. This doesn't allow to use lower level libraries which require to do this without applying any additional workarounds.
However that part of Rusts model could be enhanced if there is enough interest in it. e.g. [1] discusses a proposal for it.
Why did polling have to be baked into the language? Seems bizarre for a supposedly portable language to assume the functionality of an OS feature which could change in the future.
Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.
Rust also didn't solve the colored functions problem. Most people think that's an impossible problem to solve without a VM/runtime (like Java Loom), but people also thought garbage collection was impossible in a systems language until Rust solved it. It could have been a great opportunity for them.
> people also thought garbage collection was impossible in a systems language until Rust solved it
No, they didn't. Linear typing for systems languages had already been done in ats, cyclone, and clean, the latter two of which were a major inspiration for rust.
Venturing further into gc territory: long before rust was even a twinkle in graydon hoare's eye, smart pointers were happening in c++, and apple was experimenting with objective c for drivers.
Perhaps more accurate to say "safe reclamation of dynamic allocations without GC was not known to be possible in a practical programming language, before Rust".
The problem with languages like ATS and Cyclone is that you need heavy usage in real-world applications to prove that your approach is actually usable by developers at scale. Rust achieved that first.
Cyclone was a c derivative (I believe it was even backwards compatible), and ats a blend of ml and c. Ml and c are both certainly proven.
Cyclone was, and ats is, a research project; not necessarily intended to achieve widespread use. And again, obj-c was being used by apple in drivers, which is certainly a real-world application.
> without GC
I don't know what you mean by this. GC is a memory management policy in which the programmer does not need to manually end the lifetimes of objects. Rust is a garbage collected language. How many manual calls to 'drop' or 'free' does the average rust program have?
ATS is not just "a blend of ML and C", it has a powerful proof system on top.
You can't just say "well, these languages were derived from C in part, THEREFORE they must be easy to adopt at scale", that doesn't follow at all.
Yes, Cyclone and ATS were research projects, that's why they were never able to accumulate the real-world experience needed to demonstrate that their ideas work at scale.
Objective-C isn't memory safe.
By "GC" here I meant memory reclamation schemes that require runtime support and object layout changes ... which is the way most people use it. If you use the term "garbage collection" in a more expansive way, so that you say Rust "is a garbage collected language", then most people are going to misunderstand you.
> By "GC" here I meant memory reclamation schemes that require [...] object layout changes
Changes with respect to what?
One example of a popular GC is the boehm GC. It provides a drop-in replacement for malloc, usable in c for existing c structures without any ABI changes.
Perhaps you are thinking specifically of compacting GCs, which usually need objects to have a header with a forwarding pointer?
> require runtime support
‘malloc’ and ‘free’ are a memory reclamation scheme that is part of the c runtime. I don't think there's any argument to be made that they are garbage collection. What's the difference between them and some other runtime support?
-----------------------------------------
Broadly, you are referring to mechanisms which can be used to implement garbage collection, but those are not what's interesting here. What's interesting is a memory management policy which supports garbage collection and is usable for a systems programming language.
-----------------------------------------
> If you use the term "garbage collection" in a more expansive way, so that you say Rust "is a garbage collected language", then most people are going to misunderstand you.
‘Garbage collection’ is a technical term with a specific, precise meaning. This meaning is generally understood and accepted throughout the literature. It's also the thing that's specifically interesting here: manually managing object lifetimes is error-prone and tends to lead to bugs, and bugs in systems software tend to be far-reaching, so a way to eliminate those bugs categorically is considered valuable.
> You can't just say "well, these languages were derived from C in part, THEREFORE they must be easy to adopt at scale", that doesn't follow at all.
That's fair as such, but I think the situation is a bit more nuanced than that. The semantics of ats and cyclone are largely designed to augment c directly. Ats's proof semantics in particular map very well to the semantics of c programs as written. Which, true, doesn't prove anything, but shows that there is much less to be proved: the existing paradigm can still be used.
> Yes, Cyclone and ATS were research projects, that's why they were never able to accumulate the real-world experience needed to demonstrate that their ideas work at scale.
Are the several multi-100-kloc ats compilers out there not real-world enough? If not then, on the topic of proof languages, ada/spark and isabelle/hol had proven themselves long before rust.
I have connections with the academic GC community. I gave an invited talk at ISMM 2012. I guarantee that they will not agree "Rust is a garbage-collected language".
FWIW Wikipedia describes "garbage collection" as "a form of automatic memory management" and goes on to say "Other similar techniques include stack allocation, region inference, memory ownership ..." so whoever wrote that doesn't agree that all forms of automatic reclamation are garbage collection.
I prefer to avoid arguing about the meaning of words but it's not good to sow confusion.
> Are the several multi-100-kloc ats compilers out there not real-world enough?
Yes, projects written by the creators of the language are not enough.
> If not then, on the topic of proof languages, ada/spark and isabelle/hol had proven themselves long before rust.
> Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.
This is comparing apples to oranges; Rust's general, no-assumptions-baked-in coroutines feature is called "generators", and are not yet stable. It is this feature that is internally used to implement async/await. https://github.com/rust-lang/rust/issues/43122
>Meanwhile C and C++ can easily adopt any async system call style because it made no assumptions in the standards about how that would be done.
Do you know about co_await in C++20? AFAIK (I only have a very cursory knowledge about it, so I may be wrong) it also makes some trade-offs, e.g. it requires allocations, while in Rust async tasks can live on stack or statically allocated regions of memory.
Also do not forget that Rust has to ensure memory safety at compile time, while C++ can be much more relaxed about it.
C++20 coroutines are not async in the standard. They are just coroutines. Actually they have no implementation. The user has to write classes to implement the promise type and the awaitable type. You could just as easily write a coroutine library wrapping epoll as you could io_uring. The only thing it does behind your back (other than compile to stackless coroutines) is allocate memory, which also goes for a lot of other things.
Is this not also true of Rust? Are you saying Rust in some sense hardcodes an implementation to await in a way C++ doesn't? (I am not a Rust programmer, but I am very very curious about this and would appreciate any insight; I do program in C++ with co_await daily, with my own promise/task classes.)
Rust's async/await support is not intended as a general replacement of coroutines. In fact, async/await is built on top of coroutines (what Rust calls "generators"), but these are not yet stable. https://github.com/rust-lang/rust/issues/43122
Ouch... thanks; I didn't realize the Rust situation was this bad :(. FWIW, I do not look at generators as being what I would want as my interface for working with coroutines, and am very much on board there with the comments from tommythorn. I guess I just have too many decades of experience working with coroutines in various systems I have used :(.
TL;DR: rust makes you bring some sort of executor along. You can write your own, you can use someone else's. I have not done enough of a deep dive into what made it into the standard to give you a great line-by-line comparison.
It requires allocation if the coroutine outlives the scope that created it.
Otherwise compiler are free to implement heap allocation elision (which is done in Clang).
Now compared to Rust, assuming you have a series of coroutines to process a deferred event, Rust will allocate once for the whole series while C++ would allocate once per coroutine to store them in the reactor/proactor.
> people also thought garbage collection was impossible in a systems language until Rust solved it
Only if you understand "garbage collection" in a narrow sense of memory safety, no explicit free() calls, a relatively readable syntax for passing objects around, and acceptable amount of unused memory. This comes with a non-negligible amount of small text for Rust when compared to garbage-collected languages.
Instead of carefully weighing advantages and disadvantages of both models, the decision was effectively made on the ground of "we want to ship async support as soon as possible" [1]. Unfortunately, because of this rush, Rust got stuck with a poll-based model with a whole bunch of problems without a clear solution in sight (async drop anyone?). And instead of a proper solution for self-referencing structs (yes, a really hard problem), we did end up with the hack-ish Pin solution, which has already caused a number of problems since stabilization and now may block enabling of noalias by default [2].
Many believe that Rust async story was unnecessarily rushed. While it may have helped to increase Rust adoption in the mid term, I believe it will cause serious issues in the longer term.
[1]: https://github.com/rust-lang/rust/issues/62149#issuecomment-... [2]: https://github.com/rust-lang/rust/issues/63818