-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structured concurrency for server applications #447
base: main
Are you sure you want to change the base?
Conversation
server/guides/concurrency.md
Outdated
tree for the above-mentioned reasons. Broadly speaking libraries expose two | ||
kinds of APIs short lived almost request response like APIs e.g. | ||
`HTTPClient.get("http://example.com)` and long-lived APIs such as an HTTP server | ||
that accepts inbound connections. In reality, libraries often expose both since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say:
- Clients have both kinds of requests: request-response style methods which are run in the request task and background work which needs to be scheduled on some root task.
- Servers only need a root task to schedule their request handling in?
server/guides/concurrency.md
Outdated
> Note: With future language features such as `~Escapable` types it might be | ||
possible to encode this constraint in the language itself. | ||
|
||
## Task executors in Swift on Server applications |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before we go into NIO land here, we should probably create something like how Server applications in structured concurrency:
withDiscardingTaskGroup { taskGroup in
for try await connection in server.newConnections {
taskGroup.addTask {
try await handleConnection(connection) // local reasoning... yay!
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then you should probably explain how handConnection should consume the connections incoming messages as an AsyncSequence. Again. Local reasoning! All code is right there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you have established this pattern, you can explain that a NIOAsyncChannel works exactly like your connection here. Then link to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added an example above for the server which I am going to pick up here again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually just moved the whole section around task executors out. Let's keep it focused on structured concurrency.
server/guides/concurrency.md
Outdated
Highly performant server application often rely on handling incoming connections | ||
and requests almost synchronously without incurring unnecessary allocations or | ||
context switches. Swift Concurrency is by default executing any non-isolated | ||
method on the global concurrent executor. On the other hand, `NIO` has its own | ||
thread pool in the shape of an `EventLoopGroup`. `NIO` picks an `EventLoop` out | ||
of the `EventLoopGroup` for any `Channel` and executes all of the `Channel`s IO | ||
on that `EventLoop`. When bridging from `NIO` into Swift Concurrency by default | ||
the execution has to context switch between the `Channel`s `EventLoop` and one | ||
of the threads of the global concurrent executor. To avoid this context switch | ||
Swift Concurrency introduced the concept of preferred task executors in | ||
[SE-XXX](). When interacting with the `NIOAsyncChannel` the preferred task | ||
executor can be set to the `Channel`s `EventLoop`. If this is beneficial or | ||
disadvantageous for the performance of the application depends on a couple of | ||
factors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would mention this in this document even.
server/guides/concurrency.md
Outdated
then cancelled in the `deinit`. Since `deinit`s of classes and actors are run at | ||
arbitrary times it becomes impossible to tell when resources created by those | ||
unstructured tasks are released. Since, the `run()` method pattern has come up a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
deinit
s of classes and actors are run at
arbitrary times it becomes impossible to tell when resources created by those
unstructured tasks are released.
This makes it sound like a garbage collector, but ARC is deterministic, so maybe tweak the wording here to instead focus on the fact that it can be difficult to enforce cleanup at a specific time when a reference is shared between tasks. That said, this might not be important, for example if you have 5 identical worker tasks sharing a resource, and they're all shutting down, you might not care which one is the last one to shut down (and thus allow the resource to be deinited), so maybe add an example of where this is important.
server/guides/concurrency.md
Outdated
is not allowed. | ||
|
||
> Note: With future language features such as `~Escapable` types it might be | ||
possible to encode this constraint in the language itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even more importantly, a non-copyable type can represent a resource like this, where you do have full control of when it's cleaned up, making the scope-based API unnecessary. If you're linking to non-escaping, might be worth linking to non-copyable as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with ~Copyable
and ~Escapable
types we still can't express deinit based resource clean up in all cases. Closing FDs or deleting VMs is an asynchronous action and deinit
s cannot be async at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, but "in all cases" is a very high bar. Non-copyable types help substantially over the status quo, by allowing a single owner known at compile time, who's responsible for managing the resource. When and how that owner chooses to free resources is orthogonal, but the important part is that you can achieve deterministic resource management without a with
style API, which was my original point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still cannot. Anything that requires an asynchronous deinit cannot achieve deterministic resource management. Deterministic means at any point in your program you can tell when the resource is freed which works for simple things with ~Copyable
but won't work for stuff like file descriptors, sockets or virtual machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. If I have a non-copyable type with a func close() async
on it which must be called before deinit (otherwise a precondition fails), by looking at the code you know exactly when it's being freed, and when the freeing finished.
I'm not talking about deinit doing the work, I'm saying that the lack of non-copyable types meant that the only way to enforce a single known entity to free the resource was a with
API. With non-copyable types, there is always exactly one owner of the value, who's responsible for freeing it (in any way that makes sense for the type, for sync types, it can be using deinit, for async types it can be using an explicit close, whatever).
I don't see any non-determinism here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, async deinit would make this even more flexible, but it's not a blocker for using non-copyable types correctly even without with
style APIs. (You still can, of course, it's just that now there's a second way.)
server/guides/concurrency.md
Outdated
shutdown of applications. Gracefully shutting down applications is often | ||
required in modern cloud environments during roll out of new application | ||
versions or infrastructure reformations. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if in-scope for this guide, or if it should be a separate one, but once I have these structured tasks nicely set up, how do I communicate between them? More guidance on that topic would be great, especially how to wire up the async sequences (I presume) at task creation time, some patterns around that.
server/guides/concurrency.md
Outdated
tree for the above-mentioned reasons. Broadly speaking libraries expose two | ||
kinds of APIs short lived almost request response like APIs e.g. | ||
`HTTPClient.get("http://example.com)` and long-lived APIs such as an HTTP server | ||
that accepts inbound connections. In reality, libraries often expose both since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can be a bit more general here:
tree for the above-mentioned reasons. Broadly speaking libraries expose two | |
kinds of APIs short lived almost request response like APIs e.g. | |
`HTTPClient.get("http://example.com)` and long-lived APIs such as an HTTP server | |
that accepts inbound connections. In reality, libraries often expose both since | |
tree for the above-mentioned reasons. Broadly speaking there are two | |
kinds of tasks: | |
1. Short lived tasks, for example request-response like APIs, and | |
2. Long lived background tasks, for example an HTTP server | |
that accepts inbound connections. | |
Most libraries will use both since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the more general approach leads to clear communication of our intend here:
For client requests, we have clear rules:
Everything that we not consider part of a client request we dispatch into the background task via AsyncSequence, whereas the original client request remains within whatever task the user scheduled the work in. for the background work we need a task root. This is the reason why we need a run()
for clients.
For server tasks:
We need a task root to schedule the request-response handling in. This is a behavior that is contrary to clients.
--
This distinction is extremely important to ensure that cancellation works correctly in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I have rewritten this part and actually added some code examples for a simplified HTTPClient
and HTTPServer
server/guides/concurrency.md
Outdated
|
||
## Task executors in Swift on Server applications | ||
|
||
Most of the Swift on Server ecosystem is build on top of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the Swift on Server ecosystem is build on top of | |
Most of the Swift on Server ecosystem is built on top of |
server/guides/concurrency.md
Outdated
[swift-nio](https://github.com/apple/swift-nio) - a high performant event-driven | ||
networking library. `NIO` has its own concurrency model that predates Swift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[swift-nio](https://github.com/apple/swift-nio) - a high performant event-driven | |
networking library. `NIO` has its own concurrency model that predates Swift | |
[swift-nio](https://github.com/apple/swift-nio) - a high performance event-driven | |
networking library. `NIO` has its own concurrency model that predates Swift |
server/guides/concurrency.md
Outdated
|
||
Highly performant server application often rely on handling incoming connections | ||
and requests almost synchronously without incurring unnecessary allocations or | ||
context switches. Swift Concurrency is by default executing any non-isolated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context switches. Swift Concurrency is by default executing any non-isolated | |
context switches. By default, Swift Concurrency executes any non-isolated |
server/guides/concurrency.md
Outdated
|
||
Most of the Swift on Server ecosystem is build on top of | ||
[swift-nio](https://github.com/apple/swift-nio) - a high performant event-driven | ||
networking library. `NIO` has its own concurrency model that predates Swift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIO
is a module, I think we should be calling it SwiftNIO here
f8cf141
to
7991382
Compare
7991382
to
eaf02d8
Compare
Thanks for the feedback everyone. I just pushed a commit that addresses most of it and provides some code examples. Would appreciate if you could take another look at it |
Structured Concurrency allows you to organize your code into high-level tasks | ||
and their child component tasks. These tasks are the primary unit of concurrency | ||
and enable the flow of information up and down the task hierarchy. Furthermore, | ||
child tasks enhance local reasoning since at the end of a method all child tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
child tasks enhance local reasoning since at the end of a method all child tasks | |
structured concurrency enforces local reasoning since at the end of a method all child tasks, that were spawned within this method, |
} | ||
|
||
// This consumes new incoming requests and dispatches them onto connections from the pool. | ||
for await (request, continuation) in workStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for await (request, continuation) in workStream { | |
for await (request, continuation) in requestStream { |
public func execute(request: HTTPRequest) async throws -> HTTPResponse { | ||
try await withCheckedContinuation { continuation in | ||
self.requestStreamContinuation.yield((request, continuation)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is weird! Why do we dispatch into the run call here?! This breaks task cancellation. Requests should remain in the task that the user already has.
} | ||
``` | ||
|
||
In the above simplified examples, you can already see that both expose a `func |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the above simplified examples, you can already see that both expose a `func | |
In the above simplified examples, you can already see that both objects – HTTPClient and HTTPServer – expose a `func |
your applications. It allows you to compose different services under a single | ||
top-level task while getting out-of-the-box support for signal handling allowing | ||
you to gracefully terminate services. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially show an example what a main
would look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree - working out how to kick off a couple of services with long running tasks with the service lifecycle has been one of the more challenging things recently whilst investigation Vapor's next steps
The goal of structuring libraries and applications like this is enabling a | ||
seamless integration between them. Furthermore, since everything is inside the | ||
same task tree and task locals propagate down the tree we unlock new APIs in | ||
libraries such as `Swift Log` or `Swift Distributed Tracing`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! What is the benefit of those new APIs? Tell me about automatic context propagation! And link to the docs!
After adopting this structure a common question that comes up is how to model | ||
communication between the various components. This can be achieved by: | ||
- Using dependency injection to pass one component to the other, or | ||
- Inverting the control between components using `AsyncSequence`s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh. this is where the meat really is! Some "how to draw an owl" vibes here ;)
Open questions:
- How do I invert the control between components using an AsyncSequence?
- How can I use my http-client/database-client in my server? Can a Service be injected into another service? Can you show me how cool task cancellation is in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start and will provide some guidelines for library authors. It does appear to be missing some features as mentioned in other comments - like integration with other recommended packages and how those play nicely with each other. It would also be good to show some examples of task cancellation, an example with service lifecycle would be a great place to put this
No description provided.