-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Observed a significant performance degradation in multipart uploads after upgrading from Helidon MP 1.4.14 to Helidon MP 4.1.1 #9678
Comments
I'm not able to reproduce the problem so far. I've tried uploading a 4M multipart on Linux and MacOS and the upload times seem reasonable. Please see attached zip file.
|
Looks like the problem is related to running the server on Linux and using a non-loopback interface. Here is a quick test using Mac and Linux:
We need to look a bit closer as to how Jersey is reading the data from Helidon and why it seems to performed badly on Linux. |
For reference (a previous perf issue on Linux): #7983 |
Looks like this is relared to the low-level TCP receive buffer configuration. Apparently, this affects Linux in particular. Increasing that buffer size solves the problem as follows (e.g. setting it to 32K):
So it seems that the auto-tuning we enable in Linux does not work well in this case. |
Hi ! We tested the suggested configuration, but it did not result in any change in the observed latency. Additionally, we reviewed the performance tuning documentation and experimented with other configurations, but no changes were observed.
|
@DeepakR-Oracle That link may need to be updated, the only option you need is the one I listed above. I can clearly see the improvement if the receive buffer is increased, but let me double check one more time using version 4.1.1. If this works for me, I will provide step-by-step instructions on how I tested it for you to try out. |
@DeepakR-Oracle The setting for the receive buffer size works for me. The scenario under test is described below. Prerequisites
Testing
|
Hi, We tested the provided files, but the issue still persists. :/ Environment : Server :
Client :
Saw the following warning log in the server at some point :
This is what we are seeing even after un-commenting the receiver buffer configuration ( Note, the server was killed after 2 minutes of waiting, hence the Broken pipe )
What's something interesting is, we tried enabling SSH port-forwarding and the upload is relatively very quick. Here is how the curl looks with port forwarding enabled :
Thanks ! |
@DeepakR-Oracle That's interesting. Honestly, this looks more like a network issue now. Is HTTP traffic routed differently? Are there any load balancers along the way? I just don't know how to connect your findings with anything we do in Helidon. I'll ask PSR for some help evaluating this. |
@spericas There are no load balancers or intermediate layers, and we don’t see this issue with any other HTTP requests—only those involving multipart data. We initially suspected a network-related too, but this bug was raised because the same environment works fine with Helidon 1.x and JDK 8. The latency issue appears only with Helidon 4.x and JDK 21. Thanks! |
@DeepakR-Oracle We are going to attempt to do some testing of this on OCI to see if we can reproduce the problem. We obviously need to reproduce this before we can try to fix it. |
I managed to do some testing using two OCI machines on the same subnet. I'm not able to reproduce the same upload problem described above using the 2 Ubuntu systems on a wi-fi network. In particular, the setting
does not help nor hurt the upload performance, which is about 1.5 secs for a 4 MB file.
Server version is:
Unfortunately, we still cannot reproduce the problem reported by @DeepakR-Oracle at this time. |
@DeepakR-Oracle I somehow missed that your client in the test above is Mac. I'm assuming that means you're going over VPN then? If so, could you reproduce the problem using two OCI machines? |
I've done one more test with a Linux server version 8.1 (thanks @barchetta!). If the client runs on another OCI machine (same client as in the test above), I see no problems at all with the upload. Naturally, if the client is a Mac connecting over VPN, then it is quite slow to upload --but that's the network not Helidon AFICT. |
Issue Description:
We are experiencing a significant performance degradation in multipart file uploads after upgrading our application. Our application has an API that allows users to upload files using multipart (we use the
jersey-media-multipart
dependency). Previously, with Helidon MP 1.4.14 and JDK 8, it took approximately 4 seconds to ingest a 4.2MB file. After upgrading to Helidon MP 4.1.1 and JDK 21, we are observing a noticeable delay when uploading the same file ( in minutes ) .Environment Details:
Kubernetes : v1.28.11
Kubernetes CNI : Flannel v0.25.4
Helidon MP : v4.1.1
JDK : 21
Information
Here's how our API looks currently.
Async profiler data :
async-profiler-data.zip
What we were able to observe internally is that, in Helidon 4.x, reading data from the socket seems to be handled by the LazyInputStream class, whereas this behavior is not observed in Helidon 1.x.
We are wondering if lazy reading might contribute to the slowness observed in multipart processing. Could the LazyInputStream implementation or its interaction with jersey-media-multipart be causing delays in reading and processing multipart data?
Additional Information
We have a GA next month, any timely help would be greatly appreciated.
The text was updated successfully, but these errors were encountered: