Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uploading a PDF with many pages (for example, more than 20 pages) to Chainlit, the Chainlit page may become unresponsive #1558

Open
seleniumpython2016 opened this issue Dec 2, 2024 · 7 comments
Labels
backend Pertains to the Python backend. bug Something isn't working needs-triage

Comments

@seleniumpython2016
Copy link

seleniumpython2016 commented Dec 2, 2024

Describe the bug
When uploading a PDF with many pages (for example, more than 20 pages) to Chainlit, the Chainlit page may become unresponsive during the process of embedding the content into the vector database if it takes a long time. You may need to refresh the page to continue.

To Reproduce
Steps to reproduce the behavior:

  1. Use the upload button in the dialog box to upload PDF files.
  2. The backend program performs OCR scanning and embeds the content into the vector database.
  3. After performing OCR recognition and embedding tasks,the Chainlit page become unresponsive

Screenshots
image

Desktop (please complete the following information):
chainlit ==1.3.1

@dosubot dosubot bot added backend Pertains to the Python backend. bug Something isn't working labels Dec 2, 2024
@dokterbob
Copy link
Collaborator

@seleniumpython2016 Thanks for the feedback.

Could it be that your OCR code is not running in async?

@dokterbob
Copy link
Collaborator

Does this problem also occur when you're not doing OCR? In that case, the problem is the OCR (probably it not being async), not with chainlit.

If it still persists, love to hear what's happening in the browser (e.g. profiling, console log etc.).

@seleniumpython2016
Copy link
Author

seleniumpython2016 commented Dec 17, 2024

@dokterbob I used async OCR 。because the page was dead. In the background, I saw 'connection closed'. When I refreshed the page, the data replied by the big model had already been released

@seleniumpython2016
Copy link
Author

@dokterbob I guess the session timed out? May I ask how long the default session time is? How to set the timeout for this session?

@seleniumpython2016
Copy link
Author

@dosu what do you think ?

Copy link

dosubot bot commented Dec 25, 2024

The default session timeout for Chainlit is 3600 seconds. You can adjust the session timeout by configuring the session_timeout parameter in your project's configuration file. Here's how you can set it:

[project]
session_timeout = <desired_timeout_in_seconds>

This should help you manage session timeouts more effectively, especially if your operations, like OCR and embedding, take a significant amount of time [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@willydouhard
Copy link
Collaborator

My guess is that your embedding step is taking a long time (which is fine) but is using synchronous code (which is not fine). check https://docs.chainlit.io/guides/sync-async

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Pertains to the Python backend. bug Something isn't working needs-triage
Projects
None yet
Development

No branches or pull requests

3 participants