Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] 크롤링 API 성능 개선 및 최적화 #37

Open
4 tasks
yxhwxn opened this issue Aug 12, 2024 · 0 comments · Fixed by #42
Open
4 tasks

[Refactor] 크롤링 API 성능 개선 및 최적화 #37

yxhwxn opened this issue Aug 12, 2024 · 0 comments · Fixed by #42
Assignees
Labels
refactor code refactoring

Comments

@yxhwxn
Copy link
Member

yxhwxn commented Aug 12, 2024

🔨 리팩토링이 필요한 부분

댓글 크롤링 API 로직

리팩토링 작업 브런치

refactor/37

✅ refactoring TODO

  • 비동기 처리
  • 멀티스레딩을 통한 댓글 분산 처리(
  • 크롤링 종료 조건 개선(더 이상 새로운 댓글이 없을 때'만 종료되도록 되어 있는데, 특정 횟수만큼 새로운 댓글이 없으면 종료하거나, 크롤링 시간 제한을 더 엄격하게 설정)
  • 데이터베이스 벌크 연산 활용(데이터를 하나씩 저장하는 대신, 일정한 크기만큼 모아서 한 번에 저장)

참고 사항

현재 댓글 재수집은 정책상 1회만 가능함. event 테이블에 필드를 추가해서 값이 True일 때, 더이상 크롤링 진행하지 못하게 하는 로직 또한, 필요

@yxhwxn yxhwxn added the refactor code refactoring label Aug 12, 2024
@yxhwxn yxhwxn self-assigned this Aug 12, 2024
@yxhwxn yxhwxn linked a pull request Aug 16, 2024 that will close this issue
1 task
@yxhwxn yxhwxn reopened this Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor code refactoring
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant