Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent app from crashing #498

Closed
wants to merge 3 commits into from
Closed

Conversation

rupav
Copy link
Member

@rupav rupav commented Feb 6, 2018

Fixes #494

Checklist

  • I have read the Contribution & Best practices Guide and my PR follows them.
  • My branch is up-to-date with the Upstream master branch.
  • I have added necessary documentation (if appropriate)

Changes proposed in this pull request:

  • Prevents app from crashing for those engines whose qtype like news not implemented, yet is parsed, and creates a 400/404 error internally.
  • parsijoo, although with 200 status_code, returns empty urls, and hence cases like this is also handled.
    https://aqueous-refuge-13068.herokuapp.com/

Few queries to look for:
search chelsea, friend, local or any other query on google+news and parsijoo+news in both original http://query-server.herokuapp.com/ and https://aqueous-refuge-13068.herokuapp.com/ .

print('Parsijoo parsed: ' + str(urls))

try:
print('Parsijoo parsed: ' + str(urls))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't use try except block, I am getting error :
'charmap' codec can't encode characters in position 29-31: character maps to <undefined>.
This error comes up when urls are empty.
It seems to be a console error rather than python as per statckoverflow . So handled this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like encoding problem

@rupav
Copy link
Member Author

rupav commented Feb 6, 2018

@raju249 @realslimshanky let me know if app is now not crashing anyhow.
I know for a given status code I haven't put a correct corresponding message to show.
But for that, in this PR only, I will create an external file, ContentManagement, where I will map, all error codes like,
400, 419 etc., to their respective message to show on browser. I will fetch this in python script and then I will show message as per error code.
Can I have the liberty to create another file ContentManagement?

Copy link
Member

@vaibhavsingh97 vaibhavsingh97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Job 👏

@@ -42,8 +42,12 @@ def get_page(self, query, startIndex=0, qtype=''):
if self.name == 'mojeek' and qtype == 'news':
payload['fmt'] = 'news'
response = requests.get(url, headers=self.headers, params=payload)
status_code = response.status_code
print(status_code)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this statement. IMO, we can use logging to give detail logs on console. Let me know your opinion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaibhavsingh97 I don't know about this logging.(I thought print is only used to debug) Can you please point me to a link to read on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can look here and here

status_code = response.status_code
print(status_code)
if(status_code == 400 or status_code == 404):
return (None, status_code)
print(response.url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

if(len(urls) == 0):
return (None, status_code)
else:
print("Couldn't fetch more results.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case where, urls fetched are not equal to expected number of urls. Should not be removed from logs then in my opinion. As we have not implemented no. of urls fetched yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not asking to remove this, I am asking to replace with logging statements

print('Parsijoo parsed: ' + str(urls))

try:
print('Parsijoo parsed: ' + str(urls))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like encoding problem

app/server.py Outdated
@@ -36,11 +36,24 @@ def store(url, links):
def index():
return render_template('index.html', engines_list=scrapers.keys())

'''
def bad_request(error):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function doesn't handle the case of csv

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaibhavsingh97 can this be created as a separate issue for beginners. No case handling for csv too was present earlier. Should I open an issue for the same for beginners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rupav Yes, That would be great and help beginner to solve the issue as well 👍

app/server.py Outdated
return make_response(response, error[0])
print(error[2])
if error[2] == 'xml':
print("Its XML!!!!!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@vaibhavsingh97
Copy link
Member

? But for that, in this PR only, I will create an external file, ContentManagement, where I will map, all error codes like,
400, 419 etc., to their respective message to show on browser. I will fetch this in python script and then I will show message as per error code.
Can I have the liberty to create another file ContentManagement?

Will this will not create problem for other file types like csv and xml. Can you please give more detail on this.

@rupav
Copy link
Member Author

rupav commented Feb 6, 2018

@vaibhavsingh97 .No, regarding your query on adding a message corresponding to a given error,
it will not create error for xml. And for csv, its message will be shown in default json object. It will work just like it now...
2018-02-06

@vaibhavsingh97
Copy link
Member

@rupav I am getting [object Object] in CSV

@rupav
Copy link
Member Author

rupav commented Feb 6, 2018

Ok, @vaibhavsingh97 looking into it. 👍

@realslimshanky
Copy link
Member

Prevents app from crashing for those engines whose qtype like news not implemented, yet is parsed, and creates a 400/404 error internally.

IMHO, a user-friendly message should be included on the client side like News hasn't been implemented for this search engine yet.

@vaibhavsingh97
Copy link
Member

@realslimshanky This we can add as an enhancement and let beginner would solve that issue so from this atleast some new comer will join us and can join our awesome team 😄

@rupav
Copy link
Member Author

rupav commented Feb 6, 2018

@vaibhavsingh97 done (PS: https://travis-ci.org/fossasia/query-server/jobs/338069203 tests are failing because I introduced one more value to return 😅 i.e. status_code) These can be changed after this PR?
@realslimshanky please review and let me know if I should make ContentManagement file too.

@vaibhavsingh97
Copy link
Member

@rupav It will better if you fix travis as well

@rupav
Copy link
Member Author

rupav commented Feb 6, 2018

Ok @vaibhavsingh97 , so I have to change tests then. But one question. If I change tests in this PR, how can travis will be validating with new tests since new tests has not been merged 🤔?

@vaibhavsingh97
Copy link
Member

@rupav So what travis does is pull your commit and run tests present in your commit so if you change/add tests than travis will test those tests

Copy link
Member

@realslimshanky realslimshanky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the travis report for your PR as many tests are failing.

@rupav
Copy link
Member Author

rupav commented Feb 7, 2018

yes @realslimshanky , since I introduces one more parameter, tests are failing. Looking into them 👍

@rupav rupav force-pushed the mock-494-issue branch 7 times, most recently from 60fec61 to b469b4d Compare February 8, 2018 04:32
@rupav rupav force-pushed the mock-494-issue branch 7 times, most recently from 33fda7b to feeec13 Compare February 8, 2018 15:33
@rupav
Copy link
Member Author

rupav commented Feb 8, 2018

@vaibhavsingh97 I solved almost all conflicts. had to comment out a test too (as it will always fail),
there is one I couldnt solve. Please see python 3.6 travis logs, which states
AssertionError: assert [{'desc': 'mo...'mock_title'}] == {'error': 'Cou...Server Error'} .
This means its not fetching results. But I tried running locally with given query, qformat, and engine, giving correct results.
ping @realslimshanky

@realslimshanky
Copy link
Member

Firstly it's not recommended to skip tests, especially when we are modifying the API.
I tried to look into your changes and break apart the test alongside your modifications. The error Travis is throwing is the one when server.py breaks while retrieving and sending result from the scrapper. Maybe I'm not getting to the point where the problem lies, even the environment is almost the same for local dev as well. We should hold this implementation until there is a solution.

@@ -76,6 +77,7 @@ def test_api_search_for_no_response(mock_bad_request):
mock_bad_request.assert_called_with([404, 'No response',
'google:fossasia'])
assert resp == "Mock Response"
'''
Copy link
Member Author

@rupav rupav Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@realslimshanky can you please see this test I skipped(commented out)...
test_api_searh_for_no_response, why line 77 will ever be called? My point is why bad_request function will be called when '/api/v1/search/google?query=fossasia' fetches correct results, so it will not lead to calling up bad request. That's why this test will fail every time 😫

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is not testing for response from the server, it's testing the response already stored in catch. And since the request is made only once it's not present in cache.

Copy link
Member Author

@rupav rupav Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if not present in the cache, which code ideally should be executed, bad_request should have been called ?
Which statements are we talking about here ... Please help. @realslimshanky

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not worked with pytest/mock yet and would take some time to fully understand in order to assist you better. I would like to request contributors who have worked on this module to please help out.
cc: @vaibhavsingh97 @bhaveshAn

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umangahuja1
Copy link
Contributor

Bing news is also crashing (Try : cricket)

@vaibhavsingh97
Copy link
Member

@rupav Status?

@rupav
Copy link
Member Author

rupav commented May 17, 2018

@vaibhavsingh97 sorry for such a long delay, will get back to this, once I am through the documentation of pytest. I will update this as soon as I understand the problem ( which I raised earlier ).

@mariobehling
Copy link
Member

Closing this due to inactivity. Please reopen new PR if you want to resolve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Few search engines crashing the app, taking too much time to parse, e.g. Parsijoo.
5 participants