RFC: Skip proxies for requests allowed by robots.txt #334
Replies: 5 comments 2 replies
-
I would recommend creating a tutorial instead, explaining that some scholarly requests are complying with The ones that do not comply with robots.txt, really need a proxy. That part will also explain to newcomers that just because the |
Beta Was this translation helpful? Give feedback.
-
Tutorial sounds good in general, but then should we leave it to the individual applications to turn on/off proxy depending on the type of query? I'm thinking of applications that build citation networks which need to query both authors and publications. It'd be cleaner if this was handled within the library instead in the application. Turning it off and on in the application has a small penalty in that, |
Beta Was this translation helpful? Give feedback.
-
I believe that a proxy may be needed even for the vanilla queries, especially when people overload the scholar service. So, I would be hesitant to put code in scholarly that switches off the proxy automatically. |
Beta Was this translation helpful? Give feedback.
-
After some thought, I think I'd like to design the The use case that I'm really trying to optimize here is that of having a free ScraperAPI account that allows only 1000 successful requests a month, that I'd like to use only sparingly when required. |
Beta Was this translation helpful? Give feedback.
-
I am ready to drop both Tor and FreeProxies. They do not really work in most cases anymore. |
Beta Was this translation helpful? Give feedback.
-
Currently, once a proxy is set,
scholarly
uses it to get all pages, even those that are allowed by Google Scholar's robots.txt. This leads to Proxy API charging the user for such requests when no proxy would have worked just fine. Does it make sense forscholarly
to skip using the proxies that are set in retrieving information fromcitations?
pages and use proxies (if set) forscholar?
pages only? Should we give an option for the users to stay behind a proxy even for such users? Looking for opinions, comments before I can work on this feature.Beta Was this translation helpful? Give feedback.
All reactions