You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a slightly strange problem with quickscrape.
I want to run something like this: quickscrape --urllist test_dois.txt --scraper ../journal-scrapers/scrapers/plos.json --output plos-test2
That is, I want to use relative paths for the URL list and the scraper file.
When running this on OS X it works fine, but when running on my Linux server I get an error saying that it can't find the urllist file.
Simplifying this a bit and looking just at the urllist file, if I run ./quickscrape.js --urllist test_dois.txt --scraper /mnt/cm-volume/content-mine/journal-scrapers/scrapers/plos.json --output plos-test2 I get:
info: quickscrape 0.4.7 launched with...
info: - URLs from file: undefined
info: - Scraper: /mnt/cm-volume/content-mine/journal-scrapers/scrapers/plos.json
info: - Rate limit: 3 per minute
info: - Log level: info
fs.js:427
return binding.open(pathModule._makeLong(path), stringToFlags(flags), mode);
^
Error: ENOENT, no such file or directory 'test_dois.txt'
at Object.fs.openSync (fs.js:427:18)
at Object.fs.readFileSync (fs.js:284:15)
at loadUrls (/mnt/cm-volume/content-mine/quickscrape/bin/quickscrape.js:154:17)
at Object.<anonymous> (/mnt/cm-volume/content-mine/quickscrape/bin/quickscrape.js:164:41)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Function.Module.runMain (module.js:497:10)
at startup (node.js:119:16)
I have absolutely no idea why this is behaving differently on Linux to OS X.
Interesting, I seem to be able to fix this error by moving the process.chdir call further down the file - so that it is called only after the URL list has been loaded (see the diff at master...robintw:relative-paths). This seems to work on both Linux and OS X, and I'm happy to submit this as a PR if that would be useful.
I must say, I'm a bit confused by all of this though - and wondering whether I am being really stupid!
The text was updated successfully, but these errors were encountered:
I have a slightly strange problem with quickscrape.
I want to run something like this:
quickscrape --urllist test_dois.txt --scraper ../journal-scrapers/scrapers/plos.json --output plos-test2
That is, I want to use relative paths for the URL list and the scraper file.
When running this on OS X it works fine, but when running on my Linux server I get an error saying that it can't find the urllist file.
Simplifying this a bit and looking just at the urllist file, if I run
./quickscrape.js --urllist test_dois.txt --scraper /mnt/cm-volume/content-mine/journal-scrapers/scrapers/plos.json --output plos-test2
I get:I have absolutely no idea why this is behaving differently on Linux to OS X.
Interesting, I seem to be able to fix this error by moving the
process.chdir
call further down the file - so that it is called only after the URL list has been loaded (see the diff at master...robintw:relative-paths). This seems to work on both Linux and OS X, and I'm happy to submit this as a PR if that would be useful.I must say, I'm a bit confused by all of this though - and wondering whether I am being really stupid!
The text was updated successfully, but these errors were encountered: