-
Notifications
You must be signed in to change notification settings - Fork 6
changed method of loading npm packages #4
base: master
Are you sure you want to change the base?
Conversation
@@ -17,6 +17,7 @@ const fs = require('fs') | |||
"allPackagesOutput" : "/path/to/allpackages.json" | |||
, "repositoriesOutput" : "/path/to/repositories.json" | |||
, "githubOutput" : "/path/to/githubusers.json" | |||
, "aussieOutput" : "/path/to/aussieOutput.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this correct? I don't think this is needed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to as is used on line 87
good work, (I hope!), some style issues, I stopped commenting cause I think you get the idea re spaces and semis |
Yeah sorry is very foreign to my style so I struggled a bit will try to clean up now. |
@rvagg hopefully that is all good. |
I'm going to have to ponder this one, one request per package is going to be a pain in the backside. |
For any particular reason? Not sure there is currently an api to do what we want in less requests that handles the current repository size. |
well, one alternative is to use an npm mirror, I have one in my house with each package as a json file on disk, I could use that, I just need to think through whether I want to rely on it! |
Ok. Let me know what you think. How much space does a mirror need? I could potentially host one too. There would be a lot more data in there than we need for this system but maybe other uses. |
Happy to help host a mirror on my AWS if it's relatively easy |
ping @rvagg have you had a chance to have a think about this approach as yet? |
hey @rvagg just another gentle ping as I noticed you are online :) |
It would be great if #3 was fixed as I imagine there are a whole heap of developers not being listed on the site. |
@MauriceButler I'm keen to get this in, are we still sure it works? |
I am not sure. Has been a long time. I'll try to get time to run it up tonight and make sure it still works. If not at least in the next couple days. |
Change the load-npm-data functionality to used some end-points directly as the previous method was getting incorrect / truncated data because of the growth of the registry as per #3
Using an enpoint to get just package ids as per npm/npm-registry-couchapp#162
Then loading just the latest data from each module to get correct and up to date maintainers.
Also skipping the adding of repos for a bunch of unprocessable packages thus massively reducing the time to process the github requests.
Hopefully once deployed and results uploaded to https://github.com/polyhack/npm-github-data issue #3 should be resolved.