The project consists of 2 parts. CLI version and the UI. Each build instruction is outlined in the corresponding sections below.
- Pull the git project to the Documents folder on macOS under the name of
RecommenderSystem
. (any other folder if you don't wish to use the UI or using Windows or Linux) - Install using pip command in all packages outlined in
requirements.txt
. - Install tor with brew service if you wish to use the functionality (macOS only).
This step is not recommended due to the recent changes in google robot detection
- Write initial 4-6 citations in the txt/CSV file.
- Run the scraper with the command similar to the following
python3 scraper.py -i input.txt -o dataset.csv -t -n 10
-i stands for input file, -o output file, -t enables Tor interactions and -n identifies the number of threads
Please note, due to the recent changes in google robot detection, using multiple threads and tor functionality may cause errors in scraping. the following command is safe to use.
python3 scraper.py -i input.txt -o dataset.csv
- Run the analyser methods with with following commands:
python3 analysis.py -m jaccard -i dataset.csv -o output.csv -n 10
python3 analysis.py -m jaccard_no_division -i dataset.csv -o output.csv -n 10
python3 analysis.py -m network_degree -i dataset.csv -o output.csv -n 10
python3 analysis.py -m network_betweenness -i dataset.csv -o output.csv -n 10
python3 analysis.py -m paper -i dataset.csv -o output.csv -n 10
-m stands for method, -i stands for the input file, -o output file and -n number of recommendations
- Get your answer from the output file (
output.csv
in the example).
The UI is contained in the MacOSFYPUI folder and is an Xcode project. The instructions on the compilations could be found below
- Install the terminal version of the project into the Documents folder as described above.
- Navigate to the
RecommenderSystem\MacOSFYPUI
. - Open the Xcode project.
- Build it and move the resulted application to the Applications folder.
- Give permissions to the application to access the disk on the machine
Now the application should be ready to use
In order to use the application the following steps should be performed:
- Start the application
- Insert the citations in the input field separated by enter (no more than 6-7 recommended).
- Pick the number of threads and if you wish to use Tor. (recommended settings: Tor = off and 1 thread)
- Click the
Collect the data
button. - Wait until the data is collected
- If you wish to use the already collected data, then select the file with the dataset to use, else leave it at the default value.
- select the method and the number of recommendations you wish to receive
- the recommendations would be displayed in the bottom right part of the application
Q: Scraper doesn't work.
A: Install a Mozilla Firefox browser. It is required in the case of Google Scholar blocking the program with a captcha. Then go to scholar.google.com
and search for anything. The captcha will be triggered and could be solved. Alternatively, Google may have blocked the IP address, in that case - use another IP or wait for approximately an hour.