Skip to content

Web Crawling to log into the website and locate the secret flags. Passing along the appropriate cookies and the CSRF tokens to retrieve account data.

Notifications You must be signed in to change notification settings

AnshulShirude/Web-Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

High-Level Approach

The high level approach begins with establishing a TCP socket connection. In order to crawl the website, it is critical to establish a secure connection. Then, once the connection has been established, you start with logging in to the FakeSpot website. In order to login to the website, this starts with a get request to go to the link, a post request to actually log in, then passing over the cookies, and then receiving all of the links. You need to make sure that you are logged in properly before you can get the links and search for the flags. Then, comes the detection until you find all 5 of the flags continue to search the website. We also built in a feature that in the case that you exceed 5000 requests, then restart the socket connection. This is to handle load management.

Now, we will begin the searching for the secret flags. If you have not seen the flag, then add that link to a global list. Followed by that, get the request's cookied and go ahead and send the cookies. This is to mimic the way that the browser works. You want to make sure to pass along the cookies to prevent running into CSRF errors. Then, for each of the links that you receive check the response code that is displayed. The response code is parsed out with the help of a helper function. Based off of what the response code is, then either ignore the link, retry the link, retry with a new link, or look for the secret flag. Eventually, once you receive the secret flag, print out the flag. You will continue to do this indefinitely until all of the flags have been printed out (with a max of 5).

Challenges you faced

The major challenge that we faced was understanding what was expected and how to go about that. We have never worked on something like this in the past, so there are some obstacles that you need to get through in order to understand what is going on and then go about coding it. I wish there was an hour long video done by one of the professors specifically designed to explain how the details of the channels worked. This would increase my understanding of the material and lead me to the correct path in a faster manner.

Overview of testing the code

The major way that we went about testing the code was based off of running the code and seeing if you're receiving the secret flags back. We introduced debugging print statements to make sure that we are adding the links to the appropriate variable. Then, based off of adding that link we are making sure that we are parsing the link correctly for in the case that there is a secret flag embedded inside then we should be able to locate it and then find it. So, after we find a secret flag we kept on trying it to see if there are other secret flags. We ran it multiple times locally to ensure that the code is working correctly.

About

Web Crawling to log into the website and locate the secret flags. Passing along the appropriate cookies and the CSRF tokens to retrieve account data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published