A Python script that scrapes Twitter's trending topics using proxies, stores the data in MongoDB, and generates an HTML report.
- Scrapes top trending topics from Twitter
- Uses proxy rotation for reliable data collection
- Stores results in MongoDB for historical tracking
- Generates an interactive HTML report
- Supports automatic re-running via web interface
- Python 3.8+
- MongoDB database
- ProxyMesh account or similar proxy service
- Twitter account
The HTML report includes:
- Top 5 trending topics
- Current proxy IP address
- MongoDB record details
- Option to re-run the scraper
- Clone the repository:
git clone https://github.com/yourusername/twitter-trending-scraper.git
cd twitter-trending-scraper
- Install required dependencies:
pip install -r requirements.txt
Configure your MongoDB connection in the script:
MONGO_URI = "mongodb+srv://your-connection-string"
Update the proxy settings:
PROXY_HOST = "your.proxy.host"
PROXY_PORT = your_proxy_port
PROXY_USERNAME = "your_proxy_username"
PROXY_PASSWORD = "your_proxy_password"
Set your Twitter credentials as environment variables:
export TWITTER_USERNAME="your_username"
export TWITTER_PASSWORD="your_password"
Run the script:
python selenium_script.py
The script will:
- Connect to Twitter using the configured proxy
- Fetch current trending topics
- Store the data in MongoDB
- Generate an HTML report
This project is licensed under the MIT License - see the LICENSE.md file for details.
This tool is for educational purposes only. Ensure you comply with Twitter's Terms of Service and rate limiting policies when using this scraper.