Best tweet post date prediction

Individual interdisciplinary project 2017/2018, Supervisors: Tobias Scheffer & Paul Prasse, Submission Date: 02.03.2018 @ University of Potsdam, Germany

Install

This project is working on Anaconda with :

Python 3.6.3

Run in console:

pip install -r requirements.txt

Set your Twitter API keys in config.py.

#Twitter API credentials
consumer_key  = '#####'
consumer_secret  = '#####'
access_token  = '#####'
access_secret  = '#####'

Tweet crawling

Create a folder ./data/datasets/ in root. Run in console:

python crawltwitter.py -a [twitter_user] -c1 [max_tweets] -c2 [max_accounts]

p(ex) :

python crawltwitter.py -a DataScienceCtrl -c1 10000 -c2 100

Data for each accounts retrieved will be stored in data/datasets.
The account names retrieved by this crawling will be stored in data/TwitterCrawlXXXX-XX-XXXXX.json

Gather data

Create a folder ./data/gathered/ in root. Run in console:

python gather.py -max 10000

A json file will be created in data/gathered with datas ready for training.

Train models

Create folders ./data/cache/ and ./data/publish/ in root. Run in console:

python training.py -d data\gathered\gathering_xxx_xxx.json -save -i -i -i -i

The 4 options -i corresponds of highest level of data set optimisation, you can remove them.

Feature importance will be displayed. Baselines tests too.
Thanks to -save option, model will be saved in ./data/publish/.

Running web server

Run in console:

python server.py -f data\publish\xxxxxx.model.json

Then, go to http://127.0.0.1:5000 and let play with predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
baselines		baselines
crawler		crawler
model		model
static		static
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Report_Paper.pdf		Report_Paper.pdf
Tunning.xlsx		Tunning.xlsx
__init__.py		__init__.py
config.py		config.py
crawltwitter.py		crawltwitter.py
gather.py		gather.py
gethashtag.py		gethashtag.py
gethours.py		gethours.py
gettweets.py		gettweets.py
normalize.py		normalize.py
requirements.txt		requirements.txt
retrieve.py		retrieve.py
server.py		server.py
train_hashtag.py		train_hashtag.py
training.py		training.py
wordcount.py		wordcount.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Best tweet post date prediction

Install

Tweet crawling

Gather data

Train models

Running web server

About

Releases

Packages

Languages

License

vykimo/twitter_best_date

Folders and files

Latest commit

History

Repository files navigation

Best tweet post date prediction

Install

Tweet crawling

Gather data

Train models

Running web server

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages