Skip to content

Commit

Permalink
Merge pull request #17 from claromes/unique-tweets
Browse files Browse the repository at this point in the history
Collapse by timestamp
  • Loading branch information
claromes authored Dec 13, 2023
2 parents 1cf0181 + 6c2a264 commit e5d9ba8
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 1 deletion.
3 changes: 2 additions & 1 deletion app.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ def query_api(handle, limit, offset, saved_at):
st.warning('username, please!')
st.stop()

url = f'https://web.archive.org/cdx/search/cdx?url=https://twitter.com/{handle}/status/*&output=json&limit={limit}&offset={offset}&from={saved_at[0]}&to={saved_at[1]}'
url = f'https://web.archive.org/cdx/search/cdx?url=https://twitter.com/{handle}/status/*&collapse=timestamp:8&output=json&limit={limit}&offset={offset}&from={saved_at[0]}&to={saved_at[1]}'
try:
response = requests.get(url)
response.raise_for_status()
Expand Down Expand Up @@ -378,6 +378,7 @@ def next_page():

st.session_state.count = tweets_count(handle, st.session_state.saved_at)

st.caption('The search optimization uses an 8-digit [collapsing strategy](https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md?ref=hackernoon.com#collapsing), refining the captures to one per day. The number of tweets per page is set to 25, and this is a fixed value due to the API rate limit.')
st.write(f'**{st.session_state.count} URLs have been captured**')

if st.session_state.count:
Expand Down
5 changes: 5 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Changelog

## [v0.4.3](https://github.com/claromes/waybacktweets/releases/tag/v0.4.3) - 2023-12-13
- Add:
- 8-digit collapsing strategy (one capture per day)
- Messages about collapsing strategy and number of tweets displayed

## [v0.4.2](https://github.com/claromes/waybacktweets/releases/tag/v0.4.2) - 2023-12-13
- Add:
- Parse tweet URLs to delete `/photos`, `/likes`, `/retweets` and other sub-endpoints
Expand Down

0 comments on commit e5d9ba8

Please sign in to comment.