Evaluating GPT-4's Ability in Summarizing Key Points from Terms of Service Documents

Cheung, Ka Pui
AP Research
Grade 12, High School

This project started in Oct 2023 and ended in May 2024.

The code for this project is licensed under the MIT license. You are free to use this as a template for your research (with proper citation).

Abstract

This project aims to assess the effectiveness of GPT-4 in summarizing key points from Terms of Service (ToS) documents. By doing so, it seeks to address the challenges of accessibility and comprehension that internet service users often encounter. The project employs a diverse dataset of ToS documents drawn from various online services.

The evaluation methodology involves comparing GPT-4-generated summaries against reference summaries produced by human volunteers from the ToS:DR database. Several statistical metrics are employed to assess the model’s performance, including SMOG, Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog Index, Coleman-Liau Index, Dale-Chall Readability Formula, word count, BLEU score, and F1-score. These metrics provide valuable insights into GPT-4’s potential as a tool for enhancing the accessibility of ToS documents.

Technical Information

This codebase leverages various technologies for efficient data management and processing. Research data is stored using Prisma, an ORM for PostgreSQL, which runs on Docker for containerization. Data is downloaded and processed through integrated APIs from OpenAI and ToS;DR. Automation tools, including a command-line interface (CLI), are developed in TypeScript for streamlined operations.

Usage

To run the project code, clone this repository and install dependencies:

git clone https://github.com/kapuic/gpt-tos.git
cd gpt-tos
bun install # or use your preferred package manager

Start a PostgreSQL instance. You may use the docker-compose.yml file included in this repository:

docker-compose up -d

Copy the .env.example file to .env and fill in the necessary environment variables.

View Database

bun prisma studio

Open Grafana Analytics Dashboard

Go to <127.0.0.1:3000>.

Commands

`bun sample`

This script will create a sample of services.

`bun summarize`

This script will run the study by summarizing all documents in a sample.

`bun analyze`

This script will analyze documents and GPT summarizations, and find correlations between variables.

Possible Improvements for Edge Cases

Failure attempts are not captured in a log.
The current document download process does not account for the possibility of multiple services using the same document URL. (Is this a problem at all?)
Show dynamic stats when sampling and summarizing.

Why Open Source?

Open-sourcing this project will enable others to replicate our work while promoting transparency and credibility. In addition, the project may be used as a template for researchers conducting similar studies on large language models. Therefore, after the College Board completed grading my AP Research project, I decided to open-source the code.

Credits

Datasets

Open PageRank

Third-Party Libraries

Extractors

@extractus/article-extractor by Extractus @extractus (GitHub, npm)

Complexity Analysis of ToS Documents

sentence-extractor by Gavin Song @Gavin-Song (GitHub, npm)
words-count by Baozier @byn9826 (GitHub, npm)
syllable by words @words (GitHub, npm)
smog-formula by words @words (GitHub, npm)
flesch-kincaid by words @words (GitHub, npm)
flesch by words @words (GitHub, npm)
gunning-fog by words @words (GitHub, npm)
coleman-liau by words @words (GitHub, npm)
dale-chall-formula by words @words (GitHub, npm)

Accuracy Analysis of GPT Summarizations

bleu-score by words @words (GitHub, npm)
rouge by Kenneth Lim @kenlimmj (GitHub, npm)

To use the above packages in another project:

bun install @extractus/article-extractor sentence-extractor words-count syllable smog-formula flesch-kincaid flesch gunning-fog coleman-liau dale-chall-formula bleu-score rouge

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.husky		.husky
.vscode		.vscode
prisma		prisma
src		src
.env.template		.env.template
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierignore		.prettierignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
commitlint.config.js		commitlint.config.js
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
lint-staged.config.js		lint-staged.config.js
package.json		package.json
prettier.config.js		prettier.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating GPT-4's Ability in Summarizing Key Points from Terms of Service Documents

Table of Contents

Abstract

Technical Information

Usage

View Database

Open Grafana Analytics Dashboard

Commands

`bun sample`

`bun summarize`

`bun analyze`

Possible Improvements for Edge Cases

Why Open Source?

Credits

Datasets

Third-Party Libraries

Extractors

Complexity Analysis of ToS Documents

Accuracy Analysis of GPT Summarizations

About

Languages

License

kapuic/gpt-tos

Folders and files

Latest commit

History

Repository files navigation

Evaluating GPT-4's Ability in Summarizing Key Points from Terms of Service Documents

Table of Contents

Abstract

Technical Information

Usage

View Database

Open Grafana Analytics Dashboard

Commands

bun sample

bun summarize

bun analyze

Possible Improvements for Edge Cases

Why Open Source?

Credits

Datasets

Third-Party Libraries

Extractors

Complexity Analysis of ToS Documents

Accuracy Analysis of GPT Summarizations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

`bun sample`

`bun summarize`

`bun analyze`