Skip to content

Commit

Permalink
Adding ESCI for UBI.
Browse files Browse the repository at this point in the history
  • Loading branch information
jzonthemtn committed Nov 10, 2024
1 parent fb21919 commit a80ba10
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 0 deletions.
1 change: 1 addition & 0 deletions data/esci/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ubi_queries_events_1000.ndjson.bz2 filter=lfs diff=lfs merge=lfs -text
15 changes: 15 additions & 0 deletions data/esci/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# ESCI Data in UBI Format

This directory contains ESCI data in the UBI format. Created using https://github.com/opensearch-project/user-behavior-insights/tree/main/ubi-data-generator.

https://github.com/amazon-science/esci-data

```
@article{reddy2022shopping,
title={Shopping Queries Dataset: A Large-Scale {ESCI} Benchmark for Improving Product Search},
author={Chandan K. Reddy and Lluís Màrquez and Fran Valero and Nikhil Rao and Hugo Zaragoza and Sambaran Bandyopadhyay and Arnab Biswas and Anlu Xing and Karthik Subbian},
year={2022},
eprint={2206.06588},
archivePrefix={arXiv}
}
```
3 changes: 3 additions & 0 deletions data/esci/index.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash -e

curl -X POST "http://localhost:9200/_bulk?pretty" -H "Content-Type: application/x-ndjson" --data-binary @ubi_queries_events_1000.ndjson
3 changes: 3 additions & 0 deletions data/esci/ubi_queries_events_1000.ndjson.bz2
Git LFS file not shown
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ services:
plugins.security.disabled: "true"
logger.level: info
OPENSEARCH_INITIAL_ADMIN_PASSWORD: SuperSecretPassword_123
http.max_content_length: 500mb
OPENSEARCH_JAVA_OPTS: "-Xms8192m -Xmx8192m"
ulimits:
memlock:
soft: -1
Expand Down

0 comments on commit a80ba10

Please sign in to comment.