-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error trying to index osm-planet #129
Comments
@WolfgangFahl If you use the latest version of QLever and the latest version of the Qleverfile (see the linked PR, which will be merged soon), this problem should disappear. |
the qlv script is now available at https://github.com/WolfgangFahl/qlv git clone https://github.com/WolfgangFahl/qlv ./qlv -qc
Setting up QLever control in /opt/qlever-control...
Cloning into '/opt/qlever-control'...
remote: Enumerating objects: 2646, done.
...
Successfully built UNKNOWN
Installing collected packages: UNKNOWN
Successfully installed UNKNOWN-0.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
✅:QLever control setup and installed successfully.
✅:QLever is installed and available at /home/wf/bin/qlever qlv -p
Pulling QLever Docker images...
Using default tag: latest
latest: Pulling from adfreiburg/qlever
5a7813e071bf: Pull complete
...
323a3c577443: Pull complete
Digest: sha256:8494e8f862a7be0450902e445c4043542460c1eacba05c9c34f8badf863ddc75
Status: Downloaded newer image for adfreiburg/qlever:latest
docker.io/adfreiburg/qlever:latest
✅:Successfully pulled adfreiburg/qlever
Using default tag: latest
latest: Pulling from adfreiburg/qlever-ui
c6a83fedfae6: Already exists
...
26d998712fd5: Pull complete
Digest: sha256:5ab6e9a2f44d159737c9fe0c7cd7f1bd6f10b43cefd0b8130a6c1fbc979252fa
Status: Downloaded newer image for adfreiburg/qlever-ui:latest
docker.io/adfreiburg/qlever-ui:latest
✅:Successfully pulled adfreiburg/qlever-ui cd /opt/qlever-control
git pull
Updating 0e74e90..ba2823d
Fast-forward
.github/workflows/pytest.yml | 29 ++
.github/workflows/qleverfiles-check.yml | 1 +
pyproject.toml | 7 +-
...
Stored in directory: /home/wf/.cache/pip/wheels/e5/cd/6c/cbe6881bcd0490208d9bc2c9eb1e1f577f3b753b7e33f9e035
Successfully built qlever
Installing collected packages: qlever
Attempting uninstall: qlever
Found existing installation: qlever 0.5.11
Uninstalling qlever-0.5.11:
Successfully uninstalled qlever-0.5.11
Successfully installed qlever-0.5.17 |
pip install qlever works while installing from source fails see #136 |
qlv --disk gamma --kg osm-planet -ir
✅:Created directory /hd/gamma/qlever/osm-planet_20250213
✅:Started screen session qlever_osm-planet_20250213.
✅:Logging to /hd/gamma/qlever/osm-planet_20250213/screen.log is now running for a new attempt |
@WolfgangFahl Out of curiosity: why are you using an own If you need more functionality, you can extend the A further advantage is that if you have a useful extension, you can just turn it into a pull request and it might become part of the official script. |
@hannahbast at qlever-api.conf
# QLever API Wikidata HTTPS configuration
# Last updated: 2024-10-18
<VirtualHost *:443>
ServerName qlever-api.wikidata.dbis.rwth-aachen.de
ServerAdmin webmaster@localhost
ErrorLog ${APACHE_LOG_DIR}/qlever-api_error.log
CustomLog ${APACHE_LOG_DIR}/qlever-api.access.log combined
ProxyPreserveHost On
Timeout 5400
ProxyTimeout 5400
ProxyPass / http://localhost:7001/
ProxyPassReverse / http://localhost:7001/
Include wikidata-ssl-common.conf
</VirtualHost>
<VirtualHost *:80 >
ServerName qlever-api.wikidata.dbis.rwth-aachen.de
ServerAdmin webmaster@localhost
ErrorLog ${APACHE_LOG_DIR}/qlever-api_error.log
CustomLog ${APACHE_LOG_DIR}/qlever-api.access.log combined
ProxyPreserveHost On
# 90 min timeout?
Timeout 5400
ProxyTimeout 5400
ProxyPass / http://localhost:7001/
ProxyPassReverse / http://localhost:7001/
#<Proxy *>
# Order deny,allow
# Allow from all
# Authtype Basic
# Authname "Password Required"
# AuthUserFile /etc/apache2/.htpasswd
# Require valid-user
#</Proxy>
</VirtualHost> so in the apache config i could exchange the port as needed. I do not know how to do that with the api since i only see manual add buttons and import/export - and i have never seen a proper export file that i could reuse. But that would be an issue for qlever-ui i think |
qlever status
...
PID USER START RSS COMMAND
5376 th Feb13 0G ServerMain -i olympics -j 8 -p 7019 -m 5G -c 2G -e 1G -k 100 -a olympics_7643543846
5399 wf Feb13 3G ServerMain -i wikidata -j 8 -p 7001 -m 20G -c 10G -e 1G -k 200 -s 30s -a wikidata_K71G2U2bike0
5408 wf Feb13 6G ServerMain -i dblp -j 8 -p 7015 -m 20 -c 5 -e 1 -k 100 -a dblp_110931226 -t
171951 wf Feb13 32G IndexBuilderMain -i osm-planet -s osm-planet.settings.json -F ttl -f - -p true --stxxl-memory 40G --parser-buffer-size 100M now working on ad-freiburg/qlever-ui#125 and other related qlever-ui issues would IMHO be very helpful. Being able to set the active servicers with name, hostname, description, port via api would allow automation of the rotation. Being able to have local and remote server configurations in parallel on the same UI would be great. For my research the most important part would be to have persistent access to the query logs and short-urls generated for the queries. Our RWTH Aachen server is intented to be a public server has part of a network of snapquery based wikidata mirrors that hide the Query Execution Context from the users to avoid Query Rot. |
On my 128 GB machine i get: On the 512 GB machine the log ends with: in the middle of the indexing ... very strange ... |
This happens when the process gets killed by the operating system because it used up too much memory. You can verify this by checking the messages in 512 GB of RAM should be more than sufficient. We usually build these indexes on machines with 128 GB of RAM. And don't forget that OSM Planet is a pretty big dataset. The version we provide on https://osm2rdf.cs.uni-freiburg.de has almost 100 B triples, and the version we provide on https://qlever.cs.uni-freiburg.de/osm-planet has 250 B triples (and over 300 B triples internally). |
ERROR: The regex ".[\t ]*([\r\n]+)" which marks the end of a statement was not found at all within a single batch that was not the last one. Please increase the FILE_BUFFER_SIZE or set "parallel-parsing: false" in the settings file.
There seems to be no such parallel-parsing: in the Qleverfile (assuming this is the settings file mentioned in the error message).
If such a setting is possible I would hope to see an commented out version of it to get the syntax right.
While at it you might want to update
the data file of osm planet is 409 GB by now the download was much slower even on our RWTH server with decent internet access. The download took more than 3 hours.
The text was updated successfully, but these errors were encountered: