Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device Module fails to return large number of devices #9

Open
susanhooks opened this issue Sep 25, 2024 · 4 comments
Open

Device Module fails to return large number of devices #9

susanhooks opened this issue Sep 25, 2024 · 4 comments
Assignees

Comments

@susanhooks
Copy link

Device module fails when there are a large number of devices due to response size/time.

Kentik Ansible Collection=1.0.6

I am trying to use the kentik_device module to add devices to Kentik. When there are a large number of devices (around 5000 in this instance), the function gather_devices is unable to return the devices, and the module fails.

I did some testing, and the response via Postman is approximately 142MB. Changing the timeout to 5 minutes resulted in a 504 error from the API. I think a potential solution would be to change the gather_devices function to check for the specific device, and handle a 404 response as the device not existing.

fatal: [localhost]: FAILED! => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 422, in _make_request\n six.raise_from(e, None)\n File \"<string>\", line 3, in raise_from\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 417, in _make_request\n httplib_response = conn.getresponse()\n File \"/usr/lib/python3.8/http/client.py\", line 1348, in getresponse\n response.begin()\n File \"/usr/lib/python3.8/http/client.py\", line 316, in begin\n version, status, reason = self._read_status()\n File \"/usr/lib/python3.8/http/client.py\", line 277, in _read_status\n line = str(self.fp.readline(_MAXLINE + 1), \"iso-8859-1\")\n File \"/usr/lib/python3.8/socket.py\", line 669, in readinto\n return self._sock.recv_into(b)\n File \"/usr/lib/python3.8/ssl.py\", line 1270, in recv_into\n return self.read(nbytes, buffer)\n File \"/usr/lib/python3.8/ssl.py\", line 1128, in read\n return self._sslobj.read(len, buffer)\nsocket.timeout: The read operation timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/home/susan/.local/lib/python3.8/site-packages/requests/adapters.py\", line 667, in send\n resp = conn.urlopen(\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 720, in urlopen\n retries = retries.increment(\n File \"/usr/lib/python3/dist-packages/urllib3/util/retry.py\", line 400, in increment\n raise six.reraise(type(error), error, _stacktrace)\n File \"/usr/lib/python3/dist-packages/six.py\", line 703, in reraise\n raise value\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 666, in urlopen\n httplib_response = self._make_request(\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 424, in _make_request\n self._raise_timeout(err=e, url=url, timeout_value=read_timeout)\n File \"/usr/lib/python3/dist-packages/urllib3/connectionpool.py\", line 331, in _raise_timeout\n raise ReadTimeoutError(\nurllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='grpc.api.kentik.eu', port=443): Read timed out. (read timeout=30)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/home/susan/.ansible/tmp/ansible-tmp-1727298498.8673475-852646-240976643149169/AnsiballZ_kentik_device.py\", line 107, in <module>\n _ansiballz_main()\n File \"/home/susan/.ansible/tmp/ansible-tmp-1727298498.8673475-852646-240976643149169/AnsiballZ_kentik_device.py\", line 99, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/susan/.ansible/tmp/ansible-tmp-1727298498.8673475-852646-240976643149169/AnsiballZ_kentik_device.py\", line 47, in invoke_module\n runpy.run_module(mod_name='ansible_collections.kentik.kentik_config.plugins.modules.kentik_device', init_globals=dict(_module_fqn='ansible_collections.kentik.kentik_config.plugins.modules.kentik_device', _modlib_path=modlib_path),\n File \"/usr/lib/python3.8/runpy.py\", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib/python3.8/runpy.py\", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File \"/usr/lib/python3.8/runpy.py\", line 87, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_kentik.kentik_config.kentik_device_payload_k8lc6t3u/ansible_kentik.kentik_config.kentik_device_payload.zip/ansible_collections/kentik/kentik_config/plugins/modules/kentik_device.py\", line 629, in <module>\n File \"/tmp/ansible_kentik.kentik_config.kentik_device_payload_k8lc6t3u/ansible_kentik.kentik_config.kentik_device_payload.zip/ansible_collections/kentik/kentik_config/plugins/modules/kentik_device.py\", line 594, in main\n File \"/tmp/ansible_kentik.kentik_config.kentik_device_payload_k8lc6t3u/ansible_kentik.kentik_config.kentik_device_payload.zip/ansible_collections/kentik/kentik_config/plugins/modules/kentik_device.py\", line 318, in gather_devices\n File \"/home/susan/.local/lib/python3.8/site-packages/requests/api.py\", line 59, in request\n return session.request(method=method, url=url, **kwargs)\n File \"/home/susan/.local/lib/python3.8/site-packages/requests/sessions.py\", line 589, in request\n resp = self.send(prep, **send_kwargs)\n File \"/home/susan/.local/lib/python3.8/site-packages/requests/sessions.py\", line 703, in send\n r = adapter.send(request, **kwargs)\n File \"/home/susan/.local/lib/python3.8/site-packages/requests/adapters.py\", line 713, in send\n raise ReadTimeout(e, request=request)\nrequests.exceptions.ReadTimeout: HTTPSConnectionPool(host='grpc.api.kentik.eu', port=443): Read timed out. (read timeout=30)\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1 }

@kentikethan kentikethan self-assigned this Oct 2, 2024
@kentikethan
Copy link
Collaborator

Looking into this. Currently we do not have a way to just check if a device exists because we do single device requests using the device id. Gathering our options.

@susanhooks
Copy link
Author

What I did to get around this was - lookup device by name using "https://api.kentik.eu/api/v5/device/{{ ansible_eda.event.body.context.object_repr }}". Then if I get a 404, I assume the device does not exist and create it. There may be something I'm missing though.

@kentikethan
Copy link
Collaborator

kentikethan commented Oct 8, 2024

@susanhooks It looks like i get the following response if I try to create a device that already exists.

"message": "ValidationError: Internal IP (10.1.7.200) Already Exists (errxid cw2v36mtkfgg0c9d33xg)",

We also get this message if I have the same name but different IP:
message": "ValidationError: Device name (acc_geg_200) Already Exists (errxid cw2v585tkfgg0e1fh0a0)",

This is good, I think we can edit the code to just do this validation check and attempt to create the device. The downside is that it's more api calls.

I think we are going to need a solution to batch add a large list of devices. If we do not, it is highly likely that you will start to run into our api limits as well. We do have a batch API, so that should be possible but again would require us to get the entire list of devices back successfully.

I am thinking we may need two different solutions:

  1. The ability to add a one to a few devices quickly even when the portal contains thousands of devices. (Sort and comparing is inefficient here)
  2. The ability to bulk add thousands of devices without hitting API limits.

What do you think?

@susanhooks
Copy link
Author

For #1, yes definitely need that, and I think adding the API call to check for existing IP/name works, then adding. Maybe there's a newer version that has this built in, but I'm wondering if there's not a pagination/offset for the calls that would return large amounts of data. Could rebuild the data set locally instead of hitting a timeout on the remote resource.

I'm attempting to check for the existence of a particular device with the URI module, and create a new device if it does not exist, but I'm running into "Unknown parameter validation error" on my POST and haven't figured out what piece is wrong just yet. Need to put my glasses on and give the schema another close look.

For #2, I think that would be needed as well. Is there not a bulk upload via csv or something already? I would imagine it would be a similar process, but I don't know the backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants