Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5xx errors from random fails on asserts on moongoo socket connection (send) #36

Open
dev0pz opened this issue Feb 19, 2020 · 2 comments

Comments

@dev0pz
Copy link

dev0pz commented Feb 19, 2020

Hi Isage!

I was testing it under some stressed conditions and I found some sporadic 500 errors coming from aborted: runtime error

example:

2020/02/18 21:22:09 [error] 7#7: *1242 lua entry thread aborted: runtime error: /usr/local/openresty/lualib/resty/moongoo/connection.lua:157: bad request
stack traceback:
coroutine 0:
	[C]: in function 'send'
	/usr/local/openresty/lualib/resty/moongoo/connection.lua:157: in function '_query'
	/usr/local/openresty/lualib/resty/moongoo/cursor.lua:155: in function 'find_one'

that would be:
https://github.com/isage/lua-resty-moongoo/blob/master/lib/resty/moongoo/connection.lua#L157
connection.lua:157: assert(self:send(data))

and also

2020/02/18 21:19:37 [error] 7#7: *283 lua entry thread aborted: runtime error: /usr/local/openresty/lualib/resty/moongoo/connection.lua:88: assertion failed!
stack traceback:
coroutine 0:
	[C]: in function 'assert'
	/usr/local/openresty/lualib/resty/moongoo/connection.lua:88: in function '_query'
	/usr/local/openresty/lualib/resty/moongoo/database.lua:43: in function '_cmd'
	/usr/local/openresty/lualib/resty/moongoo.lua:75: in function 'connect'
	/usr/local/openresty/lualib/resty/moongoo/cursor.lua:139: in function 'find_one'

that would be:
https://github.com/isage/lua-resty-moongoo/blob/master/lib/resty/moongoo/connection.lua#L88
connection.lua:88: assert ( r_to == cbson.uint(self._id) )

I managed to lowered the error rate even more setting: socketTimeoutMS=30000 (being 60s supposedly the default value, so then 30s as mongo )

(also including a minor lint change at:)
https://github.com/isage/lua-resty-moongoo/blob/master/lib/resty/moongoo.lua#L30
from:
local stimeout = conninfo.query.socketTimeoutMS and conninfo.query.socketTimeoutMS or nil
to:
local stimeout = conninfo.query and conninfo.query.socketTimeoutMS or nil

I'm considering changing those assert to a pcall wrapping and letting it just quietly fail without returning any data to avoid any sporadic 5xx

Any other suggestion/fix?

@isage
Copy link
Owner

isage commented Feb 20, 2020 via email

@dev0pz
Copy link
Author

dev0pz commented Feb 21, 2020

I added some extra verbosity:

  --assert(self:send(data))
  local send_status, send_error = self:send(data)
  if not send_status then ngx.log(ngx.STDERR, "Moongoo failed to send data over socket connection with data: ", tostring(data), " with error: ", send_error) end
  return self:_handle_reply()

and then:

  --local header = assert ( self.sock:receive ( 16 ) )
  local header, rcv_err = self.sock:receive ( 16 ) 
  if rcv_err then ngx.log(ngx.STDERR, "Moongoo failed to receive data over socket connection with error: ", rcv_err) end

so.. with extra verbosity I found sporadic closed and timeout errors at _handle_reply and some bad request at self:send(data). so I ended up cleaning up some of my code to create conns at last minute, ended up with a conservative socketTimeoutMS=10000 and wrapping it on a pcall with a fallback using on a custom retry query function (that newer fallback retry will perfectly save the day) whenever some thread aborted was caught. but perhaps moongoo could try to catch those closed sockets on the driver itself...dunno.. do you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants