Inconsistent connection of multiple clients #34

georgeharker · 2021-02-27T00:48:04Z

I have a mioXL that exports a number of connections via rtpmidi. When I try and see those using rtpmidid they're all listed (eg aconnect -l shows them). When I connect to them with alsa based connections (even a simple mido-based python script which uses rtmidi as a backend to alsa) one or more of the connections will usually land in a bad state.

I get a fair few not connected yet messages. And often lots of feedback messages.

I do sometimes see some 'NO' refusals. Not always. Though it's easy for them to get lost in logs.

I tried increasing connection timeouts in code and a couple of other things. But I can't quite see what's going on. It felt a bit like there was a race somewhere. But I can't see it by looking at the code rn.

Do you have any idea what might be going on?

It seems like once it gets into a not connected yet stage it doesn't retry connecting.

Is there a way to have rtpmidid try and reconnect even if it gets no?

This looks super promising. I just can't get it to reliably enough connect right yet.

Thanks

George

georgeharker · 2021-03-02T00:37:11Z

I've done a lot of digging on this: here's my findings

The auto retry code wasn't working because (I think) the poller time queue sort order was backwards.

Auto reconnect needs some things reset before retrying - ports need closing, née initiator ids need generating and some callbacks need disconnecting (mostly the connected one or it would register multiple and get them several times).

Ssrc on the source end is important. And needs something repeatable but random. I used a hash on the source name. Which for connections from alsa programs using mido is always the same. But maybe it should be local host name + dest host name hashed. The rfc indicates there's need for collision detection logic. Which I didn't quite grok their example of yet.

After some serious code mangling, I still couldn't get it to reliably connect. At some point in connecting to the 6 streams from the mioXL I have, a connection to the control port would succeed (get ok), but the subsequent midi port connection would get NO'd.

It was also clear the mioXL got increasingly confused as it behaved better after restart. But after one successful connect / reconnect it would start refusing again.

After some time with Wireshark, watching what my Mac does to connect to the same 6 streams, I finally understood.

The Mac creates two ports only. All control comes from one to whichever target on the mioXL and all midi ports originate from the same.

All ssrc identifiers are the same (based on the originating device, not individual per connection).

I think the mioXL is seeing multiple connections from a single source with different ports and it's getting confused between the ssrc mappings it internally holds.

So I took some seriously hacks swipes at making this happen- forced the control port to a specified port, with socket reuse turned on via setsockopt , same for the midi port.

Made all callbacks check either the initiator_id (for ok, no) or remote ssrc for the other commands, and effectively had multiple overlapping sockets all receiving the same data and ignoring the bits not meant for that rtpclient / peer.

And it works. But it's obviously a gross hack and not how the code is meant to be structured (and is probably an abuse of sockets to have multiple bound to the same port for udp).

In summary, I believe the correct implementation is to mux all outgoing and incoming connections onto a pair of ports for control and midi (when connecting to the same host, but regardless of service name connected to / dest port). And for all of those connections to share the same local ssrc, but use distinct initiator ids on connection. The remote ssrcs will be different and it is this which allows figuring out which port on the other side sent which packet.

I'm not sure what status development is at for your project (thanks this is a great kick start). And I'm not sure quite how to proceed given changes seem like they might necessitate a reworking of the structure somewhat.

I was thinking of keeping the list of clients with rtpclient for the host match only but having each client keep a list of remote ssrcs and manage the mapped aseq<-> peer connection with mapping of ssrc to disambiguate received packets.

I'd be interested in your thoughts. I'm happy to contribute code, collaborate /fork or otherwise help out. I don't have a patch I feel good about rn, so much as tentative proof that the above works more smoothly for connection to the mioXL and mirrors what Apple does from OS X.

George

davidmoreno · 2021-05-02T21:57:50Z

Hi. First sorry I did not see the messages until now, and thank you for the detailed report.

I will add new bug requests for the poller queue problem, auto reconnect, and the repeatability of ssrc (with collision detection). If they are fixed, do you mind sending a pull request?

About all the ports problems, my idea was to be able to have several clients and servers, and I thought the best way was to have several ports. As you say with initiator_id and ssrc should be enough to identify existing connections, but to initiate it we need a way to differentiate them.

If I understand this properly the problem is when rtpmidid is a client, and it connects to several servers on the same IP, the server side sends data to a random client port, and somehow assumes the ssrc or initiator_id will make the data arrive properly?

It all sounds like a mioXL bug, but anyway we want to be compatible.

A simple temporary workaround might be to connect to a single server endpoint.

Long term I think therm it might be possible to reuse the client port for several or even all the connections... I have to think about it. And hope no other rtpmidi implementation has the reverse bug the requires different ports. Or add an option to select the behaviour, maybe even per connection.

Can you send me a wireshark / tcpdump dump of an example with this behaviour?

Thanks again for the report,
David.

georgeharker · 2021-05-02T22:38:44Z

Hi David, Trying to resurrect my brain cells on this as it’s a while since I worked on the rrtpmidid code. I can definitely send a pr for what I have. It may not match the intended design from your end though.

If I understand this properly the problem is when rtpmidid is a client, and it connects to several servers on the same IP, the server side sends data to a random client port, and somehow assumes the ssrc or initiator_id will make the data arrive properly?

The mioxl will export as separate named services each physical connection. It will accept the first connection from a new ip on both the control and data channel ports but after that It associates the IP and ssrc and port names and subsequent connection from the same IP is expected to be from the same port and ssrc. I think this is to do with the part of the spec that works out if there has been ssrc clashes. Anyway subsequent connections to a different named service will succeed on control but be refused repeating the handshake on the data port. So the dirty thing I did is create all the clients on the same port, but have them ignore messages with the wrong ssrc. This isn’t ideal in the sense that ideally there’d be a table is known ssrc that would check and dispatch rather than look at them all and ignore in n-1. But it works well enough for my purposes (making a hardware sequencer).

It all sounds like a mioXL bug, but anyway we want to be compatible. A simple temporary workaround might be to connect to a single server endpoint.

The mioxl exports separate named Services for each port by default. It might be possible to have them all mashed onto one service on the mioxl but it didn’t seem fo be the suggested pattern.

Long term I think therm it might be possible to reuse the client port for several or even all the connections... I have to think about it. And hope no other rtpmidi implementation has the reverse bug the requires different ports. Or add an option to select the behaviour, maybe even per connection.

I poured over the specs such as they’re available and it looked to me like the session initiation was supposed to do some ssrc matching. But it’s not well documented.

Can you send me a wireshark / tcpdump dump of an example with this behaviour?

Yep. I’ll dig that out, might take a bit longer to do that than send the pr. I don’t have an easy way to intercept the pi->mioxl traffic which caused the issue. But I did grab a session interaction with what the Mac does when connecting to mioXL successfully. And it uses the same port for all connections.

Thanks again for the report, David.

No problem. George

This was referenced May 2, 2021

Auto reconnect needs some things reset before retrying #37

Closed

Consistent creation of ssrc #38

Open

Poller time queue backwards #39

Closed

sadguitarius mentioned this issue Dec 23, 2022

port connection fails when connecting many ports in quick succession #92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent connection of multiple clients #34

Inconsistent connection of multiple clients #34

georgeharker commented Feb 27, 2021

georgeharker commented Mar 2, 2021

davidmoreno commented May 2, 2021

georgeharker commented May 2, 2021 via email •

edited

Loading

Inconsistent connection of multiple clients #34

Inconsistent connection of multiple clients #34

Comments

georgeharker commented Feb 27, 2021

georgeharker commented Mar 2, 2021

davidmoreno commented May 2, 2021

georgeharker commented May 2, 2021 via email • edited Loading

georgeharker commented May 2, 2021 via email •

edited

Loading