Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lost Packet handling #41

Open
tobox opened this issue Jan 30, 2014 · 21 comments
Open

Lost Packet handling #41

tobox opened this issue Jan 30, 2014 · 21 comments
Assignees
Labels
plan to fix A valid problem that I plan to fix

Comments

@tobox
Copy link

tobox commented Jan 30, 2014

When tftpy runs as a server and a data packet is lost, all further requests for the lost packet are just logged as duplicate ACKs by tftpy. If I add

          self.sendDAT()

right after

        elif pkt.blocknumber < self.context.next_block:
            log.debug("Received duplicate ACK for block %d"
                % pkt.blocknumber)
            self.context.metrics.add_dup(pkt)

the transfer continues properly after packet loss. I don't know if it is an error of tftpy or of my tftp client (u-boot bootloader), but the problem does not occur with other tftp servers.

Regards,
Thomas

@msoulier
Copy link
Owner

msoulier commented Jun 7, 2015

Hmm. So we received the ACK already and sent the DAT, but perhaps the ACK from the client was lost.

The reason we don't send a DAT in response to a duplicate ACK is the sorceror's apprentice problem. So how can I safely send a DAT there without introducing that problem?

@guillon
Copy link

guillon commented Jun 18, 2015

I've proposed a pull request for this. "Implement retransmission of packet on peer request".
#58

The reported issue is actually due to an incomplete implementation of the protocol as per RFC 1350.
In summary, duplicate recieve of the previous peer packet is a request for resend of the own previous packet as per the protocol.

The pull request handle this.

I've tested it on other clients such as u-boot tftp client and it fixes the issue.

@posborne
Copy link

FWIW, I believe I have seen this same problem when doing U-Boot tftp transfers. Have not tested the PR but could probably give it a whirl to see if it increases robustness when on a congested network.

@guillon
Copy link

guillon commented Jun 23, 2015

Indeed, u-boot tftp client, implements a timeout policy waiting for peer packets, and when a timeout occurs, ask for the expected packet to be resent as described in the protocol.

This unimplemented tftpy_server behavior (packet retransmissoin) will not occur when testing the tftpy_server with tftpy own client (tftpy_client) as tftpy_client is simpler and just waits peer packet forever.

More precisely, as soon as the client timeout is lower then the tftpy_server timeout, the client will keep sending requests for resending the previous packet. As the server sees traffic on the socket (for packet it wrongly drops), then it itself never gets into a timeout and never resend by itself the packet. At the end the client never receives any more packet from the server and the transmission is completly stucked.

@navlrac
Copy link
Contributor

navlrac commented Apr 12, 2016

See: #67 commit: Improve lost DAT handling

This fixes the problem without re-introducing the sorceror's apprentice problem. The solution is not to update the timeout counter when a duplicate ACK packet is received, so the server timeout waits for "progress" not "any traffic".

@msoulier msoulier added the plan to fix A valid problem that I plan to fix label Mar 22, 2019
@PiQuer
Copy link

PiQuer commented Jun 15, 2021

Hi @msoulier, is there any chance to get the fix by @navlrac merged?

Thanks!

@BuhtigithuB
Copy link
Contributor

Hi,

It seems to be that I have the U-Boot issue describe by @guillon is tftpy 0.8.1 fix this problem?

@BuhtigithuB
Copy link
Contributor

Hi,

Here attempt to resolve PR #67 merge conflicts : PR #121

@msoulier and @navlrac assess I didn't break anythings and if I didn't please merge ASAP.

Thanks

@PiQuer
Copy link

PiQuer commented Aug 3, 2021

@msoulier please help us getting this issue fixed, it is causing quite a lot of trouble in our CI pipeline. If the fix by @BuhtigithuB is fine, please merge it.

@BuhtigithuB
Copy link
Contributor

@PiQuer, my attempt to resolve the conflict wasn't successful and I didn't had time to retry. If you have some time you can try it strating from @navlrac #67 PR

@BuhtigithuB
Copy link
Contributor

With this cherry-pick : https://github.com/BuhtigithuB/tftpy/tree/navlrac-windowsize-rfc-7440-support-only

Tests pass... So cherry pick succeeded... Although when I add the navlrac tests for it rfc 7440 feature there is 3 tests that start to fail which are not even navlrac's tests... I am a bit puzzled...

I think I didn't fetch upstream before attempt the cherry pick :( so I am not up to date... Will try to update my repo and rebase...

@BuhtigithuB
Copy link
Contributor

Fetched from Github UI and it didn't end in merge conflict 😅

Can you gave a look?

I will push a clone of this branch with the failing tests navlrac tests so you can also have look at the failing tests

@BuhtigithuB
Copy link
Contributor

@BuhtigithuB
Copy link
Contributor

Here you have all commits but 2 that seems dated : https://github.com/BuhtigithuB/tftpy/tree/navlrac-windowsize-all-cherry-picks

@msoulier : Waiting for your input to make a proper PR that could be merged

@BuhtigithuB
Copy link
Contributor

UP!

@msoulier msoulier self-assigned this Oct 19, 2021
@BuhtigithuB
Copy link
Contributor

Let me know if I can help you any further...

@BuhtigithuB
Copy link
Contributor

UP!

@BuhtigithuB
Copy link
Contributor

@msoulier do you intent to include navlrac code into tftpy? Is it going to solve the instability issue identified with ubtoo?? Can we expect progress here? I am happy to help more if I can, but please give us some feedback...

@msoulier
Copy link
Owner

Frankly I need to set aside some time to reproduce this problem in a test case, and then evaluate the fix. I've been very busy.

I will try to find some time.

@BuhtigithuB
Copy link
Contributor

BuhtigithuB commented Nov 12, 2021 via email

@msoulier
Copy link
Owner

msoulier commented Nov 16, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to fix A valid problem that I plan to fix
Projects
None yet
Development

No branches or pull requests

7 participants