Moohar Archive

Then it broke

14th February 2024

It was all working fine when I tested it on my local machine. Then moved it to my server and tested it from an external IP, I kept getting timeouts and disconnects. It was late so I left it and returned to it today.

I’ve not yet worked out the solution but I think I understand the problem. Basically I was closing the socket too fast, so the connection gets killed before all the data has downloaded. When testing locally this was not a problem, put a router or two and an ISP between me and the server and it is. The solution would be to not close the socket, in fact HTTP 1.1 is supposed to use persistent connections by default. This is how I had my code at an earlier point on the project, only closing the client socket when the client sent zero data AKA a disconnect request.

However, I then see a problem in Chrome where the first request loads fine but the second request never finishes. According to my server console the second request is received from the client and the response is sent. According to the network tab in Chrome developer tools the connection is stalled at the initialising connection phase. Closing the socket from the server side seems to complete the request from Chrome’s point of view. This was why I changed my code originally and made a mental note to look deeper into persistent connections.

Now I’m sending more data, like images and more text, it seems I have to give the socket time before closing it otherwise the data gets truncated. But leaving it open caused Chrome to freak out. I’m obviously doing something wrong with how I’m handling the connection and sending the data. I think I need to do some proper reading of the HTTP spec. It was only a matter of time.

That said I did some hackery to make the site work while I go read. I now don’t close the socket after sending a response, which avoids the router problem. I then have a 2.5 second timeout which will close all client sockets if no other activity has happened, which works around the Chrome problem. It’s not neat. It’s not going to work if any number of people use the web site at the same time. I’m not proud of it. But it works for now.

I’m feeling the need to better track connected clients, especially if I’m going to maintain persistent connections. Generally I need to tidy up the code and move some of the network setup out of the main function. The next version is going to have quite a few changes I think. This is why I’ve tagged the current hacky state as v0.2.0 on github.

Interestingly, Firefox does not have the same issue as Chrome when I keep the socket open. I assume it’s being more forgiving and that I’m not that far away from the correct handling. Also, because I can see the sockets being opened on my server in real time, I now know that Firefox will open the socket as soon as you hover over a link in your browser. I assume they do this to get the TCP handshaking out of the way before you click the link to make the browser more responsive. Similarly Chrome always opens two sockets, again I assume for performance reasons, but in my tiny example I’ve never seen it use the second socket.

I’m off to read, be back in a couple of days I hope.

TC