Biggish files

25th February 2024

I tested serving up files that are about 10 megabytes in size. While it worked, it made the server very unstable and became a problem if the server received multiple requests at the same time. I also tested serving an MP3 file. This almost always blocked the server from other requests and if I navigated away or cancelled before the MP3 finished downloading, the server would crash.

When calling send to send data to the client, there is no guarantee it will send all the data you've asked it to send. Thankfully the function will return how much data it did send, so you can react appropriately. Up to now I've been using some code which repeatedly calls send keeping track of what actually gets sent until all the data is gone.

void send_all(int sock, char *buf, size_t len) {
	size_t total = 0;
	int sent;
	
	while (total < len) {
		sent = send(sock, buf+total, len-total, 0);
		if (sent == -1) {
			fprintf(stderr, "send error: %s", strerror(errno));
			return;
		}
		total += sent;
	}
}

This is a bit like foie gras, force-feeding the client all the bytes if it wants them or not. Not a good idea. This is also the code that is blocking the server because if the client is not ready to receive the data, send will just wait until it is, ignoring everything else.

To make the code more client friendly, only sending the client data when it is ready, I needed to ditch my send_all function and instead use polling to detect when the client socket is ready to receive data.

First I moved the output buffer to the client_state structure. This way each socket would have its own output buffer that would persist over multiple iterations of the main network loop, a requirement if I'm not sending all the data in one go. I also updated the buffer so it contains a read pointer to keep track of how much of the buffer has been read and sent. I was in two minds about if this should be a feature of the buffer or something to store in the client_state directly. I went with the buffer for now.

Second I wrote a new function called client_write to handle sending data to the client. This makes a call to send but with an additional MSG_DONTWAIT flag instructing it to not block if it can’t send data. If the send function sends less data than available, the code changes the poll file descriptor for this socket to wait for a POLLOUT event. This is triggered when the socket is ready to receive data without blocking. If all the data is sent, it sets the events list back to POLLIN to wait for the next request.

int client_write(struct pollfd* pfd, struct client_state* state) {
	long len = buf_read_max(state->out);
	long sent = send(pfd->fd, buf_read_ptr(state->out), len, MSG_DONTWAIT);
	if (sent < 0) {
		fprintf(stderr, "send error from %s (%d): %s\n", state->address, pfd->fd, strerror(errno));
		return -1;
	}
	if (sent < len) {
		buf_seek(state->out, sent);
		pfd->events = POLLOUT;
	} else {
		buf_reset(state->out);
		pfd->events = POLLIN;
	}
	return sent;
}

I created the equivalent client_read function and moved most of the code from the client_listener function to it. I tweaked the code that prepares the response so it just populates the output buffer but doesn't try to send it yet. That is left to the client_write function.

Finally I updated the client_listener function to check the incoming events and direct the code to the appropriate read or write function.

void client_listener(struct sockets_list* sockets, int index, struct routes* routes) {
	struct pollfd* pfd = &sockets->pollfds[index];
	struct client_state* state = sockets->states[index];

	int flag = 0;
	if (pfd->revents & POLLIN) {
		flag = client_read(pfd, state, routes);
	} else if (pfd->revents & POLLOUT) {
		flag = client_write(pfd, state);
	}
	
	if (flag < 0) {
		close(pfd->fd);
		client_state_free(state);
		sockets_list_rm(sockets, index);
	}
}

This fixed one of my problems. The server was no longer blocked when one client was downloading large files. Two clients could now happily query the server at the same time. Happy days.

However the server still crashed if I cancelled a download. This is because I wasn't handling hang-ups correctly. I thought I had dealt with this in the read code. If a call to recv reads zero bytes I interpret this as the client closing the connection, which is true. It is however possible for the client to close the connection at other times, like when I’m trying to send data to it. It's easy enough to detect this by checking for a POLLHUP event, which I was not doing.

I updated the client_listener code to first check for POLLHUP events before trying to send it data. While I was there I also got it to look for the other error events POLLERR and POLLNVAL. In all three cases it now closes the socket nicely instead of crashing instantly. Super happy days.

void client_listener(struct sockets_list* sockets, int index, struct routes* routes) {
	struct pollfd* pfd = &sockets->pollfds[index];
	struct client_state* state = sockets->states[index];

	int flag = 0;
	if (pfd->revents & POLLHUP) {
		tprintf("connection from %s (%d) hung up\n", state->address, pfd->fd);
		flag = -1;
	} else if (pfd->revents & (POLLERR | POLLNVAL)) {
		fprintf(stderr, "Socket error from %s (%d): %d\n", state->address, pfd->fd, pfd->revents);
		flag = -1;
	} else {
		if (pfd->revents & POLLIN) {
			flag = client_read(pfd, state, routes);
		} else if (pfd->revents & POLLOUT) {
			flag = client_write(pfd, state);
		}
	}

	if (flag < 0) {
		close(pfd->fd);
		client_state_free(state);
		sockets_list_rm(sockets, index);
	}
}

The server seems much more stable now. Before it would crash every so often after getting a request from random crawlers looking for an admin page or something. It would seem these requests will hang-up when they don't get what they were looking for because now I'm handling hang-ups, they don't seem to crash the server anymore. Another thing fixed. I wonder what will be next?