Tinn v0.1.0

12th February 2024

I would like to introduce version 0.1.0 of Tinn, my web server. You can see the full source code on github. Most of the credit has to go to Brian "Beej Jorgensen" Hall and his book "Beej's Guide to Network Programming using Internet Sockets". You can check out the book on his web site at https://beej.us/guide/bgnet or like me you can can buy a real world physical copy.

A check of the source code for my server will show that a good chunk of the code comes straight from Beej's book. To be fair to me, I typed every character myself, this is part of my learning process. If I just copy and paste code, it doesn't go in my head, but if I retype it, I'm forced to think about it which I feel helps me learn.

This is not going to be a tutorial, so I'm not going to explain every line of code. First others have already done a better job and second I'm not that confident my code doesn't have a horrible memory leak. That said, here is a quick breakdown of how the code works. It's all in a single file stub.c for now because the whole program is only 225 lines long. Skipping to the main function on line 92, I start with all the networking stuff which in summary completes the following tasks:

Gets address info for the local machine
Creates a socket
Binds that socket to a port
Starts listening to that socket
Adds that socket to a list of sockets for the call to poll later

The server then enters an infinite loop which polls its list of sockets for when they have something to say. Initially the list will only contain that first server socket. When it does have something to say, it's because a client wants to connect to the server. In this case the program accepts that connection creating a new socket for that client. This new client socket is then added to the list of sockets to poll.

When a client socket get's chatty, it's because the client is making a request. This is where the code leaves the comfort of Beej's guide and instead we have to start thinking about HTTP. For now I'm ignoring most of the request and just concerning myself with the first two words. In an HTTP request the first word is the method, the second is the resource path. I will worry about the rest of the request at a later date.

At the moment the server only supports the "GET" method, and only supports the path "/". For any other method the server responds with an HTTP error code 501 (Not Implemented) and for any other path responds with a 404 (Not Found). For a valid GET / request, the program loads the contents of the content.html file, composes the correct HTTP response with the appropriate headers, Content-Type and Content-Length being key and sends the response. The client socket is then closed and removed from the list of sockets.

That's it. Step one complete. I have the beginnings of a web server. I have a much better understanding of how sockets work. Most importantly, a list of things I'm eager to do next.

First up, this has highlighted how rusty my C coding is. Specifically I need to review how strings work. I'm fairly certain the way I'm composing the responses is suboptimal. I didn't expect this to trip me up, but years of working in higher level languages has left me reaching for a concatenation operator that does not exist.

On a similar note, the server has an issue with unicode. The UTF-8 content file had some characters (e.g. fancy quotes) that used more than one byte to encode the character, these got corrupted somewhere in the chain, I need to fix that.

Then I need more functionality, starting with being able to return different files based on the path and for those files to support different content types. Then I plan to add some much needed style to this blog. How do images work over HTTP?