Internet Relay Chat is one of the earliest open network protocols for communication. It’s popularity has waned in the Age of Slack/Twitter/Snapchat but it still has a devoted user-base, especially in the open-source community.
Your mission today, should you choose to accept it, is to write an IRC bot using Core, the Async library for concurrency, and the Angstrom parsing library.
The protocol used by IRC servers and clients is a text-based TCP protocol. The original RFC was published in 1993 (RFC1459) and the protocol was subsequently specified in more detail in 2000 through a series of updated RFCs:
- RFC2810 Internet Relay Chat: Architecture
- RFC2811 Internet Relay Chat: Channel Management
- RFC2812 Internet Relay Chat: Client Protocol
- RFC2813 Internet Relay Chat: Server Protocol
The protocol is rich: There are many different commands that clients can issue to servers as well as protocols for server-to-server and client-to-client communication.
Luckily, for our purposes, to write a bot we’ll only need to implement a few of these:
NICK <nickname> USER <user> <mode> <unused> <realname> JOIN <channels> [<keys>] PRIVMSG <msgtarget> <message> PING <server1> PONG <server1>
IRC is a text-based protocol made up of \r\n
-delimited messages and as a
result it’s fairly human-readable. It’s easy to debug or play around by using
a utility like nc
or socat
or reading the verbatim messages exchanged
between clients and servers.
Paraphrasing RFC2812:
- Servers and clients send each other messages, which may or may not generate a reply.
- Each IRC message may consist of up to three main parts, each separated by a
space: an optional prefix, a command, and any number of parameters for the
command (from 0 to 15, inclusive).
NICK jdoe JOIN #some-channel
- Messages are a maximum of 512 characters including a required delimiter on
the end of
\r\n
(i.e. there is space for 510 useful characters)[fn:1]. - When the final parameter is prefixed with a ‘:’ character, the value of
that parameter, including any space characters, is the remainder of the
message. E.g.:
PRIVMSG jdoe :This is a long parameter with spaces in it.
[fn:1] In examples below we omit the terminating CRLF for convenience.
As alluded to above, some messages warrant replies. A reply is just a message with some additional constraints:
- The optional prefix (described above) is always included.
- The command is a three digit reply-code, (the full list of possibilities is specified in section 5 of RFC2812).
- The first command parameter is always the “target” of the reply, for our purposes, typically a nick.
Putting it all together, here’s a full example of a client (nick “jdoe”) connecting to an IRC server, identifying herself, joining a channel, sending a message, and receiving a response. Here, ‘>’ denotes a message from jdoe’s IRC client to the server, and ‘<’ denotes a response from the server to jdoe’s client.
> NICK jdoe > USER jdoe * * :Jane Doe < :irc.example.com 001 jdoe :Welcome to the example.com Internet Relay Network jdoe!jdoe@some-hostname < :irc.example.com 002 jdoe :Your host is some-hostname, running ircd < :irc.example.com 003 jdoe :This server was created Sun, 11 Mar 2018 23:18:53 EDT < :irc.example.com 004 jdoe example.com ircd-1.0 iox beIikntplsZ < :irc.example.com 251 jdoe There are 1 users and 0 invisible on 1 servers < :irc.example.com 255 jdoe I have 1 clients and 1 servers < :irc.example.com 422 jdoe :MOTD File is missing > JOIN #test < :jdoe!jdoe@some-hostname JOIN #test < :irc.example.com 331 jdoe #test :No topic is set < :irc.example.com 353 jdoe = #test :@jdoe < :irc.example.com 366 jdoe #test :End of NAMES list > PRIVMSG #test :Hello test! < :psmith!psmith@another-hostname JOIN #test < :psmith!psmith@another-hostname PRIVMSG #test :Hello jdoe! < :psmith!psmith@another-hostname QUIT :connection closed > QUIT
With this, we should know just about everything we need to know to be able to write a functional bot!
For the purposes of this workshop, we’ll use a channel on the Freenode IRC
network called ##js-ocaml-workshop-2018
. You can connect to Freenode at
irc.freenode.org:6667
[fn:2]. Since IRC is text-based, you can use a utility
like netcat
, socat
, or nc
to connect to an IRC server and try sending
some commands manually. nc
has a -C
flag which will cause it to terminate
your lines with \r\n
as we want here:
$ nc -C irc.freenode.org 6667
Once you’re connected, the first order of business is to identify yourself
by issuing a NICK
command to register a nickname for yourself and then the
USER
command to give a bit more information. For example:
NICK jdoe USER jdoe * * :Jane Doe
After that, you can join the channel mentioned above like this:
JOIN :##js-ocaml-workshop-2018
And send a message to everyone else in the channel like this:
PRIVMSG ##js-ocaml-workshop-2018 :Hi everyone!
Once you’re done, you can leave the channel with the PART command, issue a
QUIT to disconnect entirely, or just close the connection by exiting nc
with Ctrl-C
.
[fn:2] In practice, you should almost certainly be connecting using TLS, but we’ll keep things simple and leave that out of scope for this exercise.
Now that you have the basics of the IRC protocol down, try writing a simple bot which connects to a (configurable) channel and responds to anyone who says “hi” or “hello” with a friendly greeting. Make sure it doesn’t get itself into a politeness loop by responding to its own greetings!
To help you on your way, bin/bot.ml
defines a simple command-line based bot
which identifies itself, connects to a configurable channel, and sends a
single message before disconnecting without making any attempt to validate
arguments or check for error replies from the server.
If you’re going to keep your bot connecting for a long time (as you probably should) then you’ll probably need to handle PING messages from the server and respond with a PONG so your bot isn’t disconnected for lack of activity.
CAUTION: If your bot takes input then you should be very careful to consider whether you need to sanitize it before using it. The Internet is a scary place so you should be careful not to trust any old message.
Once you have your bot working (woo!) there are a number of different improvements that you can make:
- Update your bot to greet users by name
- Batch greetings of users who join at close to the same time into one message
- Extend your bot to be able to connect to multiple channels at the same time and spread happiness across the whole network
Once you have a working bot that can handle the above, you’re well on your way to making it support whatever you want! There are a few different directions in which you can proceed.
One idea is to amend your bot to use the angstrom library to parse IRC protocol messages so that you can be more sure that it can handle all the different messages that IRC servers in the wild might throw at you. Be sure to write some expect tests to make sure your parser works!
If you want to keep expanding the skills of your bot, you are bounded only by your imagination. You could write a bot which:
- Archives all the messages sent to a channel for future reading.
- Prints a fortune from the
/usr/bin/fortune
on demand. - Knows how to do unit conversions.
- Uses a Markov chain and a text corpus of your choice to generate made-up but convincing sounding responses to people’s messages.
Something else! The world is your oyster.
Just a reminder to BE CAREFUL: If your bot takes input then you should be very careful to consider whether you need to sanitize it before using it.