Tech Bits - Digital Ocean and RStudio

Originally this blog was meant to focus mostly on data analysis of horse racing. My methods for achieving this are primarily through the R statistical programming language. However, as I've spent more and more time writing code in recent weeks, I have more and more to say of a technical nature and somewhat less regarding actual race analysis. Regular readers, those lonely few, who expect mostly racing content will be disappointed with this, and a collection of future posts. To help the racing enthusiasts avoid the tech bits, posts of this nature will be prefaced with Tech Bits.

Recently I've been working on a project for Formbet, a quality purveyor of daily race ratings. Using their historical ratings and results database, proprietor Dave McAuley was interested in providing subscribers with an online searchable database. Exposing the data in this way would allow users to conduct their own research, rather than relying on Dave to run SQL queries on their behalf, and also provide significant added value for subscribers. More on this in future posts.

The larger my code base for this project became, the more my development machine, a slightly aging i3 laptop, began to groan under the strain. Therefore, I needed a new solution for fast and effective programming. I pulled an older, but higher specification laptop out of the cupboard, to once again be highly annoyed with the excessive fan noise after less than twenty-four hours. Instead, I turned to Digital Ocean, a cloud based computing resource provider (I'm not sure if that's how they describe themselves in their marketing spiel, but that's how they're categorised in my mind.) In short, Digital Ocean allows customers to start and stop virtual computing resources, hosted remotely on their servers, paid for on an hourly basis. It's pretty straight forward. Amazon, Google, Microsoft and many others already play in this space. The difference with Digital Ocean is that working with their technology solution is so easy. Literally, within minutes of initial sign-up I had started a Debian 7 Virtual Private Server (VPS).

Once the server was up and running, the hard part started. Well, not really, but there were a few steps and some software to install. To cut a long, boring and technology focused story somewhat short, I installed RStudio Server, an Integrated Development Environment (IDE), which is a fancy way of saying software specifically written to write software. The server version of RStudio allows people to connect to its interfact via a web browser. I could now use the same familiar tools for coding, as I had been on my local computer. The difference is that accessing via a web browser, means the computing overhead to run the software and test any code written, could all be handled by a more powerful remote server, thus making life easier.

The total cost for the server instance I am running, or droplet in Digital Ocean parlance, is US$0.03 per hour. Each evening, when I am finished work, I stop the server, take a snapshot, which is like a freeze frame of all data and settings on the server at that point in time, and then destroy the droplet. If one forgets to destroy the droplet, hourly charging continues. When ready to start work again, simply re-create the droplet from a saved snapshot. Everything magically re-appears just as it had been.

Without further rambling, suffice to say, at around US$0.30 per day and a more comfortable working environment due to less local overhead, this is a very nice solution. The only real downsides are sometimes spending seconds (seconds!) looking for an open RStudio application on my laptop, when all I really need to do is find the relevant browser tab, and also from time to time I still need to access files or historic scripts on my local machine. Small prices to pay for the ease of working with Digital Ocean's droplets and a browser based RStudio.


Digital Ocean - an affiliate link which will provide new accounts with $10 credit on registration.

RStudio - probably the best R development tool.

Formbet - outstanding daily horse racing ratings, and soon a brilliant online data analysis tool.