This post comes out of some retrospective thinking this month. I worked on a middling sized Node.JS install for the last three quarters. We had a whole pile of javascript that ran in a web browser and a handful of Node.JS processes running to deliver that javascript and answer all the API calls it made.
The first thing to note is, we did not use Node to do the heavy lifting. If it involved touching a database it was shoved out to a LAMP app. A lot of folks will probably ask why we didn’t use something like MongoDB or another similar DB. Short answer, we couldn’t find anyone who was using it with the kind of volume we expected who had great things to say about it. There was one really scary response which amounted to ‘we have a dedicated Ops guy keeping MongoDB running for this app.’ If the choice is to hire an Ops guy solely to keep a single DB running or to spend the same money on an engineer writing new features on a well understood DB, then it’s a pretty easy choice in my view.
Node acted as a middle layer to process a number of API calls. The company had tons of data in a few different locations, Redis, MySQL, HBase, etc. After grabbing it and parsing it up it would get shoved out to a connection. We learned quickly that the way we were going caused Node to crash on us regularly. In goes Forever, this little Node module restarts Node when it crashes. You’ll need it quite a bit early on. We also put Nginx in front of all our node processes. Nginx is a great tool for reverse proxy load balancing.
From an Operations POV the app was a black box. Requests were made, stuff happened, a response occurred. Sometimes it was a HTTP 200 with valid data, sometimes it was a 500 and we had no idea what was going on. To deal with this I ended up implementing lots of StatsD checks. StatsD is a small, lightweight Node server and accompanying libraries to allow you to implement a few basic types of checks on events and network calls. It sends the results out in UDP packets. This has the advantage of being very low overhead and not requiring the receiver to be alive. I updated a version of the StatsD library for JS that Steve Ivy wrote.
From here we could trend everything happening inside the app. New release is a little slow? Oh, look! It’s making 50% more calls to Redis, something is up. In the Ops game this is great data, I can use all those finely honed visual processing skills to see behavior changes in the app quickly. That’s about it, I’ll be fleshing this out later.