This is a simple little program that I wrote mostly to practice using Coffeescript, Node, and Promises. I have a lot of little web projects going on all the time on my PC, and keeping track of them all is a sometimes difficult task. I wanted to make that task easier.
The basics of the task are simple: get a list of all the ports where I usually drop off a web-oriented project, try to get the home page, and if it's there, try to get the title of out the HTML.
For this project, I used the excellent node-promise library, mostly because its behavior most closely matched that of jQuery's Deferreds library. I also used the scraper library for screen-scraping the HTML; Scraper actually returns a jQuery object suitable for manipulating on the client.
Although Node is famous for being asynchronous, let's face it: there is an order in which some things must be done. In this case, you must get all the ports, then visit every port to get the title, and then print the results. Because I'm going to use Haml, I must also get the template; this can happen in parallel with, well, just about everything else. But we can not display the results until we have all the titles and the template.
Literate Program
As is my usual practice, this article was written with the Literate Programming toolkit Noweb. Where you see something that looks like
The Program
The first step is to get all the ports. Netstat is the cheapest way to do that, and spawning processes and reading from them is something Node does very well.
The only thing of note here is the promise. This object returns a promise that, when resolved, returns the data.
<a href="#NWD1fH8dg-1" name="NW1fH8dg-2a1g46-1"></a><dfn><get ports>=</dfn> (<a href="#NWD1fH8dg-7">U-></a>)
get_ports = () ->
promise = new deferred.Promise()
data = ''
accrue = (d) ->
data += d
netstat = spawn 'netstat', ['-anp', '-t', 'tcp']
netstat.stdout.on 'data', accrue
netstat.on 'exit', () ->
promise.resolve(data)
promise
Once we have the ports, we want to de-dupe them, as netstat sometimes returns duplicates. The de-dupe is trivial in coffeescript:
<a href="#NWD1fH8dg-2" name="NW1fH8dg-2Wcoz7-1"></a><dfn><de-duplicate an array>=</dfn> (<a href="#NWD1fH8dg-7">U-></a>)
dedupe = (arr) ->
obj={}
for i in arr
obj[i] = 0
for i of obj
i
For each port, we want to get the title. We want to use the promise so the program will block until done.
<a href="#NWD1fH8dg-3" name="NW1fH8dg-1lIkx5-1"></a><dfn><get titles>=</dfn> (<a href="#NWD1fH8dg-7">U-></a>)
get_title = (port) ->
promise = new deferred.Promise()
scraper 'http://localhost:' + port, (err, jQuery) ->
if err
promise.resolve [port, err.message]
return
promise.resolve [port, jQuery('title').text()]
promise
We want to get the titles from all of the the ports, and then spew out the results. As this is a CGI program, we want a simple header.
It's that double deferred.when()
that makes the difference. when()
takes a promise as an argument. deferred.all()
takes a bunch of promises and returns a single promise that resolves when all of the promises passed in finish. So here, we're saying all of the get_title()
operation, and the get_template
operation, must complete before we go on to render the results. Notice how the data that gets returned is the array from the inner deferral and the template.
<a href="#NWD1fH8dg-4" name="NW1fH8dg-1qTuoA-1"></a><dfn><display ports>=</dfn> (<a href="#NWD1fH8dg-7">U-></a>)
display_ports = (data) ->
<a href="#NWD1fH8dg-5" name="NW1fH8dg-1qTuoA-1-u1"></a><return matched ports>
<a href="#NWD1fH8dg-6" name="NW1fH8dg-1qTuoA-1-u2"></a><get template file>
matches = dedupe(matcher(i) for i in data.split(/\n/) when matcher(i))
promises = (get_title(i) for i in matches)
deferred.when deferred.all(deferred.all(promises), getTemplate()), (data) ->
[data, template] = data
console.log("Content-type: text/html\r\n\r\n")
handler = haml(template)
console.log handler({data: data})
The matcher is just a regular expression check:
<a href="#NWD1fH8dg-5" name="NW1fH8dg-42ae3g-1"></a><dfn><return matched ports>=</dfn> (<a href="#NWD1fH8dg-4"><-U</a>)
matcher = (i) ->
r = (/^.{20}.*?\:(\d+)/).exec(i)
if not r
return null
r = parseInt(r[1])
if (r >= 3000 and r < 3099) or (r >= 8000 and r < 8300) or (r == 80) or (r == 81)
return r
null
And the template get is equally trivial. dReadFile
is an asynchronous read function from node-promise
that returns a promise, the resolution of which is the contents of the file.
<a href="#NWD1fH8dg-6" name="NW1fH8dg-3ke9wX-1"></a><dfn><get template file>=</dfn> (<a href="#NWD1fH8dg-4"><-U</a>)
getTemplate = () ->
dReadFile('layout.haml', 'utf8')
The whole of the program becomes:
<a href="#NWD1fH8dg-7" name="NW1fH8dg-3tKBm6-1"></a><dfn><counter.cgi>=</dfn>
#!/usr/bin/coffee
deferred = require('promise')
dReadFile = require('fs-promise').readFile
spawn = require('child_process').spawn
scraper = require('scraper')
haml = require('haml')
<a href="#NWD1fH8dg-1" name="NW1fH8dg-3tKBm6-1-u1"></a><get ports>
<a href="#NWD1fH8dg-3" name="NW1fH8dg-3tKBm6-1-u2"></a><get titles>
<a href="#NWD1fH8dg-2" name="NW1fH8dg-3tKBm6-1-u3"></a><de-duplicate an array>
<a href="#NWD1fH8dg-4" name="NW1fH8dg-3tKBm6-1-u4"></a><display ports>
deferred.when get_ports(), display_ports
And that is pretty much it. The last line launches the script, and guarantees the process runs in the right order.
Cakefile
<a href="#NWD1fH8dg-8" name="NW1fH8dg-2LCiO7-1"></a><dfn><Cakefile>=</dfn>
exec = require('child_process').exec
task 'build', 'Build the main program out of Noweb', ->
exec 'notangle -c -Rcounter.cgi counter.nw > counter.cgi', (err) ->
console.log err if err
xelatex_cmd = ('xelatex counter.tex; ' +
'while grep -s "Rerun to get cross-references right" counter.log; ' +
'do xelatex counter.tex;\n done')
task 'docs', 'Build the PDF of this document', ->
exec 'noweave -x -delay counter.nw > counter.tex', (err, stdout) ->
if err
console.log err
return
exec xelatex_cmd, (err) ->
console.log err if err
task 'html', 'Build the PDF of this document', ->
exec 'noweave -filter l2h -delay -index -autodefs c -html counter.nw > counter_doc.html', (err) ->
if err
console.log err
Index
* _<Cakefile>_: D1
* _<counter.cgi>_: D1
* _<de-duplicate an array>_: D1, U2
* _<display ports>_: D1, U2
* _<get ports>_: D1, U2
* _<get template file>_: U1, D2
* _<get titles>_: D1, U2
* _<return matched ports>_: U1, D2
Warning
Okay, so this is a fairly simple program. It also requires a ton of stuff be installed in a directory where the CGI is going to be run from, so it exposes a lot of stuff you might not want to expose. Like I said, this was an experiment.
Source Code
The source code is available from GitHub at PortProject. Yeah, it's a boring name. Also, at this moment there is a bug in jsdom (Fix ReferenceError in the scanForImportRules helper function) that causes this script to spew warnings about CSS parsing. Those can safely be ignored (really!).