Dangerous Programmer: 2007

Monday, October 29, 2007

Update! New god conditions added to source repository!

Click the link to go to the god source repository - as of today, it looks like two of the three conditions I submitted for inclusion have now been added (the third being the mysql_failed condition, which may end up in some kind of auxillary gem or something, as previously mentioned...). So... look for these new conditions in god v0.6.0!

complex.rb

disk_usage.rb

Friday, October 26, 2007

New god Conditions

If you haven't checked out god yet as an alternative to monit or other system/software-monitoring tools, do yourself a favour and head over to that link for a while and then come back... it's an awesome little monitoring tool written in ruby that has all kinds of cool features, including event-based conditions that will activate as soon as a process dies instead of needing a periodic check, etc...

I won't repeat what's on their site - suffice it to say that it's a pretty nifty little piece of software, and highly useful. As such, I'm already using it in numerous places, and have written a few custom conditions to extend what's packaged with it.

At the time of writing, these have been submitted to the maintainers for future inclusion (current version is 0.5.0, so look for them in 0.6.0 hopefully!), but I've posted the files for my wonderful readers so they can start using them right away ;)

All three of these have been tested with god v0.5.0, mysql_failed has been tested w/ mysql.rb v1.24, and disk_usage has been tested w/ an installed df v5.3.0 (linux 2.6.21.3).

mysql_failed (if at all) might end up in an auxillary god gem, since it's application specific and has external dependencies. Likewise, I'm going to try and re-write disk_usage (this was a quick one that I just wanted to get working) so that it's not dependant on external programs (df), but figures it out some other way, although I suspect this may still have to be dependant somehow on the environment (ie: /proc or something).

A few usage notes:

1) These files should be put in <god-gem-install-root>/lib/god/conditions.
2) You must edit <god-gem-install-root>/lib/god.rb to require them (at the top of the file) in order to be able to use them.

complex.rb:

This condition lets you combine other conditions into compound conditions... ie: this AND (that OR these). Usage is fairly straightforward, it works more or less like any other condition (see example below). Complex conditions can be nested ad infinitum (to emulate parentheses in 'real' compound logic statements) and the 'this()' method can be omitted if it makes your code DRYer...


on.condition(:complex) do |c|
 c.this(:memory_usage) do |c2|
   ...
 end

 c.or(:complex) do |c2|
   files = %w(file1.txt file2.txt file3.conf file4.bak)

   (0..3).to_a.each do |idx|
     c2.or(:some_kind_of_file_based_condition) do |c3|
       c3.filename = files[idx]
     end
   end
 end
end

mysql_failed.rb:

This condition is intended to test all aspects of a mysql dependancy that your app may have (ie: connection and privileges... others to be added in the future if necessary/requested, perhaps a possible mysql version check?). It's fairly simple to use, the only things requiring examples are the default config setup, which allows you to skip specifying any/all config info when setting up subsequent instances of this condition, and the way of specifying privileges for the connection...


on.condition(:mysql_failed) do |c|
  c.setUser('username')
  c.setPass('password')
  c.setHost('mysql.domain.com')
  c.setDB('database_name')
  c.setPrivs = {
    'select' => %w(table_one table_two),
    'delete' => ['table_three', 'table_four']
  }
end

on.condition(:mysql_failed) do |c|
  c.setHost('mysql2.domain.com')
end

Any info not explicitly set in secondary instances of the condition will inherit from the previously setup instance. So in the example above, the second mysql_failed condition will inherit the first mysql_failed's username, password, database name and privilege set to test for.

disk_usage.rb:

This one is simple enough to let the example do the talking...


on.condition(:disk_usage) do |c|
  c.limit = 90 # percentage of partition that needs to be full
  c.mount_point = '/usr'
end

Thursday, May 3, 2007

IPTables Firewall Map

Filter vs. Nat? (Chicken vs. Egg, movie at 11...)

A few years ago I had to setup a couple relatively complex firewalls (under Linux), and in the process managed to find some documentation on the order in which a packet traverses each table and it's rules.

Sounds pretty basic, however nothing in the documentation or man pages for iptables itself explains how the tables relate to one another; for example, a packet arriving at a machine which is destined for another machine will go through the FORWARD chain of both the filter and nat tables, but which table's FORWARD chain will be examined first? A fairly significant question, since each chain can and likely will have vastly different rules which may affect the packet, and the order these rules are applied in will likely also affect the outcome in most cases...

Perl to the rescue!

And so, I created iptables-map.

Like most of us who have worn a sysadmin hat at one point or another, Perl often ends up as my Swiss Army Knife of choice. Something else may have been faster/simpler/more elegant/whatever, but I suppose at the time I must have been doing a lot of work in Perl and so it was a natural first choice... plus it's always awesome for anything where heavy string manipulation and/or regular expressions are involved, and so when I was starting it I probably would have assumed that there would be more of that kind of thing involved, although it ended up pretty basic...

A Sample of the Sweetness

SENT Packets
 mangle::OUTPUT >>> ACCEPT
 nat::OUTPUT >>> ACCEPT
 filter::OUTPUT >>> DROP
     -A OUTPUT -o lo >>> ACCEPT
     -A OUTPUT -s 10.10.10.10 -d 11.11.11.11 -o eth0 -p tcp -m tcp --dport 22 >>> ACCEPT
     -A OUTPUT -s 12.12.12.12 -d 13.13.13.13 -o eth0 -p tcp -m tcp --dport 22 >>> ACCEPT
     -A OUTPUT -o eth0 -p tcp -m tcp --dport 22 >>> DROP
     -A OUTPUT -s 14.14.14.14 -o eth0 >>> ACCEPT
 mangle::POSTROUTING >>> ACCEPT
 nat::POSTROUTING >>> ACCEPT

This is a snippet of an imaginary firewall which is restricting outgoing ssh connections. We're looking at the portion of the iptables-map output which is displaying the firewall path for packets originating from the local host ("SENT Packets"). The order in which the tables are listed (along with their default targets) is the order in which a packet will traverse them: OUTPUT chain of the mangle table first, then of the nat table, then of the filter table, then the POSTROUTING chain of the mangle table, and lastly the POSTROUTING chain of the nat table.

Pretty obvious in this output, however not at all obvious nor intuitive when all you have to work with are man pages and packaged iptables documentation.

The rest of the output is clearly showing you the specifics for each rule in each chain, and is naturally listing these rules in the order in which they're traversed/examined.

The Laundry List

In resurrecting this script for SourceForge, I've noticed that there's a new table ('raw') which is now part of a default iptables install, and ip6tables is also ready for prime time. I'll be adding support for both of these in the very near future. As an aside, if you'd like to request a version of this script in a different language, ie: as a bash or awk script or perhaps even C source for a binary, please email me and let me know!

IPTables-Map

Thursday, April 19, 2007

Gmail and GreaseMonkey

So I finally got around to playing with GreaseMonkey... and I regret not doing it sooner, there's so much I could have done already!

If you're not familiar, GreaseMonkey is a Firefox plugin that lets you add your own/custom javascript code to websites. And it's dead simple - based on a handful of specially formatted comments and a file-naming convention, your javascript becomes custom code you can use to modify any site you want.

I used it to write a scripts for Gmail - when I switched over recently from mutt and pine (I know, to all you text-purists out there: I'm sorry, but I couldn't help it...) I lost the ability to use my nifty cron-based signature auto-rotater. I had written this dead-simple script a while ago to rotate my signature file names every 5 minutes, so essentially every time I sent an email I'd have a different/random sig. Not too useful for business communication, but great for my personal box.

Anyways, Gmail obviously doesn't do this by default, but I managed to get it working with a little bit of custom javascript code via GreaseMonkey in VERY little time... even considering the nasty size (depth) of Gmail's html tree (navigating this with the DOM inspector actually took more time than creating the script).

I'll post a link/the source once I've polished it - at the moment I just have a couple signatures hard-coded in that it chooses from, but I'll add a basic interface for managing signatures as well as post it to userscripts.org and/or SourceForge shortly...

Command-Line Highlighter

I was grepping through some logs the other day at home and I figured "wouldn't it be nice if I could pipe this through something that would highlight lines matching a regex instead of just having grep pull those lines out?" Wouldn't you know it, such a tool doesn't exist, as far as I can tell. Which is very weird, since I've already found it VERY useful...

grep *will* give you context lines if you ask for them explicitly, ie: give me 2 lines before and 4 after the matching line, but I wanted to see ALL the output, with the matches highlighted for easy-spotting.

Anyways, I wrote a little perl script to do the work - it just inserts 'standard' shell color-escape sequences before and after the matching word/line to highlight it (in bright-green-on-black by default... if you're a Matrix fan/l33t hax0r and use a green-on-black setup anyways, there's an option that lets you change the highlighting colors.

It's fairly basic at the moment, but I plan on porting this to C as soon as I have time (possibly tonight) now that it's up on SourceForge, and modify it a bit so that the options/usage syntax matches grep wherever possible/appropriate.

HighLite

Wednesday, March 28, 2007

XMMS disk_writer plugin patch

This is a (very) small patch I wrote a while ago for the disk_writer plugin of xmms (an open-source X windows media player, modelled after WinAmp). The post in the link is pretty self-explanatory, it's taken from the xmms development list where I sent it... basically it simply modifies how the plugin build the filename to write to so that you get an 'auto-rotate' effect instead of overwriting your previous file recorded from the same source.

Unfortunately, at the time xmms already had a lack of active development, and as far as I can tell is currently no longer an active project. Nevertheless, the patch is there if you're interested/want to take a look, and it does work (I use it myself)...

Sunday, March 25, 2007

RFC-Compliant URI Validation

Recently, as part of another project, I needed some code to validate a URI string based on RFC-2396. The goal here was the ability to ensure that a URI was RFC compliant. As such, I decided to use a set of regular expressions which were directly modelled from the ABNF definitions in the RFC. ABNF is by it's nature a very close match for regular expressions in terms of usage, syntax and purpose, and so using them seemed like a logical method of building the URI validation code.

I started by creating an expression for the simplest (and first) definitions in the RFC. 'lowalpha' is defined by the ABNF as being one of the characters a-z inclusive, while 'upalpha' is defined as A-Z inclusive. 'alpha' is defined as either a 'lowalpha' or an 'upalpha' character. 'digit' is defined as one of the characters 0-9 inclusive. Lastly, 'alphanum' is defined as being either an 'alpha' or a 'digit' character. Based on these five definitions, I could create five matching regular expressions which would serve the purpose of indicating whether an arbitrary string matches one of these definitions or not.

<?php

define('LOWALPHA', '[a-z]');
define('UPALPHA', '[A-Z]');

define('ALPHA', '(?:'.LOWALPHA.'|'.UPALPHA.')');
///   (?:[a-z]|[A-Z])

define('ALPHA_OPT', '[a-zA-Z]');

define('DIGIT', '[0-9]');

define('ALPHANUM', '(?:'.ALPHA.'|'.DIGIT.')');
///   (?:(?:[a-z]|[A-Z])|[0-9])

define('ALPHANUM_OPT', '[a-zA-Z0-9]');

?>

The defined expressions ending in _OPT are optimized versions of the regular expression - ie: it's much more efficient to execute a single expression which is a range like [a-zA-Z] than it is to execute two adjacent ranges such as [a-z]|[A-Z].

Within the final implementation, expressions have been optimized where possible but for the most part they mirror the ABNF in the document more or less directly. Almost all the optimization that is present occurs at the lowest level, ie: in the simplest, base expressions from which the further, more complicated expressions are constructed. This approach seems to work since any optimization can loosely be thought of as having an exponential benefit, relative to how low of a level the optimization is performed at.

UriValidator

Thursday, March 22, 2007

A New Project

So today I finally decided to start writing a proper blog, after toying with the idea on and off for a little while. Among other things, I want all of the random code I write to be freely available and accessible somewhere online under the GPL and it doesn't always fit nicely into existing projects/categories/paradigms/etc. I'll still send and/or upload whatever I can where an appropriate project exists somewhere, but this gives me a nice way to keep everything referenced from one place, regardless of what it is. Yes, I know - I could just upload everything into it's own project on SourceForge or something, but this also lets me explain everything and talk about stuff in an informal way. It also provides space for me to talk about programming and related topics in general, and gives me somewhere to post random thoughts and opinions on the world of software development. Lastly, I believe the whole idea of open source shouldn't be limited to the actual source itself - the whole process of creating software should be an organic, open, and cooperative activity in most cases, and so I'll be using this to document the life of my various projects, from start to 'release'.

Dangerous Programmer