Optimising the ipaddress module from Python 3.3

February 27th, 2014 by exhuma.twn

As of Python 3.2, the “ipaddress” module has been integrated into the stdlib. Personally, I find it a bit premature, as the library code does not look to be very PEP8 compliant. Still, it fills a huge gap in the stdlib.

In the last days, I needed to find a way to collapse consecutive IP networks into supernets whenever possible. Turns out, there’s a function for that: ipaddress.collapse_addresses. Unfortunately, I was unable to use it directly as-is because I don’t have a collection of networks, but rather object instances which have “network” as a member variable. And it would be impossible to extract the networks, collapse them and correlate the results back to the original instances.

So I decided to dive into the stdlib source code and get some “inspiration” to accomplish this task. To me personally, the code was fairly difficult to follow. About 60 lines comprised of two functions where one calls the other one recursively.

I thought I could do better. And preliminary tests are promising. It’s no longer recursive (it’s shift-reduceish if you will) and about 30 lines shorter. Now, the original code does some type checking which I might decide to add later on, increasing the number of lines a bit, and maybe even hit performance. I’m still confident.

A run with 30k IPv6 networks took 93 seconds with the new algorithm using up 490MB of memory. The old, stdlib code took 230 seconds to finish with a peak memory usage of 550MB. All in all, good results.

Note that in both cases, the 30k addresses had to be loaded into memory, so they will take up a considerable amount as well, but that size is the same in both runs.

I still have an idea in mind to improve the memory usage. I’ll give that a try.

Here are a few stats:

With the new algorithm:

collapsing 300000 IPv6 networks 1 times
generating 300000 addresses...
... done
new:  92.98410562699428
        Command being timed: "./env/bin/python mantest.py 300000"
        User time (seconds): 92.79
        System time (seconds): 0.28
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:33.07
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 491496
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 123911
        Voluntary context switches: 1
        Involuntary context switches: 154
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

and with the old algorithm:

collapsing 300000 IPv6 networks 1 times
generating 300000 addresses...
... done
old:  229.66894743399462
        Command being timed: "./env/bin/python mantest.py 300000"
        User time (seconds): 229.35
        System time (seconds): 0.38
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 3:49.76
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 549592
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 144970
        Voluntary context switches: 1
        Involuntary context switches: 1218
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

I’ll add more details as I go… I’m too “into it” and keep forgetting time and to post fun stuff on-line… stay tuned.

Posted in Python | No Comments »

Colourising python logging for console output.

December 27th, 2013 by exhuma.twn

I’ve seen my fair share of code fragments colourising console output. Especially when using logging. Sometimes the colour codes are directly embedded into the format string, which makes it really hairy to deal with different colours for different levels. Sometimes even the log message is wrapped in a colour string along the lines: LOG.info("{YELLOW}Message{NORMAL}") or something equally atrocious.

Most logging frameworks support this use-case with “Formatters”. Use them! Here’s a quick example of how to do it “the right way™”:

Disclaimer: For whatever reason, this gist is borking the foobar.lu theme. I’m guessing it’s the UTF-8 char in the docstring? So maybe a web-server misconfig? So I’ll have to link it the “old way”! Go figure…

Clicky clicky → https://gist.github.com/exhuma/8147910

Posted in Python | No Comments »

Introduction to google-closure with plovr

September 1st, 2013 by exhuma.twn

I’m about to embark on a quest to understand the development for custom google-closure components (UI widgets if you will). Reading through the relevant section in “Closure – The Definitive Guide” makes me believe, it’s not all too difficult. But there are still a bunch of concepts which I need to familiarize myself with. This article briefly outlines my aim for this “learning trail”, and starts of with a tiny HelloWorld project using plovr. This article assume a minimal knowledge of google closure (you should know what “provides” and “requires”. “exportSymbol” should also not surprise you) Read the rest of this entry »

Posted in JavaScript | No Comments »

Automagic __repr__ for SQLAlchemy entities with primary key columns with Declarative Base.

July 5th, 2013 by exhuma.twn
cURL error 28: Connection timed out after 5001 milliseconds
Fatal error: Uncaught Error: Cannot use object of type WP_Error as array in /var/www/foobar.lu/www/htdocs/wp/wp-content/plugins/embed-github-gist/embed-github-gist.php:86 Stack trace: #0 /var/www/foobar.lu/www/htdocs/wp/wp-content/plugins/embed-github-gist/embed-github-gist.php(164): embed_github_gist('5935162', NULL, NULL, NULL) #1 /var/www/foobar.lu/www/htdocs/wp/wp-includes/shortcodes.php(325): handle_embed_github_gist_shortcode(Array, '', 'gist') #2 [internal function]: do_shortcode_tag(Array) #3 /var/www/foobar.lu/www/htdocs/wp/wp-includes/shortcodes.php(199): preg_replace_callback('/\\[(\\[?)(gist)(...', 'do_shortcode_ta...', '<p>According to...') #4 /var/www/foobar.lu/www/htdocs/wp/wp-includes/class-wp-hook.php(286): do_shortcode('<p>According to...') #5 /var/www/foobar.lu/www/htdocs/wp/wp-includes/plugin.php(208): WP_Hook->apply_filters('<p>According to...', Array) #6 /var/www/foobar.lu/www/htdocs/wp/wp-includes/post-template.php(247): apply_filters('the_content', 'According to <a...') #7 /var/www/foobar.lu/www/ht in /var/www/foobar.lu/www/htdocs/wp/wp-content/plugins/embed-github-gist/embed-github-gist.php on line 86