biais.org

Tuesday 27 February 2007

Profiling the profiler

A small snippet to measure profiling time:

from profileit import profileit
import operator
import time
timed = 0
 
def timemeth(func):
    def _inner(self, *args, **kw):
        start = time.time()
        res = func(self, *args, **kw)
        global timed
        timed = time.time() - start
        return res
    return _inner
 
@timemeth
@profileit(0)
def mip0(iterable):
    reduce(operator.add, iterable, 0)
 
@timemeth
@profileit(0)
def mip1(iterable):
    reduce(lambda x, y: x + y, iterable, 0)
 
mip0(xrange(100000))
print "timed %.3f seconds" % timed
 
mip1(xrange(100000))
print "timed %.3f seconds" % timed

Notes:

  • profileit is described here.
  • putting @timemeth before @profileit(0) will measure consumed time of profileit method call.
  • swap @timemeth with @profileit(0) and you will profile the timemeth function
  • read my last post to see why I used python2.5 and not python2.4

Outputs:

$ python2.5 reducecomp.py|grep seconds
         1 function calls in 0.097 CPU seconds
timed 0.109 seconds
         100001 function calls in 0.406 CPU seconds
timed 8.883 seconds

It's not a surprise, the profiler is time consuming when many functions are called.

Monday 26 February 2007

Profiling results with python2.4 and python2.5

I was playing with a python function that I wanted to profile. I used python2.5 and python2.4 to run the following code:

from profileit import profileit
import operator
import time
timed = 0
 
def timemeth(func):
    def _inner(self, *args, **kw):
        start = time.time()
        res = func(self, *args, **kw)
        global timed
        timed = time.time() - start
        return res
    return _inner
 
@profileit(0)
@timemeth
def mip0(iterable):
    reduce(operator.add, iterable, 0)
 
mip0(xrange(100000))
print "timed %.3f seconds" % timed

Outputs with python2.5:

$ python2.5 reducecomp.py|grep seconds
           2 function calls in 1.310 CPU seconds
timed 1.310 seconds      

Everything seems OK, and now the output with python2.4:

$ python2.4 reducecomp.py|grep seconds
         2 function calls in 0.000 CPU seconds
timed 1.336 seconds

0.000 seconds ? that's really fast... Something goes wrong with the profiler ?

Saturday 24 February 2007

Distellamap: visualize goto label

Very geeky project: Distillemap

Like any other game console, Atari 2600 cartridges contained executable code also commingled with data. This lists the code as columns of assembly language. Most of it is math or conditional statements (if x is true, go to y), so each time there's "go to" a curve is drawn from that point to its destination. When a byte of data (as opposed to code) is found in the cartridge, it is shown as an orange row: a solid block for a "1" or a dot for a "0". The row is eight elements long, representing a whole byte. This usually means that the images can be seen in their entirety when a series of bytes are shown as rows. The images were often stored upside-down as a programming method.

Tuesday 20 February 2007

Playing with Python metaclasses (bunch __metaclass__)

Warning: this is a snippet I wrote to play with metaclasses, the same function can be written in different ways (like this bunch recipe)

class BunchMeta(type):
    def __new__(cls, name, bases, new_attrs):
        def __create(*args):
            new_attrs["names"] = {}
            for n, i in enumerate(args):
                new_attrs["names"][i] = n
            return type(name, bases, new_attrs)
        return __create
 
class Bunch(list):
    __metaclass__ = BunchMeta
    def __init__(self, *args):
        list.__init__(self, args)
 
    def get(self, m):
        return self[self.names[m]]
 
    def __str__(self):
        return '(' + ', '.join(["%s:%s" % (str(name), str(d)) for name, d in
                       zip(self.names, self)]) + ')'

Note:

  • I used a dict for the names and a list for the bunch because I wanted to use the bunch like a dict but also like an ordered list (see *slave3 in the usage below)
  • I did not find how to set names in a class variable with type builtin

Usage:

HostBunch = Bunch("ip", "port")
slaves = [HostBunch("192.168.10.%d" % i, "8754") for i in range(100, 211)]
for i in slaves[:3]:
    print i
# print the fourth slave "ip"
print slaves[3].get("ip")
# print the fourth slave "port"
print slaves[3].get("port")
 
def strize(*args):
    return "%s:%s" % args
 
print strize(*slaves[3])

Outputs:

(ip:192.168.10.100, port:8754)
(ip:192.168.10.101, port:8754)
(ip:192.168.10.102, port:8754)
192.168.10.103
8754
192.168.10.103:8754

Friday 16 February 2007

Textorizer

An online tool that vectorizes a picture using user defined words: textorizer.

Thursday 15 February 2007

First CAPTCHA resistance test results

In this blog post, I wanted to test the spammer crawlers. I check the tested mailboxes today:

  • The first spam arrived February 8, 2007 at 11:11 in the recetansis mailbox: 13 days after the test begins.
  • Second arrived 2 minutes later in the "ceresistan" mailbox.

Today's results, about 20 days after the beginning of the test:

  • 7 spams on "recetansis" mailbox.
  • 9 spams on "ceresistan" mailbox. The same 7 as in "recetansis" mailbox + 2 originals ;).
  • 0 spam in the others boxes

Note: this blog receive about 250 unique visitors per day.

Tuesday 13 February 2007

Color Code: Full-color portrait of the English language

Color Code :

33,000 words, grouped by meaning. Each word is given the average color of web images found when searching for that term. You can see clusters of words related to plants, flesh, food, and water.

Wednesday 7 February 2007

Python itertools recipe: tuplewise

itertools module helps you to transform iterators. The following snippet is a simple recipe that create an iterator generating tuples from an input iterator. This snippet is a modification of the pairwise recipe. (tuplewise(iterable, 2) is equivalent to pairwise(iterable)).

from itertools import tee, izip
 
def tuplewise(iterable, n):
    """s -> (s0,s1,..,sn), (s1,s2,...,sn+1), (s2, s3,...,sn+2), ...
    >>> print list(tuplewise([1, 2, 3, 4, 5], 3))
    [(1, 2, 3), (2, 3, 4), (3, 4, 5)]
    >>> print list(tuplewise(xrange(4), 2))
    [(0, 1), (1, 2), (2, 3)]
    >>> print list(tuplewise([1], 2))
    []
    >>> print list(tuplewise([1, 2], 2))
    [(1, 2)]
    >>> print list(tuplewise([1, 2], 0))
    []
    >>> print list(tuplewise([], -1))
    Traceback (most recent call last):
       ...
    ValueError: n must be >= 0
    """
    tees = tee(iterable, n)
    try:
        for i, cur in enumerate(tees):
            for j in xrange(i):
                cur.next()
    except StopIteration:
        pass
    return izip(*tees)
  • tee(iterable[, n=2]): return n independent iterators from a single iterable.
  • izip is the equivalent to standard zip for iterators.

Note: The function docstring is doctestable, add the following code to the file that contains the tuplewise function and run it.

def _test():
    import doctest
    doctest.testmod()
 
if __name__ == '__main__':
    _test()

Monday 5 February 2007

Python interactive in your program

I'm working on a program that take a very long time to load and I can't test it by run/test/(fail|stop). Fortunately in the Python standard library, I've found the interactive interpreter objects.

import readline
import rlcompleter
from code import interact
 
def start_interactive(banner)
    foo = Foo()
    foo.very_long_starting_function()
    readline.parse_and_bind("tab: complete")
    interact(banner=banner, local=locals())

It's start an interactive interpreter with completion (thanks to readline and rlcompleter modules).

>>> print foo.# (<TAB> pressed)
foo.__class__   foo.__doc__  foo.__module__  
foo.__init__  foo.moo   foo.goo
>>>

Considering you're not developing the Foo class; you may use your standard editor (instead of coding in the interactive interpreter) and then use the reload() function to reload the module you modified. With the file bar.py:

def bar():
    return "bar bar"
>>> import bar
>>> bar.bar()
"bar bar"

Modify the bar function: replace return "bar bar" by return "foo foo" and then, reload the bar module:

>>> reload(bar)
<module 'bar' from '/home/max/work/blogcode/bar.py'>
>>> bar.bar()
"foo foo"