wesley tanaka

Unintuitive Python Behavior

‹ Puppy Update | unfiglet ›
()

Curious about the Google App Engine, I started playing with it, which meant writing code in Python.

Unicode support in Python seems to be a tangle of confusing function names, source code file encodings, backslash escapes, the letter "u" before strings, and probably other things. Together, this smells to me like the language originally didn't support Unicode, but wanted to keep backward compatibility with pre-unicode-support scripts.

But other than that, the fact that I haven't spent the time to learn Python didn't seem to be too much an impediment to using it . . . until now.

I've run into my first surprising, unintuitive syntax:

>>> class SomeClass():
...    list = []
...
>>> a = SomeClass()
>>> b = SomeClass()
>>> print a
<__main__.SomeClass instance at 0xb7ea18ec>
>>> print b
<__main__.SomeClass instance at 0xb7ea18cc>
>>>
>>> a.list.append(1)
>>> print a.list
[1]
>>> print b.list
[1]
>>>

Putting the assignment in the class definition like that causes both a.list and b.list to be aliased to the same object. This can be checked with id(a.list) and id(b.list):

>>> print id(a.list)
3085424716
>>> print id(b.list)
3085424716

I spent an hour looking through Python documentation for an explanation of what's going on with these top level class assignment statements (it's how Google App Engine suggests creating fields for Model subclasses), but I didn't find the explanation I was looking for.

I did a few experiments. It looks like the way to create a mutable instance member field that doesn't interact with other instance members is to put the assignment in the __init__ function (and not in the top level class definition) as follows:

class SomeClass():
   def __init__(self):
      self.list = []

I figure I'll proceed by:

  1. Assuming I'm correct,
  2. writing a few unit tests and
  3. remaining otherwise blissfully ignorant

Suggested Links

Syndicate content