| Author: | Dave Kuhlman |
|---|---|
| Address: | dkuhlman@rexx.com http://www.rexx.com/~dkuhlman |
| Revision: | 1.2b |
| Date: | April 28, 2008 |
| Copyright: | Copyright (c) 2006 Dave Kuhlman. All Rights Reserved. This software is subject to the provisions of the MIT License http://www.opensource.org/licenses/mit-license.php. |
Abstract
Various notes and Python, Jython, training, etc.
Contents
Yes, you can debug into a Python extension module using gdb on Linux. Below are several files (run_gdb, test1,gdb, and test2.gdb) that should help.
Here is a bit of explanation about the attached files:
Needless to say, you will need to modify these files for your own use. Here are the sample files:
run_gdb:
#!/bin/sh -v gdb -d /w2/XML/Libxml/libxsltmod-1.4a -x test2.gdb python `ps -C python -o pid=`
test1.gdb:
# Set command line arguments. set args --param ital 'no' --param border 2 camping_par.xsl camping_list.xml #set args --param ital 'no' --param border 2 --output tmp1.html camping_par.xsl camping_list.xml # Set breakpoints. b main b xsltProcess
test2.gdb:
b translate_to_file b translate_to_string
If I recall correctly, you could separately:
in two distinct steps. The run_gdb sample script combines those two steps.
For more information try:
Here are a few notes that might help with understanding Python's object model:
Consequences and examples:
The following code creates one object (a list) and then copies the reference to that object to another variable:
In [1]: a = [11, 22, 33] In [2]: b = a In [3]: print a [11, 22, 33] In [4]: print b [11, 22, 33] In [5]: a[1] = 44 In [6]: print a [11, 44, 33] In [7]: print b [11, 44, 33]
Notes:
Passing a mutable object into a function enables the function to make changes to the object which will be reflected in the calling environment. Example:
In [9]: def t(x, y): ...: x.append(y) ...: ...: In [10]: a = range(4) In [11]: print a [0, 1, 2, 3] In [12]: t(a, 44) In [13]: print a [0, 1, 2, 3, 44]
[Note: read the Iterators and generators updated entry before this one.]
I received a comment that I was confusing classes that implement iterators with classes that implement iterables. That may seem like a small, legalistic distinction, but Kent's comment also explained that this distinction had important consequences.
So, let's look at some of the differences between iterators and iterables.
The __iter__ method:
Resetting the iterator:
Multiple iterators from the same object:
An iterator object cannot produce multiple iterators. The __iter__ method always returns the object itself. And, once that object exhausts its sequence, it is required by the iterator protocol to stay exhausted.
An iterable object can produce multiple iterator (generator) objects, each of which can produce a sequence. In fact, an iteratible object can produce iterators that produce different sequences. Consider:
Dictionary objects provide different methods that return iterators over different sequences. iterkeys(), itervalues(), and iteritems(), return iterators that generate sequences of the keys, values, and (key, value) tuple sequences from the dictionary.
We can imagine a tree structure that creates iterators that return a sequence of all nodes in the tree that satisfy a predicate. Here is an example:
class Node(object):
o
o
o
def recursive_walk_tree_with_predicate(self, predicate):
if predicate(self):
yield self
for child in self.children:
child_iter = child.recursive_walk_tree_with_predicate(predicate)
for child1 in child_iter:
yield child1
def predicate1(node):
val = node.get_value()
val = ord(val[-1])
if val % 2 == 0:
return True
else:
return False
def test(tree):
for node in tree.recursive_walk_tree_with_predicate(predicate1):
f.write('value: "%s"\n' % (node.value, ))
Consequences:
An iterator object can be iteratored over only once. Another way of saying this this is that an iterator produces only one sequence. An iterable can produce multiple iterators and even iterators that produce different sequences.
An iterable object (but not an iterator object) can produce multiple iterators whose use can be interleaved. These iterators are independent in the sense that calling one does not alter the sequence produce by the other. Here is an example:
In [1]: class A(object): ...: def __init__(self, collection): ...: self.collection = collection ...: def __iter__(self): ...: return iter(self.collection) ...: ...: In [2]: a = A(range(5)) In [3]: b = iter(a) In [4]: c = iter(a) In [5]: b.next() Out[5]: 0 In [6]: b.next() Out[6]: 1 In [7]: b.next() Out[7]: 2 In [8]: c.next() Out[8]: 0 In [9]: c.next() Out[9]: 1 In [10]: b.next() Out[10]: 3
To say that the multiple iterators produced by an iterable are independent is not to say that an object could not be implemented which produces iterators that do communicate. However, that will often be a bug, or would serve some special purpose.
The above are not "hard and fast" rules. You are certainly free to implement iterators and iterables any way that Python enables you to. However, understanding some of the possible distinctions between iterator object and iterable object explained above, should help you to implement the iterator or iterable that you do want and not something that you don't.
Here are a few pitfalls and corner cases that you will want to be aware of:
An iterator (using the definition above) does not refresh or reset itself. Each time we ask an iterator object for its iterator/generator it returns itself, which may be (partially) exhausted. That might be what you want, but might not. Here is an example:
In [24]: a = range(5) In [25]: iter1 = iter(a) In [26]: iter1.next() Out[26]: 0 In [27]: iter1.next() Out[27]: 1 In [28]: for x in iter1: ....: print x ....: 2 3 4
An interable, in contrast with an iterator, might have behaved differently. In particular, depending on the implementation, an iterable might have begun from the beginning of the sequence when used in the for statement.
[Note: read the Iterators and generators entry before this one.]
I posted the Iterators and generators entry below on the Python tutor email list, and I also asked about how to write a non-recursive tree walk function. The solutions below are a slight reworking of the answer I received.
In both versions below, the significant feature is the use of a stack to keep the list of nodes that remain to be processed.
This version has a next() method, as required by the iterator protocol.
class Node(object):
def __init__(self, value='<no value>', children=None):
self.value = chr(value + 97) * 3
if children is None:
children = []
else:
self.children = children
def walk_tree(self):
# stack to hold nodes as we walk through
stack = []
stack.append(self)
while stack:
node = stack.pop()
# reverse children to get the right order.
stack.extend(reversed(node.children))
yield node
def __iter__(self):
self.iterator = self.walk_tree()
return self
def next(self):
return self.iterator.next()
TREE = Node(0, [Node(1, [Node(2, [Node(3, []),
Node(4, []),
Node(5, [Node(6, []),
Node(7, []),
Node(8, []),
Node(9, []),
]),
Node(10, [Node(11, []),
Node(12, []),
]),
]),
Node(13, [Node(14, []),
Node(15, []),
Node(16, [])]),
Node(17, []),
Node(18, []),
]),
Node(19, []),
Node(20, []),
])
def test():
for node in TREE:
print 'value: %s' % (node.value, )
test()
This version does not have a next() method, but the iterators returned by __iter__() do have a next().
class Node(object):
def __init__(self, value='<no value>', children=None):
self.value = chr(value + 97) * 3
if children is None:
children = []
else:
self.children = children
def show(self, indent):
self.showindent(indent)
print 'value: %s' % (self.value, )
for child in self.children:
child.show(indent+1)
def showindent(self, indent):
sys.stdout.write(' ' * (indent * 4))
def walk_tree(self):
# stack to hold nodes as we walk through
stack = []
stack.append(self)
while stack:
node = stack.pop()
# reverse children to get the right order.
stack.extend(reversed(node.children))
yield node
def __iter__(self):
return self.walk_tree()
TREE = Node(0, [Node(1, [Node(2, [Node(3, []),
Node(4, []),
Node(5, [Node(6, []),
Node(7, []),
Node(8, []),
Node(9, []),
]),
Node(10, [Node(11, []),
Node(12, []),
]),
]),
Node(13, [Node(14, []),
Node(15, []),
Node(16, [])]),
Node(17, []),
Node(18, []),
]),
Node(19, []),
Node(20, []),
])
def test():
for node in TREE:
print 'value: %s' % (node.value, )
test()
Producers and consumers -- The separation of the producer of a stream or sequence of items from the consumers of that stream of items enables us to (1) reuse the producers in different contexts and applications, (2) maintain the producer implementation independently from the consumers, and (3) implement producers (iterators in Python talk) that can be used in different contexts (for example, in a for statement or passed to different parts of our program and used there.
Sequences -- An iterator produces a sequence of items. The iterator produces the elements of the sequence one by one. We can use the items in that stream (1) in a for statement, (2) in a list comprehension or a generator expression, or (3) in locations throughout our program (by calling the next() method on the iterator).
Concrete sequences and abstract sequences -- We can think of some sequences as being concrete, meaning that all the items in the sequence exist, at one time, in a data structure. Python lists, tuples, and strings would be concrete sequences in this sense. In contrast, we can think of an abstract sequence as one that we can define or at least can create an implementation that produces it, however the sequence might not exist until we produce it and even then, not all of the sequence may exist at any given time. Our implementation. when called, will produce one item at a time as and when requested. An abstract sequence (and the items in it) can be defined by connotation (rather than denotation) or by a calculation.
Consuming an iterator at a single location -- the for statement.
Consuming an iterator at multiple locations -- An iterator object can be passed around in our program, can be passed into a function, can be returned from a function, and can be saved in a data structure. In other words an iterator is a first class object.
How to use an iterator object -- An iterator object obeys the iterator protocol. Specifically, it implements a next() method and raises the StopIteration exception when it is exhausted (there are no more items to be produced). So, here is an example of how we might get the next object from an iterator:
def use_one(aniter):
try:
obj = aniter.next()
print obj
except StopIteration, e:
print "That's all"
def test():
myiter = iter(range(3))
use_one(myiter)
use_one(myiter)
use_one(myiter)
use_one(myiter)
test()
Which, when run, produces the following output:
0 1 2 That's all
Iterators are "first class objects" -- (1) We can pass an iterator to a function. (2) We can return an iterator from a function. And (3), we can stuff an iterator into a data structure for use later. Therefore, the consumer of a sequence is not restricted to using all the items in the sequence in a single location.
An iterator is a producer.
An iterator produces a sequence, or perhaps more correctly, it produces the elements of a sequence one by one.
An iterator obeys (implements) the iterator protocol. See below.
An iterator can be used (1) in a for statement, (2) in a list comprehension or a generator expression, (3) by calling the next() method implemented by the iterator.
Iterator objects themselves are required to support the following two methods, which together form the iterator protocol:
You will (almost?) always want a constructor (the __init__ method). I'm not sure whether the protocol requires it, but it's usually needed.
For more on the iterator protocol, see 3.5 Iterator Types in the "Python Library Reference".
Use the yield statement -- If a function definition contains a yield statement, then the value returned by the function is a generator/iterator, i.e. an object that obeys the iterator protocol.
Example:
>>> def test(): ... yield 'hi' ... yield 'bye' ... >>> testiter = test() >>> testiter <generator object at 0xb7d6b7cc> >>> testiter.next() 'hi' >>> testiter.next() 'bye' >>> testiter.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration >>> >>> dir(testiter) ['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__', 'close', 'gi_frame', 'gi_running', 'next', 'send', 'throw']
Notes:
Here is a more complicated example:
import sys
import getopt
def get_reply():
while True:
reply = raw_input("Entry ('quit' to quit): ")
if reply == 'quit':
raise StopIteration
yield reply
def get_bundle(bundle_size):
bundle = []
count = 0
for x in get_reply():
count += 1
bundle.append(x)
if count >= bundle_size:
yield bundle
bundle = []
count = 0
yield bundle
def test():
getter = get_bundle(3)
for bundle in getter:
print 'bundle:', bundle
print
test()
When you run it, it might look like this:
$ python test_iterator.py
Entry ('quit' to quit): aaaa
Entry ('quit' to quit): bbbb
Entry ('quit' to quit): cccc
bundle: ['aaaa', 'bbbb', 'cccc']
Entry ('quit' to quit): dddd
Entry ('quit' to quit): eeee
Entry ('quit' to quit): quit
bundle: ['dddd', 'eeee']
Implement an iterator object (class) -- Two alternative methods:
Use yield -- In this technique, you will (1) implement a generator method in your class using the yield statement and (2) return the value returned by the generator method (an iterator) in your __iter__ method. Here is an example:
class Tree(object):
...
def walk_tree(self, level):
yield (level, self, )
for child in self.get_children():
iterchildren = child.walk_tree(level+1)
for level1, tree1 in iterchildren:
yield level1, tree1
def __iter__(self):
return self.walk_tree(self.initlevel)
Notes:
Implement an explicit next() method -- In this technique, you will (1) implement a next method in your class and (2) return the object itself (i.e. self) in your __iter__ method. An example:
class Doubler(object):
def __init__(self, collection):
self.collection = collection
self.index = 0
def next(self):
if self.index < len(self.collection):
value = self.collection[self.index]
self.index +=1
return value * 2
else:
raise StopIteration
def __iter__(self):
self.index = 0
return self
def test():
collection = range(5)
doubler = Doubler(collection)
for value in doubler:
print 'value: %d' % (value, )
print '-' * 30
for value in doubler:
print 'value: %d' % (value, )
test()
Notes:
A few additional notes and suggestions on implementations:
Suppose we have an iterator myiter, that produces a sequence of words. Here are some ways that we can use that iterator:
In [1]: words = 'the baby bison romped in the meadow'.split() In [2]: words Out[2]: ['the', 'baby', 'bison', 'romped', 'in', 'the', 'meadow'] In [3]: myiter = iter(words)
Use it in a for statement:
>>> words = 'the baby bison romped in the meadow'.split()
>>> myiter = iter(words)
>>> for item in myiter:
print item.capitalize()
The
Baby
Bison
Romped
In
The
Meadow
Use it in a list comprehension:
>>> words = 'the baby bison romped in the meadow'.split() >>> myiter = iter(words) >>> doublewords = [item * 2 for item in myiter] >>> doublewords ['thethe', 'babybaby', 'bisonbison', 'rompedromped', 'inin', 'thethe', 'meadowmeadow']
Use it in a genrator expression:
>>> words = 'the baby bison romped in the meadow'.split()
>>> myiter = iter(words)
>>> doublewords = (item * 2 for item in myiter)
>>> doublewords.next()
'thethe'
>>> doublewords.next()
'babybaby'
>>> doublewords.next()
'bisonbison'
>>> doublewords.next()
'rompedromped'
>>> doublewords.next()
'inin'
>>> doublewords.next()
'thethe'
>>> doublewords.next()
'meadowmeadow'
>>> doublewords.next()
Traceback (most recent call last):
File "<pyshell#25>", line 1, in <module>
doublewords.next()
StopIteration
Pass it to a function:
>>> words = 'the baby bison romped in the meadow'.split()
>>> myiter = iter(words)
>>> def consumer(worditer):
for item in worditer:
print item.upper()
>>> consumer(myiter)
THE
BABY
BISON
ROMPED
IN
THE
MEADOW
Stuff it into a data structure:
>>> words = 'the baby bison romped in the meadow'.split()
>>> myiter = iter(words)
>>> def consumer(struct):
print 'before:', struct[0]
for item in struct[1]:
print item.upper()
print 'after:', struct[2]
>>> struct = ('start', myiter, 'finish')
>>> consumer(struct)
before: start
THE
BABY
BISON
ROMPED
IN
THE
MEADOW
after: finish
See PEP 255 -- Simple Generators.
Lib/test/test_generators.py in the Python source distribution contains many examples.
There is more information, some of which is elementary and some advanced here: Functional Programming HOWTO.
The "What's new" section for Python 2.2 also discusses iterators. See: 3 PEP 234: Iterators
I've recently started using workingenv.py to try to protect myself from this problem. (See sys.path, PYTHONPATH, site.py, etc.) Here are a few thoughts about this.
The problem defined -- When you use easy_install to install a package such as Pylons, for example, a number of different Python packages will be installed on your machine. And, easy_install forces the path to the packages that it installs up higher in sys.path than other packages installed on your system. If you have a specially customized version of one of these packages, your Python will no longer find that version. I had this difficulty with Docutils, which contains an extra output writer ODF/ODT writer for Docutils which I wrote to produce files that can be loaded into OpenOffice Writer and which is not in the standard Docutils distribution. After installing Pylons, Python finds the version of Docutils installed by easy_install instead of the version in which the ODF/ODT writer is installed. If this is an issue for you, then you may have to use a sitecustomize.py module or other technique in order to use your version of that module from Python.
easy_install, in a sense, hijacks your Python environment. It does this by installing a .pth file that gives the packages which easy_install installs priority over other installed Python packages. What this means is (1) that after using easy_install to install Pylons (or any other package) some of your Python applications may not operate in the same way that they formerly did. And, (2) if, after using ``easy_install to install Pylons, you install another Python packages that was also installed by easy_install, your Python will find the version of that package installed by easy_install rather than the version you just installed.
So, what can you do? One suggestion is to use workingenv, which you can learn about at the workingenv Web site.
You might use workingenv.py as follows:
Your application will be running in your new Python environment. And, Python applications that you start up in a session where you have not used bin/activate in your new environment, will be running under your old original environment and will use the packages that you've installed there.
Once Pylons has been installed in this new environment, you will only need use steps 2 and 4 in order to start Pylons.
Here is an example showing how you might use workingenv.py to create a Python environment and then install and use Pylons within that environment:
$ python ../Workingenv/workingenv.py PylonsEnv $ source PylonsEnv/bin/activate $ easy_install Pylons $ cd Test/Sudoku01/ $ paster serve --reload development.ini
And on subsequent days, after your environment has been created and Pylons has been installed in that environment, you might use the following to start your Pylons application (for development and testing):
$ source PylonsEnv/bin/activate $ cd Test/Sudoku01/ $ paster serve --reload development.ini
First, let's understand what we are trying to accompish. We are trying to figure out where Python looks for modules to be imported and how to tell Python where to look.
When Python processes an import statement, it searches directories listed in sys.path and it searches in those directories in the order that they occur in sys.path. This implies the following:
You can find out what is in sys.path using something like the following:
>>> import sys >>> for path in sys.path: ... print path
So, if sys.path is the key to determining where Python looks for modules to be imported and the key to controlling where Python looks, how do items (directories) get into sys.path? The next section explains this.
Notes on the standard process can be found in your python2.x/site.py.
A qualification -- I believe, but am not sure, that if you have used easy_install on your system, then it might be that lib/python2.x/site.py, lib/python2.x/site-packages/site.py have been replaced.
An important thing to note is that, while .pth files are processed last, easy_install is able to patch the easy_install.pth file so that it shoves it's paths to the top of sys.path.
If you have used easy_install on your system and you did not specify the -m or --multi-version, then easy_install creates and adds directories to a easy-install.pth file. But, it is even a bit more complex than that because easy_install also puts a bit of code in easy-install.pth that forces paths in which it has installed a package to the top of sys.path. Further more, this apparently happens after the PYTHONPATH environment variable is processed. So, you cannot
And, easy_install also appears to add it's own site.py, and tricks Python into using the easy_install version of site.py instead of the standard one. See these notes:
So perhaps a message to somebody would help, but I don't know who:
I've noticed some strange behavior on my machine since installing pylons. I believe that it is done by easy_install, but I'm not sure.
There is alse a site.py in lib/python2.5/site-packages (in addition to the one in lib/python2.5). Where did that come from? Did easy_install put it there? There are very few comments in it, as though someone does not want me to know who put it there.
Does anyone have a recommended method of overriding paths inserted into sys.path by easy_install? Basically, I want to force lib/python2.5/site-packages to the front/top. I've been using a sitecustomize.py file in my current directory, but that has the feel of a kludge to it.
Thanks for help.
Dave
Next step: Study the site.py replacement from easy_install.
Some options:
A variable is a name bound to a value in a namespace.
A namespace is a dictionary in which Python can look up a name (possible) to obtain its value. Note that the use of dictionaries to implement namespaces is an implementation specific to Python. Other programming languages implement namespaces differently.
Names refer to objects. Objects can be integers, tuples, list, dictionaries, strings, instances of classes, functions, classes (themselves), other Python built-in types, and instances of classes. And, don't forget, references to objects can also be held in data structures (lists, dictionaries, etc).
Determining which namespace a name is in is static. It can be determined by a lexical scan of the code. If a variable is assigned a value anywhere in a scope (specifically within a function or method body), then that variable is local to that scope. If Python does not find a variable in the local scope, then it looks next in the global scope (also sometimes called the module scope) and then in the built-ins scope. But, the global statement can be used to force Python to find and use a global variable (a variable defined at top level in a module) rather than create a local one.
Determining whether a name is bound (has a value) in a namespace is dynamic. You must follow the logic of the code in order to determine (1) when a variable has been bound to a value and (2) what value the variable is been bound to at any given point in the execution of the program. A variable is given a value at a certain time during the execution of the code in a scope. For example, in the following function, the variable count is not bound to a value until the end of the first iteration of the loop:
def test_dynamic():
for idx in range(5):
if idx > 1:
x = count
print 'x:', x
count = idx
In Python, since the use of objects and references to them are so pervasive and consistent, we sometimes conflate a variable and the object it refers to. So, for example, if we have the following code:
total = 25
items = [11, 22, 33]
def func1():
pass
class Class1:
pass
we sometimes say:
But, if we were more careful, we might say:
Or, even:
Trying to keep this as simple as possible ... might have to ignore a few corner cases ...
Use these rules to determine the scope of a name/variable:
In order to force Python to use a global variable when it is assigned a value in a function, use the global statement.
A few additional notes:
Modules and functions (and methods) create scopes.
Python does not treat statement blocks as separate scopes. For example, the body or block in an if/else or for statement does not create a separate scope.
A class does not create a separate scope (although the methods in it do). For example an instance of the following class:
ClassGlobal1 = 'Class global data'
class TestScope:
ClassGlobal1 = 'Class local data'
def show(self):
print '(TestScope.show) ClassGlobal1: %s' % ClassGlobal1
references the global (module) variable ClassGlobal1, not the one defined in the class, and will print out:
(TestScope.show) ClassGlobal1: Class global data
In order to access the class level variable, I must qualify it with the class, for example, as follows:
TestScope.ClassGlobal1
Although it may seem obvious, if from moduleA I import moduleB, then call function1 in moduleB, it is moduleB's module scope that is seen by function1, not moduleA's. In other words, it is lexical scope that matters, not dynamic scope. This is true by design, and so that I can understand moduleB and the code in it by reading that code and without knowing the code in moduleA or any other module from which function1 might be called.
All the following Python statements bind a value to a name (a variable) in a namespace:
Also, function or method parameters bind an actual parameter value from a call of the function to the formal parameter name in the local (function/method) scope. Or, if the actual value is omitted in the call and there is a default value, the default value is bound to the formal parameter name in the local scope. Additional notes: (1) The binding of actual values to formal parameters happens each time the function is called. (2) Default values, if the function definition has any, are evaluated only once, which is why you usually do not want to use mutable objects as default values. Specifically, do:
def f(values=None):
if values is None:
values = []
...
Do not do:
def f(values=[]):
...
The second of the above creates only one empty list which is shared by all invocations that omit the parameter.
In Python, it is important to understand that functions and classes (and other types of objects too) are objects that can be referred to by a variable, stored in a data structure (e.g. a list or dictionary), passed to a function, returned by a function, etc.
It is also important to realize that variables in Python are simply names in a namespace that refer to objects of some kind.
Or, summarized in other words:
Names are references to objects and
Objects are first class, which means that we can:
Although, perhaps we should qualify the above a bit by saying that variables and data structures hold references to objects, and that we can pass references to objects into functions, and so on.
The built-in functions globals() and locals() return the dictionaries the represent the global and the local namespaces respectively. Caution: Although you can get a dictionary that represents the current global or local namespace, it is questionable that you can modify a namespace by modifying the dictionary returned by globals() or locals(). Actually, it seems to work with globals(), but definitely fails in some case with locals().
Nested scopes -- Note that for lexically/statically nested scopes (for example, a function defined inside a function), it seems that globals() and locals() still give access to all items in the accessible namespaces, but do not give dictionary style access to all visible scopes. In particular, variables in nested scopes are not included in locals() and globals() For more on this, see PEP 227: Statically Nested Scopes.
When you want to inspect Python's namespaces and symbol tables, also look at the built-in functions dir() and vars(). dir() returns a (not necessarily complete) list of names in the current local symbol table. vars() returns a dictionary corresponding to the current local symbol table. Note that dir() and vars() can take an optional object as an argument. See the Python documentation (below) for more on this.
For more information on the above built-in functions, see 2.1 Built-in Functions in the Python Library Reference.
More help and explanation on names, namespaces, bindings, etc is here:
When and why are global variables considered evil? Are there ways to reduce their evil-ness?
Many programming "authorities" consider global variables harmful. Why? Perhaps if we can answer this question we could ...
What is wrong with global variables? Why are they harmful?
A few comments on global variables in Python:
If and when you must use global variables, here are things that you can do:
Python is less rigid than some languages with respect to interfaces. And, while there are several implementations of some of the capabilities provided by interfaces (for example, in Zope), there is, as of this date, no implementation of interfaces in the Python standard library.
So, in Python, what do we mean by an interface? An interface, sometimes called a protocol, is a description of the methods and their signatures that must be implemented by a class in order to satisfy the protocol. If a class implements those methods, then we say that the class implements the interface/protocol and we say that the interface is provided by the class (that implements it).
Why would we care about defining and implementing interfaces? An interface can serve as a way for a service provider to tell a service consumer/user about the characteristics of a class or instance that the user must provide. An example is provided by the standard input, output, and error streams in the Python sys module:
stdin stdout stderr
File objects corresponding to the interpreter's standard input, output and error streams. stdin is used for all interpreter input except for scripts but including calls to input() and raw_input(). stdout is used for the output of print and expression statements and for the prompts of input() and raw_input(). The interpreter's own prompts and (almost all of) its error messages go to stderr. stdout and stderr needn't be built-in file objects: any object is acceptable as long as it has a write() method that takes a string argument. (Changing these objects doesn't affect the standard I/O streams of processes executed by os.popen(), os.system() or the exec*() family of functions in the os module.) [emphasis added]
See: 3.1 sys -- System-specific parameters and functions (and search for "stdin").
This suggests that we adopt the following point of view -- If I learn the required interface for a replacement for sys.stdout, then I have learned how to implement a class that redirects or filters output from my program. In a similar way, the developer of a framework can tell me how to implement a class, so that I can pass the class or an instance of the class to the framework, and by doing so can customize the behavior of the framework.
Let's categorize several approaches to interfaces in Python from strict to loose:
Zope provides an implementation of interfaces for Python. Using Zope interfaces, if you are not already using Zope, might be more trouble than it is worth. And, it will require that all your users will need to install Zope also. Still, especially for someone who is interested in tutoring and teaching new Python programmers, possibly programmers who are familiar with Java, Zope interfaces may be worth thinking about.
Although they are implemented in the Zope distribution, it can also be installed separately and can be used outside of Zope applications.
Here are a few of the capabilities provided by Zope interfaces (copied from my_zope_install/lib/python/zope/interface/README.txt):
We can ask an interface whether it is implemented by a class:
>>> IFoo.implementedBy(Foo) True
We can ask whether an interface is provided by an object:
>>> foo = Foo() >>> IFoo.providedBy(foo) True
We can ask what interfaces are implemented by an object:
>>> list(zope.interface.implementedBy(Foo)) [<InterfaceClass __main__.IFoo>]
For more information on the Zope implementation of interfaces, see:
Here is a trivial example:
from zope.interface import Interface, implements
class IA(Interface):
def show(self, level):
"""Show this object.
"""
class A:
implements(IA)
def show(self, msg):
print '(A.show) msg: "%s"' % msg
def test():
a = A()
a.show('hello')
print IA.implementedBy(A)
test()
Notes:
We can document an interface by providing a Python class containing method headers and method doc-strings but no method implementations.
Here is an example:
class MyAbstract:
def __init__(self, name):
if self.__class__ == MyAbstract:
raise NotImplementedError, 'class MyAbstract is abstract'
def write(self, msg):
"""Write a message.
"""
def show(self):
"""Display the name etc.
"""
class MyConcrete(MyAbstract):
"""This class implements interface MyAbstract.
"""
def __init__(self, name):
MyAbstract.__init__(self, name)
self.name = name
def write(self, msg):
print '(MyConcrete:%s) msg: %s' % (self.name, msg, )
def show(self):
print '(MyConcrete) name: %s' % self.name
Notes:
We can describe the interface or protocol in text, saying something like "any class that has a method named this with these arguments and a method named that with those arguments and ...". Needless to say, if there are more than a couple of methods, you are likely to want to switch to the use of an abstract class to document your interface.
The following example might provide suggestions for those who want to do a bit of error checking in code that requires a class or instance that must provide an interface:
import inspect
class A:
def show(self, level):
print '(A.show) level: %s' % level
class B:
def show(self):
print '(B.show) level: %s' % level
#
# This function requires an object that implements a method named 'show'
# which takes one argument in addition to self.
#
def test_interface(obj):
# Does it have a 'show' attribute?
if not hasattr(obj, 'show'):
raise RuntimeError, 'obj must support method show'
meth = getattr(obj, 'show')
argnames = inspect.getargspec(meth)[0]
# Does it have at least one argument (in addition to self)?
if len(argnames) != 2:
raise RuntimeError, 'method show must take two args (self + level)'
obj.show(25)
def test():
a = A()
# This call is OK.
test_interface(a)
b = B()
# This call generates an exception. Class B does not
# support the required protocol.
test_interface(b)
test()
Running the above code produces the following output:
(A.show) level: 25
Traceback (most recent call last):
File "tmp.py", line 33, in ?
test()
File "tmp.py", line 31, in test
test_interface(b)
File "tmp.py", line 24, in test_interface
raise RuntimeError, 'method show must take two args (self + level)'
RuntimeError: method show must take two args (self + level)
Notes:
A few words in summary:
The following articles provide additional reading on interfaces:
Python is a simple language.
Python is an advanced and complex language.
The classes I have taught have been 3 and 4 day classes. It is not possible to teach all of Python in that amount of time. So, designing the contents of a class requires balance and compromise between covering the basic parts and covering the advanced parts.
So, to start off a discussion, I'll list the basic parts and the more advanced parts.
Basic:
Advanced:
A Jython course, in particular, seems to be attended by students with different needs: (1) some need to write and maintain scripts; (2) some need to extend Jython with Java; (3) some need to embed Jython in a Java application. I'm suggesting that you get this divergence out on the table and that you teach two separate sections for those two separate needs.
And, what if you cannot separate your training into two separate classes for two groups of students? Some suggestions:
The next three sub-sections give a very high level description of suggested contents for these two separate courses (or sessions).
Prerequisites -- None, but familiarity with some other progamming lanuage is helpful, and familiarity with an object-oriented language is a plus.
Content:
Advantages and limitations of adding special Jython support to a Java class.
Features:
Prerequisites -- (1) Knowledge of Python; (2) knowledge of Java.
Content:
I typically use my class notes as my materials for the class. These materials are posted at my Web site here: http://www.rexx.com/~dkuhlman/#proposed-python-courses
I usually ask the class members to open a Web browser window and point it at the relevant materials. I also point out that the text version in reST (reStructuredText) can be found by following the link at the bottom of the page. Some students like to take notes by loading the text version into a text editor and adding their notes to it.
Docutils can also be used to produce LaTeX from reST and then I use pdflatex to produce PDF. One training company I worked with, took the PDF and printed paper copies. Some students prefer that, although I do not know why.
This is one of the aspects of Python/Jython training that gives me the most difficulty. There are classes where I stop and ask for questions, and cannot get a response. I try probing. I try to start a question session by asking questions of the class. But, sometimes there is nothing. I believe that some class members believe that they are being polite by not putting me on the hot spot with a difficult question. But, actually, responding to questions is the part I like the most.
If pulling questions out of the class is the part that gives you difficulty, too, here are a few things you might try:
Having an agenda or schedule is important. It establishes goals for the class, which will be of help and guidance for you, and also informs class members on what they can expect. It also gives you something to keep yourself on track.
Prepare a one to two page agenda. I usually break it up into a description of morning and afternoon sessions. You can find an example here: Jython Agenda. Note that this, like most of my documents, was generated from reStructuredText (reST) using Docutils, and you will find a link to the source document at the bottom of the page.
A few notes based on my experiences in teaching Jython/Python:
Teaching Jython breaks down into two major areas:
In a typical class, you are likely to face two categories of students:
The difficulty is that the first type of student will be lost when you get to the Java part, and the second type of student will be bored while you do the Python part.
An ideal solution would be to teach two separate classes for the two types of students. But, you will often not have that option.
My own solution so far has been to try to make the Python sessions as lively as possible and to make the Jython/Java sessions as basic as possible. That's not the best of solutions, but it's the best I have been able to do so far. One possible tweak on this approach is to try to save the last afternoon of the training for practical exercises and individual projects. This allows each student to pick the area where s/he has an interest and feels a need to learn. In my last training, this approach was reasonably successful.
If you intend to teach Jython/Python, planning for practical exercises is almost as important as planning the teaching, explaining, and lecturing that you will do. Exercises are especially problematic in classes I teach because these classes are relatively short. How do you teach programming in Python and Jython's connections with Java and include practical exercises that are reasonably realistic all in a three or four day training schedule?
Homework? That might help stretch the available time. But, I don't think assigning work for after-hours is realistic.