563 doc = os.path.splitext(os.path.basename(doc))[0]
564
565 # remove the document if it was already processed
566 if doc in completed_files: 567 return False # remove
568 return True # keep
569
A variable used in a closure is defined in a loop. This will result in all closures using the same value for the closed-over variable, which can pave way for a hideous bug.
Consider the following code snippet:
# Motivation: Create 3 functions that return `x**2`, `x**3`, and `x**4`.
# Note: This will be using the loop variable to show the bug.
powers = [lambda x: x**i for i in range(2,5)]
So, powers
is supposed to contain 3 functions to return 2nd, 3rd and 4th powers of a given number.
On execution, these are the results:
In [1]: powers = [lambda x: x**i for i in range(2,5)]
In [2]: powers[0](2) # Expected result: 4 (2**2)
Out[2]: 16
In [3]: powers[1](2) # Expected result: 8 (2**3)
Out[3]: 16
In [4]: powers[2](2) # Expected result: 16 (2**4)
Out[4]: 16
This happens because i is not local to the lambdas, but is defined in the outer scope, and it is accessed when the lambda is called — not when it is defined. At the end of the loop, the value of i is 4, so all the functions now return x**4.
In order to avoid this, you need to save the values in variables local to the lambdas, so that they don’t rely on the value of the global i
:
In [1]: powers = [lambda x, n =i: x**n for i in range(2,5)]
In [2]: powers[0](2)
Out[2]: 4
In [3]: powers[1](2)
Out[3]: 8
In [4]: powers[2](2)
Out[4]: 16