Cell variable is_within_directory defined in loop
122
123 for member in tar.getmembers():
124 member_path = os.path.join(path, member.name)
125 if not is_within_directory(path, member_path):126 raise Exception("Attempted Path Traversal in Tar File")
127
128 tar.extractall(path, members, numeric_owner=numeric_owner)
Description
A variable used in a closure is defined in a loop. This will result in all closures using the same value for the closed-over variable, which can pave way for a hideous bug.
Consider the following code snippet:
# Motivation: Create 3 functions that return `x**2`, `x**3`, and `x**4`.
# Note: This will be using the loop variable to show the bug.
powers = [lambda x: x**i for i in range(2,5)]
So, powers
is supposed to contain 3 functions to return 2nd, 3rd and 4th powers of a given number.
On execution, these are the results:
In [1]: powers = [lambda x: x**i for i in range(2,5)]
In [2]: powers[0](2) # Expected result: 4 (2**2)
Out[2]: 16
In [3]: powers[1](2) # Expected result: 8 (2**3)
Out[3]: 16
In [4]: powers[2](2) # Expected result: 16 (2**4)
Out[4]: 16
This happens because i is not local to the lambdas, but is defined in the outer scope, and it is accessed when the lambda is called — not when it is defined. At the end of the loop, the value of i is 4, so all the functions now return x**4.
Recommended
In order to avoid this, you need to save the values in variables local to the lambdas, so that they don’t rely on the value of the global i
:
In [1]: powers = [lambda x, n =i: x**n for i in range(2,5)]
In [2]: powers[0](2)
Out[2]: 4
In [3]: powers[1](2)
Out[3]: 8
In [4]: powers[2](2)
Out[4]: 16