A DeepSource analyzer inspects and analyzes source code in a repository to raise issues and track metrics. Analyzers look for anti-patterns and bug risks and raise issues to show the developer. Analyzers also create and track metrics like dependency counts, documentation coverage, etc. Analyzers operate at a file-level (e.g., an anti-pattern found in a file at a specific location) and repository-level (e.g., 4 dependencies found that are not installed).
In addition to detecting issues, DeepSource analyzers can now suggest fixes for commonly occurring issues and can create a pull request with the fixes. Look for the 'Autofix' button in the dashboard adjacent to certain supported issues. We're steadily increasing the coverage of issues on all the language analyzers that we support.
DeepSource detects issues using static analysis across categories like anti-patterns, bug risks, coverage, performance issues, security flaws, style issues and type check issues.
Anti-patterns are certain ways of writing code that result in poor design. While anti-patterns are correct code, they are not recommended as they often affect maintainability, readability, performance, and security.
Examples of anti-patterns:
In Python, the
file class is equipped with special methods that are automatically called whenever
a file is opened via a
with statement (e.g.
with open("file.txt", "r") as file). These special
methods ensure that the file is properly and safely opened and closed.
The code below does not use
with to open a file. This code depends on you remembering to manually close the file via
close() when finished. Even if you remember to call
close(), the code is still dangerous, because if an exception occurs before the call to
close(), it will not be called and memory issues or file corruption could occur.
f = open('/tmp/.deepsource.toml', 'r') f.write("config file.") f.close()
It is a good practice to use
with to open a file. The file class has some special built-in methods called
__exit__(), which respectively open and close the file for you. Python can guarantee that these special methods are always called, even if an exception occurs.
with open("/tmp/.deepsource.toml", "r") as f: f.write("config file.")
Bug risks are issues in code that can cause errors in code and breakages in production. A bug is a flaw in the code that produces undesired or incorrect results.
Code often has bug risks due to poor coding practices, lack of version control, miscommunication of requirements, unrealistic time schedules for development and buggy third-party tools.
Examples of bug-risks:
Passing mutable lists or dictionaries as default arguments to a function can have unforeseen consequences. Usually, when you use a list or dictionary as the default argument to a function, you want the program to create a new list or dictionary every time that function is called. However, this is not what Python does. The first time the function is called, Python creates a persistent object for the list or dictionary. Every subsequent time the function is called, Python uses that same persistent object created during the first call to the function.
In the code below, the
append function is used under the assumption that a new
list with the object passed as the first argument would be returned each time that the function is called without the second argument. In reality, this is not what happens. The first time the function is called, Python creates a persistent
list. Every subsequent call to
append appends the value to that persistent list instead of creating a new one.
def append(number, number_list=): number_list.append(number) print(number_list) return number_list append(5) # expecting: , actual:  append(7) # expecting: , actual: [5, 7] append(2) # expecting: , actual: [5, 7, 2]
It is a good practice to use a sentinel value to denote an empty
This means if you want the function to return a singleton list each time this function is called without the second argument, you should use a sentinel value to represent this use case and then modify the function's body to support this scenario.
# the keyword None is the sentinel value representing empty list def append(number, number_list=None): if number_list is None: number_list =  number_list.append(number) print(number_list) return number_list append(5) # expecting: , actual:  append(7) # expecting: , actual:  append(2) # expecting: , actual: 
Performance issues are issues that impact the performance of code being executed by slowing it down. Considerable performance gains can be obtained when the appropriate functions and directives are used.
Examples of performance issues:
Checking for membership of a key in a list can potentially take n iterations to complete, where n is the number of items in the list. If possible, change the list to a set or dictionary instead because Python can search for items in a set or dictionary by directly accessing them without iterations, which is much more efficient.
The code below defines a list
l and then uses the expression
if 3 in l: to check if the number 3 exists in the list. This is inefficient. Behind the scenes, Python iterates through the list until it finds the number or reaches the end of the list.
l = [1, 2, 3, 4] # iterates over three elements in the list if 3 in l: print("The number 3 is in the list.") else: print("The number 3 is NOT in the list.")
In the modified code below, the list has been changed to a set. This is much more efficient behind the scenes, as Python can attempt to directly access the target number in the set, rather than iterate through every item in the list and compare every item to the target number.
s = set([1, 2, 3, 4]) if 3 in s: print("The number 3 is in the list.") else: print("The number 3 is NOT in the list.")
A bug in code which could potentially be used to compromise security is a security vulnerability issue. Using libraries and tools that are out-of-date or have known security issues can also introduce vulnerabilities in the system.
A highly dynamic language like Python that gives you many ways to change the runtime behavior of code and even dynamically execute new code is powerful but can be a security risk as well.
The three main types of security vulnerabilities based on their more extrinsic weaknesses are:
- Porous defenses
- Risky resource management
- Insecure interaction between components
Examples of security vulnerabilities:
exec statement enables you to execute arbitrary Python code stored in literal strings dynamically. Building a complex string of Python code and then passing that code to exec results in code that is hard to read and hard to test. Anytime the "Use of
exec" error is encountered, you should go back to the code and check if there is a clearer, more direct way to accomplish the task.
The sample code below composes a literal string containing Python code and then passes that string to
exec for execution. This is an indirect and confusing way to program in Python.
s = "print(\"Hello, World!\")" exec s
In most scenarios, you can easily refactor the code to avoid the use of exec. In the example below, the use of exec has been removed and replaced by a function.
def print_hello_world(): print("Hello, World!") print_hello_world()
assert statement is a debugging aid that tests a condition. If the condition is true, it does nothing, and your program continues to execute. But if the assert condition evaluates to false, it raises an AssertionError exception with an optional error message.
In the code below, an
assert statement is used in application logic, which is discouraged. Asserts can be turned off globally in the Python interpreter. Don’t rely on assert expressions to be executed for data validation or data processing.
def delete_product(product_id, user): assert user.is_admin(), 'Must have admin privileges to delete' assert store.product_exists(product_id), 'Unknown product id' store.find_product(product_id).delete()
Assert statements should ideally be used only in tests, and to verify invariants while debugging. Instead of using
assert in production code, you could do your validation with regular if-statements and raise validation exceptions if necessary.
def delete_product(product_id, user): if not user.is_admin(): raise AuthError('Must have admin privileges to delete') if not store.product_exists(product_id): raise ValueError('Unknown product id') store.find_product(product_id).delete()
Style issues are violations in the code format according to a style guide. If the code does not follow the specified code style guidelines, it fails to express its intent in the most readable way.
Code style can be boiled down to anything that is a stylistic choice in the code that has no effect on the behavior of the code
Any large code base with multiple team members should look as if only one programmer wrote it. If a team agrees on a given style it can help keep the code consistent.
Examples of style issues:
Per the PEP-8 Style Guide, all Python code should be consistently indented with 4 spaces, never tabs.
The following code mixes spaces and tabs for indentation. The
print("Hello, World!") statement is indented with a tab. The
print("Goodbye, World!") statement is indented with 4 spaces.
def print_hello_world(): # indented with tab print("Hello, World!") def print_goodbye_world(): # indented with 4 spaces print("Goodbye, World!")
All Python code should be consistently indented with 4 spaces as in the code below.
def print_hello_world(): print("Hello, World!") # indented with 4 spaces def print_goodbye_world(): print("Goodbye, World!") # indented with 4 spaces
In Python, there should be whitespace after the characters
; , and
:. For instance, in the code below, there is a missing whitespace after
class BaseNumberGenerator: def __init__(self): self.limits = (1,10) def get_number(self, min_max): raise NotImplemented
Documentation issues are caused if certain parts of the code are left undocumented. Documentation is a collection of easy to understand images and written descriptions that describe what a codebase does and how it can be used.
Documentation is important because it ensures that the next time you dive into a codebase, you won't have to take as much time to get up to speed.
Examples of documentation issues:
PEP-8 mandates that all public modules, classes, functions, and methods should have a documentation string. A documentation string is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a string becomes the
doc special attribute of that object.
Ensuring that every public module, class, function, and method is documented makes it easier for other developers to maintain the code.
If a module, class, function, or method needs to be public, add a documentation string that describes the purpose or use of the object (see PEP-257 for guidelines). If the object does not need to be public then make it "private" by changing its name from
The following simple, public function should be updated to include a documentation string immediately after the def line.
def add(x, y): return x + y
You might insert the documentation string:
"""Return the sum of x and y.""" on line 2 as shown in the code below.
def add(x, y): """Return the sum of x and y.""" return x + y
The following table lists all supported analyzers.
Available release channels:
A DeepSource Transformer automatically 'transforms' all incoming source code in
a repository with popular code auto-formatters (e.g.,
Whenever a PR is created, all transformers enabled in
.deepsource.toml are run
and the changes detected - if any - are committed back to that branch.
If changes are introduced without a PR (i.e. committed directly into the default branch), a new PR is created with transformations.
Note: Since GitHub restricts creating commits on branches from other forks, it may not be possible to commit back the changes sometimes. In this case, we recommend merging the PR as is.
Once merged, DeepSource would automatically pick up the transformations introduced in the merge commit, and create a new PR with these formatting changes only.
transformers are first introduced in
.deepsource.toml (in the default branch), the entire codebase is transformed. Afterwards, only the files added/modified in a commit/pull-request (PR)/merge-request (MR) are transformed.
For PRs which have non-default branches as base (e.g., PRs to release branches, etc.), transformations are not run.
The following table lists all supported transformers and their shortcodes.
|Google Java Format|
The analysis configuration for a repository on DeepSource is defined in a
.deepsource.toml file in the repository's root. This file must be present at the said location for analysis to be run.
. ├── .deepsource.toml ├── README.md ├── bar │ └── baz.py └── foo.py
The configuration is written in the TOML format.
- Type: Integer
- Presence: mandatory
- Description: The version property is required and must be set to 1, which is the only supported version at the moment.
version = 1
- Type: Array
- Presence: optional
- Description: List of glob patterns that should be excluded when the analyses are run. These patterns should be relative to the repository's root.
exclude_patterns = [ "bin/**", "**/node_modules/", "js/**/*.min.js" ]
- Type: Array
- Presence: optional
- Description: List of glob patterns that should be marked as tests or containing test files. These patterns should be relative to the repository's root.
test_patterns = [ "tests/**", "test_*.py" ]
- Type: Array of Tables
- Presence: mandatory
- Description: List of analyzers. For each analyzer that has to be enabled for this repo, an entry has to be added in the analyzers table. Refer to analyzer specific configuration docs for available options.
[[analyzers]] name = "python" enabled = true dependency_file_paths = [ "requirements.txt", "Pipfile" ]
- Type: Array of Tables
- Presence: mandatory
- Description: List of transformers. For each transformer that has to be enabled for this repo, an entry has to be added in the transformers table. Refer to transformer specific configuration docs for available options.
[[transformers]] name = "black" enabled = true