Qubole Data Service (QDS) is the #1 cloud-native big data platform that revolutionizes the way companies access insights from the data and make it actionable for their business use case. It serves some of the largest data-driven companies such as Lyft, Expedia, Box, and Oracle.
Anything ‘data’ nowadays is incomplete without the mention of smart minds behind it — data engineers, analysts and data scientists who form Qubole’s major user base. To enable them to access the Qubole Data Service API in their day-to-day work, they use qds-sdk-py, a Python SDK. The SDK provides an easy-to-use command-line interface allowing its users to run Hive, Hadoop, Pig, Presto and shell commands synchronously, or submit a command and check its status against QDS.
Qubole has one or two developers from each team working on qds-sdk-py, which brings the total to 15-20 developers. According to Joy Lal Chattaraj, member of technical staff at Qubole, to maintain project’s health they relied on unit test coverage and de-facto standards like PEP8, compact function definitions, etc. for any new code added. All of these conventions were enforced manually during code reviews.
With multiple developers from different teams working on the project:
Joy realized that automating code reviews will be a win-win for everyone. In August 2019, he implemented DeepSource to run static code analysis for the project. The straightforward configuration required zero technical support and the analysis was up and running in a few minutes. The team’s on-boarding followed shortly after.
Automation proved to be a savior for Qubole’s developers. DeepSource scans run with every pull request, and flag the issues directly in GitHub checks within seconds. What happens next?
The developers conduct the first round of code review themselves and fix the flaws detected before involving the reviewer, saving both the developer and reviewer a lot of time. Recalling one such instance, Joy says:
With DeepSource, Qubole’s review time has decreased 3-fold, feature release cycle has picked up pace and the scope for missing flaws — be it an obvious error or an elusive one, has reduced considerably. A few instances of the flaws detected:
Seeing the accuracy of issues reported and the low false positives, Joy and his team also saw evident improvement in quality metrics of the code base. That’s when he decided to make the checks mandatory, which means unless all the issues flagged by DeepSource are not resolved, neither the developer nor the reviewer can merge the code. It has been helpful for Qubole in blocking the pull requests that do not comply with the project’s coding standards.
At Qubole, the team aims to have 100% unit test coverage for all the incoming code and above 80% for the project overall, which is tracked regularly. Having a tool in place already which tracks test coverage without any overhead, became a bonus.
DeepSoure reports / updates the test coverage status after every run. It helps the developers ensure that the defined threshold is maintained in every pull request that is merged to master. “Test coverage is the check I like the most” Joy says, “and the capability to integrate it with pull requests is icing on the cake.”