Pysa, a static analyzer for Python offered by Facebook

Facebook has introduced an open source static analyzer called «Pysa»(Python Static Analyzer) which is designed to identify potential vulnerabilities in Python code.

PYSA provides data flow analysis as a result of code execution, which allows you to identify many potential vulnerabilities and problems of privacy related to the use of data in places where it should not appear.

For example, Pysa can track the use of raw external data in calls that execute external programs, in file operations and in SQL constructs.

Today, we share details about Pysa, an open source static analysis tool that we have built to detect and prevent security and privacy issues in Python code. Last year, we shared how we created Zoncolan, a static analysis tool that helps us analyze more than 100 million lines of hack code and has helped engineers prevent thousands of potential security problems. That success inspired us to develop Pysa, which is an acronym for Python Static Analyzer.

Pysa uses the same algorithms to perform static analysis and even share code with Zoncolan. Like Zoncolan, Pysa tracks data flows through a program.

The user defines sources (places where important data originates) as well as sinks (places where the source data should not end).

For security applications, the most common types of sources are places where user-controlled data enters the application, such as the Django dictionary.

Receivers tend to be much more varied, but can include APIs that run code, such as eval, or APIs that access the file system, such asos.open

Pysa performs iterative rounds of analysis to build abstracts to determine which functions return data from a source and which functions have parameters that eventually hit a sink. If Pysa finds that a source eventually connects to a sink, it reports a problem. 

Analyzer work it boils down to identifying incoming data sources and dangerous calls, in which the original data should not be used.

Pysa monitors the passage of data through the chain of function calls and associates the original data with potentially dangerous places in the code.

Because we use open source Python server frameworks like Django and Tornado for our own products, Pysa may start to encounter security issues in projects that use these frameworks from the very first run. Using Pysa for frameworks we don't have coverage for yet is generally as simple as adding a few configuration lines to tell Pysa where data is coming into the server.

A common vulnerability identified by Pysa is an open redirect issue (CVE-2019-19775) in the Zulip messaging platform, caused by passing unclean external parameters when displaying thumbnails.

Pysa's data flow tracking capabilities can be used to validate the use of additional frames and to determine compliance with user data usage policies.

For example, Pysa without additional configurations can be used to verify projects using the frameworks Django and Tornado. Pysa can also identify common vulnerabilities in web applications, such as SQL substitution and cross-site scripting (XSS).

On Facebook, the analyzer is used to verify the code of the Instagram service. During the first quarter of 2020, Pysa helped identify 44% of all problems found by Facebook engineers in Instagram's server-side code base.

A total of 330 problems were identified in the process of automated change verification using Pysa, 49 (15%) of which were evaluated as significant and 131 (40%) were not dangerous. In 150 cases (45%) the problems were attributed to false positives.

The new parser is designed as an add-on to the Pyre type verification toolkit and is placed in your repository. The code is released under the MIT license.

Finally if you want to know more about it, you can check the details in the original post. The link is this.


The content of the article adheres to our principles of editorial ethics. To report an error click here!.

Be the first to comment

Leave a Comment

Your email address will not be published.

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.