Quantcast
Channel: Dynamics Communities
Viewing all articles
Browse latest Browse all 940

How Microsoft Fabric Handles Common Python Vulnerabilities, Version Conflicts

$
0
0
Microsoft Fabric UG

Since Microsoft has made the usage of Python within Excel and Microsoft Fabric publicly available, I’ve been curious to understand how Common Vulnerabilities and Exposures (CVE) and version conflicts are handled within Excel and Microsoft Fabric.

Python is a popular programming language that offers many benefits for data analysis, automation, and web development. However, as with many other open-source solutions and programming languages, Python also poses some security risks and challenges that need to be addressed, especially when working with Microsoft products such as Excel and Microsoft Fabric.

What are CVE in Python and Library Version Conflict?

CVE stands for Common Vulnerabilities and Exposures, which is a list of publicly known security flaws and risks in software systems. CVEs are assigned a unique identifier and a severity score based on their impact and exploitability.

Library version conflict is a problem that occurs when different libraries or packages require incompatible versions of the same dependency. For example, if “library A” requires requests==2.3.1 and “library B” requires requests==2.4.0, then installing both libraries will cause a version conflict, as only one version of requests can be installed at a time. This can lead to unexpected errors or failures in the application that uses these libraries.

How Microsoft Fabric Addresses CVE and Version Conflict

Microsoft Fabric supports Python as one of its primary languages for Apache Spark and data science workflows. To handle CVE and version conflict in Python, Microsoft Fabric provides the following features and tools:

Workspace Libraries

Workspace libraries allow users to install and manage Python libraries at the workspace level, which are shared across all notebooks and Spark jobs in the workspace. Users can install both feed libraries (from public sources such as PyPI or Conda) and custom libraries using the workspace settings.

Standardize libraries and its version at workspace level, as good practice. This will set the ground for all Fabric users and developers working with Python or Spark.

Notebook Libraries

Notebook libraries allow users to install and manage Python libraries at the notebook level, which are specific to each notebook session. Users can install feed libraries using the %pip magic command or custom libraries using the %upload magic command in the notebook cells. Notebook libraries can help resolve version conflicts by overriding the workspace libraries for a particular notebook.

When a Python library is installed at notebook-scoped, only the current notebook and its associated jobs have access to that library. Other notebooks attached to the same cluster are not affected. Also, notebook-scope libraries do not persist across sessions, so they must be reinstalled at the beginning of the notebook session.

Microsoft Spark Utilities

Microsoft Spark Utilities (MSSparkUtils) is a built-in package that helps users perform common tasks in Microsoft Fabric notebooks. One of the tasks that MSSparkUtils can do is to check for CVEs in the installed Python libraries using the mssparkutils.cve.check() function. This function will scan the libraries and report any CVEs found, along with their severity and description.

Best Practices to Reduce Python CVE and Library Version Conflict

Here are some tips and best practices for business users to reduce Python CVE and library version conflicts when working with Microsoft Fabric and Excel:

  • Use virtual environments: Virtual environments are isolated Python environments that can have different versions of Python and libraries installed. Virtual environments can help avoid version conflicts and ensure reproducibility of the Python code. Users can create and activate virtual environments using tools such as virtualenv or venv
  • Use pip-conflict-checker and pipdeptree: Pip-conflict-checker and pipdeptree are two useful tools that can help users detect and resolve version conflicts in Python libraries. pip-conflict-checker recursively checks the requirements of each library and prints out any conflicts.
  • Use pandas API on Spark: Pandas API on Spark is a feature of Microsoft Fabric that allows users to scale their pandas workloads to any size by running them distributed across multiple nodes. pandas API on Spark can help users avoid CVE and version conflicts in pandas and its dependencies, as it uses the built-in pandas version in Microsoft Fabric.

The post How Microsoft Fabric Handles Common Python Vulnerabilities, Version Conflicts appeared first on Dynamics Communities.


Viewing all articles
Browse latest Browse all 940

Trending Articles