HomeArtificial IntelligenceArtificial Intelligence DIYPython tips to speed up your data analysis

Python tips to speed up your data analysis

Profiling the ‘pandas’ dataframe

Profilingis a process that helps us understand our data, and Pandas Profiling is a python package that does exactly that. It’s a simple and fast way to perform exploratory data analysis of a Pandas Dataframe. The pandas df.describe()and df.info()functions are normally used as a first step in the EDA process. However, it only gives a very basic overview of the data and doesn’t help much in the case of large data sets. The Pandas Profiling function, on the other hand, extends the pandas DataFrame with df.profile_report() for quick data analysis. It displays a lot of information with a single line of code and that too in an interactive HTML report.

Python tips to speed up your data analysis 1

Installation

Usage

Let’s use the age-old titanic dataset to demonstrate the capabilities of the versatile python profiler.

Python tips to speed up your data analysis 2

Python tips to speed up your data analysis 3

Bringing interactivity to pandas plots

Pandas has a built-in .plot() function as part of the DataFrame class. However, the visualizations rendered with this function aren’t interactive and that makes it less appealing. On the contrary, the ease to plot charts with pandas.DataFrame.plot() function also cannot be ruled out. What if we could plot interactive plotly like charts with pandas without having to make major modifications to the code? Well, you can actually do that with the help of Cufflinks library.

Installation

Usage

Python tips to speed up your data analysis 4

Python tips to speed up your data analysis 5

A dash of magic

Magic commands are a set of convenient functions in Jupyter Notebooks that are designed to solve some of the common problems in standard data analysis. You can see all available magics with the help of %lsmagic.

Python tips to speed up your data analysis 6

Python tips to speed up your data analysis 7

  • %matplotlib notebook
Python tips to speed up your data analysis 8
%matplotlib inline vs %matplotlib notebook

Python tips to speed up your data analysis 9

Python tips to speed up your data analysis 10

The interactive debugger is also a magic function but I have given it a category of its own. If you get an exception while running the code cell, type %debug in a new line and run it. This opens an interactive debugging environment that brings you to the position where the exception has occurred. You can also check for the values of variables assigned in the program and also perform operations here. To exit the debugger hit q.

Python tips to speed up your data analysis 11

Printing can be pretty too

If you want to produce aesthetically pleasing representations of your data structures, pprint is the go-to module. It is especially useful when printing dictionaries or JSON data. Let’s have a look at an example which uses both print and pprint to display the output.

Python tips to speed up your data analysis 12

Python tips to speed up your data analysis 13

Making the notes stand out

We can use alert/Note boxes in your Jupyter Notebooks to highlight something important or anything that needs to stand out. The color of the note depends upon the type of alert that is specified. Just add any or all of the following codes in a cell that needs to be highlighted.

Python tips to speed up your data analysis 14

Python tips to speed up your data analysis 15

Python tips to speed up your data analysis 16

Python tips to speed up your data analysis 17

Consider a cell of Jupyter Notebook containing the following lines of code:

A typical way of running a python script from the command line is: python hello.py. However, if you add an additional -i while running the same script e.g python -i hello.py it offers more advantages. Let’s see how.

Python tips to speed up your data analysis 18

Ctrl/Cmd + / comments out selected lines in the cell by automatically. Hitting the combination again will uncomment the same line of code.

Python tips to speed up your data analysis 19

Have you ever accidentally deleted a cell in a Jupyter Notebook? If yes then here is a shortcut that can undo that delete action.

  • If you need to recover an entire deleted cell hit ESC+Z or EDIT > Undo Delete Cells

Python tips to speed up your data analysis 20

In this article, I’ve listed the main tips I have gathered while working with Python and Jupyter Notebooks. I’m sure these simple hacks will be of use to you at some point in your career. Till then, happy coding!

This article has been published from the source link without modifications to the text. Only the headline has been changed.

Source link

Most Popular