Linux-Python Modules for Data Analysis Installation and basics with Jupyter NoteBook

Alright ,So the Modules we need right now are ,

1. Numpy (Numerical Python)

2. Pandas

3. Matplotlib

Installation and setup (python3)

Numpy

To install Numpy on Ubuntu 20.04 execute the following command.

PYTHON 3:
$ sudo apt install python3-numpy

Using pip

pip3 install numpy

Check the version ,

python3 -c "import numpy; print(numpy.__version__)"

Note- Some may face problems while using the pip3 command . Like, It has been installed but you are not able to use it . It may give you the ModuleNotFoundError .If your working directory and the installed modules are different .

Pandas

pip install pandas

Check out the version of pandas ,

pd.__version__

Matplotlib

pip install matplotlib

Basic Data Analysis

source -https://github.com/wesm/pydata-book

Download the file -usa_gov

This file serves here as the data file

Alright, So using this data file we can analyse various aspects of this data and even create graphs with matplotlib. Like this:-

Let us start with reading the file , I’m assuming you’re familiar with Jupyter Note book and continuing with opening the file .

Counting time Zone with Pandas

and here is the output you get ,

Download the this notebook for reference .

Here are all the notebooks on various data sources and their analysis.

Download the database from here-

Note book for this database

data analysis part 2

data analysis part 3

Programmer | Technical content Writer | Lives in India | Wanna go where I can breathe freedom