Python for Data Sciences (PDS)

 

Course Overview

This course offers a comprehensive introduction to Python for data science, equipping participants with the skills to manipulate, analyze, and visualize data effectively. Starting with foundational Python programming, the course progresses to cover essential tools like Pandas for data manipulation, Matplotlib and Seaborn for visualization, and Numpy for numerical computations. Participants will learn to work with relational databases like SQLite and PostgreSQL, as well as NoSQL databases like MongoDB, to manage and analyze large datasets. The course also explores using Jupyter Notebooks for organizing analyses, cleaning and preparing data, and performing advanced computations with SciPy. By the end, attendees will be prepared to tackle real-world data challenges and make data-driven decisions confidently.

Who should attend

  • Data Analysts or Administrators
  • Business Intelligence Professionals
  • Data Scientists
  • Software Developers

Prerequisites

  • Basic Keyboard Proficiency: Ability to navigate and use a keyboard effectively, including typing, copy-pasting, and performing basic text editing in the terminal or text editors.
  • Prior familiarity with a programming language is beneficial but not mandatory.

Course Objectives

  • Build a Strong Python Foundation: Master the fundamentals of Python programming, including functions, data structures, and control flow, as a basis for data science.
  • Use Jupyter Notebooks for Data Science Workflows: Learn to create and manage Jupyter Notebooks for organizing and presenting data analyses effectively.
  • Manipulate Data with Pandas: Work with DataFrames to clean, modify, and analyze structured data using Boolean masks, time series, and groupby operations.
  • Interact with Databases: Connect to and query relational databases like SQLite and PostgreSQL, as well as NoSQL databases like MongoDB, to manage and analyze data.
  • Visualize Data with Matplotlib and Seaborn: Create insightful visualizations, including histograms, bar graphs, and relational plots, to explore and communicate data trends.
  • Leverage Numpy for Numerical Analysis: Use Numpy arrays for efficient numerical computations, including generating data, indexing, and reshaping multi-dimensional arrays.
  • Explore Advanced Visualizations with Seaborn: Visualize multi-dimensional datasets and relational data to uncover deeper insights.
  • Utilize Regular Expressions for Data Parsing: Apply regex techniques to search and process text-based data effectively.
  • Perform Scientific Computations with SciPy: Use SciPy for advanced mathematical and scientific computations to support complex data analyses.
  • Clean and Prepare Data for Analysis: Master techniques for handling missing values, cleaning datasets, and transforming data for analytical workflows.

Follow On Courses

Outline: Python for Data Sciences (PDS)

Day 1: Foundational Python

  • Lecture + Lab: Built-in Functions
  • Lecture + Lab: Custom Functions
  • Lecture + Lab: Objects and Methods
  • Lecture: Python Lists
  • Lecture + Lab: Python Lists
  • Lecture: Python Dictionaries
  • Lecture + Lab: Python Dictionaries
  • Lecture: Conditionals
  • Lecture + Lab: If, Elif, and Else Conditions
  • Lecture + Lab: While Loops

Day 2: Foundational Python (Continued)

  • Lecture + Lab: For Loops
  • Lecture: Reading and Writing to Files
  • Lecture + Lab: Reading Files
  • Lecture + Lab: Using Modules
  • Lecture + Lab: PIP and Third Party Libraries
  • Lecture + Lab: Try and Except
  • Lecture + Lab: Python Classes & Inheritance

Days 3-5: Python Data Science

  • Lecture + Lab: Introduction to Jupyter Notebook

Regular Expression

  • Lecture: Introduction to Regular Expression (RegEx)
  • Lecture + Lab: Use RegEx to Search Text
  • Lecture + Lab: Search and Replace Data
  • Lecture + Lab: Compiling RegEx Search Objects
  • Lecture + Lab: Testing if a Match Exists

Pandas

  • Lecture + Lab: Intro to Pandas
  • Lecture + Lab: Examining Cashflow with Pandas
  • Lecture: Cleaning Data with Pandas
  • Lecture + Lab: Cleaning Data with Pandas
  • Challenge: Modifying DataFrames
  • Lecture + Lab: Boolean Masks for DataFrames
  • Lecture + Lab: Time Series Data

Python and Databases

  • Lecture: Interacting with Databases
  • Lecture + Lab: Learning sqlite3
  • Lecture + Lab: postgreSQL with Python
  • Lecture + Lab: Python and MongoDB

Matplotlib

  • Lecture: Matplotlib
  • Lecture + Lab: Creating Plots with Matplotlib
  • Lecture + Lab: Matplotlib - Histograms and Bar Graphs
  • Lecture + Lab: Annotating Graphs on Matplotlib
  • Lecture + Lab: Making Subplots with Matplotlib
  • Lecture + Lab: Customizing Matplotlib Graphs
  • Challenge: Create a Graph with Matplotlib

Seaborn

  • Lecture + Lab: Pandas groupby and Graphing Relational Data with Seaborn
  • Lecture + Lab: Visualizing Multi-Dimensional Data Using Seaborn

Numpy

  • Lecture + Lab: Using Numpy Arrays
  • Lecture + Lab: Generating Data with Numpy’s arange and linspace
  • Lecture + Lab: Indexing, Slicing, and Reshaping Multi-Dimensional Arrays

Scipy

  • Lecture + Lab: Introducing SciPy

Additional Python Data Science Tools

  • Lecture + Lab: Building Interactive Maps
  • Lecture + Lab: folium and Flask - Returning Maps from Custom API Endpoints
  • Challenge: Map the Location of the ISS

Optional- PCEP Certification Guide

  • Lecture: Introduction to the PCEP Exam
  • Lecture + Lab: Advanced Numbers and Operators
  • Lecture + Lab: Pythonic Loops and Iteration
  • Lecture + Lab: Advanced Lists and Tuples
  • Lecture + Lab: Advanced Functionality and Error Handling

Prices & Delivery methods

Online Training

Duration
5 days

Price
  • Online Training: CAD 3,860
  • Online Training: US $ 2,795
Classroom Training

Duration
5 days

Price
  • Canada: CAD 3,860

Schedule

Currently there are no training dates scheduled for this course.