Course Overview
This course offers a comprehensive introduction to Python for data science, equipping participants with the skills to manipulate, analyze, and visualize data effectively. Starting with foundational Python programming, the course progresses to cover essential tools like Pandas for data manipulation, Matplotlib and Seaborn for visualization, and Numpy for numerical computations. Participants will learn to work with relational databases like SQLite and PostgreSQL, as well as NoSQL databases like MongoDB, to manage and analyze large datasets. The course also explores using Jupyter Notebooks for organizing analyses, cleaning and preparing data, and performing advanced computations with SciPy. By the end, attendees will be prepared to tackle real-world data challenges and make data-driven decisions confidently.
Who should attend
- Data Analysts or Administrators
- Business Intelligence Professionals
- Data Scientists
- Software Developers
Prerequisites
- Basic Keyboard Proficiency: Ability to navigate and use a keyboard effectively, including typing, copy-pasting, and performing basic text editing in the terminal or text editors.
- Prior familiarity with a programming language is beneficial but not mandatory.
Course Objectives
- Build a Strong Python Foundation: Master the fundamentals of Python programming, including functions, data structures, and control flow, as a basis for data science.
- Use Jupyter Notebooks for Data Science Workflows: Learn to create and manage Jupyter Notebooks for organizing and presenting data analyses effectively.
- Manipulate Data with Pandas: Work with DataFrames to clean, modify, and analyze structured data using Boolean masks, time series, and groupby operations.
- Interact with Databases: Connect to and query relational databases like SQLite and PostgreSQL, as well as NoSQL databases like MongoDB, to manage and analyze data.
- Visualize Data with Matplotlib and Seaborn: Create insightful visualizations, including histograms, bar graphs, and relational plots, to explore and communicate data trends.
- Leverage Numpy for Numerical Analysis: Use Numpy arrays for efficient numerical computations, including generating data, indexing, and reshaping multi-dimensional arrays.
- Explore Advanced Visualizations with Seaborn: Visualize multi-dimensional datasets and relational data to uncover deeper insights.
- Utilize Regular Expressions for Data Parsing: Apply regex techniques to search and process text-based data effectively.
- Perform Scientific Computations with SciPy: Use SciPy for advanced mathematical and scientific computations to support complex data analyses.
- Clean and Prepare Data for Analysis: Master techniques for handling missing values, cleaning datasets, and transforming data for analytical workflows.
Follow On Courses
Outline: Python for Data Sciences (PDS)
Day 1: Foundational Python
- Lecture + Lab: Built-in Functions
- Lecture + Lab: Custom Functions
- Lecture + Lab: Objects and Methods
- Lecture: Python Lists
- Lecture + Lab: Python Lists
- Lecture: Python Dictionaries
- Lecture + Lab: Python Dictionaries
- Lecture: Conditionals
- Lecture + Lab: If, Elif, and Else Conditions
- Lecture + Lab: While Loops
Day 2: Foundational Python (Continued)
- Lecture + Lab: For Loops
- Lecture: Reading and Writing to Files
- Lecture + Lab: Reading Files
- Lecture + Lab: Using Modules
- Lecture + Lab: PIP and Third Party Libraries
- Lecture + Lab: Try and Except
- Lecture + Lab: Python Classes & Inheritance
Days 3-5: Python Data Science
- Lecture + Lab: Introduction to Jupyter Notebook
Regular Expression
- Lecture: Introduction to Regular Expression (RegEx)
- Lecture + Lab: Use RegEx to Search Text
- Lecture + Lab: Search and Replace Data
- Lecture + Lab: Compiling RegEx Search Objects
- Lecture + Lab: Testing if a Match Exists
Pandas
- Lecture + Lab: Intro to Pandas
- Lecture + Lab: Examining Cashflow with Pandas
- Lecture: Cleaning Data with Pandas
- Lecture + Lab: Cleaning Data with Pandas
- Challenge: Modifying DataFrames
- Lecture + Lab: Boolean Masks for DataFrames
- Lecture + Lab: Time Series Data
Python and Databases
- Lecture: Interacting with Databases
- Lecture + Lab: Learning sqlite3
- Lecture + Lab: postgreSQL with Python
- Lecture + Lab: Python and MongoDB
Matplotlib
- Lecture: Matplotlib
- Lecture + Lab: Creating Plots with Matplotlib
- Lecture + Lab: Matplotlib - Histograms and Bar Graphs
- Lecture + Lab: Annotating Graphs on Matplotlib
- Lecture + Lab: Making Subplots with Matplotlib
- Lecture + Lab: Customizing Matplotlib Graphs
- Challenge: Create a Graph with Matplotlib
Seaborn
- Lecture + Lab: Pandas groupby and Graphing Relational Data with Seaborn
- Lecture + Lab: Visualizing Multi-Dimensional Data Using Seaborn
Numpy
- Lecture + Lab: Using Numpy Arrays
- Lecture + Lab: Generating Data with Numpy’s arange and linspace
- Lecture + Lab: Indexing, Slicing, and Reshaping Multi-Dimensional Arrays
Scipy
- Lecture + Lab: Introducing SciPy
Additional Python Data Science Tools
- Lecture + Lab: Building Interactive Maps
- Lecture + Lab: folium and Flask - Returning Maps from Custom API Endpoints
- Challenge: Map the Location of the ISS
Optional- PCEP Certification Guide
- Lecture: Introduction to the PCEP Exam
- Lecture + Lab: Advanced Numbers and Operators
- Lecture + Lab: Pythonic Loops and Iteration
- Lecture + Lab: Advanced Lists and Tuples
- Lecture + Lab: Advanced Functionality and Error Handling